[MonetDB-users] Batch sql inserts in monetdb
Hello all, I'm working in a project and we are planning to use monetdb. Our Data: We have around 15 millon users and 6,000 columns. Our Problem We need to be able to query specific log data. We use Hadoop to massage the data and create a sql file with sql inserts (around 15 millon insert with 3000 columns) Out Question Is it efficent to run a command like. By efficent I refe:. betweent 1 to 6 hours efficent, after that, a big problem. monetdb -u ....... < my_sql_insert.sql with 15 millon inserts and between 1000 to 3000 columns? (I'm not always going to create insert for the 6000 columns) could monetdb handle multiple of those command if a split the file. monetdb -u ....... < my_sql_insert_1.sql monetdb -u ....... < my_sql_insert_2.sql monetdb -u ....... < my_sql_insert_3.sql ... Is SQL COPY more efficient than a sql scrip file with inserts? BAT files is it significant faster ? I'm trying to avoid this approach since it seems a little bit obscue. any other suggestion? I'm not able to run all this test and get the info myself becuase I will not have a monetdb ready soon. Thank you very much and I would be glad to send back an email with the results I got and contribute in that way. Federico
On Wed, 1 Feb 2012, Federico Dayan wrote:
Is SQL COPY more efficient than a sql scrip file with inserts?
Yes, much more.
BAT files is it significant faster ? I'm trying to avoid this approach since it seems a little bit obscue.
Could be faster, because you save on looking for 6001 delimiters per line. Then again, it might require a significant amount of time to develop the conversion yourself. Stefan
Hi Federico, for bulk data loads into MonetDB/SQL, COPY INTO from CSV file(s) is strongly recommended over SQL scripts with INSERT statements. COPY INTO is significantly faster. If you do not instantly understand what "creating binary BAT files" and "attaching binary BAT files" means, you most probably do not need to worry about options. Kind regards, Stefan On Wed, Feb 01, 2012 at 12:50:50PM -0300, Federico Dayan wrote:
Hello all,
I'm working in a project and we are planning to use monetdb.
Our Data: We have around 15 millon users and 6,000 columns.
Our Problem We need to be able to query specific log data. We use Hadoop to massage the data and create a sql file with sql inserts (around 15 millon insert with 3000 columns)
Out Question Is it efficent to run a command like. By efficent I refe:. betweent 1 to 6 hours efficent, after that, a big problem.
monetdb -u ....... < my_sql_insert.sql
with 15 millon inserts and between 1000 to 3000 columns? (I'm not always going to create insert for the 6000 columns)
could monetdb handle multiple of those command if a split the file.
monetdb -u ....... < my_sql_insert_1.sql monetdb -u ....... < my_sql_insert_2.sql monetdb -u ....... < my_sql_insert_3.sql ...
Is SQL COPY more efficient than a sql scrip file with inserts?
BAT files is it significant faster ? I'm trying to avoid this approach since it seems a little bit obscue.
any other suggestion?
I'm not able to run all this test and get the info myself becuase I will not have a monetdb ready soon.
Thank you very much and I would be glad to send back an email with the results I got and contribute in that way.
Federico
------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Don't forget to provide a good row estimate with the COPY INTO statement
On Wed, Feb 1, 2012 at 11:18 AM, Stefan Manegold
Hi Federico,
for bulk data loads into MonetDB/SQL, COPY INTO from CSV file(s) is strongly recommended over SQL scripts with INSERT statements. COPY INTO is significantly faster.
If you do not instantly understand what "creating binary BAT files" and "attaching binary BAT files" means, you most probably do not need to worry about options.
Kind regards, Stefan
On Wed, Feb 01, 2012 at 12:50:50PM -0300, Federico Dayan wrote:
Hello all,
I'm working in a project and we are planning to use monetdb.
Our Data: We have around 15 millon users and 6,000 columns.
Our Problem We need to be able to query specific log data. We use Hadoop to massage the data and create a sql file with sql inserts (around 15 millon insert with 3000 columns)
Out Question Is it efficent to run a command like. By efficent I refe:. betweent 1 to 6 hours efficent, after that, a big problem.
monetdb -u ....... < my_sql_insert.sql
with 15 millon inserts and between 1000 to 3000 columns? (I'm not always going to create insert for the 6000 columns)
could monetdb handle multiple of those command if a split the file.
monetdb -u ....... < my_sql_insert_1.sql monetdb -u ....... < my_sql_insert_2.sql monetdb -u ....... < my_sql_insert_3.sql ...
Is SQL COPY more efficient than a sql scrip file with inserts?
BAT files is it significant faster ? I'm trying to avoid this approach since it seems a little bit obscue.
any other suggestion?
I'm not able to run all this test and get the info myself becuase I will not have a monetdb ready soon.
Thank you very much and I would be glad to send back an email with the results I got and contribute in that way.
Federico
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (4)
-
David Pennell
-
Federico Dayan
-
Stefan de Konink
-
Stefan Manegold