[MonetDB-users] Preferred bulk load method?
Good afternoon- Just wondering if there is a preferred method to load large datasets into monetdb? Piping SQL through mclient? thanks much -matt
On Wed, Nov 28, 2007 at 12:02:18PM -0700, m h wrote:
Good afternoon-
Just wondering if there is a preferred method to load large datasets into monetdb? Piping SQL through mclient? copy into
some examples from 'tpch' COPY 5 RECORDS INTO region from 'PWD/region.tbl' USING DELIMITERS '|', '|\n'; COPY 25 RECORDS INTO nation from 'PWD/nation.tbl' USING DELIMITERS '|', '|\n'; COPY 10000 RECORDS INTO supplier from 'PWD/supplier.tbl' USING DELIMITERS '|', '|\n'; COPY 150000 RECORDS INTO customer from 'PWD/customer.tbl' USING DELIMITERS '|', '|\n'; COPY 200000 RECORDS INTO part from 'PWD/part.tbl' USING DELIMITERS '|', '|\n'; COPY 800000 RECORDS INTO partsupp from 'PWD/partsupp.tbl' USING DELIMITERS '|', '|\n'; COPY 1500000 RECORDS INTO orders from 'PWD/orders.tbl' USING DELIMITERS '|', '|\n'; COPY 7000000 RECORDS INTO lineitem from 'PWD/lineitem.tbl' USING DELIMITERS '|', '|\n'; replace PWD/filename.tbl with the server reachable location. niels
thanks much
-matt
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
On Nov 28, 2007 1:12 PM, Niels Nes
On Wed, Nov 28, 2007 at 12:02:18PM -0700, m h wrote:
Good afternoon-
Just wondering if there is a preferred method to load large datasets into monetdb? Piping SQL through mclient? copy into
some examples from 'tpch'
COPY 5 RECORDS INTO region from 'PWD/region.tbl' USING DELIMITERS '|', '|\n'; replace PWD/filename.tbl with the server reachable location.
niels
Is there a sample tbl file somewhere? When the command I get the following: copy 5 records into vendor_dim from '/tmp/vendor.csv' using delimiters '|', '|\n'; ....contents of file like the following... !2456|'2456'|'FOO BAR CAFE'||'875 CENTER STREET RM'|'NY'|'NY'|' 96813'||||2456|2004-08-18||2004-08-19 !SQLException:importTable:failed to import table
Whoops forgot one bit of the error.... Is there a sample tbl file somewhere? When I run the command I get the following: copy 5 records into vendor_dim from '/tmp/vendor.csv' using delimiters '|', '|\n'; !SQLException:sql:decimal wrong format ....contents of file like the following... !2456|'2456'|'FOO BAR CAFE'||'875 CENTER STREET RM'|'NY'|'NY'|' 96813'||||2456|2004-08-18||2004-08-19 !SQLException:importTable:failed to import table
On Wed, Nov 28, 2007 at 01:45:44PM -0700, m h wrote:
Whoops forgot one bit of the error....
Is there a sample tbl file somewhere? When I run the command I get IN the sources sql/src/benchmark/tpch/SF-0.01/ the following:
copy 5 records into vendor_dim from '/tmp/vendor.csv' using delimiters '|', '|\n';
!SQLException:sql:decimal wrong format ....contents of file like the following... !2456|'2456'|'FOO BAR CAFE'||'875 CENTER STREET RM'|'NY'|'NY'|' 96813'||||2456|2004-08-18||2004-08-19 ^
Which of these is supposed to be a decimal? (quotes are only needed in case the strings aren't safe, ie when they contain the seperator. I you use the two given seperators (field and record) then at the end you need '|\n' or change the seperator into '\n'. niels
!SQLException:importTable:failed to import table
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
On Nov 28, 2007 1:49 PM, Niels Nes
On Wed, Nov 28, 2007 at 01:45:44PM -0700, m h wrote:
Whoops forgot one bit of the error....
Is there a sample tbl file somewhere? When I run the command I get IN the sources sql/src/benchmark/tpch/SF-0.01/ the following:
copy 5 records into vendor_dim from '/tmp/vendor.csv' using delimiters '|', '|\n';
!SQLException:sql:decimal wrong format ....contents of file like the following... !2456|'2456'|'FOO BAR CAFE'||'875 CENTER STREET RM'|'NY'|'NY'|' 96813'||||2456|2004-08-18||2004-08-19 ^
Which of these is supposed to be a decimal?
Everything that looks like an INT is actually a NUMERIC (sorry, I'm trying to keep the same schema as the original.... )
(quotes are only needed in case the strings aren't safe, ie when they contain the seperator.
I you use the two given seperators (field and record) then at the end you need '|\n' or change the seperator into '\n'.
Thanks I changed this, am still getting the error. The data is confidential, let me try to create some non-confidential example....
The data is confidential, let me try to create some non-confidential example....
CREATE TABLE bulk2 ( id NUMERIC(9, 0) NOT NULL, fax NUMERIC(10, 0), phone NUMERIC(10, 0), acctnum NUMERIC(7, 0), PRIMARY KEY (vendnum) ); /tmp/bulk.csv 1001||1231231234|1001 1002|1234|1231231235|1002 COPY 2 RECORDS INTO bulk_load from '/tmp/bulk.csv' USING DELIMITERS '|', '\n'; !SQLException:sql:decimal wrong format (1231231234|1001) !SQLException:sql:value 1231231234|1001 from line 0 not inserted !SQLException:importTable:failed to import table Timer 1.114 msec 0 rows
On Nov 28, 2007 2:18 PM, m h
The data is confidential, let me try to create some non-confidential example....
CREATE TABLE bulk2 ( id NUMERIC(9, 0) NOT NULL, fax NUMERIC(10, 0), phone NUMERIC(10, 0), acctnum NUMERIC(7, 0), PRIMARY KEY (vendnum) );
/tmp/bulk.csv 1001||1231231234|1001 1002|1234|1231231235|1002
COPY 2 RECORDS INTO bulk_load from '/tmp/bulk.csv' USING DELIMITERS '|', '\n';
!SQLException:sql:decimal wrong format (1231231234|1001) !SQLException:sql:value 1231231234|1001 from line 0 not inserted !SQLException:importTable:failed to import table Timer 1.114 msec 0 rows
sorry bulk_load should be bulk2.... ie COPY 2 RECORDS INTO bulk2 from '/tmp/bulk.csv' USING DELIMITERS '|', '\n';
On Wed, Nov 28, 2007 at 02:20:10PM -0700, m h wrote:
On Nov 28, 2007 2:18 PM, m h
wrote: The data is confidential, let me try to create some non-confidential example....
CREATE TABLE bulk2 ( id NUMERIC(9, 0) NOT NULL, fax NUMERIC(10, 0), phone NUMERIC(10, 0), acctnum NUMERIC(7, 0), PRIMARY KEY (vendnum) );
/tmp/bulk.csv 1001||1231231234|1001 1002|1234|1231231235|1002
COPY 2 RECORDS INTO bulk_load from '/tmp/bulk.csv' USING DELIMITERS '|', '\n';
!SQLException:sql:decimal wrong format (1231231234|1001) !SQLException:sql:value 1231231234|1001 from line 0 not inserted !SQLException:importTable:failed to import table Timer 1.114 msec 0 rows
sorry bulk_load should be bulk2.... ie COPY 2 RECORDS INTO bulk2 from '/tmp/bulk.csv' USING DELIMITERS '|', '\n';
thanks for the bug, fix is coming soon. Could you next time report it on the sourceforge bugtracker? For now to work arround the problem, use INT and BIGINT types instead of numeric(x,0). And specify a NULL for nulls. Niels
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
On Nov 28, 2007 2:45 PM, Niels Nes
thanks for the bug, fix is coming soon. Could you next time report it on the sourceforge bugtracker?
Sure, I can put in on the bugtracker now if you want. Not a problem. I just wanted to make sure it was actually a bug. ;)
For now to work arround the problem, use INT and BIGINT types instead of numeric(x,0). And specify a NULL for nulls.
Thanks
On Wed, Nov 28, 2007 at 03:00:29PM -0700, m h wrote:
On Nov 28, 2007 2:45 PM, Niels Nes
wrote: thanks for the bug, fix is coming soon. Could you next time report it on the sourceforge bugtracker?
Sure, I can put in on the bugtracker now if you want. Not a problem. I just wanted to make sure it was actually a bug. ;)
I'm allready fix it ie no need now but next would be great.
For now to work arround the problem, use INT and BIGINT types instead of numeric(x,0). And specify a NULL for nulls.
Niels
Thanks
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
On Wed, Nov 28, 2007 at 11:07:42PM +0100, Niels Nes wrote:
On Wed, Nov 28, 2007 at 03:00:29PM -0700, m h wrote:
On Nov 28, 2007 2:45 PM, Niels Nes
wrote: thanks for the bug, fix is coming soon. Could you next time report it on the sourceforge bugtracker?
Sure, I can put in on the bugtracker now if you want. Not a problem. I just wanted to make sure it was actually a bug. ;)
I'm allready fix it ie no need now but next would be great.
For now to work arround the problem, use INT and BIGINT types instead of numeric(x,0). And specify a NULL for nulls.
fixed, checked into the Stable cvs branch.
Niels
Niels
Thanks
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
--
Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
On Nov 28, 2007 3:19 PM, m h
On Nov 28, 2007 3:15 PM, Niels Nes
wrote: fixed, checked into the Stable cvs branch.
Niels
Thanks again. I'll start putting bugs in as soon as I think it might be one from now on.
This is still failing on today's build. Here's the bug: https://sourceforge.net/tracker/index.php?func=detail&aid=1842019&group_id=56967&atid=482468
On Fri, Nov 30, 2007 at 05:08:54PM -0700, m h wrote:
On Nov 28, 2007 3:19 PM, m h
wrote: On Nov 28, 2007 3:15 PM, Niels Nes
wrote: fixed, checked into the Stable cvs branch.
Niels
Thanks again. I'll start putting bugs in as soon as I think it might be one from now on.
This is still failing on today's build. Here's the bug: https://sourceforge.net/tracker/index.php?func=detail&aid=1842019&group_id=56967&atid=482468
Indeed it didn't handle empty values correctly jet. Now thats fixed too. Niels
------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
participants (2)
-
m h
-
Niels Nes