Hello Karanbir,
Well, that can't be the cause then. :) I will file the bug-report
regarding the BOM.
When I try to write the command exactly as you did (with both '-s' and
a pipe, i get)
$ echo 'CREATE TABLE aap (a int);' | mclient -lsql -dtest
$ cat data.dat
1
2
3
4
5
$ N=4; head -n $N data.dat | mclient -lsql -dtest -s "copy $N records
into aap from STDIN;"
MAPI = monetdb@localhost:50000
QUERY = copy 4 records into aap from STDIN;
ERROR = !SQLException:sql:value ';' while parsing ';' from line 0
field 0 not inserted, expecting type int
!SQLException:importTable:failed to import table
It seems that mclient is confused about what to read first (statement
or stdin) and perhaps it is a bug? I think an mclient guru might be
able to answer this?
I also tried (and this seems to work fine):
$ (N=4; echo "copy $N records into aap from STDIN;"; head -n $N
data.dat) | mclient -lsql -dtest
[ 4 ]
$ mclient -lsql -dtest -s "select * from aap;"
% sys.aap # table_name
% a # name
% int # type
% 1 # length
[ 1 ]
[ 2 ]
[ 3 ]
[ 4 ]
So this could be another workaround? Have you tried this already?
Wouter
p.s. Otherwise, I guess another workaround would be to create a
(temporary) pipe on your filesystem (but i'm not sure whether that
works):
$ mkfifo /tmp/workaroundpipe
$ (cat mydata > /tmp/workaroundpipe) &
$ mclient -lsql -s "copy x records into x from '/tmp/workaroundpipe';"
2009/5/5 Karanbir Singh
Hi Wouter,
Wouter Alink wrote:
Hello Karanbir,
This sounds like a BOM (Byte Order Mark, http://unicode.org/faq/utf_bom.html#BOM) is not dealt with correctly.
Thats interesting, and not something I'd considered at all. However :
If you try:
xxd /home/kbsingh/data/data/1000.utf8 | head
does it start with 'EF BB BF'?
[kbsingh@koala ~]$ xxd /home/kbsingh/data/data/1000.utf8 | head 0000000: 3664 6266 6339 6431 6635 3464 3137 3366 6dbfc9d1f54d173f 0000010: 6130 3962 6664 6131 3965 3566 6335 3062 a09bfda19e5fc50b
So that does not seem to be the issue in this case.
A little experiment (on the head) reveals a bug in mclient (it does not handle correctly the optional BOM at the beginning of the input):
$ cat selectWithBOM.py print "\xEF\xBB\xBFSELECT 1;" $ python selectWithBOM.py > queryWithBOM.sql $ xxd queryWithBOM.sql 0000000: efbb bf53 454c 4543 5420 313b 0a ...SELECT 1;. $ cat queryWithBOM.sql SELECT 1; $ echo "SELECT 1;" | mclient -lsql % . # table_name % single_value # name % tinyint # type % 1 # length [ 1 ] $ cat queryWithBOM.sql | mclient -lsql (Hangs)
I guess a bug should be filed.
Good call, should I go ahead and do that using your test case here ? or would you like to file the bugreport yourself ? The only reason I am hesitant to do this is that while there seems to be this issue, its not an issue that my data suffers from here.
If your data starts with the BOM, a workaround would be to strip the first three bytes of your data (as the BOM is not very meaningful when using UTF-8).
I dont think that its the case here, so what are the workaround options available ? Essentially : I need to load about 600 to 700 G worth of data thats going to be delivered to me in a .gz file, expanding that to raw text is not something I'd like to consider unless thats was the _only_ way to get data loaded here.
-- Karanbir Singh : http://www.karan.org/ : 2522219@icq
------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users