Import bug on bug tracker
Hello there I reported what I believe to be a bug in the import routines towards the end of September: https://www.monetdb.org/bugzilla/show_bug.cgi?id=3812#c0 https://www.monetdb.org/bugzilla/show_bug.cgi?id=3812#c0 Below is a summary of the issue I encountered, I wasn’t sure if the bug tracker is live/active so trying on this mailing list. Of course, this may be expected behaviour (though I doubt it) in which case would be useful to know. [reply https://www.monetdb.org/bugzilla/process_bug.cgi#add_comment] [−] https://www.monetdb.org/bugzilla/process_bug.cgi#Description https://www.monetdb.org/bugzilla/show_bug.cgi?id=3812#c0Andy Barlow mailto:andy.barlow@datum360.com 2015-09-25 15:19:43 CEST User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 Build Identifier: If your import file contains a number of character combinations, such as \560, \561, \700 and probably lots of others, the copy into fails with a message like: Failed to import table line 5 field 1 'varchar(20)' expected in 'FIVE\700' Reproducible: Always Steps to Reproduce: 1. create table test (test varchar(20)); 2. create a load file test.txt with the following: ONE TWO THREE FOUR FIVE\700 3. copy into test from 'test.txt' using delimiters '\t','\n'; 4. fails with message: Failed to import table line 5 field 1 'varchar(20)' expected in 'FIVE\700' Actual Results: Failed to import table line 5 field 1 'varchar(20)' expected in 'FIVE\700' Expected Results: should import OK, the character combinations are valid [reply https://www.monetdb.org/bugzilla/process_bug.cgi#add_comment] [−] https://www.monetdb.org/bugzilla/process_bug.cgi#Comment 1 https://www.monetdb.org/bugzilla/show_bug.cgi?id=3812#c1Andy Barlow mailto:andy.barlow@datum360.com 2015-09-25 16:04:00 CEST it appears that the copy command is treating the input as octal. If you convert the backslash \ into its octal equivalent, \134 then it “works”. In some cases, such as \700 an error is thrown. With an import file test.txt: ONE\101 ONE\134101 When imported into table test (test varchar(20)), you get: ONEA ONE\101 So the copy command is converting octal to UTF-8,ASCII equivalen Andy Barlow t: +44 7830 302268 www.datum360.com http://www.datum360.com/ ...delivering a measured approach to engineering information Please be advised that this email may contain confidential information. If you are not the intended recipient, please do not read, copy or re-transmit this email. If you have received this email in error, please notify us by email by replying to the sender or by telephone and delete this message and any attachments. Thank you in advance for your cooperation and assistance. In addition, Datum360 disclaim that the content of this email constitutes an offer to enter into, or the acceptance of, any contract or agreement or any amendment thereto; provided that the foregoing disclaimer does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment to this email.
Dear Andy, thank you very much for using MonetDB and for sharing your experiences and problems you encountered. Please accept our apologies for the inconveniences you experience and for this belated reply. We --- a small handful of people forming the core MonetDB team of CWI's Database Architectures research group --- closely follow both our mailing lists and our bug tracker, and we do highly appreciate any user feedback we receive via these and other channels. Unfortunately, our resources, both in terms of manpower and time, are finite and vary over time. After all, our main "business" is performing research and publishing scientific articles. Maintaining MonetDB and handling user suggestions, request and fixing reported bug largely happens on a voluntary basis in our spare time. The much we are eager to do more, the much we face above constraints. Having said that, quoting in general and backslash ('\') quoting / and handling in general and in bulk imports (copy into) in particular is not trivial to do entirely "waterproof" and consistent. We are aware of your bug report and related problems in general, and fixing this is on our agenda. Unfortunately, in the light what I said above, we currently cannot give a useful estimate when we'll be able to devote the necessary time and resources to properly investigate the origin of the problem and solve it. Please bear with us. Kind regards, Stefan ----- On Nov 11, 2015, at 12:18 PM, andy barlow andy.barlow@datum360.com wrote:
Hello there
I reported what I believe to be a bug in the import routines towards the end of September:
https://www.monetdb.org/bugzilla/show_bug.cgi?id=3812#c0
Below is a summary of the issue I encountered, I wasn’t sure if the bug tracker is live/active so trying on this mailing list.
Of course, this may be expected behaviour (though I doubt it) in which case would be useful to know.
[ reply ] [−] Description Andy Barlow 2015-09-25 15:19:43 CEST User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 Build Identifier:
If your import file contains a number of character combinations, such as \560, \561, \700 and probably lots of others, the copy into fails with a message like:
Failed to import table line 5 field 1 'varchar(20)' expected in 'FIVE\700'
Reproducible: Always
Steps to Reproduce: 1. create table test (test varchar(20)); 2. create a load file test.txt with the following: ONE TWO THREE FOUR FIVE\700 3. copy into test from 'test.txt' using delimiters '\t','\n'; 4. fails with message: Failed to import table line 5 field 1 'varchar(20)' expected in 'FIVE\700' Actual Results: Failed to import table line 5 field 1 'varchar(20)' expected in 'FIVE\700'
Expected Results: should import OK, the character combinations are valid [ reply ] [−] Comment 1 Andy Barlow 2015-09-25 16:04:00 CEST it appears that the copy command is treating the input as octal.
If you convert the backslash \ into its octal equivalent, \134 then it “works”.
In some cases, such as \700 an error is thrown.
With an import file test.txt:
ONE\101 ONE\134101
When imported into table test (test varchar(20)), you get:
ONEA ONE\101
So the copy command is converting octal to UTF-8,ASCII equivalen
Andy Barlow t: +44 7830 302268
www.datum360.com ...delivering a measured approach to engineering information
Please be advised that this email may contain confidential information. If you are not the intended recipient, please do not read, copy or re-transmit this email. If you have received this email in error, please notify us by email by replying to the sender or by telephone and delete this message and any attachments. Thank you in advance for your cooperation and assistance.
In addition, Datum360 disclaim that the content of this email constitutes an offer to enter into, or the acceptance of, any contract or agreement or any amendment thereto; provided that the foregoing disclaimer does not invalidate the binding effect of any digital or other electronic reproduction of a manual signature that is included in any attachment to this email.
_______________________________________________ developers-list mailing list developers-list@monetdb.org https://www.monetdb.org/mailman/listinfo/developers-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (2)
-
andy.barlow@datum360.com
-
Stefan Manegold