Re: [Monetdb-developers] [Monetdb-checkins] MonetDB5/src/modules/mal tablet.mx, , 1.226, 1.227
Is this correct? What about the case where you have a field which ends in \\, such as \\,more data which is two fields separated by a comma. On 2009-08-14 08:17, Martin Kersten wrote:
Update of /cvsroot/monetdb/MonetDB5/src/modules/mal In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11440
Modified Files: tablet.mx Log Message: Only start looking at the escape if you really have to.
U tablet.mx Index: tablet.mx =================================================================== RCS file: /cvsroot/monetdb/MonetDB5/src/modules/mal/tablet.mx,v retrieving revision 1.226 retrieving revision 1.227 diff -u -d -r1.226 -r1.227 --- tablet.mx 14 Aug 2009 05:50:11 -0000 1.226 +++ tablet.mx 14 Aug 2009 06:17:10 -0000 1.227 @@ -2278,21 +2278,21 @@ /* we split based on simple character separators for speed */ if ( rseplen == 1 ){ for (; *e; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep) + if ( *e == *rsep) { + if ( e> s&& *(e-1) == '\\') + continue; break; + } if (*e == 0) e = 0; } else do { for ( ; *e ; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) + if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) { + if ( e> s&& *(e-1) == '\\') + continue; break; + }
if ( *e ) break; e = 0;
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Monetdb-checkins mailing list Monetdb-checkins@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-checkins
-- Sjoerd Mullender
Sjoerd Mullender wrote:
Is this correct? What about the case where you have a field which ends in \\, such as \\,more data indeed a tricky one... That one was not covered in the old code either. The only other reasonable solution is to disallow any escaping of col/rec separators all together or to rely on the user to deal with these corner cases
which is two fields separated by a comma.
On 2009-08-14 08:17, Martin Kersten wrote:
Update of /cvsroot/monetdb/MonetDB5/src/modules/mal In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11440
Modified Files: tablet.mx Log Message: Only start looking at the escape if you really have to.
U tablet.mx Index: tablet.mx =================================================================== RCS file: /cvsroot/monetdb/MonetDB5/src/modules/mal/tablet.mx,v retrieving revision 1.226 retrieving revision 1.227 diff -u -d -r1.226 -r1.227 --- tablet.mx 14 Aug 2009 05:50:11 -0000 1.226 +++ tablet.mx 14 Aug 2009 06:17:10 -0000 1.227 @@ -2278,21 +2278,21 @@ /* we split based on simple character separators for speed */ if ( rseplen == 1 ){ for (; *e; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep) + if ( *e == *rsep) { + if ( e> s&& *(e-1) == '\\') + continue; break; + } if (*e == 0) e = 0; } else do { for ( ; *e ; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) + if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) { + if ( e> s&& *(e-1) == '\\') + continue; break; + }
if ( *e ) break; e = 0;
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Monetdb-checkins mailing list Monetdb-checkins@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-checkins
Martin Kersten wrote:
Sjoerd Mullender wrote:
Is this correct? What about the case where you have a field which ends in \\, such as \\,more data indeed a tricky one... That one was not covered in the old code either. The only other reasonable solution is to disallow any escaping of col/rec separators all together or to rely on the user to deal with these corner cases
I would have thought it's fairly simple: \ escapes the next character. If the next character is in the set [nbtfa] and perhaps [0-7] the combination has a special meaning (newline, backspace, etc), but otherwise you just get the next character, whether it is a \ or the separator. You could have the rule that the \ is only recognized as escape if it is inside a double-quoted string.
which is two fields separated by a comma.
On 2009-08-14 08:17, Martin Kersten wrote:
Update of /cvsroot/monetdb/MonetDB5/src/modules/mal In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11440
Modified Files: tablet.mx Log Message: Only start looking at the escape if you really have to.
U tablet.mx Index: tablet.mx =================================================================== RCS file: /cvsroot/monetdb/MonetDB5/src/modules/mal/tablet.mx,v retrieving revision 1.226 retrieving revision 1.227 diff -u -d -r1.226 -r1.227 --- tablet.mx 14 Aug 2009 05:50:11 -0000 1.226 +++ tablet.mx 14 Aug 2009 06:17:10 -0000 1.227 @@ -2278,21 +2278,21 @@ /* we split based on simple character separators for speed */ if ( rseplen == 1 ){ for (; *e; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep) + if ( *e == *rsep) { + if ( e> s&& *(e-1) == '\\') + continue; break; + } if (*e == 0) e = 0; } else do { for ( ; *e ; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) + if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) { + if ( e> s&& *(e-1) == '\\') + continue; break; + }
if ( *e ) break; e = 0;
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Monetdb-checkins mailing list Monetdb-checkins@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-checkins
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- Sjoerd Mullender
Sjoerd Mullender
Martin Kersten wrote:
Sjoerd Mullender wrote:
Is this correct? What about the case where you have a field which ends in \\, such as \\,more data indeed a tricky one... That one was not covered in the old code either. The only other reasonable solution is to disallow any escaping of col/rec separators all together or to rely on the user to deal with these corner cases
I would have thought it's fairly simple: \ escapes the next character. If the next character is in the set [nbtfa] and perhaps [0-7] the combination has a special meaning (newline, backspace, etc), but otherwise you just get the next character, whether it is a \ or the separator. You could have the rule that the \ is only recognized as escape if it is inside a double-quoted string. But in sql you can change the string_quote character to be something other than the double quote. And there is also the situation where you want to import multiple lines of text into a blob field.
I have some nice example of import files that contain html in blob fields that work with the current implementation. We can use them to test the new code. And the interpretation of the \\ is also a nice one, what would happen if you had a blob column that contained latex files? And than dumped and loaded them? What you maybe want is not characters to separate fields, records and strings, but to use strings. The mssql bulkcopy tool allows this. Arjen
which is two fields separated by a comma.
On 2009-08-14 08:17, Martin Kersten wrote:
Update of /cvsroot/monetdb/MonetDB5/src/modules/mal In directory 23jxhf1.ch3.sourceforge.com:/tmp/cvs-serv11440
Modified Files: tablet.mx Log Message: Only start looking at the escape if you really have to.
U tablet.mx Index: tablet.mx =================================================================== RCS file: /cvsroot/monetdb/MonetDB5/src/modules/mal/tablet.mx,v retrieving revision 1.226 retrieving revision 1.227 diff -u -d -r1.226 -r1.227 --- tablet.mx 14 Aug 2009 05:50:11 -0000 1.226 +++ tablet.mx 14 Aug 2009 06:17:10 -0000 1.227 @@ -2278,21 +2278,21 @@ /* we split based on simple character separators for speed */ if ( rseplen == 1 ){ for (; *e; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep) + if ( *e == *rsep) { + if ( e> s&& *(e-1) == '\\') + continue; break; + } if (*e == 0) e = 0; } else do { for ( ; *e ; e++) - if ( *e == '\\') - e++; - else - if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) + if ( *e == *rsep&& strncmp(e,rsep,rseplen) == 0 ) { + if ( e> s&& *(e-1) == '\\') + continue; break; + }
if ( *e ) break; e = 0;
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Monetdb-checkins mailing list Monetdb-checkins@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-checkins
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- ==================================================================== CWI, Kamer C0.03 Centrum voor Wiskunde en Informatica Science Park 123 Email: arjen.de.rijke@cwi.nl 1098 XG Amsterdam tel: +31-(0)20-5924305 Nederland +31-(0)6-51899284 fax: +31-(0)20-5924312 ===================== http://www.cwi.nl/~rijke/ ====================
participants (3)
-
Arjen de Rijke
-
Martin Kersten
-
Sjoerd Mullender