International characters in MonetDB
Hello All, I have been trying to do some bulk uploads from CSV files into a MonetDB database. All the fields in the database are VARCHAR. The text in some of the fields contains international characters such as : ü, Ó, é etc. When I execute : copy into database from 'filename.csv' using delimiters ',','\n','"' NULL as ''; I get an error stating that the fields with words containing the international characters could not be imported, and the import process stops. How can I solve this issue? (I am running Monetdb on windows) Thanks, SG
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2013-07-15 13:27, aaaaaa ggggggg wrote:
Hello All,
I have been trying to do some bulk uploads from CSV files into a MonetDB database. All the fields in the database are VARCHAR. The text in some of the fields contains international characters such as : ü, Ó, é etc.
When I execute : copy into database from 'filename.csv' using delimiters ',','\n','"' NULL as '';
I get an error stating that the fields with words containing the international characters could not be imported, and the import process stops.
How can I solve this issue? (I am running Monetdb on windows)
When reading CSV files, MonetDB only accepts files encoding using UTF-8.
Thanks, SG
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBUePzFz7g04AjvIQpAQKH1QP/aqd2qIi24Ay/7aSoHrvGoFoFiyRJXglT 6nHJUx/iiBtY/shFXDzkpdcxkYVO8YmJgVKSjDic19xFho+n/FFftYmaD24xFMDX ucTt1T3pDx768s1ou05sw5w2aXNvU2Fbh52+N1jBYo/H5Z8LdnHvH1juQQ/+12IR y006Fpjt5pU= =mJDq -----END PGP SIGNATURE-----
Hi Sjoerd,
Thanks for that information! worked like a charm. I just had to convert the ascii file to UTF8 encoding using :
iconv -f iso-8859-1 -t utf-8 Filename.csv > Filename.utf8.csv
Converting all files for bulk reading to utf8 seems to have speeded up the reading process a bit as well!
I have a follow up question :
My original Data is in XML format (I have the XSD file as well). I couldn't find a good/clear example of how to use MonetDB5 to read an XML file into the database, preserving the schema in the XML file.
It would be very helpful if you could give some pointers.
Best Regards
SG
________________________________
From: Sjoerd Mullender
Hello All,
I have been trying to do some bulk uploads from CSV files into a MonetDB database. All the fields in the database are VARCHAR. The text in some of the fields contains international characters such as : ü, Ó, é etc.
When I execute : copy into database from 'filename.csv' using delimiters ',','\n','"' NULL as '';
I get an error stating that the fields with words containing the international characters could not be imported, and the import process stops.
How can I solve this issue? (I am running Monetdb on windows)
When reading CSV files, MonetDB only accepts files encoding using UTF-8.
Thanks, SG
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBUePzFz7g04AjvIQpAQKH1QP/aqd2qIi24Ay/7aSoHrvGoFoFiyRJXglT 6nHJUx/iiBtY/shFXDzkpdcxkYVO8YmJgVKSjDic19xFho+n/FFftYmaD24xFMDX ucTt1T3pDx768s1ou05sw5w2aXNvU2Fbh52+N1jBYo/H5Z8LdnHvH1juQQ/+12IR y006Fpjt5pU= =mJDq -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
Hi SG, since the Apr2011 release, MonetDB (based on the MonetDB5 back-end) is a purely relational (SQL) DBMS. Regarding XML, we do support some feature of the SQL/XML standard as documented under http://www.monetdb.org/Documentation/Manuals/SQLreference/XML MonetDB/XQuery as "native" XML/XQuery DBMS (based on the MonetDB4 back-end has been decprecated After the Mar2011 release; cf. http://www.monetdb.org/XQuery The final snapshot is available from http://www.monetdb.org/XQuery/Downloads Best, Stefan On Mon, Jul 15, 2013 at 06:54:31AM -0700, aaaaaa ggggggg wrote:
Hi Sjoerd,
Thanks for that information! worked like a charm. I just had to convert the ascii file to UTF8 encoding using : iconv -f iso-8859-1 -t utf-8 Filename.csv > Filename.utf8.csv
Converting all files for bulk reading to utf8 seems to have speeded up the reading process a bit as well!
I have a follow up question : My original Data is in XML format (I have the XSD file as well). I couldn't find a good/clear example of how to use MonetDB5 to read an XML file into the database, preserving the schema in the XML file. It would be very helpful if you could give some pointers.
Best Regards SG
________________________________ From: Sjoerd Mullender
To: users-list@monetdb.org Sent: Monday, July 15, 2013 3:03 PM Subject: Re: International characters in MonetDB -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2013-07-15 13:27, aaaaaa ggggggg wrote:
Hello All,
I have been trying to do some bulk uploads from CSV files into a MonetDB database. All the fields in the database are VARCHAR. The text in some of the fields contains international characters such as : ü, Ó, é etc.
When I execute : copy into database from 'filename.csv' using delimiters ',','\n','"' NULL as '';
I get an error stating that the fields with words containing the international characters could not be imported, and the import process stops.
How can I solve this issue? (I am running Monetdb on windows)
When reading CSV files, MonetDB only accepts files encoding using UTF-8.
Thanks, SG
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQCVAwUBUePzFz7g04AjvIQpAQKH1QP/aqd2qIi24Ay/7aSoHrvGoFoFiyRJXglT 6nHJUx/iiBtY/shFXDzkpdcxkYVO8YmJgVKSjDic19xFho+n/FFftYmaD24xFMDX ucTt1T3pDx768s1ou05sw5w2aXNvU2Fbh52+N1jBYo/H5Z8LdnHvH1juQQ/+12IR y006Fpjt5pU= =mJDq -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Hi Stefan,
Thanks for that information regarding XML. Following the information in :
http://www.monetdb.org/Documentation/Manuals/SQLreference/XML
I tried :
copy into MyDatabase from 'Myschema.xml';
for which I get the error message :
"Error: COPY INTO: no such table 'MyDatabase'"
I do not have any tables in the database : "MyDatabase" since the documentation says:
"statement reads the XML document and breaks it into a series of relational tables with foreign key references. It results in a structured object representation."
What am I doing wrong?
Best,
SG
________________________________
From: Stefan Manegold
Hi Sjoerd,
Thanks for that information! worked like a charm. I just had to convert the ascii file to UTF8 encoding using : iconv -f iso-8859-1 -t utf-8 Filename.csv > Filename.utf8.csv
Converting all files for bulk reading to utf8 seems to have speeded up the reading process a bit as well!
I have a follow up question : My original Data is in XML format (I have the XSD file as well). I couldn't find a good/clear example of how to use MonetDB5 to read an XML file into the database, preserving the schema in the XML file. It would be very helpful if you could give some pointers.
Best Regards SG
________________________________ From: Sjoerd Mullender
To: users-list@monetdb.org Sent: Monday, July 15, 2013 3:03 PM Subject: Re: International characters in MonetDB -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2013-07-15 13:27, aaaaaa ggggggg wrote:
Hello All,
I have been trying to do some bulk uploads from CSV files into a MonetDB database. All the fields in the database are VARCHAR. The text in some of the fields contains international characters such as : ü, Ó, é etc.
When I execute : copy into database from 'filename.csv' using delimiters ',','\n','"' NULL as '';
I get an error stating that the fields with words containing the international characters could not be imported, and the import process stops.
How can I solve this issue? (I am running Monetdb on windows)
When reading CSV files, MonetDB only accepts files encoding using UTF-8.
Thanks, SG
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQCVAwUBUePzFz7g04AjvIQpAQKH1QP/aqd2qIi24Ay/7aSoHrvGoFoFiyRJXglT 6nHJUx/iiBtY/shFXDzkpdcxkYVO8YmJgVKSjDic19xFho+n/FFftYmaD24xFMDX ucTt1T3pDx768s1ou05sw5w2aXNvU2Fbh52+N1jBYo/H5Z8LdnHvH1juQQ/+12IR y006Fpjt5pU= =mJDq -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
Hi SG / aaaaaa ggggggg / veriveri24, to be honest, I would expect that you have to create the table first, but since I don't know more about MonetDB's SQL/XML support, I cannot even tell you which schema that table is supposed to have. I hope that some else who knows more about this will be able to answer your question. Best, Stefan ----- Original Message -----
Hi Stefan,
Thanks for that information regarding XML. Following the information in : http://www.monetdb.org/Documentation/Manuals/SQLreference/XML
I tried : copy into MyDatabase from 'Myschema.xml'; for which I get the error message : "Error: COPY INTO: no such table 'MyDatabase'" I do not have any tables in the database : "MyDatabase" since the documentation says: "statement reads the XML document and breaks it into a series of relational tables with foreign key references. It results in a structured object representation."
What am I doing wrong?
Best, SG
________________________________ From: Stefan Manegold
To: Communication channel for MonetDB users Sent: Monday, July 15, 2013 4:22 PM Subject: Re: International characters in MonetDB Hi SG,
since the Apr2011 release, MonetDB (based on the MonetDB5 back-end) is a purely relational (SQL) DBMS.
Regarding XML, we do support some feature of the SQL/XML standard as documented under http://www.monetdb.org/Documentation/Manuals/SQLreference/XML
MonetDB/XQuery as "native" XML/XQuery DBMS (based on the MonetDB4 back-end has been decprecated After the Mar2011 release; cf. http://www.monetdb.org/XQuery The final snapshot is available from http://www.monetdb.org/XQuery/Downloads
Best, Stefan
On Mon, Jul 15, 2013 at 06:54:31AM -0700, aaaaaa ggggggg wrote:
Hi Sjoerd,
Thanks for that information! worked like a charm. I just had to convert the ascii file to UTF8 encoding using : iconv -f iso-8859-1 -t utf-8 Filename.csv > Filename.utf8.csv
Converting all files for bulk reading to utf8 seems to have speeded up the reading process a bit as well!
I have a follow up question : My original Data is in XML format (I have the XSD file as well). I couldn't find a good/clear example of how to use MonetDB5 to read an XML file into the database, preserving the schema in the XML file. It would be very helpful if you could give some pointers.
Best Regards SG
________________________________ From: Sjoerd Mullender
To: users-list@monetdb.org Sent: Monday, July 15, 2013 3:03 PM Subject: Re: International characters in MonetDB -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2013-07-15 13:27, aaaaaa ggggggg wrote:
Hello All,
I have been trying to do some bulk uploads from CSV files into a MonetDB database. All the fields in the database are VARCHAR. The text in some of the fields contains international characters such as : ü, Ó, é etc.
When I execute : copy into database from 'filename.csv' using delimiters ',','\n','"' NULL as '';
I get an error stating that the fields with words containing the international characters could not be imported, and the import process stops.
How can I solve this issue? (I am running Monetdb on windows)
When reading CSV files, MonetDB only accepts files encoding using UTF-8.
Thanks, SG
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQCVAwUBUePzFz7g04AjvIQpAQKH1QP/aqd2qIi24Ay/7aSoHrvGoFoFiyRJXglT 6nHJUx/iiBtY/shFXDzkpdcxkYVO8YmJgVKSjDic19xFho+n/FFftYmaD24xFMDX ucTt1T3pDx768s1ou05sw5w2aXNvU2Fbh52+N1jBYo/H5Z8LdnHvH1juQQ/+12IR y006Fpjt5pU= =mJDq -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (3)
-
aaaaaa ggggggg
-
Sjoerd Mullender
-
Stefan Manegold