[MonetDB-users] ODBC + non-ascii
Hi, I don't know if problem is related to used interface (ODBC) but I have a problem with inserting non-ascii data. All data I insert seems broken. I started with 'utf-8' but tested 'cp1250' and 'iso8859-2' as well. No luck. All data I take back has broken non-ascii characters. Any clues? (Windows 32-bit "Jun 2008" release) Pawel Lewicki
Pawel Lewicki wrote:
Hi, I don't know if problem is related to used interface (ODBC) but I have a problem with inserting non-ascii data. All data I insert seems broken. I started with 'utf-8' but tested 'cp1250' and 'iso8859-2' as well. No luck. All data I take back has broken non-ascii characters. Any clues? (Windows 32-bit "Jun 2008" release)
Yes. Windows. Internally, MonetDB and the MonetDB ODBC driver are fully UTF-8. If you use the MonetDB Client program to look at your data, you are almost guaranteed to get weird results if you use non-ASCII characters since it seems to be impossible (that is to say, we haven't found a way) to figure out the character set encoding that is used by the cmd window. There doesn't seem to be anything that differentiates that window from some other interfaces that we have tried, except that it does use a different encoding. But if you use the ODBC driver itself to get the data back from the server, it should still be UTF-8. (It's been a while since I have looked at this, so I forgot the details. But I do remember I tried quite a few things to get the cmd window to behave sanely.) -- Sjoerd Mullender
Sjoerd Mullender wrote:
Pawel Lewicki wrote:
Hi, I don't know if problem is related to used interface (ODBC) but I have a problem with inserting non-ascii data. All data I insert seems broken. I started with 'utf-8' but tested 'cp1250' and 'iso8859-2' as well. No luck. All data I take back has broken non-ascii characters. Any clues? (Windows 32-bit "Jun 2008" release)
Yes. Windows.
Internally, MonetDB and the MonetDB ODBC driver are fully UTF-8.
If you use the MonetDB Client program to look at your data, you are almost guaranteed to get weird results if you use non-ASCII characters since it seems to be impossible (that is to say, we haven't found a way) to figure out the character set encoding that is used by the cmd window. There doesn't seem to be anything that differentiates that window from some other interfaces that we have tried, except that it does use a different encoding.
But if you use the ODBC driver itself to get the data back from the server, it should still be UTF-8.
(It's been a while since I have looked at this, so I forgot the details. But I do remember I tried quite a few things to get the cmd window to behave sanely.)
Hi, I don't mean cmd window. I use ODBC to transfer data between databases. When I query MonetDB I get wrong data. I'm sure it's wrong, not only improperly displayed :) I'm not sure yet if it gets broken on inserting to MonetDB or on latter selecting. Pawel Lewicki
Pawel Lewicki wrote:
Hi, I don't know if problem is related to used interface (ODBC) but I have a problem with inserting non-ascii data. All data I insert seems broken. I started with 'utf-8' but tested 'cp1250' and 'iso8859-2' as well. No luck. All data I take back has broken non-ascii characters. Any clues? (Windows 32-bit "Jun 2008" release)
ODBC driver seems broken. I inserted the same data using native Python bindings and it's ok. Pawel Lewicki
Pawel Lewicki wrote:
Pawel Lewicki wrote:
Hi, I don't know if problem is related to used interface (ODBC) but I have a problem with inserting non-ascii data. All data I insert seems broken. I started with 'utf-8' but tested 'cp1250' and 'iso8859-2' as well. No luck. All data I take back has broken non-ascii characters. Any clues? (Windows 32-bit "Jun 2008" release)
ODBC driver seems broken. I inserted the same data using native Python bindings and it's ok.
It is certainly possible that the driver is broken. Can you submit a bug report with information on how to reproduce on the bug tracker at sourceforge? https://sourceforge.net/tracker/?group_id=56967&atid=482468 -- Sjoerd Mullender
participants (2)
-
Pawel Lewicki
-
Sjoerd Mullender