Since the MonetDB server is UTF-8 *only*, you should *never* have
non-UTF-8 strings inside the server. If you have strings in some other
encoding, they should be converted to UTF-8 by whatever client program
you're using. mclient has options to do this (-e option).
If you want to do conversions yourself, take a look at the iconv related
code in common/stream/stream.c. Also, the console_read and
console_write functions in that file can give you inspiration. They
convert Windows wide characters (16-bit encodings of Unicode code
points) to and from UTF-8. This would be close to converting ints to UTF-8.
On 12/29/2016 01:10 AM, imad hajj chahine wrote:
> Hi again Sjoerd,
>
> After digging in the code I found the GDKstrFromStr, does this function
> handle conversion from a normal string to UTF8_string?
> Is this the correct syntax to use the function:
>
> str
> UDFyearbracket(str *ret, const date *v)
> {
> if (*v == date_nil) {
> *ret = GDKstrdup(str_nil);
> } else {
> int year;
> fromdate(*v, NULL, NULL, &year);
> *ret = (str) GDKmalloc(15);
> sprintf(*ret, "%d", year);
> GDKstrFromStr((unsigned char *)*ret, (unsigned char *)*ret, 15);
> }
> return MAL_SUCCEED;
> }
>
> Thank you.
>
> On Wed, Dec 28, 2016 at 11:40 PM, imad hajj chahine
> <imad.hajj.chahine@gmail.com <mailto:imad.hajj.chahine@gmail.com >> wrote:
>
> Thank you Sjoerd,
>
> Any idea how to convert an integer to UTF-8 string, does sprintf
> come with a variation that can handle UTF-8?
>
> Thank you.
>
> On Wed, Dec 28, 2016 at 11:08 PM, Sjoerd Mullender
> <sjoerd@monetdb.org <mailto:sjoerd@monetdb.org>> wrote:
>
> See https://dev.monetdb.org/hg/MonetDB-extend/
> <https://dev.monetdb.org/hg/MonetDB-extend/ > for a tutorial on
> how to
> create a UDF in C. You can use the URL to clone from.
>
> On 12/28/2016 09:28 PM, Alberto Ferrari wrote:
> > Imad, I hope your success with this. Please comment if you get it, and
> > then, could those new functions incorporate to future version of Monet?
> > Or maybe easily compiled to current? So in the future users may suggest
> > new useful functions (shame about SQL UDF performance)
> >
> > Regards!
> >
> > 2016-12-28 14:48 GMT-03:00 imad hajj chahine
> > <imad.hajj.chahine@gmail.com
> <mailto:imad.hajj.chahine@gmail.com >
> <mailto:imad.hajj.chahine@gmail.com
> <mailto:users-list@monetdb.org <mailto:users-list@monetdb.org> <mailto:imad.hajj.chahine@gmail.com >>>:
> >
> > Hi,
> >
> > After reviewing all the other alternatives like SQL and Python UDF,
> > I was either stuck on performance with SQL UDF or on usability with
> > Python UDF (unable to use with aggregation, and not such great
> > performance with dates),
> >
> > so I decided to go the hard way with C functions, as a bonus it will
> > give me the possibility to change the functionalities without
> > worrying about dependencies, which was not the case in other languages.
> >
> > The purpose is to create a set of formatting functions for Year,
> > Quarter, Month, Week and Day brackets, and of course i need to
> > create the bulk version of each function for performance.
> >
> > Starting from the MTIMEdate_extract_year_bulk, now i have the simple
> > function working, and successfully calling it from mclient:
> > /
> > /
> > /str/
> > /UDFyearbracket(str *ret, const date *v)/
> > /{/
> > /if (*v == date_nil) {/
> > /*ret = GDKstrdup(str_nil);/
> > /} else {/
> > /int year;/
> > /fromdate(*v, NULL, NULL, &year);/
> > /*ret = (str) GDKmalloc(15);/
> > /sprintf(*ret, "%d", year);/
> > /}/
> > /return MAL_SUCCEED;/
> > /}/
> >
> >
> > For the bulk version i get an error in the log: gdk_atoms.c:1345:
> > strPut: Assertion `(v[i] & 0x80) == 0' failed.
> > /str/
> > /UDFBATyearbracket(bat *ret, const bat *bid)/
> > /{/
> > /BAT *b, *bn;/
> > /BUN i,n;/
> > /str *y;/
> > /const date *t;/
> > /
> > /
> > /if ((b = BATdescriptor(*bid)) == NULL)/
> > /throw(MAL, "UDF.BATyearbracket", "Cannot access
> descriptor");/
> > /n = BATcount(b);/
> > /
> > /
> > /bn = COLnew(b->hseqbase, TYPE_str, BATcount(b), TRANSIENT);/
> > /if (bn == NULL) {/
> > /BBPunfix(b->batCacheid);/
> > /throw(MAL, "UDF.BATyearbracket", "memory allocation
> failure");/
> > /}/
> > /bn->tnonil = 1;/
> > /bn->tnil = 0;/
> > /
> > /
> > /t = (const date *) Tloc(b, 0);/
> > /y = (str *) Tloc(bn, 0);/
> > /for (i = 0; i < n; i++) {/
> > /if (*t == date_nil) {/
> > /*y = GDKstrdup(str_nil);/
> > /} else/
> > /UDFyearbracket(y, t);/
> > /if (strcmp(*y, str_nil) == 0) {/
> > /bn->tnonil = 0;/
> > /bn->tnil = 1;/
> > /}/
> > /y++;/
> > /t++;/
> > /}/
> > /
> > /
> > /BATsetcount(bn, (BUN) (y - (str *) Tloc(bn, 0)));/
> > /
> > /
> > /bn->tsorted = BATcount(bn)<2;/
> > /bn->trevsorted = BATcount(bn)<2;/
> > /
> > /
> > /BBPkeepref(*ret = bn->batCacheid);/
> > /BBPunfix(b->batCacheid);/
> > /return MAL_SUCCEED;/
> > /}/
> >
> > PS: I am not a c expert but i can find my way with basic operations
> > and pointers.
> >
> > Any help or suggestions is appreciated.
> >
> > Thank you.
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org <mailto:users-list@monetdb.org>
>>
> > https://www.monetdb.org/mailman/listinfo/users-list
> <https://www.monetdb.org/mailman/listinfo/users-list >
> > <https://www.monetdb.org/mailman/listinfo/users-list
> <https://www.monetdb.org/mailman/listinfo/users-list >>
> >
> >
> >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org <mailto:users-list@monetdb.org>
> > https://www.monetdb.org/mailman/listinfo/users-list
> <https://www.monetdb.org/mailman/listinfo/users-list >
> >
>
> --
> Sjoerd Mullender
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org <mailto:users-list@monetdb.org>
> https://www.monetdb.org/mailman/listinfo/users-list
> <https://www.monetdb.org/mailman/listinfo/users-list >
>
>
>
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
--
Sjoerd Mullender
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list