Thank you Mark and Jennie,

The purpose of this function is to create date brackets, like month and quarter brackets not only to convert date to string.
I handled null values and now its working and returning Nat for null values. Any idea on how to return null instead NaT to stay in the SQL standards?
I also noticed that i cant aggregate over the python function, for instance if i create a view with a column referring the Python UDF and i execute select count(columnPythonUDF) from table1 an error occured.
Am I excepting too much from a Python UDF to behave like a normal SQL UDF, as I didnt face this error when aggregating over an SQL UDF.

CREATE FUNCTION pyDayBracket(d DATE) RETURNS STRING LANGUAGE PYTHON {
    import pandas as pd
    return pd.to_datetime(d, coerce=True).strftime('%Y-%m-%d')
};

as for the execution time:
  1. directly invoking str_to_date in select query took 0.73 sec
  2. UDF with concat date parts tool 1.1 sec
  3. UDF calling str_to_date took 24 sec
Thank you.


On Wed, Dec 21, 2016 at 6:17 PM, Ying Zhang <Y.Zhang@cwi.nl> wrote:

> On 21 Dec 2016, at 16:24, imad hajj chahine <imad.hajj.chahine@gmail.com> wrote:
>
> Hi Jennie,
>
> Thank you for the answer, the unexpected version of the function take 1.1 sec to execute, the table contains 500k values.

You mean the faster version of the function took 1.1 sec?  How long did the slower one take?

>
> You mentioned Bulk functions, I tried to achieve this with python with no luck, can you please check my Python function syntax and let me know the problem.

I’m sorry that I’m not a python expert.  But I saw Mark has already answered.

Regards,
Jennie


> CREATE FUNCTION pyDayBracket(pdate date) returns string
> LANGUAGE PYTHON {
>       from time import strftime, strptime
>       return [strftime("%Y-%m-%d", strptime(pdt,"%Y-%m-%d")) for pdt in pdate]
> };
>
> Throws a buffer size mismatch exception.
> Also why python don't recognize the date type from monetdb?
>
> Thank you.
>
>
> On Wed, Dec 21, 2016 at 5:11 PM, Ying Zhang <Y.Zhang@cwi.nl> wrote:
> Hai
>
> > On 21 Dec 2016, at 15:26, Alberto Ferrari <aferrari@starconnecting.com> wrote:
> >
> > Imad, we had the same problem (you can find our mail in list with
> > subject "Functions in Monetdb", date nov/22), and had no satisfactory
> > solution yet.
> > I think monetDb is not good for UDF yet... (though is great db engine!)
>
> Mostly, it’s because monetdb can’t automatically bulk execute user defined functions.
> So for UDFs, the overhead of a function call is multiplied by the number of values to process.
> While for built-in functions, they all have a bulk version, so that they are only called once with all values to process.
>
> >
> >
> >
> > 2016-12-21 10:40 GMT-03:00 imad hajj chahine <imad.hajj.chahine@gmail.com>:
> >> Hi,
> >>
> >> I have a weird behavior when i package some sql code in a function, the
> >> performance deteriorate.
> >> The same code running directly in select statement is blinking fast.
> >> ex:
> >>
> >> select date_to_str("Expiry_Date",'%Y-%m-%d') from table: exec time is 0.72s
> >>
> >> if i create a function:
> >> CREATE FUNCTION DayBracket(pdate date) returns char(10)
> >> BEGIN
> >>    return date_to_str(pdate,'%Y-%m-%d');
> >> end;
> >> grant execute on function DayBracket to public;
> >>
> >> and execute select DayBracket("Expiry_Date") from table: exec time is 24s
> >>
> >> even when i set the limit to 1 its taking the same time, so i guess it
> >> should be something related to loading the function.
>
> Using LIMIT in the outer query doesn’t help, your UDF is till called with every value.
> The final tuple is only filtered after that.
>
> >>
> >> The following function perform much better:
> >> CREATE FUNCTION DayBracketOpt(pdate date) returns char(10)
> >> BEGIN
> >>    return extract(year from pdate) || '-' || lpad(extract(month from
> >> pdate),2,'0') || '-' || lpad(dayofmonth(pdate),2,'0');
> >> end;
> >> grant execute on function DayBracketOpt to public;
>
> Hmm, a bit unexpected.  How much better is it?  How many values are processed?
>
> >>
> >> I also tried to create a Python function:
> >> CREATE FUNCTION pyDayBracket(pdate date) returns string
> >> LANGUAGE PYTHON {
> >> from time import strftime, strptime
> >> return [strftime("%Y-%m-%d", strptime(pdt,"%Y-%m-%d")) for pdt in pdate]
> >> };
> >> it is taking too long and throwing an exception:buffer size mismatch
> >> PS: the pdate array type is detected in pythons as string not as date;
> >>
> >>
> >> Also how to delete a function with dependency, I need to replace some
> >> functions which are used in others functions and views. Any flags to turn
> >> off dependency check before dropping?
>
> I think you’ll have to drop the depending functions and views first.
>
> Regards,
> Jennie
>
>
> >>
> >> Thank you.
> >>
> >> _______________________________________________
> >> users-list mailing list
> >> users-list@monetdb.org
> >> https://www.monetdb.org/mailman/listinfo/users-list
> >>
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list