UDF function with >3 parameters cannot benefit from using PYTHON_MAP
Hi, I created a table with 9 columns and passed 6 columns from the table to a UDF. This UDF is fully vectorizable. However, we found only single thread was used during the execution. Then, we came to think about what the problem behind and we created the two simple functions python_min_map_3 and python_min_map_4 which takes 3 and 4 parameters separately. The MonetDB Linux version we used was July 2017. If PYTHON_MAP is enabled, it would return the results of each segment processed by parallel code. The number of results depends on how many threads are used. Otherwise, a single value should be returned (e.g. because of numpy.min). The result is - python_min_map_3 returns a couple of numbers - python_min_map_4 returns a single number Even when we increased the number of the arguments, a single number was always returned. Is it a restriction in using PYTHON_MAP when the number of arguments should be no more than 3? Here is our example code below. CREATE FUNCTION python_min_map_3(x0 FLOAT, x1 FLOAT, x2 FLOAT) RETURNS FLOAT LANGUAGE PYTHON_MAP { return numpy.min(x0) }; select python_min_map_3(x0, x1, x2) from table_0; CREATE FUNCTION python_min_map_4(x0 FLOAT, x1 FLOAT, x2 FLOAT, x3 FLOAT) RETURNS FLOAT LANGUAGE PYTHON_MAP { return numpy.min(x0) }; select python_min_map_4(x0, x1, x2,x3) from table_0; NumPy in MonetDB reference: https://www.monetdb.org/blog/embedded-pythonnumpy-monetdb Best regards, Hanfeng Chen
participants (1)
-
Hanfeng Chen