Hi, I'm trying to define an aggregate function in MonetDB to concatenate rows in select group by (GROUP_CONCAT in MySQL). Postgres does not have this but there is an example to define this function. I'm starting to use monetdb from mysql thanks to MonetDB performance for large data sets. For some small tables, we have the feature to allow users to store small amount of custom data in rows and have a web service to convert rows into columns. We have this feature because we can do it in MySQL and Postgres using GROUP_CONCAT. I'm working on an open source breeding system which stores breeding data in MySQL, environment data in Postgres and now marker data in MonetDB in Australia. The system is called KDDart. Going back to my problem is that there is the same question with GROUP_CONCAT about 3 years ago. I have been googling and trying based on the information available for 2 days now. I have learnt how to define a MAL function and make it available in SQL mode by saving it in the autoload. However, when it comes to define and write a MAL aggregate function. I'm unable to find any step-by-step example. I could not start successfully with a simple aggregate solution. I google further that we a documentation on monetdb website about defining an aggregate function in R. I installed R and recompile my MonetDB source (MonetDB-11.19.9) on my openSuSe 12.2 with a pre-compiled R (R-base-2.15.1-1.1.4.x86_64). I followed the instruction to enable rintegration and compile successfully. R integration is loaded with mserver5 instruction. But when using monetdbd it says 'set: no such property: embedr'. I did bit of C debugging but I only got Segmentation fault. I don't know what you could do to help me out. I wish I could write the GROUP_CONCAT MAL function and being able to define it permanently in .mal file and being able to link it to the SQL. I hope you can help me out. Thanks in advance. Puthick
On Apr 16, 2015, at 09:23, Puthick Hok
wrote: Hi,
I'm trying to define an aggregate function in MonetDB to concatenate rows in select group by (GROUP_CONCAT in MySQL). Postgres does not have this but there is an example to define this function.
Hai Puthick, This function is mainly related to how the query results are displayed. I wonder how much effort it is to capture this functionality in your application code? It's probably easier, as programming in MAL may not be trivial.
I'm starting to use monetdb from mysql thanks to MonetDB performance for large data sets. For some small tables, we have the feature to allow users to store small amount of custom data in rows and have a web service to convert rows into columns. We have this feature because we can do it in MySQL and Postgres using GROUP_CONCAT. I'm working on an open source breeding system which stores breeding data in MySQL, environment data in Postgres and now marker data in MonetDB in Australia. The system is called KDDart.
Interesting! Do you mean this: http://www.kddart.org/?
Going back to my problem is that there is the same question with GROUP_CONCAT about 3 years ago. I have been googling and trying based on the information available for 2 days now. I have learnt how to define a MAL function and make it available in SQL mode by saving it in the autoload. However, when it comes to define and write a MAL aggregate function. I'm unable to find any step-by-step example. I could not start successfully with a simple aggregate solution.
MAL is basically for internal use. It's not intended for end-users. Therefore, the lack of documentation. If you are really interested in programming MAL, I'm afraid that you'll need to read existing MAL code as example. However, unless you have good reasons to go for MAL, I'd recommend SQL UDF, possibly with help of R.
I google further that we a documentation on monetdb website about defining an aggregate function in R.
a.f.a.i.k., we don't have any documentations about R functions. You probably will have more luck searching in the documentations of the R community/website. We merely execute whatever R code is written in de UDF.
I installed R and recompile my MonetDB source (MonetDB-11.19.9) on my openSuSe 12.2 with a pre-compiled R (R-base-2.15.1-1.1.4.x86_64). I followed the instruction to enable rintegration and compile successfully. R integration is loaded with mserver5 instruction. But when using monetdbd it says 'set: no such property: embedr'. I did bit of C debugging but I only got Segmentation fault.
I'm sorry I don't have answers to this problem. You'll need our R experts. Please ping us again if you don't get answers. If possible, you can also try our VirtualImage or Docker images. There, our R integrations are precompiled and installed. Regards, Jennie
I don't know what you could do to help me out. I wish I could write the GROUP_CONCAT MAL function and being able to define it permanently in .mal file and being able to link it to the SQL.
I hope you can help me out.
Thanks in advance. Puthick _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Puthick, Last year I was in need of this function and I came across the following example on the mailing list: --- function group_concat(b:bat[:oid, :str]):str; value := ";"; barrier (h, t) := iterator.new(b); value := value + t; value := value + ";"; redo (h, t) := iterator.next(b); exit (h, t); return value; end group_concat; function subgroup_concat(b:bat[:oid,:any_1],g:bat[:oid,:oid],e:bat[:oid,:any_2],skip_ nils:bit) :bat[:oid,:str]; nw := aggr.count(e); nl := calc.lng(nw); bn := bat.new(:oid, :str, nl); # check 'e' has some values - this is the list of group IDs in the head # 'g' is the group to the data BAT head ID mapping # 'b' is the data column BAT that we are aggregating over barrier (grpid, t) := iterator.new(e); # select GID from TID->GID map to get matching TIDs TIDs := algebra.uselect(g,grpid); # get DATA for matching TIDs b_data := algebra.kintersect(b,TIDs); # aggregate grpval := group_concat(b_data); # Store the result for this group bat.insert(bn, grpid, grpval); redo (grpid, t) := iterator.next(e); exit (grpid, t); return bn; end subgroup_concat; --- You should place this function in aggr.mal to prevent issues with the mitosis optimizer, see https://www.monetdb.org/pipermail/users-list/2013-August/006722.html https://www.monetdb.org/pipermail/users-list/2013-August/006722.html for details on this. Register this function in SQL using: CREATE AGGREGATE group_concat (val STRING) returns STRING external name aggr.group_concat; My experience with this MAL implementation is that performance is suboptimal. For better performance you can implement this function as a C UDF for instance.. Hope this helps. Regards, Nik Van: users-list [mailto:users-list-bounces+n.h.schuiling=students.uu.nl@monetdb.org] Namens Puthick Hok Verzonden: donderdag 16 april 2015 9:23 Aan: users-list@monetdb.org Onderwerp: MAL aggregate function difficulty Hi, I'm trying to define an aggregate function in MonetDB to concatenate rows in select group by (GROUP_CONCAT in MySQL). Postgres does not have this but there is an example to define this function. I'm starting to use monetdb from mysql thanks to MonetDB performance for large data sets. For some small tables, we have the feature to allow users to store small amount of custom data in rows and have a web service to convert rows into columns. We have this feature because we can do it in MySQL and Postgres using GROUP_CONCAT. I'm working on an open source breeding system which stores breeding data in MySQL, environment data in Postgres and now marker data in MonetDB in Australia. The system is called KDDart. Going back to my problem is that there is the same question with GROUP_CONCAT about 3 years ago. I have been googling and trying based on the information available for 2 days now. I have learnt how to define a MAL function and make it available in SQL mode by saving it in the autoload. However, when it comes to define and write a MAL aggregate function. I'm unable to find any step-by-step example. I could not start successfully with a simple aggregate solution. I google further that we a documentation on monetdb website about defining an aggregate function in R. I installed R and recompile my MonetDB source (MonetDB-11.19.9) on my openSuSe 12.2 with a pre-compiled R (R-base-2.15.1-1.1.4.x86_64). I followed the instruction to enable rintegration and compile successfully. R integration is loaded with mserver5 instruction. But when using monetdbd it says 'set: no such property: embedr'. I did bit of C debugging but I only got Segmentation fault. I don't know what you could do to help me out. I wish I could write the GROUP_CONCAT MAL function and being able to define it permanently in .mal file and being able to link it to the SQL. I hope you can help me out. Thanks in advance. Puthick
Nik, thank you very much for your detailed instructions. It works.
Jennie, thank you for your comment. Now with the help, I better
understand MAL. I don't want to be a MAL expert, just enough to
explain to my boss. www.kddart.org is indeed the website about our
project.
Thanks again everyone!
Regards,
Puthick
On Sat, Apr 18, 2015 at 12:13 AM, Nik Schuiling
Hi Puthick,
Last year I was in need of this function and I came across the following example on the mailing list:
---
function group_concat(b:bat[:oid, :str]):str;
value := ";";
barrier (h, t) := iterator.new(b);
value := value + t;
value := value + ";";
redo (h, t) := iterator.next(b);
exit (h, t);
return value;
end group_concat;
function subgroup_concat(b:bat[:oid,:any_1],g:bat[:oid,:oid],e:bat[:oid,:any_2],skip_nils:bit) :bat[:oid,:str];
nw := aggr.count(e);
nl := calc.lng(nw);
bn := bat.new(:oid, :str, nl);
# check 'e' has some values - this is the list of group IDs in the head
# 'g' is the group to the data BAT head ID mapping
# 'b' is the data column BAT that we are aggregating over
barrier (grpid, t) := iterator.new(e);
# select GID from TID->GID map to get matching TIDs
TIDs := algebra.uselect(g,grpid);
# get DATA for matching TIDs
b_data := algebra.kintersect(b,TIDs);
# aggregate
grpval := group_concat(b_data);
# Store the result for this group
bat.insert(bn, grpid, grpval);
redo (grpid, t) := iterator.next(e);
exit (grpid, t);
return bn;
end subgroup_concat;
---
You should place this function in aggr.mal to prevent issues with the mitosis optimizer, see https://www.monetdb.org/pipermail/users-list/2013-August/006722.html for details on this.
Register this function in SQL using:
CREATE AGGREGATE group_concat (val STRING) returns STRING external name aggr.group_concat;
My experience with this MAL implementation is that performance is suboptimal. For better performance you can implement this function as a C UDF for instance..
Hope this helps.
Regards,
Nik
Van: users-list [mailto:users-list-bounces+n.h.schuiling=students.uu.nl@monetdb.org] Namens Puthick Hok Verzonden: donderdag 16 april 2015 9:23 Aan: users-list@monetdb.org Onderwerp: MAL aggregate function difficulty
Hi,
I'm trying to define an aggregate function in MonetDB to concatenate rows in select group by (GROUP_CONCAT in MySQL). Postgres does not have this but there is an example to define this function.
I'm starting to use monetdb from mysql thanks to MonetDB performance for large data sets. For some small tables, we have the feature to allow users to store small amount of custom data in rows and have a web service to convert rows into columns. We have this feature because we can do it in MySQL and Postgres using GROUP_CONCAT. I'm working on an open source breeding system which stores breeding data in MySQL, environment data in Postgres and now marker data in MonetDB in Australia. The system is called KDDart.
Going back to my problem is that there is the same question with GROUP_CONCAT about 3 years ago. I have been googling and trying based on the information available for 2 days now. I have learnt how to define a MAL function and make it available in SQL mode by saving it in the autoload. However, when it comes to define and write a MAL aggregate function. I'm unable to find any step-by-step example. I could not start successfully with a simple aggregate solution.
I google further that we a documentation on monetdb website about defining an aggregate function in R. I installed R and recompile my MonetDB source (MonetDB-11.19.9) on my openSuSe 12.2 with a pre-compiled R (R-base-2.15.1-1.1.4.x86_64). I followed the instruction to enable rintegration and compile successfully. R integration is loaded with mserver5 instruction. But when using monetdbd it says 'set: no such property: embedr'. I did bit of C debugging but I only got Segmentation fault.
I don't know what you could do to help me out. I wish I could write the GROUP_CONCAT MAL function and being able to define it permanently in .mal file and being able to link it to the SQL.
I hope you can help me out.
Thanks in advance. Puthick
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (3)
-
Nik Schuiling
-
Puthick Hok
-
Ying Zhang