Niels, thanks for your quick reply. I'm looking at batxml.c right now, is there a func that I should look for single and multiple groups? I'm looking at xmlagg. I don't understand why I would need both single and multiple group impls; i can only compute HLL values for single columns. Any pointers are greatly appreciated, thanks! Miguel On 11/28/2012 05:33 PM, Niels Nes wrote:
On Wed, Nov 28, 2012 at 05:27:47PM +0000, Miguel Ping wrote:
Greetings,
We're currently evaluating MonetDB for a analytical DW and so far we are happy with the results.
I am trying to implement a grouping function that calculates a a value over a set of strings, so that my queries would read like this:
select metric, udf_aggregate(string_column) from table group by metric;
for a bit of background, we're using a dinstinct value sketch called HyperLogLog http://metamarkets.com/2012/ fast-cheap-and-98-right-cardinality-estimation-for-big-data/ http://blog.aggregateknowledge.com/2012/10/25/ sketch-of-the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure/
and we're currently storing estimations for each time period (day). HLL lets you merge/aggregate a set of estimations (each estimation is a vector of numbers, we're currently storing it as a string) for an arbitrary range, and still have an accurate estimation. (I'm sure the literature doesn't call it estimations, sorry for my English)
what I would like is a custom UDF like the one provided in MonetDB src (reverse) but that would operate and behave like an aggregate function.
Right now, I'm not considering using it for types other than string (no need for polymorphic right now). Is this possible with an UDF? I found a way of registering aggregate Yes. functions on the mailing list, but the HLL is complex enough to warrant its own C impl, instead of a MAL function. Well c functions need there own mal signatures. So steps needed
c-hll-aggr implementations for single and multiple groups. Load this library into monetdb (ie mal and library file). sql create aggregate to register them.
Examples are in the batxml file(s).
Niels
Thanks, Miguel _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list