I ran into a similar issue when creating my own aggregate function - the issue was that when using the mitosis pipeline it assumes your sub-aggregation is iterative (but then doesn't properly call it iteratively in the mergetable processing since it only recognises aggregations sum/count/min/max). I worked around this by changing the opt_mergetable processing to call mat.pack() for b, g and e before passing them to my aggregation function. Doing this allows the use of the mitosis pipeline where possible and still supporting non-iterative sub aggregation functions. Whether this could be considered a valid enhancement for MonetDB I'm not sure? If you want your sub-aggregations to run iteratively you need to change opt_mergetable to recognise your sub-aggregation name. It would be good if there was a dedicated namespace available for custom aggregations that user could define iterative and non-iterative sub-aggregations in that would avoid having to change the optimizer code. Regards, Scott PS: The other option for making your sub-aggregation not use the mitosis pipeline is to declare it in the aggr module - that way it gets recognised as an aggregation not supporting the iterative approach. This means that none of your query can use the mitosis pipeline though. -----Original Message----- From: users-list [mailto:users-list-bounces+scott.mathieson=pb.com@monetdb.org] On Behalf Of Niels Nes Sent: 14 August 2013 14:10 To: Communication channel for MonetDB users Subject: Re: b and g must be aligned On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.
Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.
My guess, is that the subhllagg should probably be added (recognized). Niels
Thanks! -- Miguel
On 08/13/2013 03:59 PM, Miguel Ping wrote:
I'm trying to come up with a small sample, but it is hard.
- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.
Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .
* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:
... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid
I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.
Thanks, Miguel
On 08/13/2013 11:35 AM, Martin Kersten wrote:
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.
Thanks, Martin
On 8/13/13 11:27 AM, Miguel Ping wrote:
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:
"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."
Thanks.
On 08/13/2013 08:51 AM, Stefan Manegold wrote:
Dear Miguel,
I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.
Best, Stefan
On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:
Some more info: >SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314
>SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15
it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?
On 08/12/2013 05:50 PM, Miguel Ping wrote: >Hi, I'm resurrecting this since I've been out of town and only >today I got a chance to investigate further. I've recompiled >with >-O0 to prevent optimizations from "hiding" the values, and in >the debugger I got this: > >b->U->count BUN 764314 >g->U->count BUN 17 >b->H->seq oid 0 >g->H->seq oid 0 >b->H->dense unsigned int 1 >g->H->dense unsigned int 1 > > >The stack call is as follows: > >BAThllaggr() at udf.c >AGGRsubhllaggcand() at udf.c >AGGRsubhllagg() at udf.c >malCommandCall() at mal_interpreter.c >runMALsequence() at mal_interpreter.c >DFLOWworker() at mal_dataflow.c >start_thread() at pthread_create.c >clone() at clone.S >0x0 > >-------- Original Message -------- >Subject: Re: b and g must be aligned >Date: Fri, 26 Jul 2013 11:34:53 +0100 >From: Sjoerd Mullender
>Reply-To: Communication channel for MonetDB users > >To: Communication channel for MonetDB users > > > >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >On 2013-07-26 12:23, Miguel Ping wrote: >>On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>14:04, Miguel Ping wrote: >>>>>Hi all, >>>>> >>>>>We're hitting this error "b and g must be aligned". I >>>>>tracked the src to a commit about some alignment code thing >>>>>in >>>>>gdk_calc: >>>>>http://www.mail-archive.com/checkin-list@monetdb.org/msg0973 >>>>>1.html >>>>> >>>>> >(Fix alignment conversion in compatibility code for grouped >>>>>aggregates.) >>>>> >>>>>Can you guys please explain what's the reason behind this >>>>>error? I can't understand by just looking to the src of >>>>>gdk_calc.c >>>>> >>>>>Thanks! >>When using grouped aggregates, the grouping bat must be aligned >>with the value bat. The value bat is b and contains the values >>you want to aggregate. The group bat is g and contains for >>each value in b the group (an oid) it belongs to. Equal group >>ids means the same group. These bats must be aligned, because >>we need to know for each value in b to which group it belongs. >>Aligned means: same length, and same head column values. The >>head columns must be dense (a sequence of numbers starting at >>some value, and each next value exactly one larger than the >>previous). Dense sequences are usually not stored explicitly >>in MonetDB. We only store the first value in the hseqbase >>field. So the hseqbase fields of b and g must be equal. The >>one exception to this is when the bats are both empty. This >>last exception is the change to gdk_calc.c in that changeset. >> >>-- Sjoerd Mullender >>>_______________________________________________ users-list >>>mailing listusers-list@monetdb.org >>>http://mail.monetdb.org/mailman/listinfo/users-list >>> >>> >>Thanks for your explanation. I still don't understand how can >>there be a misalignment. I would expect MonetDB to feed my hll >>aggregate functions with the correct values. Can it be that I >>may have some NULL values and the validation is failing because >>of that? >If this happens when running a SQL query, it's a bug. >I don't think NULLs have anything to do with it. NULL values >are stored in-line. >You might want to look at b->U->count, g->U->count, b->H->seq, >g->H->seq, b->H->dense, g->H->dense when the misalignment >g->H->happens >(either in the debugger or by using printf--but realize that >count and seq are not int, so %d is not going to work). >Also things like the MAL plan (prepend SQL query with EXPLAIN) >and the stack trace might be useful. > >- -- Sjoerd Mullender >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.4.13 (GNU/Linux) >Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ > >iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >GwdvU7bSZtM= >=2zSg >-----END PGP SIGNATURE----- >_______________________________________________ >users-list mailing list >users-list@monetdb.org >http://mail.monetdb.org/mailman/listinfo/users-list > > > > > > _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl ________________________________