b and g must be aligned

Miguel Ping

12 Aug 2013 12 Aug '13

4:50 p.m.

Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this: b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1 The stack call is as follows: BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0 -------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2013-07-26 12:23, Miguel Ping wrote:

...

On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote:

...
...
...
Hi all,

We're hitting this error "b and g must be aligned". I tracked the src to a commit about some alignment code thing in gdk_calc: http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html

(Fix alignment conversion in compatibility code for grouped

...
aggregates.)

Can you guys please explain what's the reason behind this error? I can't understand by just looking to the src of gdk_calc.c

Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender

...
_______________________________________________ users-list mailing listusers-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that?

If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful. - -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Attachments:

attachment.html (text/html — 7.4 KB)

Show replies by date

Miguel Ping

12 Aug 12 Aug

4:52 p.m.

And here's the explain: sql>explain SELECT ts_day, hllagg(h_visitors_register) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000 GROUP BY ts_day; +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s2_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s2_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

Miguel Ping

5:59 p.m.

Some more info:

...

SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...

SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values? On 08/12/2013 05:50 PM, Miguel Ping wrote:

...

Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote:

...
On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote:

...
...
...
Hi all,

We're hitting this error "b and g must be aligned". I tracked the src to a commit about some alignment code thing in gdk_calc: http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html

(Fix alignment conversion in compatibility code for grouped

...
aggregates.)

Can you guys please explain what's the reason behind this error? I can't understand by just looking to the src of gdk_calc.c

Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender

...
_______________________________________________ users-list mailing listusers-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that?

If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Stefan Manegold

13 Aug 13 Aug

7:51 a.m.

Dear Miguel, I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help. Best, Stefan On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...

Some more info:

...
SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...
SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote:

...
Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote:

...
On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote:

...
...
...
Hi all,

We're hitting this error "b and g must be aligned". I tracked the src to a commit about some alignment code thing in gdk_calc: http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html

(Fix alignment conversion in compatibility code for grouped

...
aggregates.)

Can you guys please explain what's the reason behind this error? I can't understand by just looking to the src of gdk_calc.c

Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender

...
_______________________________________________ users-list mailing listusers-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that?

If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

...

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

9:27 a.m.

The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: "If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." Thanks. On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...

Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info:

...
SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...
SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote:

...
Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote:

...
On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote:

...
...
> Hi all, > > We're hitting this error "b and g must be aligned". I tracked > the src to a commit about some alignment code thing in > gdk_calc: > http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html > > (Fix alignment conversion in compatibility code for grouped > aggregates.) > > Can you guys please explain what's the reason behind this > error? I can't understand by just looking to the src of > gdk_calc.c > > Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender

...
_______________________________________________ users-list mailing listusers-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that? If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Stefan Manegold

9:30 a.m.

Hi Miguel, If it's considered a bug in MonetDB, and unless you've already done so, please file a bug report including all information (data and code as necessary) required to reproduce the problem. Thanks! Stefan ----- Original Message -----

...

The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info:

...
SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...
SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote:

...
Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote:

...
On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote:

...
>> Hi all, >> >> We're hitting this error "b and g must be aligned". I tracked >> the src to a commit about some alignment code thing in >> gdk_calc: >> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >> >> (Fix alignment conversion in compatibility code for grouped >> aggregates.) >> >> Can you guys please explain what's the reason behind this >> error? I can't understand by just looking to the src of >> gdk_calc.c >> >> Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender

...
_______________________________________________ users-list mailing listusers-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that? If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Martin Kersten

10:35 a.m.

If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally. Thanks, Martin On 8/13/13 11:27 AM, Miguel Ping wrote:

...

The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info:

...
SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...
SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote:

...
Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote:

...
On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote:

...
>> Hi all, >> >> We're hitting this error "b and g must be aligned". I tracked >> the src to a commit about some alignment code thing in >> gdk_calc: >> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >> >> (Fix alignment conversion in compatibility code for grouped >> aggregates.) >> >> Can you guys please explain what's the reason behind this >> error? I can't understand by just looking to the src of >> gdk_calc.c >> >> Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender

...
_______________________________________________ users-list mailing listusers-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that? If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

2:59 p.m.

I'm trying to come up with a small sample, but it is hard. - I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample. Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) . * Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here: ... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible. Thanks, Miguel On 08/13/2013 11:35 AM, Martin Kersten wrote:

...

If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote:

...
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info:

...
SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...
SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote:

...
Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote:

...
On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 14:04, Miguel Ping wrote: >>> Hi all, >>> >>> We're hitting this error "b and g must be aligned". I tracked >>> the src to a commit about some alignment code thing in >>> gdk_calc: >>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>> >>> (Fix alignment conversion in compatibility code for grouped >>> aggregates.) >>> >>> Can you guys please explain what's the reason behind this >>> error? I can't understand by just looking to the src of >>> gdk_calc.c >>> >>> Thanks! When using grouped aggregates, the grouping bat must be aligned with the value bat. The value bat is b and contains the values you want to aggregate. The group bat is g and contains for each value in b the group (an oid) it belongs to. Equal group ids means the same group. These bats must be aligned, because we need to know for each value in b to which group it belongs. Aligned means: same length, and same head column values. The head columns must be dense (a sequence of numbers starting at some value, and each next value exactly one larger than the previous). Dense sequences are usually not stored explicitly in MonetDB. We only store the first value in the hseqbase field. So the hseqbase fields of b and g must be equal. The one exception to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset.

-- Sjoerd Mullender > _______________________________________________ users-list > mailing listusers-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list > > Thanks for your explanation. I still don't understand how can there be a misalignment. I would expect MonetDB to feed my hll aggregate functions with the correct values. Can it be that I may have some NULL values and the validation is failing because of that? If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

14 Aug 14 Aug

1:04 p.m.

I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step. Where can I learn more about this optimizer? Can someone shed some light? Thanks! -- Miguel On 08/13/2013 03:59 PM, Miguel Ping wrote:

...

I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote:

...
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info:

...
SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

...
SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote:

...
Hi, I'm resurrecting this since I've been out of town and only today I got a chance to investigate further. I've recompiled with -O0 to prevent optimizations from "hiding" the values, and in the debugger I got this:

b->U->count BUN 764314 g->U->count BUN 17 b->H->seq oid 0 g->H->seq oid 0 b->H->dense unsigned int 1 g->H->dense unsigned int 1

The stack call is as follows:

BAThllaggr() at udf.c AGGRsubhllaggcand() at udf.c AGGRsubhllagg() at udf.c malCommandCall() at mal_interpreter.c runMALsequence() at mal_interpreter.c DFLOWworker() at mal_dataflow.c start_thread() at pthread_create.c clone() at clone.S 0x0

-------- Original Message -------- Subject: Re: b and g must be aligned Date: Fri, 26 Jul 2013 11:34:53 +0100 From: Sjoerd Mullender Reply-To: Communication channel for MonetDB users To: Communication channel for MonetDB users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2013-07-26 12:23, Miguel Ping wrote: > On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 > 14:04, Miguel Ping wrote: >>>> Hi all, >>>> >>>> We're hitting this error "b and g must be aligned". I tracked >>>> the src to a commit about some alignment code thing in >>>> gdk_calc: >>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>> >>>> (Fix alignment conversion in compatibility code for grouped >>>> aggregates.) >>>> >>>> Can you guys please explain what's the reason behind this >>>> error? I can't understand by just looking to the src of >>>> gdk_calc.c >>>> >>>> Thanks! > When using grouped aggregates, the grouping bat must be aligned > with the value bat. The value bat is b and contains the values you > want to aggregate. The group bat is g and contains for each value > in b the group (an oid) it belongs to. Equal group ids means the > same group. These bats must be aligned, because we need to know for > each value in b to which group it belongs. Aligned means: same > length, and same head column values. The head columns must be > dense (a sequence of numbers starting at some value, and each next > value exactly one larger than the previous). Dense sequences are > usually not stored explicitly in MonetDB. We only store the first > value in the hseqbase field. So the hseqbase fields of b and g > must be equal. The one exception to this is when the bats are both > empty. This last exception is the change to gdk_calc.c in that > changeset. > > -- Sjoerd Mullender >> _______________________________________________ users-list >> mailing listusers-list@monetdb.org >> http://mail.monetdb.org/mailman/listinfo/users-list >> >> > Thanks for your explanation. I still don't understand how can > there be a misalignment. I would expect MonetDB to feed my hll > aggregate functions with the correct values. Can it be that I may > have some NULL values and the validation is failing because of > that? If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful.

- -- Sjoerd Mullender -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/

iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms GwdvU7bSZtM= =2zSg -----END PGP SIGNATURE----- _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Niels Nes

1:10 p.m.

On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...

I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized). Niels

...

Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote:

...
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info: >SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

>SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote: >Hi, I'm resurrecting this since I've been out of town and only >today I got a chance to investigate further. I've recompiled with >-O0 to prevent optimizations from "hiding" the values, and in the >debugger I got this: > >b->U->count BUN 764314 >g->U->count BUN 17 >b->H->seq oid 0 >g->H->seq oid 0 >b->H->dense unsigned int 1 >g->H->dense unsigned int 1 > > >The stack call is as follows: > >BAThllaggr() at udf.c >AGGRsubhllaggcand() at udf.c >AGGRsubhllagg() at udf.c >malCommandCall() at mal_interpreter.c >runMALsequence() at mal_interpreter.c >DFLOWworker() at mal_dataflow.c >start_thread() at pthread_create.c >clone() at clone.S >0x0 > >-------- Original Message -------- >Subject: Re: b and g must be aligned >Date: Fri, 26 Jul 2013 11:34:53 +0100 >From: Sjoerd Mullender >Reply-To: Communication channel for MonetDB users > >To: Communication channel for MonetDB users > > > >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >On 2013-07-26 12:23, Miguel Ping wrote: >>On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>14:04, Miguel Ping wrote: >>>>>Hi all, >>>>> >>>>>We're hitting this error "b and g must be aligned". I tracked >>>>>the src to a commit about some alignment code thing in >>>>>gdk_calc: >>>>>http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>> >>>>> >(Fix alignment conversion in compatibility code for grouped >>>>>aggregates.) >>>>> >>>>>Can you guys please explain what's the reason behind this >>>>>error? I can't understand by just looking to the src of >>>>>gdk_calc.c >>>>> >>>>>Thanks! >>When using grouped aggregates, the grouping bat must be aligned >>with the value bat. The value bat is b and contains the values you >>want to aggregate. The group bat is g and contains for each value >>in b the group (an oid) it belongs to. Equal group ids means the >>same group. These bats must be aligned, because we need to know for >>each value in b to which group it belongs. Aligned means: same >>length, and same head column values. The head columns must be >>dense (a sequence of numbers starting at some value, and each next >>value exactly one larger than the previous). Dense sequences are >>usually not stored explicitly in MonetDB. We only store the first >>value in the hseqbase field. So the hseqbase fields of b and g >>must be equal. The one exception to this is when the bats are both >>empty. This last exception is the change to gdk_calc.c in that >>changeset. >> >>-- Sjoerd Mullender >>>_______________________________________________ users-list >>>mailing listusers-list@monetdb.org >>>http://mail.monetdb.org/mailman/listinfo/users-list >>> >>> >>Thanks for your explanation. I still don't understand how can >>there be a misalignment. I would expect MonetDB to feed my hll >>aggregate functions with the correct values. Can it be that I may >>have some NULL values and the validation is failing because of >>that? >If this happens when running a SQL query, it's a bug. >I don't think NULLs have anything to do with it. NULL values are >stored in-line. >You might want to look at b->U->count, g->U->count, b->H->seq, >g->H->seq, b->H->dense, g->H->dense when the misalignment happens >(either in the debugger or by using printf--but realize that count and >seq are not int, so %d is not going to work). >Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >stack trace might be useful. > >- -- Sjoerd Mullender >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.4.13 (GNU/Linux) >Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ > >iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >GwdvU7bSZtM= >=2zSg >-----END PGP SIGNATURE----- >_______________________________________________ >users-list mailing list >users-list@monetdb.org >http://mail.monetdb.org/mailman/listinfo/users-list > > > > > > _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

Scott Mathieson

2:18 p.m.

I ran into a similar issue when creating my own aggregate function - the issue was that when using the mitosis pipeline it assumes your sub-aggregation is iterative (but then doesn't properly call it iteratively in the mergetable processing since it only recognises aggregations sum/count/min/max). I worked around this by changing the opt_mergetable processing to call mat.pack() for b, g and e before passing them to my aggregation function. Doing this allows the use of the mitosis pipeline where possible and still supporting non-iterative sub aggregation functions. Whether this could be considered a valid enhancement for MonetDB I'm not sure? If you want your sub-aggregations to run iteratively you need to change opt_mergetable to recognise your sub-aggregation name. It would be good if there was a dedicated namespace available for custom aggregations that user could define iterative and non-iterative sub-aggregations in that would avoid having to change the optimizer code. Regards, Scott PS: The other option for making your sub-aggregation not use the mitosis pipeline is to declare it in the aggr module - that way it gets recognised as an aggregation not supporting the iterative approach. This means that none of your query can use the mitosis pipeline though. -----Original Message----- From: users-list [mailto:users-list-bounces+scott.mathieson=pb.com@monetdb.org] On Behalf Of Niels Nes Sent: 14 August 2013 14:10 To: Communication channel for MonetDB users Subject: Re: b and g must be aligned On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...

I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized). Niels

...

Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote:

...
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote:

...
Some more info: >SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 764314

>SELECT count(distinct(ts_day)) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND ts<=1374706799000; ==> 15

it seems that b.count is the row count, while g.count is something like the distinct count, with 2 more values?

On 08/12/2013 05:50 PM, Miguel Ping wrote: >Hi, I'm resurrecting this since I've been out of town and only >today I got a chance to investigate further. I've recompiled >with >-O0 to prevent optimizations from "hiding" the values, and in >the debugger I got this: > >b->U->count BUN 764314 >g->U->count BUN 17 >b->H->seq oid 0 >g->H->seq oid 0 >b->H->dense unsigned int 1 >g->H->dense unsigned int 1 > > >The stack call is as follows: > >BAThllaggr() at udf.c >AGGRsubhllaggcand() at udf.c >AGGRsubhllagg() at udf.c >malCommandCall() at mal_interpreter.c >runMALsequence() at mal_interpreter.c >DFLOWworker() at mal_dataflow.c >start_thread() at pthread_create.c >clone() at clone.S >0x0 > >-------- Original Message -------- >Subject: Re: b and g must be aligned >Date: Fri, 26 Jul 2013 11:34:53 +0100 >From: Sjoerd Mullender >Reply-To: Communication channel for MonetDB users > >To: Communication channel for MonetDB users > > > >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >On 2013-07-26 12:23, Miguel Ping wrote: >>On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>14:04, Miguel Ping wrote: >>>>>Hi all, >>>>> >>>>>We're hitting this error "b and g must be aligned". I >>>>>tracked the src to a commit about some alignment code thing >>>>>in >>>>>gdk_calc: >>>>>http://www.mail-archive.com/checkin-list@monetdb.org/msg0973 >>>>>1.html >>>>> >>>>> >(Fix alignment conversion in compatibility code for grouped >>>>>aggregates.) >>>>> >>>>>Can you guys please explain what's the reason behind this >>>>>error? I can't understand by just looking to the src of >>>>>gdk_calc.c >>>>> >>>>>Thanks! >>When using grouped aggregates, the grouping bat must be aligned >>with the value bat. The value bat is b and contains the values >>you want to aggregate. The group bat is g and contains for >>each value in b the group (an oid) it belongs to. Equal group >>ids means the same group. These bats must be aligned, because >>we need to know for each value in b to which group it belongs. >>Aligned means: same length, and same head column values. The >>head columns must be dense (a sequence of numbers starting at >>some value, and each next value exactly one larger than the >>previous). Dense sequences are usually not stored explicitly >>in MonetDB. We only store the first value in the hseqbase >>field. So the hseqbase fields of b and g must be equal. The >>one exception to this is when the bats are both empty. This >>last exception is the change to gdk_calc.c in that changeset. >> >>-- Sjoerd Mullender >>>_______________________________________________ users-list >>>mailing listusers-list@monetdb.org >>>http://mail.monetdb.org/mailman/listinfo/users-list >>> >>> >>Thanks for your explanation. I still don't understand how can >>there be a misalignment. I would expect MonetDB to feed my hll >>aggregate functions with the correct values. Can it be that I >>may have some NULL values and the validation is failing because >>of that? >If this happens when running a SQL query, it's a bug. >I don't think NULLs have anything to do with it. NULL values >are stored in-line. >You might want to look at b->U->count, g->U->count, b->H->seq, >g->H->seq, b->H->dense, g->H->dense when the misalignment >g->H->happens >(either in the debugger or by using printf--but realize that >count and seq are not int, so %d is not going to work). >Also things like the MAL plan (prepend SQL query with EXPLAIN) >and the stack trace might be useful. > >- -- Sjoerd Mullender >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.4.13 (GNU/Linux) >Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ > >iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >GwdvU7bSZtM= >=2zSg >-----END PGP SIGNATURE----- >_______________________________________________ >users-list mailing list >users-list@monetdb.org >http://mail.monetdb.org/mailman/listinfo/users-list > > > > > > _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

2:20 p.m.

Here: *without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+ *with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+ On 08/14/2013 02:10 PM, Niels Nes wrote:

...

On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote:

...
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote:

...
Dear Miguel,

I am not aware of any "hllaggr()" function in the MonetDB release, so I assume the error occurs in your code. Not know your code at all, I'm afraid we cannot be of much help.

Best, Stefan

On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: > Some more info: >> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE > ts>=1373410800000 AND ts<=1374706799000; > ==> 764314 > >> SELECT count(distinct(ts_day)) FROM > wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND > ts<=1374706799000; > ==> 15 > > it seems that b.count is the row count, while g.count is something > like the distinct count, with 2 more values? > > On 08/12/2013 05:50 PM, Miguel Ping wrote: >> Hi, I'm resurrecting this since I've been out of town and only >> today I got a chance to investigate further. I've recompiled with >> -O0 to prevent optimizations from "hiding" the values, and in the >> debugger I got this: >> >> b->U->count BUN 764314 >> g->U->count BUN 17 >> b->H->seq oid 0 >> g->H->seq oid 0 >> b->H->dense unsigned int 1 >> g->H->dense unsigned int 1 >> >> >> The stack call is as follows: >> >> BAThllaggr() at udf.c >> AGGRsubhllaggcand() at udf.c >> AGGRsubhllagg() at udf.c >> malCommandCall() at mal_interpreter.c >> runMALsequence() at mal_interpreter.c >> DFLOWworker() at mal_dataflow.c >> start_thread() at pthread_create.c >> clone() at clone.S >> 0x0 >> >> -------- Original Message -------- >> Subject: Re: b and g must be aligned >> Date: Fri, 26 Jul 2013 11:34:53 +0100 >> From: Sjoerd Mullender >> Reply-To: Communication channel for MonetDB users >> >> To: Communication channel for MonetDB users >> >> >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 2013-07-26 12:23, Miguel Ping wrote: >>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>> 14:04, Miguel Ping wrote: >>>>>> Hi all, >>>>>> >>>>>> We're hitting this error "b and g must be aligned". I tracked >>>>>> the src to a commit about some alignment code thing in >>>>>> gdk_calc: >>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>> >>>>>> >> (Fix alignment conversion in compatibility code for grouped >>>>>> aggregates.) >>>>>> >>>>>> Can you guys please explain what's the reason behind this >>>>>> error? I can't understand by just looking to the src of >>>>>> gdk_calc.c >>>>>> >>>>>> Thanks! >>> When using grouped aggregates, the grouping bat must be aligned >>> with the value bat. The value bat is b and contains the values you >>> want to aggregate. The group bat is g and contains for each value >>> in b the group (an oid) it belongs to. Equal group ids means the >>> same group. These bats must be aligned, because we need to know for >>> each value in b to which group it belongs. Aligned means: same >>> length, and same head column values. The head columns must be >>> dense (a sequence of numbers starting at some value, and each next >>> value exactly one larger than the previous). Dense sequences are >>> usually not stored explicitly in MonetDB. We only store the first >>> value in the hseqbase field. So the hseqbase fields of b and g >>> must be equal. The one exception to this is when the bats are both >>> empty. This last exception is the change to gdk_calc.c in that >>> changeset. >>> >>> -- Sjoerd Mullender >>>> _______________________________________________ users-list >>>> mailing listusers-list@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>> >>>> >>> Thanks for your explanation. I still don't understand how can >>> there be a misalignment. I would expect MonetDB to feed my hll >>> aggregate functions with the correct values. Can it be that I may >>> have some NULL values and the validation is failing because of >>> that? >> If this happens when running a SQL query, it's a bug. >> I don't think NULLs have anything to do with it. NULL values are >> stored in-line. >> You might want to look at b->U->count, g->U->count, b->H->seq, >> g->H->seq, b->H->dense, g->H->dense when the misalignment happens >> (either in the debugger or by using printf--but realize that count and >> seq are not int, so %d is not going to work). >> Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >> stack trace might be useful. >> >> - -- Sjoerd Mullender >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.13 (GNU/Linux) >> Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >> >> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >> GwdvU7bSZtM= >> =2zSg >> -----END PGP SIGNATURE----- >> _______________________________________________ >> users-list mailing list >> users-list@monetdb.org >> http://mail.monetdb.org/mailman/listinfo/users-list >> >> >> >> >> >> > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

Niels Nes

2:55 p.m.

Miguel As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace. Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable. The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation). Niels On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...

Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote:

...
The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug:

"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful."

Thanks.

On 08/13/2013 08:51 AM, Stefan Manegold wrote: >Dear Miguel, > >I am not aware of any "hllaggr()" function in the MonetDB release, so I >assume the error occurs in your code. Not know your code at all, I'm afraid >we cannot be of much help. > >Best, >Stefan > >On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>Some more info: >>>SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>ts>=1373410800000 AND ts<=1374706799000; >>==> 764314 >> >>>SELECT count(distinct(ts_day)) FROM >>wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>ts<=1374706799000; >>==> 15 >> >>it seems that b.count is the row count, while g.count is something >>like the distinct count, with 2 more values? >> >>On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>Hi, I'm resurrecting this since I've been out of town and only >>>today I got a chance to investigate further. I've recompiled with >>>-O0 to prevent optimizations from "hiding" the values, and in the >>>debugger I got this: >>> >>>b->U->count BUN 764314 >>>g->U->count BUN 17 >>>b->H->seq oid 0 >>>g->H->seq oid 0 >>>b->H->dense unsigned int 1 >>>g->H->dense unsigned int 1 >>> >>> >>>The stack call is as follows: >>> >>>BAThllaggr() at udf.c >>>AGGRsubhllaggcand() at udf.c >>>AGGRsubhllagg() at udf.c >>>malCommandCall() at mal_interpreter.c >>>runMALsequence() at mal_interpreter.c >>>DFLOWworker() at mal_dataflow.c >>>start_thread() at pthread_create.c >>>clone() at clone.S >>>0x0 >>> >>>-------- Original Message -------- >>>Subject: Re: b and g must be aligned >>>Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>From: Sjoerd Mullender >>>Reply-To: Communication channel for MonetDB users >>> >>>To: Communication channel for MonetDB users >>> >>> >>> >>>-----BEGIN PGP SIGNED MESSAGE----- >>>Hash: SHA1 >>> >>>On 2013-07-26 12:23, Miguel Ping wrote: >>>>On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>14:04, Miguel Ping wrote: >>>>>>>Hi all, >>>>>>> >>>>>>>We're hitting this error "b and g must be aligned". I tracked >>>>>>>the src to a commit about some alignment code thing in >>>>>>>gdk_calc: >>>>>>>http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>> >>>>>>> >>>(Fix alignment conversion in compatibility code for grouped >>>>>>>aggregates.) >>>>>>> >>>>>>>Can you guys please explain what's the reason behind this >>>>>>>error? I can't understand by just looking to the src of >>>>>>>gdk_calc.c >>>>>>> >>>>>>>Thanks! >>>>When using grouped aggregates, the grouping bat must be aligned >>>>with the value bat. The value bat is b and contains the values you >>>>want to aggregate. The group bat is g and contains for each value >>>>in b the group (an oid) it belongs to. Equal group ids means the >>>>same group. These bats must be aligned, because we need to know for >>>>each value in b to which group it belongs. Aligned means: same >>>>length, and same head column values. The head columns must be >>>>dense (a sequence of numbers starting at some value, and each next >>>>value exactly one larger than the previous). Dense sequences are >>>>usually not stored explicitly in MonetDB. We only store the first >>>>value in the hseqbase field. So the hseqbase fields of b and g >>>>must be equal. The one exception to this is when the bats are both >>>>empty. This last exception is the change to gdk_calc.c in that >>>>changeset. >>>> >>>>-- Sjoerd Mullender >>>>>_______________________________________________ users-list >>>>>mailing listusers-list@monetdb.org >>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>> >>>>> >>>>Thanks for your explanation. I still don't understand how can >>>>there be a misalignment. I would expect MonetDB to feed my hll >>>>aggregate functions with the correct values. Can it be that I may >>>>have some NULL values and the validation is failing because of >>>>that? >>>If this happens when running a SQL query, it's a bug. >>>I don't think NULLs have anything to do with it. NULL values are >>>stored in-line. >>>You might want to look at b->U->count, g->U->count, b->H->seq, >>>g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>(either in the debugger or by using printf--but realize that count and >>>seq are not int, so %d is not going to work). >>>Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>stack trace might be useful. >>> >>>- -- Sjoerd Mullender >>>-----BEGIN PGP SIGNATURE----- >>>Version: GnuPG v1.4.13 (GNU/Linux) >>>Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>> >>>iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>GwdvU7bSZtM= >>>=2zSg >>>-----END PGP SIGNATURE----- >>>_______________________________________________ >>>users-list mailing list >>>users-list@monetdb.org >>>http://mail.monetdb.org/mailman/listinfo/users-list >>> >>> >>> >>> >>> >>> >>_______________________________________________ >>users-list mailing list >>users-list@monetdb.org >>http://mail.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

4:32 p.m.

I renamed the module definition in udf.mal: module batudf; TO module aggr; reinstalled, and it did the trick. I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?). Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version. Thanks -- Miguel On 08/14/2013 03:55 PM, Niels Nes wrote:

...

Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote: > The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, > BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: > > "If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment > happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." > > Thanks. > > > On 08/13/2013 08:51 AM, Stefan Manegold wrote: >> Dear Miguel, >> >> I am not aware of any "hllaggr()" function in the MonetDB release, so I >> assume the error occurs in your code. Not know your code at all, I'm afraid >> we cannot be of much help. >> >> Best, >> Stefan >> >> On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>> Some more info: >>>> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>> ts>=1373410800000 AND ts<=1374706799000; >>> ==> 764314 >>> >>>> SELECT count(distinct(ts_day)) FROM >>> wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>> ts<=1374706799000; >>> ==> 15 >>> >>> it seems that b.count is the row count, while g.count is something >>> like the distinct count, with 2 more values? >>> >>> On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>> Hi, I'm resurrecting this since I've been out of town and only >>>> today I got a chance to investigate further. I've recompiled with >>>> -O0 to prevent optimizations from "hiding" the values, and in the >>>> debugger I got this: >>>> >>>> b->U->count BUN 764314 >>>> g->U->count BUN 17 >>>> b->H->seq oid 0 >>>> g->H->seq oid 0 >>>> b->H->dense unsigned int 1 >>>> g->H->dense unsigned int 1 >>>> >>>> >>>> The stack call is as follows: >>>> >>>> BAThllaggr() at udf.c >>>> AGGRsubhllaggcand() at udf.c >>>> AGGRsubhllagg() at udf.c >>>> malCommandCall() at mal_interpreter.c >>>> runMALsequence() at mal_interpreter.c >>>> DFLOWworker() at mal_dataflow.c >>>> start_thread() at pthread_create.c >>>> clone() at clone.S >>>> 0x0 >>>> >>>> -------- Original Message -------- >>>> Subject: Re: b and g must be aligned >>>> Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>> From: Sjoerd Mullender >>>> Reply-To: Communication channel for MonetDB users >>>> >>>> To: Communication channel for MonetDB users >>>> >>>> >>>> >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> On 2013-07-26 12:23, Miguel Ping wrote: >>>>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>> 14:04, Miguel Ping wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> We're hitting this error "b and g must be aligned". I tracked >>>>>>>> the src to a commit about some alignment code thing in >>>>>>>> gdk_calc: >>>>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>>> >>>>>>>> >>>> (Fix alignment conversion in compatibility code for grouped >>>>>>>> aggregates.) >>>>>>>> >>>>>>>> Can you guys please explain what's the reason behind this >>>>>>>> error? I can't understand by just looking to the src of >>>>>>>> gdk_calc.c >>>>>>>> >>>>>>>> Thanks! >>>>> When using grouped aggregates, the grouping bat must be aligned >>>>> with the value bat. The value bat is b and contains the values you >>>>> want to aggregate. The group bat is g and contains for each value >>>>> in b the group (an oid) it belongs to. Equal group ids means the >>>>> same group. These bats must be aligned, because we need to know for >>>>> each value in b to which group it belongs. Aligned means: same >>>>> length, and same head column values. The head columns must be >>>>> dense (a sequence of numbers starting at some value, and each next >>>>> value exactly one larger than the previous). Dense sequences are >>>>> usually not stored explicitly in MonetDB. We only store the first >>>>> value in the hseqbase field. So the hseqbase fields of b and g >>>>> must be equal. The one exception to this is when the bats are both >>>>> empty. This last exception is the change to gdk_calc.c in that >>>>> changeset. >>>>> >>>>> -- Sjoerd Mullender >>>>>> _______________________________________________ users-list >>>>>> mailinglistusers-list@monetdb.org >>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>> >>>>>> >>>>> Thanks for your explanation. I still don't understand how can >>>>> there be a misalignment. I would expect MonetDB to feed my hll >>>>> aggregate functions with the correct values. Can it be that I may >>>>> have some NULL values and the validation is failing because of >>>>> that? >>>> If this happens when running a SQL query, it's a bug. >>>> I don't think NULLs have anything to do with it. NULL values are >>>> stored in-line. >>>> You might want to look at b->U->count, g->U->count, b->H->seq, >>>> g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>> (either in the debugger or by using printf--but realize that count and >>>> seq are not int, so %d is not going to work). >>>> Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>> stack trace might be useful. >>>> >>>> - -- Sjoerd Mullender >>>> -----BEGIN PGP SIGNATURE----- >>>> Version: GnuPG v1.4.13 (GNU/Linux) >>>> Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>>> >>>> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>> GwdvU7bSZtM= >>>> =2zSg >>>> -----END PGP SIGNATURE----- >>>> _______________________________________________ >>>> users-list mailing list >>>> users-list@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>> >>>> >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> users-list mailing list >>> users-list@monetdb.org >>> http://mail.monetdb.org/mailman/listinfo/users-list > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

Niels Nes

4:52 p.m.

On Wed, Aug 14, 2013 at 05:32:02PM +0100, Miguel Ping wrote:

...

I renamed the module definition in udf.mal:

module batudf; TO module aggr;

reinstalled, and it did the trick.

I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?).

mitosis (ie horizontal partitioning) only works for tables with atleast 100k tuples.

...

Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version.

Niels

...

Thanks -- Miguel

On 08/14/2013 03:55 PM, Niels Nes wrote:

...
Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote: >If it is hitting before your code, then please provide >the smallest (SQL) test case to reproduce it locally. > >Thanks, Martin > >On 8/13/13 11:27 AM, Miguel Ping wrote: >>The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, >>BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: >> >>"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment >>happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." >> >>Thanks. >> >> >>On 08/13/2013 08:51 AM, Stefan Manegold wrote: >>>Dear Miguel, >>> >>>I am not aware of any "hllaggr()" function in the MonetDB release, so I >>>assume the error occurs in your code. Not know your code at all, I'm afraid >>>we cannot be of much help. >>> >>>Best, >>>Stefan >>> >>>On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>>>Some more info: >>>>>SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>>>ts>=1373410800000 AND ts<=1374706799000; >>>>==> 764314 >>>> >>>>>SELECT count(distinct(ts_day)) FROM >>>>wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>>>ts<=1374706799000; >>>>==> 15 >>>> >>>>it seems that b.count is the row count, while g.count is something >>>>like the distinct count, with 2 more values? >>>> >>>>On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>>>Hi, I'm resurrecting this since I've been out of town and only >>>>>today I got a chance to investigate further. I've recompiled with >>>>>-O0 to prevent optimizations from "hiding" the values, and in the >>>>>debugger I got this: >>>>> >>>>>b->U->count BUN 764314 >>>>>g->U->count BUN 17 >>>>>b->H->seq oid 0 >>>>>g->H->seq oid 0 >>>>>b->H->dense unsigned int 1 >>>>>g->H->dense unsigned int 1 >>>>> >>>>> >>>>>The stack call is as follows: >>>>> >>>>>BAThllaggr() at udf.c >>>>>AGGRsubhllaggcand() at udf.c >>>>>AGGRsubhllagg() at udf.c >>>>>malCommandCall() at mal_interpreter.c >>>>>runMALsequence() at mal_interpreter.c >>>>>DFLOWworker() at mal_dataflow.c >>>>>start_thread() at pthread_create.c >>>>>clone() at clone.S >>>>>0x0 >>>>> >>>>>-------- Original Message -------- >>>>>Subject: Re: b and g must be aligned >>>>>Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>>>From: Sjoerd Mullender >>>>>Reply-To: Communication channel for MonetDB users >>>>> >>>>>To: Communication channel for MonetDB users >>>>> >>>>> >>>>> >>>>>-----BEGIN PGP SIGNED MESSAGE----- >>>>>Hash: SHA1 >>>>> >>>>>On 2013-07-26 12:23, Miguel Ping wrote: >>>>>>On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>>>14:04, Miguel Ping wrote: >>>>>>>>>Hi all, >>>>>>>>> >>>>>>>>>We're hitting this error "b and g must be aligned". I tracked >>>>>>>>>the src to a commit about some alignment code thing in >>>>>>>>>gdk_calc: >>>>>>>>>http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>>>> >>>>>>>>> >>>>>(Fix alignment conversion in compatibility code for grouped >>>>>>>>>aggregates.) >>>>>>>>> >>>>>>>>>Can you guys please explain what's the reason behind this >>>>>>>>>error? I can't understand by just looking to the src of >>>>>>>>>gdk_calc.c >>>>>>>>> >>>>>>>>>Thanks! >>>>>>When using grouped aggregates, the grouping bat must be aligned >>>>>>with the value bat. The value bat is b and contains the values you >>>>>>want to aggregate. The group bat is g and contains for each value >>>>>>in b the group (an oid) it belongs to. Equal group ids means the >>>>>>same group. These bats must be aligned, because we need to know for >>>>>>each value in b to which group it belongs. Aligned means: same >>>>>>length, and same head column values. The head columns must be >>>>>>dense (a sequence of numbers starting at some value, and each next >>>>>>value exactly one larger than the previous). Dense sequences are >>>>>>usually not stored explicitly in MonetDB. We only store the first >>>>>>value in the hseqbase field. So the hseqbase fields of b and g >>>>>>must be equal. The one exception to this is when the bats are both >>>>>>empty. This last exception is the change to gdk_calc.c in that >>>>>>changeset. >>>>>> >>>>>>-- Sjoerd Mullender >>>>>>>_______________________________________________ users-list >>>>>>>mailinglistusers-list@monetdb.org >>>>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>> >>>>>>> >>>>>>Thanks for your explanation. I still don't understand how can >>>>>>there be a misalignment. I would expect MonetDB to feed my hll >>>>>>aggregate functions with the correct values. Can it be that I may >>>>>>have some NULL values and the validation is failing because of >>>>>>that? >>>>>If this happens when running a SQL query, it's a bug. >>>>>I don't think NULLs have anything to do with it. NULL values are >>>>>stored in-line. >>>>>You might want to look at b->U->count, g->U->count, b->H->seq, >>>>>g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>>>(either in the debugger or by using printf--but realize that count and >>>>>seq are not int, so %d is not going to work). >>>>>Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>>>stack trace might be useful. >>>>> >>>>>- -- Sjoerd Mullender >>>>>-----BEGIN PGP SIGNATURE----- >>>>>Version: GnuPG v1.4.13 (GNU/Linux) >>>>>Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>>>> >>>>>iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>>>ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>>>PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>>>GwdvU7bSZtM= >>>>>=2zSg >>>>>-----END PGP SIGNATURE----- >>>>>_______________________________________________ >>>>>users-list mailing list >>>>>users-list@monetdb.org >>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>_______________________________________________ >>>>users-list mailing list >>>>users-list@monetdb.org >>>>http://mail.monetdb.org/mailman/listinfo/users-list >>_______________________________________________ >>users-list mailing list >>users-list@monetdb.org >>http://mail.monetdb.org/mailman/listinfo/users-list >_______________________________________________ >users-list mailing list >users-list@monetdb.org >http://mail.monetdb.org/mailman/listinfo/users-list > > _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

5:10 p.m.

Oh, that explains it. Although the resultset is only 50k, the initial table is much larger. Thanks. On 08/14/2013 05:52 PM, Niels Nes wrote:

...

On Wed, Aug 14, 2013 at 05:32:02PM +0100, Miguel Ping wrote:

...
I renamed the module definition in udf.mal:

module batudf; TO module aggr;

reinstalled, and it did the trick.

I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?). mitosis (ie horizontal partitioning) only works for tables with atleast 100k tuples. Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version. Niels Thanks -- Miguel

On 08/14/2013 03:55 PM, Niels Nes wrote:

...
Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote: > I'm trying to come up with a small sample, but it is hard. > > - I currently have a small dataset (~600 rows) which gives me the error. > - BUT if I export that data onto a *new* table, the error doesn't > happen. This makes it hard to provide a proper sample. > > Running explain on the same query for these two tables, the plan is > different: the error plan seems to use the new subgroup feature for > aggregates which was released in a recent MonetDB version (11.15.x?) . > > * Does monetdb keep some sort of internal table statistics that feed the > MAL planner? > * Can I force the query to use the 'old' aggregate function? > * Can you guys point me to the part where the group BAT is calculated? > my guess is that the group ids are not being calculated correctly, hence > the difference between g->U->count and b->U->count. I'm guessing the > culprit is around here: > > ... > | (X_16,r1_34,X_140) := group.subgroupdone(X_15); > | X_18 := algebra.leftfetchjoin(r1_34,X_15); > ... > | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); > #r1_r4 should be the bat* gid > > > I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' > aggregate definition) and the query works as expected. > I don't want to do a downgrade since I don't even know if the data files > are compatible. > > Thanks, > Miguel > > On 08/13/2013 11:35 AM, Martin Kersten wrote: >> If it is hitting before your code, then please provide >> the smallest (SQL) test case to reproduce it locally. >> >> Thanks, Martin >> >> On 8/13/13 11:27 AM, Miguel Ping wrote: >>> The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, >>> BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: >>> >>> "If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment >>> happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." >>> >>> Thanks. >>> >>> >>> On 08/13/2013 08:51 AM, Stefan Manegold wrote: >>>> Dear Miguel, >>>> >>>> I am not aware of any "hllaggr()" function in the MonetDB release, so I >>>> assume the error occurs in your code. Not know your code at all, I'm afraid >>>> we cannot be of much help. >>>> >>>> Best, >>>> Stefan >>>> >>>> On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>>>> Some more info: >>>>>> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>>>> ts>=1373410800000 AND ts<=1374706799000; >>>>> ==> 764314 >>>>> >>>>>> SELECT count(distinct(ts_day)) FROM >>>>> wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>>>> ts<=1374706799000; >>>>> ==> 15 >>>>> >>>>> it seems that b.count is the row count, while g.count is something >>>>> like the distinct count, with 2 more values? >>>>> >>>>> On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>>>> Hi, I'm resurrecting this since I've been out of town and only >>>>>> today I got a chance to investigate further. I've recompiled with >>>>>> -O0 to prevent optimizations from "hiding" the values, and in the >>>>>> debugger I got this: >>>>>> >>>>>> b->U->count BUN 764314 >>>>>> g->U->count BUN 17 >>>>>> b->H->seq oid 0 >>>>>> g->H->seq oid 0 >>>>>> b->H->dense unsigned int 1 >>>>>> g->H->dense unsigned int 1 >>>>>> >>>>>> >>>>>> The stack call is as follows: >>>>>> >>>>>> BAThllaggr() at udf.c >>>>>> AGGRsubhllaggcand() at udf.c >>>>>> AGGRsubhllagg() at udf.c >>>>>> malCommandCall() at mal_interpreter.c >>>>>> runMALsequence() at mal_interpreter.c >>>>>> DFLOWworker() at mal_dataflow.c >>>>>> start_thread() at pthread_create.c >>>>>> clone() at clone.S >>>>>> 0x0 >>>>>> >>>>>> -------- Original Message -------- >>>>>> Subject: Re: b and g must be aligned >>>>>> Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>>>> From: Sjoerd Mullender >>>>>> Reply-To: Communication channel for MonetDB users >>>>>> >>>>>> To: Communication channel for MonetDB users >>>>>> >>>>>> >>>>>> >>>>>> -----BEGIN PGP SIGNED MESSAGE----- >>>>>> Hash: SHA1 >>>>>> >>>>>> On 2013-07-26 12:23, Miguel Ping wrote: >>>>>>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>>>> 14:04, Miguel Ping wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> We're hitting this error "b and g must be aligned". I tracked >>>>>>>>>> the src to a commit about some alignment code thing in >>>>>>>>>> gdk_calc: >>>>>>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>>>>> >>>>>>>>>> >>>>>> (Fix alignment conversion in compatibility code for grouped >>>>>>>>>> aggregates.) >>>>>>>>>> >>>>>>>>>> Can you guys please explain what's the reason behind this >>>>>>>>>> error? I can't understand by just looking to the src of >>>>>>>>>> gdk_calc.c >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>> When using grouped aggregates, the grouping bat must be aligned >>>>>>> with the value bat. The value bat is b and contains the values you >>>>>>> want to aggregate. The group bat is g and contains for each value >>>>>>> in b the group (an oid) it belongs to. Equal group ids means the >>>>>>> same group. These bats must be aligned, because we need to know for >>>>>>> each value in b to which group it belongs. Aligned means: same >>>>>>> length, and same head column values. The head columns must be >>>>>>> dense (a sequence of numbers starting at some value, and each next >>>>>>> value exactly one larger than the previous). Dense sequences are >>>>>>> usually not stored explicitly in MonetDB. We only store the first >>>>>>> value in the hseqbase field. So the hseqbase fields of b and g >>>>>>> must be equal. The one exception to this is when the bats are both >>>>>>> empty. This last exception is the change to gdk_calc.c in that >>>>>>> changeset. >>>>>>> >>>>>>> -- Sjoerd Mullender >>>>>>>> _______________________________________________ users-list >>>>>>>> mailinglistusers-list@monetdb.org >>>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>>> >>>>>>>> >>>>>>> Thanks for your explanation. I still don't understand how can >>>>>>> there be a misalignment. I would expect MonetDB to feed my hll >>>>>>> aggregate functions with the correct values. Can it be that I may >>>>>>> have some NULL values and the validation is failing because of >>>>>>> that? >>>>>> If this happens when running a SQL query, it's a bug. >>>>>> I don't think NULLs have anything to do with it. NULL values are >>>>>> stored in-line. >>>>>> You might want to look at b->U->count, g->U->count, b->H->seq, >>>>>> g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>>>> (either in the debugger or by using printf--but realize that count and >>>>>> seq are not int, so %d is not going to work). >>>>>> Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>>>> stack trace might be useful. >>>>>> >>>>>> - -- Sjoerd Mullender >>>>>> -----BEGIN PGP SIGNATURE----- >>>>>> Version: GnuPG v1.4.13 (GNU/Linux) >>>>>> Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>>>>> >>>>>> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>>>> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>>>> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>>>> GwdvU7bSZtM= >>>>>> =2zSg >>>>>> -----END PGP SIGNATURE----- >>>>>> _______________________________________________ >>>>>> users-list mailing list >>>>>> users-list@monetdb.org >>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> users-list mailing list >>>>> users-list@monetdb.org >>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>> _______________________________________________ >>> users-list mailing list >>> users-list@monetdb.org >>> http://mail.monetdb.org/mailman/listinfo/users-list >> _______________________________________________ >> users-list mailing list >> users-list@monetdb.org >> http://mail.monetdb.org/mailman/listinfo/users-list >> >> > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list > > _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

Miguel Ping

16 Aug 16 Aug

11:19 a.m.

Just a final question? how can I remove the old aggregate? If I recompile the new aggregate under the new module namespace, the old one still exists and the query doesn't work. I've tried looking in sys.functions and updating the 'mod' column for the aggregate but that's not enough. If I try to re-create the aggregate I hit the following:

...

CREATE AGGREGATE: name 'hllagg' (clob) already in use

Thanks! On 08/14/2013 06:10 PM, Miguel Ping wrote:

...

Oh, that explains it. Although the resultset is only 50k, the initial table is much larger. Thanks.

On 08/14/2013 05:52 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 05:32:02PM +0100, Miguel Ping wrote:

...
I renamed the module definition in udf.mal:

module batudf; TO module aggr;

reinstalled, and it did the trick.

I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?). mitosis (ie horizontal partitioning) only works for tables with atleast 100k tuples. Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version. Niels Thanks -- Miguel

On 08/14/2013 03:55 PM, Niels Nes wrote:

...
Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote: > I've been tinkering a little more, if I use either the minimal_pipe, > no_mitosis_pipe or the sequential_pipe optimizers the error no > longer occurs. I've specified the set of optimizers to use, and it > seems that the optimizer step that's problematic is the *mitosis* > step. > > Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels > Thanks! > -- > Miguel > > > > On 08/13/2013 03:59 PM, Miguel Ping wrote: >> I'm trying to come up with a small sample, but it is hard. >> >> - I currently have a small dataset (~600 rows) which gives me the error. >> - BUT if I export that data onto a *new* table, the error doesn't >> happen. This makes it hard to provide a proper sample. >> >> Running explain on the same query for these two tables, the plan is >> different: the error plan seems to use the new subgroup feature for >> aggregates which was released in a recent MonetDB version (11.15.x?) . >> >> * Does monetdb keep some sort of internal table statistics that feed the >> MAL planner? >> * Can I force the query to use the 'old' aggregate function? >> * Can you guys point me to the part where the group BAT is calculated? >> my guess is that the group ids are not being calculated correctly, hence >> the difference between g->U->count and b->U->count. I'm guessing the >> culprit is around here: >> >> ... >> | (X_16,r1_34,X_140) := group.subgroupdone(X_15); >> | X_18 := algebra.leftfetchjoin(r1_34,X_15); >> ... >> | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); >> #r1_r4 should be the bat* gid >> >> >> I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' >> aggregate definition) and the query works as expected. >> I don't want to do a downgrade since I don't even know if the data files >> are compatible. >> >> Thanks, >> Miguel >> >> On 08/13/2013 11:35 AM, Martin Kersten wrote: >>> If it is hitting before your code, then please provide >>> the smallest (SQL) test case to reproduce it locally. >>> >>> Thanks, Martin >>> >>> On 8/13/13 11:27 AM, Miguel Ping wrote: >>>> The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, >>>> BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: >>>> >>>> "If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment >>>> happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." >>>> >>>> Thanks. >>>> >>>> >>>> On 08/13/2013 08:51 AM, Stefan Manegold wrote: >>>>> Dear Miguel, >>>>> >>>>> I am not aware of any "hllaggr()" function in the MonetDB release, so I >>>>> assume the error occurs in your code. Not know your code at all, I'm afraid >>>>> we cannot be of much help. >>>>> >>>>> Best, >>>>> Stefan >>>>> >>>>> On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>>>>> Some more info: >>>>>>> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>>>>> ts>=1373410800000 AND ts<=1374706799000; >>>>>> ==> 764314 >>>>>> >>>>>>> SELECT count(distinct(ts_day)) FROM >>>>>> wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>>>>> ts<=1374706799000; >>>>>> ==> 15 >>>>>> >>>>>> it seems that b.count is the row count, while g.count is something >>>>>> like the distinct count, with 2 more values? >>>>>> >>>>>> On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>>>>> Hi, I'm resurrecting this since I've been out of town and only >>>>>>> today I got a chance to investigate further. I've recompiled with >>>>>>> -O0 to prevent optimizations from "hiding" the values, and in the >>>>>>> debugger I got this: >>>>>>> >>>>>>> b->U->count BUN 764314 >>>>>>> g->U->count BUN 17 >>>>>>> b->H->seq oid 0 >>>>>>> g->H->seq oid 0 >>>>>>> b->H->dense unsigned int 1 >>>>>>> g->H->dense unsigned int 1 >>>>>>> >>>>>>> >>>>>>> The stack call is as follows: >>>>>>> >>>>>>> BAThllaggr() at udf.c >>>>>>> AGGRsubhllaggcand() at udf.c >>>>>>> AGGRsubhllagg() at udf.c >>>>>>> malCommandCall() at mal_interpreter.c >>>>>>> runMALsequence() at mal_interpreter.c >>>>>>> DFLOWworker() at mal_dataflow.c >>>>>>> start_thread() at pthread_create.c >>>>>>> clone() at clone.S >>>>>>> 0x0 >>>>>>> >>>>>>> -------- Original Message -------- >>>>>>> Subject: Re: b and g must be aligned >>>>>>> Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>>>>> From: Sjoerd Mullender >>>>>>> Reply-To: Communication channel for MonetDB users >>>>>>> >>>>>>> To: Communication channel for MonetDB users >>>>>>> >>>>>>> >>>>>>> >>>>>>> -----BEGIN PGP SIGNED MESSAGE----- >>>>>>> Hash: SHA1 >>>>>>> >>>>>>> On 2013-07-26 12:23, Miguel Ping wrote: >>>>>>>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>>>>> 14:04, Miguel Ping wrote: >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> We're hitting this error "b and g must be aligned". I tracked >>>>>>>>>>> the src to a commit about some alignment code thing in >>>>>>>>>>> gdk_calc: >>>>>>>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>> (Fix alignment conversion in compatibility code for grouped >>>>>>>>>>> aggregates.) >>>>>>>>>>> >>>>>>>>>>> Can you guys please explain what's the reason behind this >>>>>>>>>>> error? I can't understand by just looking to the src of >>>>>>>>>>> gdk_calc.c >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>> When using grouped aggregates, the grouping bat must be aligned >>>>>>>> with the value bat. The value bat is b and contains the values you >>>>>>>> want to aggregate. The group bat is g and contains for each value >>>>>>>> in b the group (an oid) it belongs to. Equal group ids means the >>>>>>>> same group. These bats must be aligned, because we need to know for >>>>>>>> each value in b to which group it belongs. Aligned means: same >>>>>>>> length, and same head column values. The head columns must be >>>>>>>> dense (a sequence of numbers starting at some value, and each next >>>>>>>> value exactly one larger than the previous). Dense sequences are >>>>>>>> usually not stored explicitly in MonetDB. We only store the first >>>>>>>> value in the hseqbase field. So the hseqbase fields of b and g >>>>>>>> must be equal. The one exception to this is when the bats are both >>>>>>>> empty. This last exception is the change to gdk_calc.c in that >>>>>>>> changeset. >>>>>>>> >>>>>>>> -- Sjoerd Mullender >>>>>>>>> _______________________________________________ users-list >>>>>>>>> mailinglistusers-list@monetdb.org >>>>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>>>> >>>>>>>>> >>>>>>>> Thanks for your explanation. I still don't understand how can >>>>>>>> there be a misalignment. I would expect MonetDB to feed my hll >>>>>>>> aggregate functions with the correct values. Can it be that I may >>>>>>>> have some NULL values and the validation is failing because of >>>>>>>> that? >>>>>>> If this happens when running a SQL query, it's a bug. >>>>>>> I don't think NULLs have anything to do with it. NULL values are >>>>>>> stored in-line. >>>>>>> You might want to look at b->U->count, g->U->count, b->H->seq, >>>>>>> g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>>>>> (either in the debugger or by using printf--but realize that count and >>>>>>> seq are not int, so %d is not going to work). >>>>>>> Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>>>>> stack trace might be useful. >>>>>>> >>>>>>> - -- Sjoerd Mullender >>>>>>> -----BEGIN PGP SIGNATURE----- >>>>>>> Version: GnuPG v1.4.13 (GNU/Linux) >>>>>>> Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>>>>>> >>>>>>> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>>>>> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>>>>> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>>>>> GwdvU7bSZtM= >>>>>>> =2zSg >>>>>>> -----END PGP SIGNATURE----- >>>>>>> _______________________________________________ >>>>>>> users-list mailing list >>>>>>> users-list@monetdb.org >>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> users-list mailing list >>>>>> users-list@monetdb.org >>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>> _______________________________________________ >>>> users-list mailing list >>>> users-list@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/users-list >>> _______________________________________________ >>> users-list mailing list >>> users-list@monetdb.org >>> http://mail.monetdb.org/mailman/listinfo/users-list >>> >>> >> _______________________________________________ >> users-list mailing list >> users-list@monetdb.org >> http://mail.monetdb.org/mailman/listinfo/users-list >> >> > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Niels Nes

11:30 a.m.

On Fri, Aug 16, 2013 at 12:19:08PM +0100, Miguel Ping wrote:

...

Just a final question? how can I remove the old aggregate?

If I recompile the new aggregate under the new module namespace, the old one still exists and the query doesn't work. I've tried looking in sys.functions and updating the 'mod' column for the aggregate but that's not enough.

If I try to re-create the aggregate I hit the following:

...
CREATE AGGREGATE: name 'hllagg' (clob) already in use Doesn't DROP AGGREGATE work?

Niels

...

Thanks!

On 08/14/2013 06:10 PM, Miguel Ping wrote:

...
Oh, that explains it. Although the resultset is only 50k, the initial table is much larger. Thanks.

On 08/14/2013 05:52 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 05:32:02PM +0100, Miguel Ping wrote:

...
I renamed the module definition in udf.mal:

module batudf; TO module aggr;

reinstalled, and it did the trick.

I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?). mitosis (ie horizontal partitioning) only works for tables with atleast 100k tuples. Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version. Niels Thanks -- Miguel

On 08/14/2013 03:55 PM, Niels Nes wrote:

...
Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +================================================================================================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_68 := language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); | | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); | | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | | language.pass(X_23); | | language.pass(X_16); | | language.pass(r1_34); | | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +=====================================================================================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | | barrier X_157 := language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); | | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | | language.pass(X_120); | | language.pass(X_58); | | language.pass(X_121); | | language.pass(X_60); | | language.pass(X_122); | | language.pass(X_62); | | language.pass(X_123); | | language.pass(X_15); | | language.pass(X_112); | | language.pass(X_113); | | language.pass(X_114); | | language.pass(X_115); | | language.pass(r1_34); | | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote: >On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote: >>I've been tinkering a little more, if I use either the minimal_pipe, >>no_mitosis_pipe or the sequential_pipe optimizers the error no >>longer occurs. I've specified the set of optimizers to use, and it >>seems that the optimizer step that's problematic is the *mitosis* >>step. >> >>Where can I learn more about this optimizer? Can someone shed some light? >Could you send us the explain outputs of both with and without >mitosis/mergetable. This may give us a hint of what goes wrong. > >My guess, is that the subhllagg should probably be added (recognized). > >Niels >>Thanks! >>-- >>Miguel >> >> >> >>On 08/13/2013 03:59 PM, Miguel Ping wrote: >>>I'm trying to come up with a small sample, but it is hard. >>> >>>- I currently have a small dataset (~600 rows) which gives me the error. >>>- BUT if I export that data onto a *new* table, the error doesn't >>>happen. This makes it hard to provide a proper sample. >>> >>>Running explain on the same query for these two tables, the plan is >>>different: the error plan seems to use the new subgroup feature for >>>aggregates which was released in a recent MonetDB version (11.15.x?) . >>> >>>* Does monetdb keep some sort of internal table statistics that feed the >>>MAL planner? >>>* Can I force the query to use the 'old' aggregate function? >>>* Can you guys point me to the part where the group BAT is calculated? >>>my guess is that the group ids are not being calculated correctly, hence >>>the difference between g->U->count and b->U->count. I'm guessing the >>>culprit is around here: >>> >>>... >>>| (X_16,r1_34,X_140) := group.subgroupdone(X_15); >>>| X_18 := algebra.leftfetchjoin(r1_34,X_15); >>>... >>>| X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); >>>#r1_r4 should be the bat* gid >>> >>> >>>I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' >>>aggregate definition) and the query works as expected. >>>I don't want to do a downgrade since I don't even know if the data files >>>are compatible. >>> >>>Thanks, >>>Miguel >>> >>>On 08/13/2013 11:35 AM, Martin Kersten wrote: >>>>If it is hitting before your code, then please provide >>>>the smallest (SQL) test case to reproduce it locally. >>>> >>>>Thanks, Martin >>>> >>>>On 8/13/13 11:27 AM, Miguel Ping wrote: >>>>>The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, >>>>>BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: >>>>> >>>>>"If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment >>>>>happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." >>>>> >>>>>Thanks. >>>>> >>>>> >>>>>On 08/13/2013 08:51 AM, Stefan Manegold wrote: >>>>>>Dear Miguel, >>>>>> >>>>>>I am not aware of any "hllaggr()" function in the MonetDB release, so I >>>>>>assume the error occurs in your code. Not know your code at all, I'm afraid >>>>>>we cannot be of much help. >>>>>> >>>>>>Best, >>>>>>Stefan >>>>>> >>>>>>On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>>>>>>Some more info: >>>>>>>>SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>>>>>>ts>=1373410800000 AND ts<=1374706799000; >>>>>>>==> 764314 >>>>>>> >>>>>>>>SELECT count(distinct(ts_day)) FROM >>>>>>>wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>>>>>>ts<=1374706799000; >>>>>>>==> 15 >>>>>>> >>>>>>>it seems that b.count is the row count, while g.count is something >>>>>>>like the distinct count, with 2 more values? >>>>>>> >>>>>>>On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>>>>>>Hi, I'm resurrecting this since I've been out of town and only >>>>>>>>today I got a chance to investigate further. I've recompiled with >>>>>>>>-O0 to prevent optimizations from "hiding" the values, and in the >>>>>>>>debugger I got this: >>>>>>>> >>>>>>>>b->U->count BUN 764314 >>>>>>>>g->U->count BUN 17 >>>>>>>>b->H->seq oid 0 >>>>>>>>g->H->seq oid 0 >>>>>>>>b->H->dense unsigned int 1 >>>>>>>>g->H->dense unsigned int 1 >>>>>>>> >>>>>>>> >>>>>>>>The stack call is as follows: >>>>>>>> >>>>>>>>BAThllaggr() at udf.c >>>>>>>>AGGRsubhllaggcand() at udf.c >>>>>>>>AGGRsubhllagg() at udf.c >>>>>>>>malCommandCall() at mal_interpreter.c >>>>>>>>runMALsequence() at mal_interpreter.c >>>>>>>>DFLOWworker() at mal_dataflow.c >>>>>>>>start_thread() at pthread_create.c >>>>>>>>clone() at clone.S >>>>>>>>0x0 >>>>>>>> >>>>>>>>-------- Original Message -------- >>>>>>>>Subject: Re: b and g must be aligned >>>>>>>>Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>>>>>>From: Sjoerd Mullender >>>>>>>>Reply-To: Communication channel for MonetDB users >>>>>>>> >>>>>>>>To: Communication channel for MonetDB users >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>-----BEGIN PGP SIGNED MESSAGE----- >>>>>>>>Hash: SHA1 >>>>>>>> >>>>>>>>On 2013-07-26 12:23, Miguel Ping wrote: >>>>>>>>>On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>>>>>>14:04, Miguel Ping wrote: >>>>>>>>>>>>Hi all, >>>>>>>>>>>> >>>>>>>>>>>>We're hitting this error "b and g must be aligned". I tracked >>>>>>>>>>>>the src to a commit about some alignment code thing in >>>>>>>>>>>>gdk_calc: >>>>>>>>>>>>http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>(Fix alignment conversion in compatibility code for grouped >>>>>>>>>>>>aggregates.) >>>>>>>>>>>> >>>>>>>>>>>>Can you guys please explain what's the reason behind this >>>>>>>>>>>>error? I can't understand by just looking to the src of >>>>>>>>>>>>gdk_calc.c >>>>>>>>>>>> >>>>>>>>>>>>Thanks! >>>>>>>>>When using grouped aggregates, the grouping bat must be aligned >>>>>>>>>with the value bat. The value bat is b and contains the values you >>>>>>>>>want to aggregate. The group bat is g and contains for each value >>>>>>>>>in b the group (an oid) it belongs to. Equal group ids means the >>>>>>>>>same group. These bats must be aligned, because we need to know for >>>>>>>>>each value in b to which group it belongs. Aligned means: same >>>>>>>>>length, and same head column values. The head columns must be >>>>>>>>>dense (a sequence of numbers starting at some value, and each next >>>>>>>>>value exactly one larger than the previous). Dense sequences are >>>>>>>>>usually not stored explicitly in MonetDB. We only store the first >>>>>>>>>value in the hseqbase field. So the hseqbase fields of b and g >>>>>>>>>must be equal. The one exception to this is when the bats are both >>>>>>>>>empty. This last exception is the change to gdk_calc.c in that >>>>>>>>>changeset. >>>>>>>>> >>>>>>>>>-- Sjoerd Mullender >>>>>>>>>>_______________________________________________ users-list >>>>>>>>>>mailinglistusers-list@monetdb.org >>>>>>>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>>>>> >>>>>>>>>> >>>>>>>>>Thanks for your explanation. I still don't understand how can >>>>>>>>>there be a misalignment. I would expect MonetDB to feed my hll >>>>>>>>>aggregate functions with the correct values. Can it be that I may >>>>>>>>>have some NULL values and the validation is failing because of >>>>>>>>>that? >>>>>>>>If this happens when running a SQL query, it's a bug. >>>>>>>>I don't think NULLs have anything to do with it. NULL values are >>>>>>>>stored in-line. >>>>>>>>You might want to look at b->U->count, g->U->count, b->H->seq, >>>>>>>>g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>>>>>>(either in the debugger or by using printf--but realize that count and >>>>>>>>seq are not int, so %d is not going to work). >>>>>>>>Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>>>>>>stack trace might be useful. >>>>>>>> >>>>>>>>- -- Sjoerd Mullender >>>>>>>>-----BEGIN PGP SIGNATURE----- >>>>>>>>Version: GnuPG v1.4.13 (GNU/Linux) >>>>>>>>Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>>>>>>> >>>>>>>>iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>>>>>>ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>>>>>>PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>>>>>>GwdvU7bSZtM= >>>>>>>>=2zSg >>>>>>>>-----END PGP SIGNATURE----- >>>>>>>>_______________________________________________ >>>>>>>>users-list mailing list >>>>>>>>users-list@monetdb.org >>>>>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>_______________________________________________ >>>>>>>users-list mailing list >>>>>>>users-list@monetdb.org >>>>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>>_______________________________________________ >>>>>users-list mailing list >>>>>users-list@monetdb.org >>>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>>_______________________________________________ >>>>users-list mailing list >>>>users-list@monetdb.org >>>>http://mail.monetdb.org/mailman/listinfo/users-list >>>> >>>> >>>_______________________________________________ >>>users-list mailing list >>>users-list@monetdb.org >>>http://mail.monetdb.org/mailman/listinfo/users-list >>> >>> >>_______________________________________________ >>users-list mailing list >>users-list@monetdb.org >>http://mail.monetdb.org/mailman/listinfo/users-list >-- >Niels Nes, Centrum Wiskunde & Informatica (CWI) >Science Park 123, 1098 XG Amsterdam, The Netherlands >room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl >url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

Miguel Ping

12:43 p.m.

My bad, I tried 'delete aggregate'. Drop aggregate works just fine. On 08/16/2013 12:30 PM, Niels Nes wrote:

...

On Fri, Aug 16, 2013 at 12:19:08PM +0100, Miguel Ping wrote:

...
Just a final question? how can I remove the old aggregate?

If I recompile the new aggregate under the new module namespace, the old one still exists and the query doesn't work. I've tried looking in sys.functions and updating the 'mod' column for the aggregate but that's not enough.

If I try to re-create the aggregate I hit the following:

...
CREATE AGGREGATE: name 'hllagg' (clob) already in use Doesn't DROP AGGREGATE work?

Niels

...
Thanks!

On 08/14/2013 06:10 PM, Miguel Ping wrote:

...
Oh, that explains it. Although the resultset is only 50k, the initial table is much larger. Thanks.

On 08/14/2013 05:52 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 05:32:02PM +0100, Miguel Ping wrote:

...
I renamed the module definition in udf.mal:

module batudf; TO module aggr;

reinstalled, and it did the trick.

I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?). mitosis (ie horizontal partitioning) only works for tables with atleast 100k tuples. Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version. Niels Thanks -- Miguel

On 08/14/2013 03:55 PM, Niels Nes wrote:

...
Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote: > Here: > > *without* mitosis > +------------------------------------------------------------------------------------------------+ > | mal | > +================================================================================================+ > | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | > | X_27 := nil:bat[:oid,:lng]; | > | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | > | barrier X_68 := language.dataflow(); | > | X_4 := sql.mvc(); | > | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | > | X_5:bat[:oid,:oid] := > sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | > | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | > | (X_11,r1_11) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | > | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | > | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | > | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | > | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | > | X_18 := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | > | (X_20,r1_26) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | > | X_22 := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | > | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | > | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | > | X_27 := algebra.leftfetchjoin(r1_34,X_23); | > | X_29 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0); > | > | (X_33,r1_49) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2); > | > | X_36 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); > | > | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | > | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | > | language.pass(X_5); | > | language.pass(X_23); | > | language.pass(X_16); | > | language.pass(r1_34); | > | exit X_68; | > | X_42 := sql.resultSet(2,1,X_27); | > | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_27); > | > | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_39); > | > | X_51 := io.stdout(); | > | sql.exportResult(X_51,X_42); | > | end s13_1; | > +------------------------------------------------------------------------------------------------+ > > > *with mitosis* > +-------------------------------------------------------------------------------------------------------------------------------------+ > | mal | > +=====================================================================================================================================+ > | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | > | X_18 := nil:bat[:oid,:lng]; | > | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | > | barrier X_157 := language.dataflow(); | > | X_4 := sql.mvc(); | > | X_63:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | > | X_56:bat[:oid,:oid] := > sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | > | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | > | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | > | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | > | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | > | X_75:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | > | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | > | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | > | (X_124,X_125,X_126) := group.subgroupdone(X_120); | > | X_127 := algebra.leftfetchjoin(X_125,X_120); | > | X_64:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | > | X_58:bat[:oid,:oid] := > sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | > | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | > | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | > | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | > | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | > | X_76:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | > | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | > | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | > | (X_128,X_129,X_130) := group.subgroupdone(X_121); | > | X_131 := algebra.leftfetchjoin(X_129,X_121); | > | X_65:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | > | X_60:bat[:oid,:oid] := > sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | > | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | > | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | > | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | > | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | > | X_77:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | > | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | > | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | > | (X_132,X_133,X_134) := group.subgroupdone(X_122); | > | X_135 := algebra.leftfetchjoin(X_133,X_122); | > | X_66:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | > | X_62:bat[:oid,:oid] := > sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | > | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | > | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | > | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | > | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | > | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | > | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | > | X_78:bat[:oid,:lng] := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | > | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | > | X_14 := > sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | > | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | > | (X_136,X_137,X_138) := group.subgroupdone(X_123); | > | X_139 := algebra.leftfetchjoin(X_137,X_123); | > | X_149 := mat.packIncrement(X_127,4); | > | X_150 := mat.packIncrement(X_149,X_131); | > | X_151 := mat.packIncrement(X_150,X_135); | > | X_15 := mat.packIncrement(X_151,X_139); | > | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | > | X_18 := algebra.leftfetchjoin(r1_34,X_15); | > | X_89:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,0,4); > | > | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,0,4); > | > | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | > | X_91:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,1,4); > | > | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,1,4); > | > | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | > | X_93:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,2,4); > | > | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,2,4); > | > | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | > | X_95:bat[:oid,:str] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",0,3,4); > | > | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",2,3,4); > | > | X_24 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_register",1); > | > | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | > | X_153 := mat.packIncrement(X_145,4); | > | X_154 := mat.packIncrement(X_153,X_146); | > | X_155 := mat.packIncrement(X_154,X_147); | > | X_26 := mat.packIncrement(X_155,X_148); | > | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | > | language.pass(X_56); | > | language.pass(X_120); | > | language.pass(X_58); | > | language.pass(X_121); | > | language.pass(X_60); | > | language.pass(X_122); | > | language.pass(X_62); | > | language.pass(X_123); | > | language.pass(X_15); | > | language.pass(X_112); | > | language.pass(X_113); | > | language.pass(X_114); | > | language.pass(X_115); | > | language.pass(r1_34); | > | exit X_157; | > | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | > | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | > | X_30 := sql.resultSet(2,1,X_18); | > | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigint",64,0,X_18); > | > | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",1881,0,X_27); > | > | X_39 := io.stdout(); | > | sql.exportResult(X_39,X_30); | > | end s14_1; | > +-------------------------------------------------------------------------------------------------------------------------------------+ > > > On 08/14/2013 02:10 PM, Niels Nes wrote: >> On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote: >>> I've been tinkering a little more, if I use either the minimal_pipe, >>> no_mitosis_pipe or the sequential_pipe optimizers the error no >>> longer occurs. I've specified the set of optimizers to use, and it >>> seems that the optimizer step that's problematic is the *mitosis* >>> step. >>> >>> Where can I learn more about this optimizer? Can someone shed some light? >> Could you send us the explain outputs of both with and without >> mitosis/mergetable. This may give us a hint of what goes wrong. >> >> My guess, is that the subhllagg should probably be added (recognized). >> >> Niels >>> Thanks! >>> -- >>> Miguel >>> >>> >>> >>> On 08/13/2013 03:59 PM, Miguel Ping wrote: >>>> I'm trying to come up with a small sample, but it is hard. >>>> >>>> - I currently have a small dataset (~600 rows) which gives me the error. >>>> - BUT if I export that data onto a *new* table, the error doesn't >>>> happen. This makes it hard to provide a proper sample. >>>> >>>> Running explain on the same query for these two tables, the plan is >>>> different: the error plan seems to use the new subgroup feature for >>>> aggregates which was released in a recent MonetDB version (11.15.x?) . >>>> >>>> * Does monetdb keep some sort of internal table statistics that feed the >>>> MAL planner? >>>> * Can I force the query to use the 'old' aggregate function? >>>> * Can you guys point me to the part where the group BAT is calculated? >>>> my guess is that the group ids are not being calculated correctly, hence >>>> the difference between g->U->count and b->U->count. I'm guessing the >>>> culprit is around here: >>>> >>>> ... >>>> | (X_16,r1_34,X_140) := group.subgroupdone(X_15); >>>> | X_18 := algebra.leftfetchjoin(r1_34,X_15); >>>> ... >>>> | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); >>>> #r1_r4 should be the bat* gid >>>> >>>> >>>> I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' >>>> aggregate definition) and the query works as expected. >>>> I don't want to do a downgrade since I don't even know if the data files >>>> are compatible. >>>> >>>> Thanks, >>>> Miguel >>>> >>>> On 08/13/2013 11:35 AM, Martin Kersten wrote: >>>>> If it is hitting before your code, then please provide >>>>> the smallest (SQL) test case to reproduce it locally. >>>>> >>>>> Thanks, Martin >>>>> >>>>> On 8/13/13 11:27 AM, Miguel Ping wrote: >>>>>> The query calls a custom aggregate function, but the error occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, >>>>>> BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: >>>>>> >>>>>> "If this happens when running a SQL query, it's a bug. I don't think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment >>>>>> happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." >>>>>> >>>>>> Thanks. >>>>>> >>>>>> >>>>>> On 08/13/2013 08:51 AM, Stefan Manegold wrote: >>>>>>> Dear Miguel, >>>>>>> >>>>>>> I am not aware of any "hllaggr()" function in the MonetDB release, so I >>>>>>> assume the error occurs in your code. Not know your code at all, I'm afraid >>>>>>> we cannot be of much help. >>>>>>> >>>>>>> Best, >>>>>>> Stefan >>>>>>> >>>>>>> On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>>>>>>> Some more info: >>>>>>>>> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>>>>>>> ts>=1373410800000 AND ts<=1374706799000; >>>>>>>> ==> 764314 >>>>>>>> >>>>>>>>> SELECT count(distinct(ts_day)) FROM >>>>>>>> wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>>>>>>> ts<=1374706799000; >>>>>>>> ==> 15 >>>>>>>> >>>>>>>> it seems that b.count is the row count, while g.count is something >>>>>>>> like the distinct count, with 2 more values? >>>>>>>> >>>>>>>> On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>>>>>>> Hi, I'm resurrecting this since I've been out of town and only >>>>>>>>> today I got a chance to investigate further. I've recompiled with >>>>>>>>> -O0 to prevent optimizations from "hiding" the values, and in the >>>>>>>>> debugger I got this: >>>>>>>>> >>>>>>>>> b->U->count BUN 764314 >>>>>>>>> g->U->count BUN 17 >>>>>>>>> b->H->seq oid 0 >>>>>>>>> g->H->seq oid 0 >>>>>>>>> b->H->dense unsigned int 1 >>>>>>>>> g->H->dense unsigned int 1 >>>>>>>>> >>>>>>>>> >>>>>>>>> The stack call is as follows: >>>>>>>>> >>>>>>>>> BAThllaggr() at udf.c >>>>>>>>> AGGRsubhllaggcand() at udf.c >>>>>>>>> AGGRsubhllagg() at udf.c >>>>>>>>> malCommandCall() at mal_interpreter.c >>>>>>>>> runMALsequence() at mal_interpreter.c >>>>>>>>> DFLOWworker() at mal_dataflow.c >>>>>>>>> start_thread() at pthread_create.c >>>>>>>>> clone() at clone.S >>>>>>>>> 0x0 >>>>>>>>> >>>>>>>>> -------- Original Message -------- >>>>>>>>> Subject: Re: b and g must be aligned >>>>>>>>> Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>>>>>>> From: Sjoerd Mullender >>>>>>>>> Reply-To: Communication channel for MonetDB users >>>>>>>>> >>>>>>>>> To: Communication channel for MonetDB users >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -----BEGIN PGP SIGNED MESSAGE----- >>>>>>>>> Hash: SHA1 >>>>>>>>> >>>>>>>>> On 2013-07-26 12:23, Miguel Ping wrote: >>>>>>>>>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On 2013-07-25 >>>>>>>>>> 14:04, Miguel Ping wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> We're hitting this error "b and g must be aligned". I tracked >>>>>>>>>>>>> the src to a commit about some alignment code thing in >>>>>>>>>>>>> gdk_calc: >>>>>>>>>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg09731.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> (Fix alignment conversion in compatibility code for grouped >>>>>>>>>>>>> aggregates.) >>>>>>>>>>>>> >>>>>>>>>>>>> Can you guys please explain what's the reason behind this >>>>>>>>>>>>> error? I can't understand by just looking to the src of >>>>>>>>>>>>> gdk_calc.c >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks! >>>>>>>>>> When using grouped aggregates, the grouping bat must be aligned >>>>>>>>>> with the value bat. The value bat is b and contains the values you >>>>>>>>>> want to aggregate. The group bat is g and contains for each value >>>>>>>>>> in b the group (an oid) it belongs to. Equal group ids means the >>>>>>>>>> same group. These bats must be aligned, because we need to know for >>>>>>>>>> each value in b to which group it belongs. Aligned means: same >>>>>>>>>> length, and same head column values. The head columns must be >>>>>>>>>> dense (a sequence of numbers starting at some value, and each next >>>>>>>>>> value exactly one larger than the previous). Dense sequences are >>>>>>>>>> usually not stored explicitly in MonetDB. We only store the first >>>>>>>>>> value in the hseqbase field. So the hseqbase fields of b and g >>>>>>>>>> must be equal. The one exception to this is when the bats are both >>>>>>>>>> empty. This last exception is the change to gdk_calc.c in that >>>>>>>>>> changeset. >>>>>>>>>> >>>>>>>>>> -- Sjoerd Mullender >>>>>>>>>>> _______________________________________________ users-list >>>>>>>>>>> mailinglistusers-list@monetdb.org >>>>>>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Thanks for your explanation. I still don't understand how can >>>>>>>>>> there be a misalignment. I would expect MonetDB to feed my hll >>>>>>>>>> aggregate functions with the correct values. Can it be that I may >>>>>>>>>> have some NULL values and the validation is failing because of >>>>>>>>>> that? >>>>>>>>> If this happens when running a SQL query, it's a bug. >>>>>>>>> I don't think NULLs have anything to do with it. NULL values are >>>>>>>>> stored in-line. >>>>>>>>> You might want to look at b->U->count, g->U->count, b->H->seq, >>>>>>>>> g->H->seq, b->H->dense, g->H->dense when the misalignment happens >>>>>>>>> (either in the debugger or by using printf--but realize that count and >>>>>>>>> seq are not int, so %d is not going to work). >>>>>>>>> Also things like the MAL plan (prepend SQL query with EXPLAIN) and the >>>>>>>>> stack trace might be useful. >>>>>>>>> >>>>>>>>> - -- Sjoerd Mullender >>>>>>>>> -----BEGIN PGP SIGNATURE----- >>>>>>>>> Version: GnuPG v1.4.13 (GNU/Linux) >>>>>>>>> Comment: Using GnuPG with Thunderbird -http://www.enigmail.net/ >>>>>>>>> >>>>>>>>> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn/VL >>>>>>>>> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5LQbs >>>>>>>>> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3Gms >>>>>>>>> GwdvU7bSZtM= >>>>>>>>> =2zSg >>>>>>>>> -----END PGP SIGNATURE----- >>>>>>>>> _______________________________________________ >>>>>>>>> users-list mailing list >>>>>>>>> users-list@monetdb.org >>>>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users-list mailing list >>>>>>>> users-list@monetdb.org >>>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>> _______________________________________________ >>>>>> users-list mailing list >>>>>> users-list@monetdb.org >>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>> _______________________________________________ >>>>> users-list mailing list >>>>> users-list@monetdb.org >>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>> >>>>> >>>> _______________________________________________ >>>> users-list mailing list >>>> users-list@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>> >>>> >>> _______________________________________________ >>> users-list mailing list >>> users-list@monetdb.org >>> http://mail.monetdb.org/mailman/listinfo/users-list >> -- >> Niels Nes, Centrum Wiskunde & Informatica (CWI) >> Science Park 123, 1098 XG Amsterdam, The Netherlands >> room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl >> url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl

Scott Mathieson

4:13 p.m.

Hi Miguel, I didn't do that much testing in terms of performance as the aggregation we were creating was non-iterative - what I was trying to do was ensure that when we used my custom aggregation in the same query as other iterative aggregations e.g. count() or sum() - we could still use mitosis (and therefore parallel computation) of the group aggregates for those that supported it. I can provide you with the sources if you wish - there are only a couple of changes to opt_mergetable and opt_prelude to detect the new modules I've defined and to either run it iteratively (like the existing aggr module entries or to call mat.pack() on the inputs to the aggregation function to ensure you get a materialised Bat of the complete data when you want to use the non-iterative approach. This was just experimental work so it's not been committed or approved in any way by the MonetDB team, nor can I make any claims as to the real efficacy of this approach :-) Regards, Scott -----Original Message----- From: users-list [mailto:users-list-bounces+scott.mathieson=pb.com@monetdb.org] On Behalf Of Miguel Ping Sent: 14 August 2013 17:32 To: users-list@monetdb.org Subject: Re: b and g must be aligned I renamed the module definition in udf.mal: module batudf; TO module aggr; reinstalled, and it did the trick. I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?). Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version. Thanks -- Miguel On 08/14/2013 03:55 PM, Niels Nes wrote:

...

Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +==================================================================== +============================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | barrier X_68 := | language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0); | | (X_33,r1_49) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2); | | X_36 := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | language.pass(X_23); | language.pass(X_16); | | language.pass(r1_34); | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigin | t",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",188 | 1,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +==================================================================== +=================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | barrier X_157 := | language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,3,4); | | X_24 := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | language.pass(X_120); | | language.pass(X_58); | language.pass(X_121); | language.pass(X_60); | | language.pass(X_122); | language.pass(X_62); | | language.pass(X_123); | language.pass(X_15); | | language.pass(X_112); | language.pass(X_113); | | language.pass(X_114); | language.pass(X_115); | | language.pass(r1_34); | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigin | t",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",188 | 1,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := | batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote:

...
If it is hitting before your code, then please provide the smallest (SQL) test case to reproduce it locally.

Thanks, Martin

On 8/13/13 11:27 AM, Miguel Ping wrote: > The query calls a custom aggregate function, but the error > occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: > > "If this happens when running a SQL query, it's a bug. I don't > think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." > > Thanks. > > > On 08/13/2013 08:51 AM, Stefan Manegold wrote: >> Dear Miguel, >> >> I am not aware of any "hllaggr()" function in the MonetDB >> release, so I assume the error occurs in your code. Not know >> your code at all, I'm afraid we cannot be of much help. >> >> Best, >> Stefan >> >> On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>> Some more info: >>>> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>> ts>=1373410800000 AND ts<=1374706799000; >>> ==> 764314 >>> >>>> SELECT count(distinct(ts_day)) FROM >>> wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>> ts<=1374706799000; ==> 15 >>> >>> it seems that b.count is the row count, while g.count is >>> something like the distinct count, with 2 more values? >>> >>> On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>> Hi, I'm resurrecting this since I've been out of town and >>>> only today I got a chance to investigate further. I've >>>> recompiled with >>>> -O0 to prevent optimizations from "hiding" the values, and in >>>> the debugger I got this: >>>> >>>> b->U->count BUN 764314 >>>> g->U->count BUN 17 >>>> b->H->seq oid 0 >>>> g->H->seq oid 0 >>>> b->H->dense unsigned int 1 >>>> g->H->dense unsigned int 1 >>>> >>>> >>>> The stack call is as follows: >>>> >>>> BAThllaggr() at udf.c >>>> AGGRsubhllaggcand() at udf.c >>>> AGGRsubhllagg() at udf.c >>>> malCommandCall() at mal_interpreter.c >>>> runMALsequence() at mal_interpreter.c >>>> DFLOWworker() at mal_dataflow.c >>>> start_thread() at pthread_create.c >>>> clone() at clone.S >>>> 0x0 >>>> >>>> -------- Original Message -------- >>>> Subject: Re: b and g must be aligned >>>> Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>> From: Sjoerd Mullender >>>> Reply-To: Communication channel for MonetDB users >>>> >>>> To: Communication channel for MonetDB users >>>> >>>> >>>> >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> On 2013-07-26 12:23, Miguel Ping wrote: >>>>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On >>>>> 2013-07-25 14:04, Miguel Ping wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> We're hitting this error "b and g must be aligned". I >>>>>>>> tracked the src to a commit about some alignment code >>>>>>>> thing in >>>>>>>> gdk_calc: >>>>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg0 >>>>>>>> 9731.html >>>>>>>> >>>>>>>> >>>> (Fix alignment conversion in compatibility code for grouped >>>>>>>> aggregates.) >>>>>>>> >>>>>>>> Can you guys please explain what's the reason behind this >>>>>>>> error? I can't understand by just looking to the src of >>>>>>>> gdk_calc.c >>>>>>>> >>>>>>>> Thanks! >>>>> When using grouped aggregates, the grouping bat must be >>>>> aligned with the value bat. The value bat is b and contains >>>>> the values you want to aggregate. The group bat is g and >>>>> contains for each value in b the group (an oid) it belongs >>>>> to. Equal group ids means the same group. These bats must >>>>> be aligned, because we need to know for each value in b to >>>>> which group it belongs. Aligned means: same length, and same >>>>> head column values. The head columns must be dense (a >>>>> sequence of numbers starting at some value, and each next >>>>> value exactly one larger than the previous). Dense >>>>> sequences are usually not stored explicitly in MonetDB. We >>>>> only store the first value in the hseqbase field. So the >>>>> hseqbase fields of b and g must be equal. The one exception >>>>> to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset. >>>>> >>>>> -- Sjoerd Mullender >>>>>> _______________________________________________ users-list >>>>>> mailinglistusers-list@monetdb.org >>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>> >>>>>> >>>>> Thanks for your explanation. I still don't understand how >>>>> can there be a misalignment. I would expect MonetDB to feed >>>>> my hll aggregate functions with the correct values. Can it >>>>> be that I may have some NULL values and the validation is >>>>> failing because of that? >>>> If this happens when running a SQL query, it's a bug. >>>> I don't think NULLs have anything to do with it. NULL values >>>> are stored in-line. >>>> You might want to look at b->U->count, g->U->count, >>>> b->H->seq, >>>> g->H->seq, b->H->dense, g->H->dense when the misalignment >>>> g->H->happens >>>> (either in the debugger or by using printf--but realize that >>>> count and seq are not int, so %d is not going to work). >>>> Also things like the MAL plan (prepend SQL query with >>>> EXPLAIN) and the stack trace might be useful. >>>> >>>> - -- Sjoerd Mullender >>>> -----BEGIN PGP SIGNATURE----- >>>> Version: GnuPG v1.4.13 (GNU/Linux) >>>> Comment: Using GnuPG with Thunderbird >>>> -http://www.enigmail.net/ >>>> >>>> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn >>>> /VL >>>> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5L >>>> ttSsmAF+Qbs >>>> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3 >>>> Gms >>>> GwdvU7bSZtM= >>>> =2zSg >>>> -----END PGP SIGNATURE----- >>>> _______________________________________________ >>>> users-list mailing list >>>> users-list@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>> >>>> >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> users-list mailing list >>> users-list@monetdb.org >>> http://mail.monetdb.org/mailman/listinfo/users-list > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list ________________________________

Miguel Ping

5:20 p.m.

Thanks Scott, I'm willing to give it a try, I'm sure I'll learn something along the way. I still think it's a bug; although putting the custom function in the aggr module fixes it, I'm sure the mitosis optimizer would be nice for some plans. And my aggregation is iterative anyway, so I still don't know if the BATgroupaggrinit function is buggy for this scenario, or it's the mitosis optimizer itself. Can you attach the sources? Thanks! On 08/16/2013 05:13 PM, Scott Mathieson wrote:

...

Hi Miguel,

I didn't do that much testing in terms of performance as the aggregation we were creating was non-iterative - what I was trying to do was ensure that when we used my custom aggregation in the same query as other iterative aggregations e.g. count() or sum() - we could still use mitosis (and therefore parallel computation) of the group aggregates for those that supported it.

I can provide you with the sources if you wish - there are only a couple of changes to opt_mergetable and opt_prelude to detect the new modules I've defined and to either run it iteratively (like the existing aggr module entries or to call mat.pack() on the inputs to the aggregation function to ensure you get a materialised Bat of the complete data when you want to use the non-iterative approach.

This was just experimental work so it's not been committed or approved in any way by the MonetDB team, nor can I make any claims as to the real efficacy of this approach :-)

Regards,

Scott

-----Original Message----- From: users-list [mailto:users-list-bounces+scott.mathieson=pb.com@monetdb.org] On Behalf Of Miguel Ping Sent: 14 August 2013 17:32 To: users-list@monetdb.org Subject: Re: b and g must be aligned

I renamed the module definition in udf.mal:

module batudf; TO module aggr;

reinstalled, and it did the trick.

I still don't understand how it works with a fresh table. I'm guessing there's some history feeding the mal planner (queryhistory or callhistory steps?).

Scott, did you do any tests to check the difference beween iterative and non-iterative versions? And finally, can you share the patch you mention? I'm interested in applying the patch if it's much faster than the "regular" version.

Thanks -- Miguel

On 08/14/2013 03:55 PM, Niels Nes wrote:

...
Miguel

As pointed out by scott, the easiest fix is to put your aggregation function (as we expect) into the aggr mal namespace.

Then mitosis won't trigger for queries with your aggregation, but other queries still get mitosis/mergetable.

The complex solution is to add your case (2 step aggregation) into the mergetable code (and fix mitosis that it doesn't bail out on your aggregation).

Niels

On Wed, Aug 14, 2013 at 03:20:06PM +0100, Miguel Ping wrote:

...
Here:

*without* mitosis +------------------------------------------------------------------------------------------------+ | mal | +==================================================================== +============================+ | function user.s13_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_27 := nil:bat[:oid,:lng]; | | X_39:bat[:oid,:str] := nil:bat[:oid,:str]; | barrier X_68 := | language.dataflow(); | | X_4 := sql.mvc(); | | X_8 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0); | | X_5:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07"); | | X_65 := algebra.subselect(X_8,X_5,A0,A1,true,true,false); | | (X_11,r1_11) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2); | | X_66 := algebra.subselect(r1_11,A0,A1,true,true,false); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_67 := algebra.subselect(X_14,X_5,A0,A1,true,true,false); | | X_16 := sql.subdelta(X_65,X_5,X_11,X_66,X_67); | | X_18 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0); | | (X_20,r1_26) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2); | | X_22 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_23 := sql.projectdelta(X_16,X_18,X_20,r1_26,X_22); | | (X_24,r1_34,r2_34) := group.subgroupdone(X_23); | | X_27 := algebra.leftfetchjoin(r1_34,X_23); | | X_29 := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0); | | (X_33,r1_49) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2); | | X_36 := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",1); | | X_38 := sql.projectdelta(X_16,X_29,X_33,r1_49,X_36); | | X_39:bat[:oid,:str] := batudf.subhllagg(X_38,X_24,r1_34,true); | | language.pass(X_5); | language.pass(X_23); | language.pass(X_16); | | language.pass(r1_34); | exit X_68; | | X_42 := sql.resultSet(2,1,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigin | t",64,0,X_27); | | sql.rsColumn(X_42,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",188 | 1,0,X_39); | | X_51 := io.stdout(); | | sql.exportResult(X_51,X_42); | | end s13_1; | +------------------------------------------------------------------------------------------------+

*with mitosis* +-------------------------------------------------------------------------------------------------------------------------------------+ | mal | +==================================================================== +=================================================================+ | function user.s14_1{autoCommit=true}(A0:lng,A1:lng):void; | | X_18 := nil:bat[:oid,:lng]; | | X_27:bat[:oid,:str] := nil:bat[:oid,:str]; | barrier X_157 := | language.dataflow(); | | X_4 := sql.mvc(); | | X_63:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,0,4); | | X_56:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",0,4); | | X_104 := algebra.subselect(X_63,X_56,A0,A1,true,true,false); | | (X_67:bat[:oid,:oid] ,X_68:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,0,4); | | X_108 := algebra.subselect(X_68,A0,A1,true,true,false); | | X_112 := sql.subdelta(X_104,X_56,X_67,X_108); | | X_75:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,0,4); | | (X_79:bat[:oid,:oid] ,X_80:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,0,4); | | X_120 := sql.projectdelta(X_112,X_75,X_79,X_80); | | (X_124,X_125,X_126) := group.subgroupdone(X_120); | | X_127 := algebra.leftfetchjoin(X_125,X_120); | | X_64:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,1,4); | | X_58:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",1,4); | | X_105 := algebra.subselect(X_64,X_58,A0,A1,true,true,false); | | (X_69:bat[:oid,:oid] ,X_70:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,1,4); | | X_109 := algebra.subselect(X_70,A0,A1,true,true,false); | | X_113 := sql.subdelta(X_105,X_58,X_69,X_109); | | X_76:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,1,4); | | (X_81:bat[:oid,:oid] ,X_82:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,1,4); | | X_121 := sql.projectdelta(X_113,X_76,X_81,X_82); | | (X_128,X_129,X_130) := group.subgroupdone(X_121); | | X_131 := algebra.leftfetchjoin(X_129,X_121); | | X_65:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,2,4); | | X_60:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",2,4); | | X_106 := algebra.subselect(X_65,X_60,A0,A1,true,true,false); | | (X_71:bat[:oid,:oid] ,X_72:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,2,4); | | X_110 := algebra.subselect(X_72,A0,A1,true,true,false); | | X_114 := sql.subdelta(X_106,X_60,X_71,X_110); | | X_77:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,2,4); | | (X_83:bat[:oid,:oid] ,X_84:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,2,4); | | X_122 := sql.projectdelta(X_114,X_77,X_83,X_84); | | (X_132,X_133,X_134) := group.subgroupdone(X_122); | | X_135 := algebra.leftfetchjoin(X_133,X_122); | | X_66:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",0,3,4); | | X_62:bat[:oid,:oid] := sql.tid(X_4,"wa_sapo_pt_audience","kpi_2013_07",3,4); | | X_107 := algebra.subselect(X_66,X_62,A0,A1,true,true,false); | | (X_73:bat[:oid,:oid] ,X_74:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",2,3,4); | | X_111 := algebra.subselect(X_74,A0,A1,true,true,false); | | X_10 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts",1); | | X_53 := algebra.subselect(X_10,X_62,A0,A1,true,true,false); | | X_115 := sql.subdelta(X_107,X_62,X_73,X_111,X_53); | | X_78:bat[:oid,:lng] := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",0,3,4); | | (X_85:bat[:oid,:oid] ,X_86:bat[:oid,:lng] ) := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",2,3,4); | | X_14 := sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","ts_day",1); | | X_123 := sql.projectdelta(X_115,X_78,X_85,X_86,X_14); | | (X_136,X_137,X_138) := group.subgroupdone(X_123); | | X_139 := algebra.leftfetchjoin(X_137,X_123); | | X_149 := mat.packIncrement(X_127,4); | | X_150 := mat.packIncrement(X_149,X_131); | | X_151 := mat.packIncrement(X_150,X_135); | | X_15 := mat.packIncrement(X_151,X_139); | | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | | X_18 := algebra.leftfetchjoin(r1_34,X_15); | | X_89:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,0,4); | | (X_96:bat[:oid,:oid] ,X_97:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,0,4); | | X_145 := sql.projectdelta(X_112,X_89,X_96,X_97); | | X_91:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,1,4); | | (X_98:bat[:oid,:oid] ,X_99:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,1,4); | | X_146 := sql.projectdelta(X_113,X_91,X_98,X_99); | | X_93:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,2,4); | | (X_100:bat[:oid,:oid] ,X_101:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,2,4); | | X_147 := sql.projectdelta(X_114,X_93,X_100,X_101); | | X_95:bat[:oid,:str] := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",0,3,4); | | (X_102:bat[:oid,:oid] ,X_103:bat[:oid,:str] ) := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",2,3,4); | | X_24 := | sql.bind(X_4,"wa_sapo_pt_audience","kpi_2013_07","h_visitors_regist | er",1); | | X_148 := sql.projectdelta(X_115,X_95,X_102,X_103,X_24); | | X_153 := mat.packIncrement(X_145,4); | | X_154 := mat.packIncrement(X_153,X_146); | | X_155 := mat.packIncrement(X_154,X_147); | | X_26 := mat.packIncrement(X_155,X_148); | | X_27:bat[:oid,:str] := batudf.subhllagg(X_26,X_16,r1_34,true); | | language.pass(X_56); | language.pass(X_120); | | language.pass(X_58); | language.pass(X_121); | language.pass(X_60); | | language.pass(X_122); | language.pass(X_62); | | language.pass(X_123); | language.pass(X_15); | | language.pass(X_112); | language.pass(X_113); | | language.pass(X_114); | language.pass(X_115); | | language.pass(r1_34); | exit X_157; | | X_152:bat[:oid,:lng] := nil:bat[:oid,:lng]; | | X_156:bat[:oid,:str] := nil:bat[:oid,:str]; | | X_30 := sql.resultSet(2,1,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","ts_day","bigin | t",64,0,X_18); | | sql.rsColumn(X_30,"wa_sapo_pt_audience.kpi_2013_07","L1","clob",188 | 1,0,X_27); | | X_39 := io.stdout(); | | sql.exportResult(X_39,X_30); | | end s14_1; | +-------------------------------------------------------------------------------------------------------------------------------------+

On 08/14/2013 02:10 PM, Niels Nes wrote:

...
On Wed, Aug 14, 2013 at 02:04:27PM +0100, Miguel Ping wrote:

...
I've been tinkering a little more, if I use either the minimal_pipe, no_mitosis_pipe or the sequential_pipe optimizers the error no longer occurs. I've specified the set of optimizers to use, and it seems that the optimizer step that's problematic is the *mitosis* step.

Where can I learn more about this optimizer? Can someone shed some light? Could you send us the explain outputs of both with and without mitosis/mergetable. This may give us a hint of what goes wrong.

My guess, is that the subhllagg should probably be added (recognized).

Niels

...
Thanks! -- Miguel

On 08/13/2013 03:59 PM, Miguel Ping wrote:

...
I'm trying to come up with a small sample, but it is hard.

- I currently have a small dataset (~600 rows) which gives me the error. - BUT if I export that data onto a *new* table, the error doesn't happen. This makes it hard to provide a proper sample.

Running explain on the same query for these two tables, the plan is different: the error plan seems to use the new subgroup feature for aggregates which was released in a recent MonetDB version (11.15.x?) .

* Does monetdb keep some sort of internal table statistics that feed the MAL planner? * Can I force the query to use the 'old' aggregate function? * Can you guys point me to the part where the group BAT is calculated? my guess is that the group ids are not being calculated correctly, hence the difference between g->U->count and b->U->count. I'm guessing the culprit is around here:

... | (X_16,r1_34,X_140) := group.subgroupdone(X_15); | X_18 := algebra.leftfetchjoin(r1_34,X_15); ... | X_27:bat[:oid,:str] := | batudf.subhllagg(X_26,X_16,r1_34,true); #r1_r4 should be the bat* gid

I have tried the same dataset with MonetDB 11.13.7 (which had the 'old' aggregate definition) and the query works as expected. I don't want to do a downgrade since I don't even know if the data files are compatible.

Thanks, Miguel

On 08/13/2013 11:35 AM, Martin Kersten wrote: > If it is hitting before your code, then please provide the > smallest (SQL) test case to reproduce it locally. > > Thanks, Martin > > On 8/13/13 11:27 AM, Miguel Ping wrote: >> The query calls a custom aggregate function, but the error >> occurs before hitting my code; the code path just starts to prepare things (it's just the boilerplate to run a custom aggr function), and it hits the error while calling BATgroupaggrinit. In fact, BATgroupaggrinit is the very first thing that the BAThllaggr function calls; Also according to Sjoerd, it's a bug: >> >> "If this happens when running a SQL query, it's a bug. I don't >> think NULLs have anything to do with it. NULL values are stored in-line. You might want to look at b->U->count, g->U->count, b->H->seq, g->H->seq, b->H->dense, g->H->dense when the misalignment happens (either in the debugger or by using printf--but realize that count and seq are not int, so %d is not going to work). Also things like the MAL plan (prepend SQL query with EXPLAIN) and the stack trace might be useful." >> >> Thanks. >> >> >> On 08/13/2013 08:51 AM, Stefan Manegold wrote: >>> Dear Miguel, >>> >>> I am not aware of any "hllaggr()" function in the MonetDB >>> release, so I assume the error occurs in your code. Not know >>> your code at all, I'm afraid we cannot be of much help. >>> >>> Best, >>> Stefan >>> >>> On Mon, Aug 12, 2013 at 06:59:33PM +0100, Miguel Ping wrote: >>>> Some more info: >>>>> SELECT count(*) FROM wa_sapo_pt_audience.kpi_2013_07 WHERE >>>> ts>=1373410800000 AND ts<=1374706799000; >>>> ==> 764314 >>>> >>>>> SELECT count(distinct(ts_day)) FROM >>>> wa_sapo_pt_audience.kpi_2013_07 WHERE ts>=1373410800000 AND >>>> ts<=1374706799000; ==> 15 >>>> >>>> it seems that b.count is the row count, while g.count is >>>> something like the distinct count, with 2 more values? >>>> >>>> On 08/12/2013 05:50 PM, Miguel Ping wrote: >>>>> Hi, I'm resurrecting this since I've been out of town and >>>>> only today I got a chance to investigate further. I've >>>>> recompiled with >>>>> -O0 to prevent optimizations from "hiding" the values, and in >>>>> the debugger I got this: >>>>> >>>>> b->U->count BUN 764314 >>>>> g->U->count BUN 17 >>>>> b->H->seq oid 0 >>>>> g->H->seq oid 0 >>>>> b->H->dense unsigned int 1 >>>>> g->H->dense unsigned int 1 >>>>> >>>>> >>>>> The stack call is as follows: >>>>> >>>>> BAThllaggr() at udf.c >>>>> AGGRsubhllaggcand() at udf.c >>>>> AGGRsubhllagg() at udf.c >>>>> malCommandCall() at mal_interpreter.c >>>>> runMALsequence() at mal_interpreter.c >>>>> DFLOWworker() at mal_dataflow.c >>>>> start_thread() at pthread_create.c >>>>> clone() at clone.S >>>>> 0x0 >>>>> >>>>> -------- Original Message -------- >>>>> Subject: Re: b and g must be aligned >>>>> Date: Fri, 26 Jul 2013 11:34:53 +0100 >>>>> From: Sjoerd Mullender >>>>> Reply-To: Communication channel for MonetDB users >>>>> >>>>> To: Communication channel for MonetDB users >>>>> >>>>> >>>>> >>>>> -----BEGIN PGP SIGNED MESSAGE----- >>>>> Hash: SHA1 >>>>> >>>>> On 2013-07-26 12:23, Miguel Ping wrote: >>>>>> On 07/25/2013 01:51 PM, Sjoerd Mullender wrote: On >>>>>> 2013-07-25 14:04, Miguel Ping wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> We're hitting this error "b and g must be aligned". I >>>>>>>>> tracked the src to a commit about some alignment code >>>>>>>>> thing in >>>>>>>>> gdk_calc: >>>>>>>>> http://www.mail-archive.com/checkin-list@monetdb.org/msg0 >>>>>>>>> 9731.html >>>>>>>>> >>>>>>>>> >>>>> (Fix alignment conversion in compatibility code for grouped >>>>>>>>> aggregates.) >>>>>>>>> >>>>>>>>> Can you guys please explain what's the reason behind this >>>>>>>>> error? I can't understand by just looking to the src of >>>>>>>>> gdk_calc.c >>>>>>>>> >>>>>>>>> Thanks! >>>>>> When using grouped aggregates, the grouping bat must be >>>>>> aligned with the value bat. The value bat is b and contains >>>>>> the values you want to aggregate. The group bat is g and >>>>>> contains for each value in b the group (an oid) it belongs >>>>>> to. Equal group ids means the same group. These bats must >>>>>> be aligned, because we need to know for each value in b to >>>>>> which group it belongs. Aligned means: same length, and same >>>>>> head column values. The head columns must be dense (a >>>>>> sequence of numbers starting at some value, and each next >>>>>> value exactly one larger than the previous). Dense >>>>>> sequences are usually not stored explicitly in MonetDB. We >>>>>> only store the first value in the hseqbase field. So the >>>>>> hseqbase fields of b and g must be equal. The one exception >>>>>> to this is when the bats are both empty. This last exception is the change to gdk_calc.c in that changeset. >>>>>> >>>>>> -- Sjoerd Mullender >>>>>>> _______________________________________________ users-list >>>>>>> mailinglistusers-list@monetdb.org >>>>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>>>> >>>>>>> >>>>>> Thanks for your explanation. I still don't understand how >>>>>> can there be a misalignment. I would expect MonetDB to feed >>>>>> my hll aggregate functions with the correct values. Can it >>>>>> be that I may have some NULL values and the validation is >>>>>> failing because of that? >>>>> If this happens when running a SQL query, it's a bug. >>>>> I don't think NULLs have anything to do with it. NULL values >>>>> are stored in-line. >>>>> You might want to look at b->U->count, g->U->count, >>>>> b->H->seq, >>>>> g->H->seq, b->H->dense, g->H->dense when the misalignment >>>>> g->H->happens >>>>> (either in the debugger or by using printf--but realize that >>>>> count and seq are not int, so %d is not going to work). >>>>> Also things like the MAL plan (prepend SQL query with >>>>> EXPLAIN) and the stack trace might be useful. >>>>> >>>>> - -- Sjoerd Mullender >>>>> -----BEGIN PGP SIGNATURE----- >>>>> Version: GnuPG v1.4.13 (GNU/Linux) >>>>> Comment: Using GnuPG with Thunderbird >>>>> -http://www.enigmail.net/ >>>>> >>>>> iQCVAwUBUfJQyT7g04AjvIQpAQK/JAP9HCp/aFaYWv0jodfPUnSRVgFSsdjTn >>>>> /VL >>>>> ttSsmAF+yomGMDIne2311f/D51F3/nte7Utx+01lgvArapWErhjGN1hzPSr5L >>>>> ttSsmAF+Qbs >>>>> PZ6dUfNcH8Rt2AtT3uSxfkFZy9VRDCNNXPei43IgMS2HxVZ48pnAVkNcpBbW3 >>>>> Gms >>>>> GwdvU7bSZtM= >>>>> =2zSg >>>>> -----END PGP SIGNATURE----- >>>>> _______________________________________________ >>>>> users-list mailing list >>>>> users-list@monetdb.org >>>>> http://mail.monetdb.org/mailman/listinfo/users-list >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> users-list mailing list >>>> users-list@monetdb.org >>>> http://mail.monetdb.org/mailman/listinfo/users-list >> _______________________________________________ >> users-list mailing list >> users-list@monetdb.org >> http://mail.monetdb.org/mailman/listinfo/users-list > _______________________________________________ > users-list mailing list > users-list@monetdb.org > http://mail.monetdb.org/mailman/listinfo/users-list > > _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list -- Niels Nes, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098sip:4098@sip.cwi.nl url:http://www.cwi.nl/~niels e-mail:Niels.Nes@cwi.nl

users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

________________________________

_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list

4152

Age (days ago)

4156

Last active (days ago)

List overview

Download

20 comments

5 participants

participants (5)

Martin Kersten
Miguel Ping
Niels Nes
Scott Mathieson
Stefan Manegold