[MonetDB-users] wrong results for group by queries
Hi, we are using MonetDB (Oct-2010) for analyzing a 250G data warehouse and have found at least one "thing" which might be a serious bug. The simplest scenario that I have found to describe it, is as follows: SELECT comp.id as compid, comp.name, affnets.name as affid FROM toc INNER JOIN app ON toc.application_id=app.id INNER JOIN comp ON app.company_id = comp.id INNER JOIN affnets ON toc.affiliate_network_id=affnets.id GROUP BY compid, comp.name, affid This query is supposed to return all unique company/affnet pairs which occur in the fact table "toc". However, it returns only the unique company-names, each with a random (the first?) affnet. (Of course, this could be accomplished with DISTINCT, but our aim to to aggregate other fields). If I change the query to select app.company_id as compid, comp.name, affnets.name as affid ... or select comp.id as compid, comp.name, affnets.name as affid ... i.e. semantically the same but only with fields from 2 directly joined tables in the GROUP BY, then the result is correct. Question: Is this a bug or are we missing something? Thank you in advance! Daniel
Hi again, now I managed to isolate the problem even more. Attached you will find a SQL script which creates 4 tables, inserts a few tuples and then queries them twice: the first query produces too few rows, but after dropping a primary key on one of the tables (it matters which one), the second query returns the correct results. Looks like the optimizer optimizes a little too eagerly in the presence of primary keys? Best regards, Daniel Daniel Boesswetter wrote:
Hi,
we are using MonetDB (Oct-2010) for analyzing a 250G data warehouse and have found at least one "thing" which might be a serious bug. The simplest scenario that I have found to describe it, is as follows:
SELECT comp.id as compid, comp.name, affnets.name as affid FROM toc INNER JOIN app ON toc.application_id=app.id INNER JOIN comp ON app.company_id = comp.id INNER JOIN affnets ON toc.affiliate_network_id=affnets.id GROUP BY compid, comp.name, affid
This query is supposed to return all unique company/affnet pairs which occur in the fact table "toc". However, it returns only the unique company-names, each with a random (the first?) affnet. (Of course, this could be accomplished with DISTINCT, but our aim to to aggregate other fields).
If I change the query to
select app.company_id as compid, comp.name, affnets.name as affid ...
or
select comp.id as compid, comp.name, affnets.name as affid ...
i.e. semantically the same but only with fields from 2 directly joined tables in the GROUP BY, then the result is correct.
Question: Is this a bug or are we missing something?
Thank you in advance!
Daniel
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
I am getting same results with latest development release: +------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | +------+------+------+ 3 tuples +------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | | b2 | 1 | c1 | | b2 | 2 | c2 | | b2 | 3 | c3 | | b3 | 1 | c1 | | b3 | 2 | c2 | | b3 | 3 | c3 | +------+------+------+ 9 tuples Mike On Fri, Apr 8, 2011 at 11:50 PM, Daniel Boesswetter < daniel.boesswetter@fu-berlin.de> wrote:
Hi again,
now I managed to isolate the problem even more. Attached you will find a SQL script which creates 4 tables, inserts a few tuples and then queries them twice: the first query produces too few rows, but after dropping a primary key on one of the tables (it matters which one), the second query returns the correct results.
Looks like the optimizer optimizes a little too eagerly in the presence of primary keys?
Best regards, Daniel
Daniel Boesswetter wrote:
Hi,
we are using MonetDB (Oct-2010) for analyzing a 250G data warehouse and have found at least one "thing" which might be a serious bug. The simplest scenario that I have found to describe it, is as follows:
SELECT comp.id as compid, comp.name, affnets.name as affid FROM toc INNER JOIN app ON toc.application_id=app.id INNER JOIN comp ON app.company_id = comp.id INNER JOIN affnets ON toc.affiliate_network_id=affnets.id GROUP BY compid, comp.name, affid
This query is supposed to return all unique company/affnet pairs which occur in the fact table "toc". However, it returns only the unique company-names, each with a random (the first?) affnet. (Of course, this could be accomplished with DISTINCT, but our aim to to aggregate other fields).
If I change the query to
select app.company_id as compid, comp.name, affnets.name as affid ...
or
select comp.id as compid, comp.name, affnets.name as affid ...
i.e. semantically the same but only with fields from 2 directly joined tables in the GROUP BY, then the result is correct.
Question: Is this a bug or are we missing something?
Thank you in advance!
Daniel
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi list, this seems particularly annoying, to say the least, and I don't see any follow up on the list. Is this confirmed and a known problem (maybe with a bug report), or is it solved, or what ? Franck Le samedi 09 avril 2011 à 05:57 +0300, Michael Sioutis a écrit :
I am getting same results with latest development release:
+------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | +------+------+------+ 3 tuples
+------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | | b2 | 1 | c1 | | b2 | 2 | c2 | | b2 | 3 | c3 | | b3 | 1 | c1 | | b3 | 2 | c2 | | b3 | 3 | c3 | +------+------+------+ 9 tuples
Mike
On Fri, Apr 8, 2011 at 11:50 PM, Daniel Boesswetter
wrote: Hi again, now I managed to isolate the problem even more. Attached you will find a SQL script which creates 4 tables, inserts a few tuples and then queries them twice: the first query produces too few rows, but after dropping a primary key on one of the tables (it matters which one), the second query returns the correct results.
Looks like the optimizer optimizes a little too eagerly in the presence of primary keys?
Best regards, Daniel
Daniel Boesswetter wrote: Hi,
we are using MonetDB (Oct-2010) for analyzing a 250G data warehouse and have found at least one "thing" which might be a serious bug. The simplest scenario that I have found to describe it, is as follows:
SELECT comp.id as compid, comp.name, affnets.name as affid FROM toc INNER JOIN app ON toc.application_id=app.id INNER JOIN comp ON app.company_id = comp.id INNER JOIN affnets ON toc.affiliate_network_id=affnets.id GROUP BY compid, comp.name, affid
This query is supposed to return all unique company/affnet pairs which occur in the fact table "toc". However, it returns only the unique company-names, each with a random (the first?) affnet. (Of course, this could be accomplished with DISTINCT, but our aim to to aggregate other fields).
If I change the query to
select app.company_id as compid, comp.name, affnets.name as affid ...
or
select comp.id as compid, comp.name, affnets.name as affid ...
i.e. semantically the same but only with fields from 2 directly joined tables in the GROUP BY, then the result is correct.
Question: Is this a bug or are we missing something?
Thank you in advance!
Daniel
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi, please bear with us. We're also only humans with 24 hours to spend per day. ;-) I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups. Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example? I'll try to fix it as soon as our agendas allow so ... Thanks, Stefan On Thu, May 05, 2011 at 09:55:37AM +0200, Franck Routier wrote:
Hi list,
this seems particularly annoying, to say the least, and I don't see any follow up on the list. Is this confirmed and a known problem (maybe with a bug report), or is it solved, or what ?
Franck
Le samedi 09 avril 2011 à 05:57 +0300, Michael Sioutis a écrit :
I am getting same results with latest development release:
+------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | +------+------+------+ 3 tuples
+------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | | b2 | 1 | c1 | | b2 | 2 | c2 | | b2 | 3 | c3 | | b3 | 1 | c1 | | b3 | 2 | c2 | | b3 | 3 | c3 | +------+------+------+ 9 tuples
Mike
On Fri, Apr 8, 2011 at 11:50 PM, Daniel Boesswetter
wrote: Hi again, now I managed to isolate the problem even more. Attached you will find a SQL script which creates 4 tables, inserts a few tuples and then queries them twice: the first query produces too few rows, but after dropping a primary key on one of the tables (it matters which one), the second query returns the correct results.
Looks like the optimizer optimizes a little too eagerly in the presence of primary keys?
Best regards, Daniel
Daniel Boesswetter wrote: Hi,
we are using MonetDB (Oct-2010) for analyzing a 250G data warehouse and have found at least one "thing" which might be a serious bug. The simplest scenario that I have found to describe it, is as follows:
SELECT comp.id as compid, comp.name, affnets.name as affid FROM toc INNER JOIN app ON toc.application_id=app.id INNER JOIN comp ON app.company_id = comp.id INNER JOIN affnets ON toc.affiliate_network_id=affnets.id GROUP BY compid, comp.name, affid
This query is supposed to return all unique company/affnet pairs which occur in the fact table "toc". However, it returns only the unique company-names, each with a random (the first?) affnet. (Of course, this could be accomplished with DISTINCT, but our aim to to aggregate other fields).
If I change the query to
select app.company_id as compid, comp.name, affnets.name as affid ...
or
select comp.id as compid, comp.name, affnets.name as affid ...
i.e. semantically the same but only with fields from 2 directly joined tables in the GROUP BY, then the result is correct.
Question: Is this a bug or are we missing something?
Thank you in advance!
Daniel
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Stefan: I'll file the bug. Franck: thanks for bringing up the topic again ;-) Best, Daniel Stefan.Manegold@cwi.nl wrote:
Hi,
please bear with us. We're also only humans with 24 hours to spend per day. ;-)
I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups.
Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example?
I'll try to fix it as soon as our agendas allow so ...
Thanks, Stefan
On Thu, May 05, 2011 at 09:55:37AM +0200, Franck Routier wrote:
Hi list,
this seems particularly annoying, to say the least, and I don't see any follow up on the list. Is this confirmed and a known problem (maybe with a bug report), or is it solved, or what ?
Franck
Le samedi 09 avril 2011 à 05:57 +0300, Michael Sioutis a écrit :
I am getting same results with latest development release:
+------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | +------+------+------+ 3 tuples
+------+------+------+ | name | id | name | +======+======+======+ | b1 | 1 | c1 | | b1 | 2 | c2 | | b1 | 3 | c3 | | b2 | 1 | c1 | | b2 | 2 | c2 | | b2 | 3 | c3 | | b3 | 1 | c1 | | b3 | 2 | c2 | | b3 | 3 | c3 | +------+------+------+ 9 tuples
Mike
On Fri, Apr 8, 2011 at 11:50 PM, Daniel Boesswetter
wrote: Hi again, now I managed to isolate the problem even more. Attached you will find a SQL script which creates 4 tables, inserts a few tuples and then queries them twice: the first query produces too few rows, but after dropping a primary key on one of the tables (it matters which one), the second query returns the correct results.
Looks like the optimizer optimizes a little too eagerly in the presence of primary keys?
Best regards, Daniel
Daniel Boesswetter wrote: Hi,
we are using MonetDB (Oct-2010) for analyzing a 250G data warehouse and have found at least one "thing" which might be a serious bug. The simplest scenario that I have found to describe it, is as follows:
SELECT comp.id as compid, comp.name, affnets.name as affid FROM toc INNER JOIN app ON toc.application_id=app.id INNER JOIN comp ON app.company_id = comp.id INNER JOIN affnets ON toc.affiliate_network_id=affnets.id GROUP BY compid, comp.name, affid
This query is supposed to return all unique company/affnet pairs which occur in the fact table "toc". However, it returns only the unique company-names, each with a random (the first?) affnet. (Of course, this could be accomplished with DISTINCT, but our aim to to aggregate other fields).
If I change the query to
select app.company_id as compid, comp.name, affnets.name as affid ...
or
select comp.id as compid, comp.name, affnets.name as affid ...
i.e. semantically the same but only with fields from 2 directly joined tables in the GROUP BY, then the result is correct.
Question: Is this a bug or are we missing something?
Thank you in advance!
Daniel
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi, Le jeudi 05 mai 2011 à 16:18 +0200, Stefan.Manegold@cwi.nl a écrit :
Hi,
please bear with us. We're also only humans with 24 hours to spend per day. ;-)
Sure Stefan, there was no intent to be rude or look demanding. I'm conscious this is a FOSS project, and a very interesting one. It's just that returning unreliable results is rather embarrassing for a database :-)
I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups.
Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example?
I'll let Daniel do this as he picked it up first ! Thanks, Franck
Hi, speaking of optimizer bugs, I'm afraid I've got at least one more (some lines removed for readability): sql>select count(distinct x) from a; 3 sql>select count(x) from a; 6 sql>select count(x), count(distinct x) from a; 6|6 The expected answer is "6|3". If everybody agrees, I'll file a bug? Daniel Franck Routier wrote:
Hi,
Le jeudi 05 mai 2011 à 16:18 +0200, Stefan.Manegold@cwi.nl a écrit :
Hi,
please bear with us. We're also only humans with 24 hours to spend per day. ;-)
Sure Stefan, there was no intent to be rude or look demanding. I'm conscious this is a FOSS project, and a very interesting one. It's just that returning unreliable results is rather embarrassing for a database :-)
I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups.
Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example?
I'll let Daniel do this as he picked it up first !
Thanks,
Franck
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On Thu, May 05, 2011 at 04:42:48PM +0200, Daniel Boesswetter wrote:
Hi,
speaking of optimizer bugs, I'm afraid I've got at least one more (some lines removed for readability):
sql>select count(distinct x) from a;
3
sql>select count(x) from a;
6
sql>select count(x), count(distinct x) from a;
6|6
The expected answer is "6|3".
"nice" catch ...
If everybody agrees, I'll file a bug?
yes please --- see also below ... ... indeed "embarrassing" ... :-( Stefan ======== sql>select count(name) from tables; +------+ | L17 | +======+ | 37 | +------+ 1 tuple (3.446ms) sql>select count(distinct name) from tables; +------+ | L20 | +======+ | 31 | +------+ 1 tuple (4.475ms) sql>select count(name) , count(distinct name) from tables; +------+------+ | L21 | L22 | +======+======+ | 37 | 37 | +------+------+ 1 tuple (4.061ms) sql>PLAN select count(name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count no nil (tables.name) as L23 ] | ^^^^^^^^^^^^ | ) [ L23 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.973ms) sql>PLAN select count(distinct name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count unique no nil (tables.name) as L24 ] | ^^^^^^!!!!!!^^^^^^^^ | ) [ L24 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.874ms) sql>PLAN select count(name) , count(distinct name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count no nil (tables.name) as L25, L25 as L26 ] | ^^^^^^^^^^^^ !!!!!!!!!! | ) [ L25, L26 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.351ms) ========
Daniel
Franck Routier wrote:
Hi,
Le jeudi 05 mai 2011 à 16:18 +0200, Stefan.Manegold@cwi.nl a écrit :
Hi,
please bear with us. We're also only humans with 24 hours to spend per day. ;-)
Sure Stefan, there was no intent to be rude or look demanding. I'm conscious this is a FOSS project, and a very interesting one. It's just that returning unreliable results is rather embarrassing for a database :-)
I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups.
Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example?
I'll let Daniel do this as he picked it up first !
Thanks,
Franck
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Hi Stefan, I filed that one as well ... Why didn't you tell me that there's a PLAN command? ;-) I've been struggling with EXPLAIN and so far I did not find it very useful, but PLAN looks promising .... Best regards, Daniel Stefan.Manegold@cwi.nl wrote:
On Thu, May 05, 2011 at 04:42:48PM +0200, Daniel Boesswetter wrote:
Hi,
speaking of optimizer bugs, I'm afraid I've got at least one more (some lines removed for readability):
sql>select count(distinct x) from a;
3
sql>select count(x) from a;
6
sql>select count(x), count(distinct x) from a;
6|6
The expected answer is "6|3".
"nice" catch ...
If everybody agrees, I'll file a bug?
yes please --- see also below ...
... indeed "embarrassing" ... :-(
Stefan
======== sql>select count(name) from tables; +------+ | L17 | +======+ | 37 | +------+ 1 tuple (3.446ms) sql>select count(distinct name) from tables; +------+ | L20 | +======+ | 31 | +------+ 1 tuple (4.475ms) sql>select count(name) , count(distinct name) from tables; +------+------+ | L21 | L22 | +======+======+ | 37 | 37 | +------+------+ 1 tuple (4.061ms) sql>PLAN select count(name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count no nil (tables.name) as L23 ] | ^^^^^^^^^^^^ | ) [ L23 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.973ms) sql>PLAN select count(distinct name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count unique no nil (tables.name) as L24 ] | ^^^^^^!!!!!!^^^^^^^^ | ) [ L24 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.874ms) sql>PLAN select count(name) , count(distinct name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count no nil (tables.name) as L25, L25 as L26 ] | ^^^^^^^^^^^^ !!!!!!!!!! | ) [ L25, L26 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.351ms) ========
Daniel
Franck Routier wrote:
Hi,
Le jeudi 05 mai 2011 à 16:18 +0200, Stefan.Manegold@cwi.nl a écrit :
Hi,
please bear with us. We're also only humans with 24 hours to spend per day. ;-)
Sure Stefan, there was no intent to be rude or look demanding. I'm conscious this is a FOSS project, and a very interesting one. It's just that returning unreliable results is rather embarrassing for a database :-)
I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups.
Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example?
I'll let Daniel do this as he picked it up first !
Thanks,
Franck
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On Thu, May 05, 2011 at 05:10:18PM +0200, Daniel Boesswetter wrote:
Hi Stefan,
I filed that one as well ...
Thanks!
Why didn't you tell me that there's a PLAN command? ;-)
don't recall you asked ;-) I admit it's not mentioned on the web site (yet?) http://monetdb.cwi.nl/SQL/Documentation/SQL-Runtime-Features.html but it has been mentioned once in a while on this list ... anyway, we'll also try to improve documentation ... Stefan
I've been struggling with EXPLAIN and so far I did not find it very useful, but PLAN looks promising ....
Best regards, Daniel
Stefan.Manegold@cwi.nl wrote:
On Thu, May 05, 2011 at 04:42:48PM +0200, Daniel Boesswetter wrote:
Hi,
speaking of optimizer bugs, I'm afraid I've got at least one more (some lines removed for readability):
sql>select count(distinct x) from a;
3
sql>select count(x) from a;
6
sql>select count(x), count(distinct x) from a;
6|6
The expected answer is "6|3".
"nice" catch ...
If everybody agrees, I'll file a bug?
yes please --- see also below ...
... indeed "embarrassing" ... :-(
Stefan
======== sql>select count(name) from tables; +------+ | L17 | +======+ | 37 | +------+ 1 tuple (3.446ms) sql>select count(distinct name) from tables; +------+ | L20 | +======+ | 31 | +------+ 1 tuple (4.475ms) sql>select count(name) , count(distinct name) from tables; +------+------+ | L21 | L22 | +======+======+ | 37 | 37 | +------+------+ 1 tuple (4.061ms) sql>PLAN select count(name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count no nil (tables.name) as L23 ] | ^^^^^^^^^^^^ | ) [ L23 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.973ms) sql>PLAN select count(distinct name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count unique no nil (tables.name) as L24 ] | ^^^^^^!!!!!!^^^^^^^^ | ) [ L24 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.874ms) sql>PLAN select count(name) , count(distinct name) from tables; +------------------------------------------------------------------------------------------------------------------------------------------+ | rel | +==========================================================================================================================================+ | project ( | | | group by ( | | | | project ( | | | | | union ( | | | | | | project ( | | | | | | | select ( | | | | | | | | table(sys._tables) [ _tables.id as p.id, _tables.name as p.name, _tables.type as p.type, _tables.%TID% NOT NULL as p.%TID% ] | | | | | | | ) [ p.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ p.id, p.name, p.type ], | | | | | | project ( | | | | | | | select ( | | | | | | | | table(tmp._tables) [ _tables.id as t.id, _tables.name as t.name, _tables.type as t.type, _tables.%TID% NOT NULL as t.%TID% ] | | | | | | | ) [ t.type < convert(2 from tinyint to smallint) ] | | | | | | ) [ t.id, t.name, t.type ] | | | | | ) [ id as tables.id, name as tables.name, type as tables.type ] | | | | ) [ tables.id, tables.name ] | | | ) [ ] [ count no nil (tables.name) as L25, L25 as L26 ] | ^^^^^^^^^^^^ !!!!!!!!!! | ) [ L25, L26 ] | +------------------------------------------------------------------------------------------------------------------------------------------+ 18 tuples (0.351ms) ========
Daniel
Franck Routier wrote:
Hi,
Le jeudi 05 mai 2011 à 16:18 +0200, Stefan.Manegold@cwi.nl a écrit :
Hi,
please bear with us. We're also only humans with 24 hours to spend per day. ;-)
Sure Stefan, there was no intent to be rude or look demanding. I'm conscious this is a FOSS project, and a very interesting one. It's just that returning unreliable results is rather embarrassing for a database :-)
I suppose this is a bug; looks like it might be a too eager attempt to (in this case, i.e., after the join, incorrectly) exploit the uniqueness of the primary key of table c when determining the groups.
Could you --- Franck or Daniel --- please file a bug report using Daniel's detailed example?
I'll let Daniel do this as he picked it up first !
Thanks,
Franck
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (4)
-
Daniel Boesswetter
-
Franck Routier
-
Michael Sioutis
-
Stefan.Manegold@cwi.nl