[Monetdb-developers] Bug report: Poor query performance
Hi, I have a database with 52 columns and 7 million rows. The columns are Date (timestamp data type), Customer (real data type) and a bunch of other columns that indicate values for each customer on each date (all of these columns are of real data type). The 'Date' column contains 5 unique values. The 'Customer' column contains 1.7 million unique values. 1. The following query returns instantly: SELECT "CUSTOMER" FROM "TABLE". 2. The following query takes a long time (stopped after a few minutes): SELECT "CUSTOMER" FROM "TABLE" GROUP BY "CUSTOMER". 3. The following query returns instantly: SELECT "DATE", "CUSTOMER" FROM "TABLE" GROUP BY "DATE", "CUSTOMER". 4. The following query takes a long time (stopped after a few minutes): SELECT "CUSTOMER", "DATE" from "TABLE" GROUP BY "CUSTOMER", "DATE". * Note that the only difference between query 1 and query 2 are the addition of a group by clause to query 2. * Note that the only difference between query 3 and query 4 are the order of columns in the select and group by clauses. Is there a way to work around this? I'm really stuck. Thanks in advance, Elad _________________________________________________________________ Talk to your Yahoo! Friends via Windows Live Messenger. Find out how. http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yah...
Elad . wrote:
Hi,
I have a database with 52 columns and 7 million rows.
dear Elad, Thanks for using MonetDB. The problem you describe is known. No easy solution exists, because the front-end optimizer does not use, nor rely on table statistics to derive the best plan. A solution will emerge, but don't hold your breath regards, Martin
The columns are Date (timestamp data type), Customer (real data type) and a bunch of other columns that indicate values for each customer on each date (all of these columns are of real data type).
The 'Date' column contains 5 unique values.
The 'Customer' column contains 1.7 million unique values.
1. The following query returns instantly: SELECT "CUSTOMER" FROM "TABLE".
2. The following query takes a long time (stopped after a few minutes): SELECT "CUSTOMER" FROM "TABLE" GROUP BY "CUSTOMER".
3. The following query returns instantly: SELECT "DATE", "CUSTOMER" FROM "TABLE" GROUP BY "DATE", "CUSTOMER".
4. The following query takes a long time (stopped after a few minutes): SELECT "CUSTOMER", "DATE" from "TABLE" GROUP BY "CUSTOMER", "DATE".
* Note that the only difference between query 1 and query 2 are the addition of a group by clause to query 2.
* Note that the only difference between query 3 and query 4 are the order of columns in the select and group by clauses.
Is there a way to work around this? I'm really stuck.
Thanks in advance,
Elad
------------------------------------------------------------------------ Talk to your Yahoo! Friends via Windows Live Messenger. Find Out How http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yah...
------------------------------------------------------------------------
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ ------------------------------------------------------------------------
_______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
participants (2)
-
Elad .
-
Martin Kersten