Hi all,
I reopen this thread about excessive slowness in query "SELECT id, count(distinct value) FROM table1 GROUP BY id;" (you can find more info in my mail below)
About your suggestion (thanks Stefan!), I can provide the below info.
The required statistics are the following ones:
select count(*) from table: 92.912.272 items
select count(id) from table: 92.912.272 items
select count(distinct id) from table: 63 items
select count(value) from table: 92.912.272 items
select count(distinct value) from table: 86.964.522 items
select count(*) from (select id,value from table group by id,value) as x: 86.964.522 items
The result of the TRACE command is:
sql>TRACE SELECT id, count(distinct value) FROM table1 GROUP BY id;
+--------------------------+---------+
| id | L3 |
+==========================+=========+
| 20160906 | 779489 |
| 20160907 | 1616739 |
| 20160908 | 1769753 |
| 20160909 | 1764169 |
| 20160910 | 668824 |
| 20160911 | 716343 |
| 20160912 | 1685183 |
| 20160913 | 1619177 |
| 20160914 | 1630866 |
| 20160915 | 1831826 |
| 20160916 | 1717702 |
| 20160917 | 638397 |
| 20160918 | 693979 |
| 20160919 | 1635602 |
| 20160920 | 1595859 |
| 20160921 | 1650712 |
| 20160922 | 1777228 |
| 20160923 | 1783591 |
| 20160924 | 652866 |
| 20160925 | 722674 |
| 20160926 | 1641260 |
| 20160927 | 1673879 |
| 20160928 | 1689016 |
| 20160929 | 1865369 |
| 20160930 | 1791773 |
| 20161001 | 654382 |
| 20161002 | 731022 |
| 20161003 | 1302974 |
| 20161004 | 1593157 |
| 20161005 | 1570264 |
| 20161006 | 1750299 |
| 20161007 | 1788563 |
| 20161008 | 590823 |
| 20161009 | 700178 |
| 20161010 | 1649577 |
| 20161011 | 1602492 |
| 20161012 | 1639719 |
| 20161013 | 1759836 |
| 20161014 | 1732986 |
| 20161015 | 641682 |
| 20161016 | 672841 |
| 20161017 | 1652638 |
| 20161018 | 1631122 |
| 20161019 | 1646245 |
| 20161020 | 1831482 |
| 20161021 | 1836734 |
| 20161022 | 618921 |
| 20161023 | 687258 |
| 20161024 | 1659389 |
| 20161025 | 1743597 |
| 20161026 | 1709957 |
| 20161027 | 1774661 |
| 20161028 | 1796815 |
| 20161029 | 665849 |
| 20161030 | 726861 |
| 20161031 | 1686187 |
| 20161101 | 1694782 |
| 20161102 | 1582901 |
| 20161103 | 1761407 |
| 20161104 | 1815731 |
| 20161105 | 629033 |
| 20161106 | 684745 |
| 20161107 | 1135136 |
+--------------------------+---------+
63 tuples (5m 25s)
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| usec | statement |
+==========+============================================================================================================================================================================================================================+
| 67 | X_1=0@0:void := querylog.define("trace select id, count(distinct value) from table1 group by id;":str, "default_pipe":str, 38:int); |
| 218 | X_40=
Hi all,
we're testing MonetDB performances in order to understand if it could be possible to use it as data mart in our BI software.
We're considering a set of queries executed in our test environment (CentOS 7, 32 GB RAM, 4 cpu).
All the executed queries have quite good performances (between 10 and 45 seconds) except the following one that is taking several minutes in order to be executed:
SELECT id, count(distinct value)
FROM table
GROUP BY id;
The table contains 100 millions of data. It has about 130 columns (varchar, int, bigint, smallint, tinyint, timestamp, datetime). We have no BLOB/CLOB/Text types.
The fields used in the query have these data types:
id: int
value: bigint
Can you suggest us any optimizations in order to have this query run faster? Do you know some possible reasons why it's so slow?
Kind Regards,
Alfio Bettini
The information contained in this message, together with any attachments, may be privileged and confidential and is intended for the use of the addressee(s) only. Any review, copying, use or distribution of this email by others (including any of its attachments) is strictly prohibited. If you are not the intended recipient, please destroy this email and its attachments and contact the sender immediately. This email was sent by EidosMedia SpA
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list The information contained in this message, together with any attachments, may be privileged and confidential and is intended for the use of the addressee(s) only. Any review, copying, use or distribution of this email by others (including any of its attachments) is strictly prohibited. If you are not the intended recipient, please destroy this email and its attachments and contact the sender immediately. This email was sent by EidosMedia SpAhttp://www.eidosmedia.com