Hi, as you can see, the majority of the time is spent in the "count(distinct userid)". All other operations are rather quick. There is no tuning option. The "count(distinct userid)" need to build a hash table to find all unique values, and that takes time. Only option for improvement is to find a better / more efficient implementation of "distinct" in particular or hash tables in general. Best, Stefan ----- On Nov 7, 2016, at 9:54 AM, Rajendra Bhat rajhalkere@gmail.com wrote:
Hi,
My table has 308270423 records..
I am using MonetDB 5 server v11.23.13 "Jun2016-SP2" (64-bit, 64-bit oids, 128-bit integers).
Monet running on Centos6 AWS machine. Please find the attached table schema.
Please find the below request query and and there results.
Query record count time Select count(*) from historical; 308270423 13.291ms Select count(userid) from historical; 308270423 4 m.8s Select count(distinct userid) from historical; 14363134 13 min 1 sec Select count(*) from historical where gender='men'; 114971336 2.9 sec Select count(*) from historical where brand='puma'; 9253872 13.5 sec Select count(*) from historical where gender='men' and brand='puma'; 4254774 4.6s Select count(userid) from historical where gender='men' and brand='puma'; 4254764 4.6 s Select count(distinct userid) from historical where gender='men' and brand='puma'; 2780272 2m.58s
Query time select sum(mrp)/sum(noOfItems) from historical where gender='men' and brand='puma'; 26.3s select sum(mrp)/sum(orderCount) from historical where gender='men' and brand='puma'; 11.2s select sum(revenue) from historical where gender='men' and brand='puma'; 10.9 s select sum(orderCount) from historical where gender='men' and brand='puma'; 5.0 s select sum(noOfItems) from historical where gender='men' and brand='puma'; 4.0 s select sum(returnCount) from historical where gender='men' and brand='puma';10.8 s select sum(rtoCount) from historical where gender='men' and brand='puma'; 9.9s select sum(0) from historical where gender='men' and brand='puma'; 4.6 s select sum(tDiscountPer) from historical where gender='men' and brand='puma'; 31.6s select sum(cDiscountPer) from historical where gender='men' and brand='puma'; 24.7s
On Mon, Nov 7, 2016 at 1:39 PM, Stefan Manegold < Stefan.Manegold@cwi.nl > wrote:
Hi,
which version of MonetDB are you running on which OS?
How many tuples (rows) does your table have?
Of what type are the columns used in your query?
Could you run the query twice and compare/report the timings?
Could you run the following queries and report their result and runtime?
Select count(*) from historical; | 308270423 | +-----------+ 1 tuple (13.291ms)
Select count(userid) from historical; Select count(distinct userid) from historical; Select count(*) from historical where gender='men'; Select count(*) from historical where brand='puma'; Select count(*) from historical where gender='men' and brand='puma'; Select count(userid) from historical where gender='men' and brand='puma'; Select count(distinct userid) from historical where gender='men' and brand='puma';
Could you run the following queries and report their runtime?
select sum(mrp)/sum(noOfItems) from historical where gender='men' and brand='puma'; select sum(mrp)/sum(orderCount) from historical where gender='men' and brand='puma'; select sum(revenue) from historical where gender='men' and brand='puma'; select sum(orderCount) from historical where gender='men' and brand='puma'; select sum(noOfItems) from historical where gender='men' and brand='puma'; select sum(returnCount) from historical where gender='men' and brand='puma'; select sum(rtoCount) from historical where gender='men' and brand='puma'; select sum(0) from historical where gender='men' and brand='puma'; select sum(tDiscountPer) from historical where gender='men' and brand='puma'; select sum(cDiscountPer) from historical where gender='men' and brand='puma';
Best, Stefan
----- On Nov 7, 2016, at 8:14 AM, Rajendra Bhat rajhalkere@gmail.com wrote:
Hi team,
I am running monetdb on 16 core processor and 30gb ram.
I have table around 54 fields. gzip compressed csv size is around 6.5 GB.
query will take around 6 min. please suggest tuning options.
query: Select count(distinct userid),sum(mrp)/sum(noOfItems),sum(mrp)/sum(orderCount),sum(revenue),sum(orderCount),sumnoOfItems),sum(returnCount),sum(rtoCount),sum(0),sum(0),sum(tDiscountPer),sum(cDiscountPer) from historical where gender='men' and brand='puma' Thanks and Regards
Rajendra Bhat
-- Thanks and Regards
Rajendra Bhat
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Thanks and Regards
Rajendra Bhat
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |