Re: Very different query speeds for two similar queries

29 Nov 2013


      On 29/11/13 11:36, Will Muldrew wrote:
...
Hi
I've been trying out monetdb on a 350M row table (modern linux
workstation with 24GB RAM + SSD).  This table has a sorted TIMESTAMP
column and a second unsorted VARCHAR(6) column which is non-unique
(perhaps 50M distinct values fairly randomly distributed).
This query is very slow, spends huge amount of time writing, and has
very low CPU (intermediate results?):
select count(*) from t_order where orderId = 'XXXXXX'
This takes hours to complete.
However, this query is a lot faster, even though the timestamp clause is
redundant (all my data is > '2012-01-01'):
It decided it can use parallel processing effectively.
Why it did not in the first query is unknown.
...
select count(*) from t_order where orderId = 'XXXXXX' and time >
'1970-01-01';
The first run is slowish (10s of seconds - I guess because we're reading
the orderId column fully), however subsequent runs are sub-second.
Yes, that is caching effect.
...
Obviously I could write an app which puts the spurious timestamp clause
into the where, but it's still confusing and not ideal.  Is there
something I can do?
regards, Martin

Re: Very different query speeds for two similar queries

Martin Kersten