Re: [Monetdb-developers] Single column select vs aggregation

18 Feb 2010


      On Thu, Feb 18, 2010 at 07:40:15PM +0100, Stefan de Konink wrote:
...
Op 18-02-10 19:22, Stefan Manegold schreef:
...
"It" (in fact we) choose to do a hash select, and since there is no hash
table, yet, we need to build it, which is infact more expensive than a
simple scan select for this very operation (later operation *might* then
benefit from the hash table ...):
Question about that then; if we make an on the fly hash. Why isn't
it 'maintained' between queries (or does this depend on the chosen
pipeline?) Because the query doesn't seem to get faster when running
it multiple time?
because it is built on an intermediate result (base BAT plus delta BATs
applied) that is gone, again after the query has been executed.

Stefan
...
...
For now, you can just locally disable/remove that alternative in the above code,
try again, and report the result.
Cold:
sql>select kvk from kvk where kvk = 412657690010;
+--------------+
| kvk          |
+==============+
| 412657690010 |
+--------------+
1 tuple
Timer    1174.737 msec 1 rows
Hot:
sql>select kvk from kvk where kvk = 412657690010;
+--------------+
| kvk          |
+==============+
| 412657690010 |
+--------------+
1 tuple
Timer      23.741 msec 1 rows
sql>select kvk from kvk where kvk = 412657690010;
Thanks for this 20x performance increase! (And it gets even better,
because numbers that doesn't exist are excluded in ~13ms.)
Stefan
-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4199       |