On Thu, Feb 18, 2010 at 07:40:15PM +0100, Stefan de Konink wrote:
Op 18-02-10 19:22, Stefan Manegold schreef:
"It" (in fact we) choose to do a hash select, and since there is no hash table, yet, we need to build it, which is infact more expensive than a simple scan select for this very operation (later operation *might* then benefit from the hash table ...):
Question about that then; if we make an on the fly hash. Why isn't it 'maintained' between queries (or does this depend on the chosen pipeline?) Because the query doesn't seem to get faster when running it multiple time?
because it is built on an intermediate result (base BAT plus delta BATs applied) that is gone, again after the query has been executed. Stefan
For now, you can just locally disable/remove that alternative in the above code, try again, and report the result.
Cold: sql>select kvk from kvk where kvk = 412657690010; +--------------+ | kvk | +==============+ | 412657690010 | +--------------+ 1 tuple Timer 1174.737 msec 1 rows
Hot: sql>select kvk from kvk where kvk = 412657690010; +--------------+ | kvk | +==============+ | 412657690010 | +--------------+ 1 tuple Timer 23.741 msec 1 rows sql>select kvk from kvk where kvk = 412657690010;
Thanks for this 20x performance increase! (And it gets even better, because numbers that doesn't exist are excluded in ~13ms.)
Stefan
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4199 |