On Thu, Feb 18, 2010 at 07:05:47PM +0100, Stefan de Konink wrote:
Op 18-02-10 18:23, Stefan Manegold schreef:
On Thu, Feb 18, 2010 at 05:46:31PM +0100, Stefan de Konink wrote:
Attached is a trace of a single column select operation, with a max, avg and equality.
The equality takes about 4 times more time than the avg operation.
From the plan I see that the operation is executed paralel, but I do wonder why suddenly the uselect takes such a high amount of time. Which version of MonetDB? CVS HEAD, 2 am.
How configured / compiled? --enable-assert --enable-debug
How was the mserver5 started (which command line options were given)? /opt/monetdb-head/bin/mserver5 --dbfarm=/home/skinkie/monetb --dbname=kvk --dbinit="include sphinx; include sql;" --set mapi_port=50001
Any changes compared to the default monetdb5.conf? No, not changed anything in there.
Is the "kvk" column sorted? No, it does include a partial sort (msb are sorted). A 32bit column (kvks) that is (or either should be...) ends up with between 170ms ~ 220ms.
Could you run the select query in the MAL debugger (DEBUG select kvk from kvk where kvk = 412657690010;), single step through it, set the debug mask to 2097152 (debug 2097152) just before the algebra.uselect() is executed, and report the output on the server console after the algebra.uselect() has been executed? Do you mean this?
indeed
mdb># _44 := algebra.uselect(_40=
:bat[:oid,:lng][1196532],A0=412657690010:lng); mdb> #BAT_select_(b=tmp_5710): sampling: tmp1 = BATslice(b=tmp_5710, _lo=598266, _hi=598371); #BAT_select_(b=tmp_5710): sampling: tmp2 = BAT_select_(tmp1=tmp_5704, tl, th, tail); #BAT_select_(b=tmp_5704): BAT_scanselect(b=tmp_5704, bn=tmp_5703, tl, th, equi=1, nequi=0, lval=1, hval=1, nocheck=1); #seqscan_eq_lng_void_tloc_oid_vid_nocheck_noinc[if ( simple_EQ(tl ,v,lng) ),v,oid_ctr,oid_ctr++;](b=tmp_5704, bn=tmp_5703, tl, th, oid_ctr=598266, str_idx=0); #BAT_select_(b=tmp_5704): tmp_5703: hkey=1, tkey=0, hsorted=139659451564097, tsorted=0. #BAT_select_(b=tmp_5710): BAT_hashselect(b=tmp_5710, bn=tmp_5703, tl); (building hash-table on the fly) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ "It" (in fact we) choose to do a hash select, and since there is no hash table, yet, we need to build it, which is infact more expensive than a simple scan select for this very operation (later operation *might* then benefit from the hash table ...): ======== $ grep -9n --color 'building hash-table on the fly' MonetDB/src/gdk/gdk_batop.mx 1210- int nocheck = (estimate >= batcnt); 1211- 1212- if (!preserve_order && equi && b->T->hash) { 1213- ALGODEBUG THRprintf(GDKout, "#BAT_select_(b=%s): BAT_hashselect(b=%s, bn=%s, tl); (using existing hash-table)\n", BATgetId(b), BATgetId(b), BATgetId(bn)); 1214- 1215- bn = BAT_hashselect(b, bn, tl); 1216- } else if (!preserve_order && equi && ATOMsize(b->ttype) > 1 && estimate * 100 < batcnt && batcnt * 2 * sizeof(int) < (GDK_mem_maxsize / 4)) { 1217- /* Build a hash-table on the fly for equi-select if the selectivity is low 1218- * and it is not too big */ 1219: ALGODEBUG THRprintf(GDKout, "#BAT_select_(b=%s): BAT_hashselect(b=%s, bn=%s, tl); (building hash-table on the fly)\n", BATgetId(b), BATgetId(b), BATgetId(bn)); 1220- 1221- bn = BAT_hashselect(b, bn, tl); 1222- } else { 1223- ALGODEBUG THRprintf(GDKout, "#BAT_select_(b=%s): BAT_scanselect(b=%s, bn=%s, tl, th, equi=%d, nequi=%d, lval=%d, hval=%d, nocheck=%d);\n", BATgetId(b), BATgetId(b), BATgetId(bn), equi, nequi, lval, hval, nocheck); 1224- 1225- bn = BAT_scanselect(b, bn, tl, th, li, hi, equi, nequi, lval, hval, nocheck); 1226- } 1227- } 1228- if (bn == NULL) { ========
In fact, I doubt whether investing in building a hash table is a good decision in such cases where we do/can not know whether it will ever pay off ...
Hence, I'd propose to simply drop the choise to build a hash table on the fly, and rather fall through to the basic scan select also in this (rare?) case. I support it. It might pay off to build the hash when the column is key, though, because
Stefan Manegold wrote: that is a good indicator for future point select. Provided the table is persistent and we can predict it to be used.
For now, you can just locally disable/remove that alternative in the above code, try again, and report the result.
Stefan
#BAThash: create hash(1196532); #BAT_select_(b=tmp_5710): tmp_5703: hkey=1, tkey=0, hsorted=139659451564032, tsorted=139659451564032.