On 5 Dec 2013 17:05, "Stefan Manegold" <Stefan.Manegold@cwi.nl> wrote:

hm, that's with the index in place?

If so:
good for our optimizer,
but unfortunately not so good for you ...

If not:
- good for you!
- impressive for our "execution engine" (scan)
(- "embarrassing" for our optimizer ... well, it's a minor detail ...)

;-)

Stefan

----- Original Message -----
> Sorry!
>
> sql>select count(*) from t_order2 where orderId between 'AAAAAA' and
> 'AAAAAA';
> +------+
> | L1 |
> +======+
> | 0 |
> +------+
> 1 tuple (596.202ms)
>
>
>
> On 5 December 2013 16:49, Stefan Manegold < Stefan.Manegold@cwi.nl > wrote:
>
>
> Hi Will,
>
> thank for reminding me: the optimizer still sees your query as point query
> and thus prevents parallelization in order to build/use the hash index.
>
> Maybe you can "fool" the optimizer (and I hope noone else is reading what I'm
> writing now ... ;-)) by "disguising" the point predicate as range predicate,
> say, "where orderId between 'FOO' and 'FOO'" --- in fact, I rather hope (for
> us)
> that is will not work ...
>
> Stefan
>
> ----- Original Message -----
> > Hi Stefan
> >
> > I've added that hash = FALSE patch, and now my simple query returns
> > reliably
> > in 6 seconds. It's still a lot slower than the indexed one (obviously), and
> > 12x slower than the one with the time > '1970-01-01' optimiser nudge (which
> > causes parallelism), but it'll do for now.
> >
> > Thanks
> >
> > -Will
> >
> >
> > On 5 December 2013 16:32, Will Muldrew < will.muldrew@gmail.com > wrote:
> >
> >
> >
> > Hi Stefan
> >
> > Here are my version details:
> >
> > MonetDB-11.15.17
> > CentOS release 6.3
> > kernel 3.6.2
> >
> > I think I might proceed with a patched version with the hash disabled.
> > That's
> > totally fine for me right now, though obviously it'd be nice to be on the
> > trunk!
> >
> > W.r.t. a less random access hash generation. This is all very hand-wavey
> > but
> > how about: 1) generate a sequence of (hash,oid) pairs to some file or
> > mmapped area. 2). use some well-known "external sort" algo to sort these
> > pairs by hash (e.g. split, qsort, merge). 3). build your final hash by
> > iterating through this sorted datastructure.
> >
> > Your memory writing would then be sequential and you'd get good page hit
> > characteristics. You just need to choose a decent sort algo - ideally which
> > works transparently over memmapped data.
> >
> > What do you think? I might even have a crack at it myself :)
> >
> > -Will
> >
> >
> >
> >
> >
> >
> >
> >
> > On 5 December 2013 15:54, Stefan Manegold < Stefan.Manegold@cwi.nl > wrote:
> >
> >
> > Hi Will,
> >
> > could you please also share with us which version of MonetDB you are using,
> > and OS you are running?
> >
> > Pleass find more comments & questions inlined below.
> >
> > ----- Original Message -----
> > > Hi
> > >
> > > I've been investigating very slow queries on my 350M row dataset when
> > > selecting for equality on a non-unique string id. e.g.
> > >
> > > select count(*) from t_order where orderId = 'FOO';
> >
> > Is your orderId column unique, or are there duplicate values?
> >
> > > This query takes ~4.5 hours to build a thash index. Subsequent repeats
> > > hit
> > > the cache and are sub milli.
> > >
> > > select count(*) from t_order where time > '1970-01-01' and orderId =
> > > 'FOO'
> > >
> > > This second one is consistently finished in a few seconds or less, and
> > > doesn't create any indexes.
> > >
> > > The first query is a bit of a show-stopper. It spends a huge about of
> > > time
> > > in
> > > BAThash (gdk_search.c), and incurs a lot of page faults while reading and
> > > writing at random across a multi GB mmaped thash file. Additionally these
> > > page faults are hampered by the OS trying to write dirty pages out in the
> > > background. I've got plenty of RAM overall (130GB), but the 'active' page
> > > proportion for the thash file seems to be stuck at about 25% - giving me
> > > a
> > > 0.75 probability of a miss.
> > >
> > > * Is there some way to make my OS less keen to evict pages? (swappiness =
> > > 0
> > > already) I should have plenty of room to have the whole thash file
> > > resident.
> > > Additionally, allowing it to sit on dirty pages longer would reduce the
> > > total write IO.
> >
> > In fact, if there is sufficient memory in total, there would be no need to
> > flush
> > dirty (non-persistent) pages at all, but we have not yet found a good way
> > to
> > hit
> > this to the OS --- madvice() did not have much/any useful efficet for us on
> > Linux.
> >
> > Maybe some of the issues dicsussed here play a role:
> > http://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases
> > ?
> >
> > While we have not yet encountered the NUMA problem reported there,
> > we have encountered the transparent huge page problems, and usually
> > disable them as also suggested in the above posting.
> >
> > > * Is there some way to simply prevent large thash creation? For many
> > > applications I'd rather have slower consistent queries, then incur a
> > > massively slow query following a restart (not great in a prod
> > > environment!)
> >
> > Not easily right now.
> > However, you can either add the "fake" time predicate as you did above,
> > which (in this case) triggers MonetDB's optimizer to push the time
> > predicate
> > (either because it's in timestamp (int) rather then varchar (string), or
> > because the time column happens to be ordered) below the orderId predicate
> > and this avoid the hash build,
> > or change the following line(s) in gdk/gdk_select.c (in case you built
> > MonetDB yourself from source):
> >
> > hash = b->batPersistence == PERSISTENT &&
> > (size_t) ATOMsize(b->ttype) > sizeof(BUN) / 4 &&
> > BATcount(b) * (ATOMsize(b->ttype) + 2 * sizeof(BUN)) < GDK_mem_maxsize / 2;
> > ->
> > hash = FALSE;
> >
> > We'll consider whether we can provide a better solution in a future
> > release.
> >
> > > * Is there some way to generate the hash index with a more sympathetic
> > > algo
> > > that doesn't degrade so steeply. e.g some hash + sort. (I have literally
> > > no
> > > experience with this!)
> >
> > We'll also look into this. However, building a hash index inherently incurs
> > random access, so chances are slim ...
> >
> > Best,
> > Stefan
> >
> > > -Will Muldrew
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > users-list mailing list
> > > users-list@monetdb.org
> > > https://www.monetdb.org/mailman/listinfo/users-list
> > >
> >
> > --
> > | Stefan.Manegold@CWI.nl | DB Architectures (DA) |
> > | www.CWI.nl/~manegold/ | Science Park 123 (L321) |
> > | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
> >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
>
> --
> | Stefan.Manegold@CWI.nl | DB Architectures (DA) |
> | www.CWI.nl/~manegold/ | Science Park 123 (L321) |
> | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>

--
| Stefan.Manegold@CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |

_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list