Dear Klem fra Nils and others, on a Dual Core AMD Opteron(tm) Processor 270 @ 2 GHz, using an optimized (default) compialtion of the latest CVS version of the February 2008 release (i.e., MonetDB Server v4.22.1 (based on GDK v1.22.1) + MonetDB/XQuery module v0.22.1), I get basically the same numbers as you get: ======== $ echo 'pf:add-doc(".../dblp.xml", "dblp.xml")' | mclient -lx -t Trans 3.459 msec Shred 0.000 msec Query 2.332 msec Update 58531.674 msec Timer 58542.243 msec ======== $ mclient -lx -t < dblp_q7.xq 4 Trans 22.646 msec Shred 3.885 msec Query 17379.378 msec Print 0.692 msec Timer 17529.196 msec ======== $ mclient -lx -t < dblp_q7.xq 4 Trans 22.432 msec Shred 3.073 msec Query 15287.170 msec Print 0.691 msec Timer 15443.823 msec ======== However, then I started "playing" a bit with the way the query is phrased, hopefully not changing the sematics. Here's what I found: 0) your original query: ======== $ echo 'count(doc("dblp.xml")/dblp/*[author="Michael Stonebraker" and author="Hector Garcia-Molina" and year > 1950]/title)' | mclient -lx -t 4 Trans 22.409 msec Shred 3.411 msec Query 15285.472 msec Print 0.786 msec Timer 15413.006 msec ======== 1) break-up conjunctive predicate in a sequence ("conjunction") of simple predicates: ======== $ echo 'count(doc("dblp.xml")/dblp/*[author="Michael Stonebraker"][author="Hector Garcia-Molina"][year > 1950]/title)' | mclient -lx -t 4 Trans 26.631 msec Shred 3.399 msec Query 6565.645 msec Print 0.773 msec Timer 6647.749 msec ======== 2) use the new (not yet "official" in the Feb'08 release) "Algebra" back-end: ======== $ echo 'count(doc("dblp.xml")/dblp/*[author="Michael Stonebraker"][author="Hector Garcia-Molina"][year > 1950]/title)' | mclient -lx -t -G <?xml version="1.0" encoding="utf-8"?> <XQueryResult>4</XQueryResult> Timer 4667.186 msec ======== 3) trigger the use of built-in (standard) text-value indices by explicitly using fn:text() (the "minor" change in semantics should not affect the result for this query + document): ======== $ echo 'count(doc("dblp.xml")/dblp/*[author/text()="Michael Stonebraker"][author/text()="Hector Garcia-Molina"][year > 1950]/title)' | mclient -lx -t -G <?xml version="1.0" encoding="utf-8"?> <XQueryResult>4</XQueryResult> Timer 3334.295 msec ======== 0) / 3) == 15413.006 msec / 3334.295 msec == factor 4.622568189 speedup ! Hence, it's now our task, to see whether we can automize some of these hand-made optimization. 2) The new "Algebra" back-end will be the default in the next release of MonetDB/XQuery. 1) We need to check, why the conjunctive predicate (in this case?) is so much slower than the sequence ("conjunction") of simple predicates, and either fix the (translation of) the former or (try to) always rewrrite the former into the latter. 3) We'll try to improve the automatic detection of cases/situation where the built-in (standard) text-value indices can be exploited. Greetings from A'dam, Stefan On Fri, May 09, 2008 at 03:30:45PM +0200, Lefteris wrote:
Hi,
In my machine which is very close to your setting (AMD 3800, 2G ram, ~100MB/s disk read) I get for hot run of q7
mclient -lxq --time < dblp_q7.xq
Trans 36.749 msec Shred 6.051 msec Query 44211.431 msec Print 3.011 msec Timer 44424.728 msec
which is almost 3 times more than your time. *But*, I just used my current installation which is with debug-enable and optimization-disable so these differences in execution times are expected (you win 30-40% with optimization-enabled). So I would say that your reported times are ok.
There are no indeces that you can additional ask. MonetDB maintains the indeces it needs by itself.
Hope I was of any help.
regards,
lefteris
On Fri, May 9, 2008 at 11:28 AM, Nils Grimsmo
wrote: Using the "super-ball" from February.
I am running some queries, and just want to check if my performance is sane. Are there more indexes I can ask to get built to increase performance?
I don't ask for an accurate verification, just a hint of whether or not we are within an order of magnitude of what seems right :)
$ Mserver --dbinit="module(pathfinder);" --set mapi_port=60000
$ echo 'pf:add-doc("path/dblp.xml", "dblp.xml")' | mclient --lx -p60000 --time Timer 73068.138 msec
$ echo 'pf:documents()' | mclient -lx -p60000 <document updatable="false" url="path/dblp.xml" collection="dblp.xml">dblp.xml</document>
$ cat dblp_q7.xq count(doc("dblp.xml")/dblp/*[author="Michael Stonebraker" and author="Hector Garcia-Molina" and year > 1950 ]/title)
$ mclient -lx -p 60000 --time < dblp_q7.xq 4 Timer 15512.186 msec (Repeated multiple times for warm-up)
My dblp.xml is 441 MB. Mserver uses 1.9 of 3.4 GB memory.
$ cat /proc/cpuinfo <snip> model name : AMD Athlon(tm) 64 Processor 3500+ cpu MHz : 2210.763 cache size : 512 KB
The dbfarm resides on a WD740GD (http://techreport.com/articles.x/6390).
Klem fra Nils
------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javao... _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javao... _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |