Re: [Monetdb-developers] [MonetDB-users] too deep recursion

10 Mar 2009

      On Mar 9, 2009, at 23:35, Martin Kersten wrote:
...
Please run it also against the HEAD, because most of the problems
may have been resolved there.
Jan Rittinger wrote:
...
Hi Martin and others,
I just tested what part the Pathfinder code generation plays and  
generated MIL code for the Aug2008 (0.24), the Nov2008, and the  
Feb2009 release branches. I ran all queries using the newest stable  
version (Feb2009) on Mac OS X.
The observations are:
* The problem with gdk_heap.mx, mmap, and Mac OS X still resides  
(all queries run in 10 seconds instead of 2 seconds)---Peter knows  
what I'm talking about.
* Like Nils reported the queries are getting slower.
* The main performance decrease in my scenario is the document  
loading.
* The problem does not stem from Pathfinder's MIL code generation.
For more details see the attached file...
------------------------------------------------------------------------
BTW: For todays' head version the results are even worse...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Attached you find a bundle with the queries and the test results.

(I'm using a Macbook Pro 2.6GHz Intel Core 2 Duo with 4GB of RAM  
running Mac OS X 10.5.6.)
...
...
Jan
On Mar 9, 2009, at 18:08, Martin Kersten wrote:
...
For all interested. Indeed there are performance differences
between the various releases. Some can be traced back to
functional enhancements, others are a result from internal
administrative activities.
Recent experiments with the TPC-H scale-factor 2 on Feb 2009
branch show a performance degradation compared to Aug 2008,
as reported on the website.
It appears that some low-level actions related to allocation
of BATs and their management in memory-scarce situations are
debet to this situation.
Solutions are integrated with the HEAD, and may (depending
on our resources) be back propagated into a bugfix release
of the Feb 2009 version.
Nils Grimsmo wrote:
...
On Wed, Mar 04, 2009 at 11:08:40PM +0100, Jan Rittinger wrote:
...
Hi Nils,
I just ran your queries with the latest (not yet announced)  
Feb2009
release (http://monetdb.cwi.nl/downloads/sources/Feb2009/) and
received an answer in 1.5 (Q1) and 2.5 (Q2) seconds. If you  
still have
problems with the new version, then please let us know.
Thank you for your answer, Jan.  Feb2009 is indeed faster than  
Nov2008,
but on my computer it is still slower than Aug2008.  I also see  
some
strange and unfavorable performance characteristics on subsequent  
queries
for Nov2008 and Feb2009 (see below).
Aug2008:
# MonetDB Server v4.24.0
# based on GDK   v1.24.0
# PF/Tijah module v0.5.0 loaded. http://dbappl.cs.utwente.nl/ 
pftijah
# MonetDB/XQuery module v0.24.0 loaded (default back-end is  
'algebra')
Nov2008-SP2:
# MonetDB Server v4.26.4
# based on GDK   v1.26.4
# PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/ 
pftijah
# MonetDB/XQuery module v0.26.4 loaded (default back-end is  
'algebra')
Feb2009:
# MonetDB Server v4.28.0
# Based on GDK   v1.28.0
# PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/ 
pftijah
# MonetDB/XQuery module v0.28.0 loaded (default back-end is  
'algebra')
I run the queries multiple times in different scenarios.
A - Have just indexed the document, first run.
B - Second run (subsequent have similar timing).
C - Restart the server (Mserver), then first run.
D - Second run (subsequent have similar timing).
Query Q0:
   Aug2008    Nov2008    Feb2009
A       1101       3687       1760
B       1031       4510       3015
C       1350       5216       3390
D       1035      12620       9533
Query Q1:
   Aug2008    Nov2008    Feb2009
A       2161      15119       3013
B       2099      19292       4072
C       2526      18523       4567
D       2117      42555      10602
This seems very strange to me.  The timings make sense for  
Aug2008, where
the query is slightly slower right after restarting the server  
(C).  For
Nov2008 and Feb2009, the second (and subsequent) runs are slower  
than the
first.  How can this be?  It can make sense for the first run after
restarting the server (C) to be slower (reading stuff from disk  
etc.), but
why is the second (D) terribly slower?  If I just keep running  
the query,
the timings are similar to D.
Note:  If I start mixing Q0 and Q1 after step D, they are both as  
slow as
in step D.
I hope this feedback is helpful.  Is there something strange with  
my
setup, or is this a "bug"?  (My timings in step (A) seem similar  
to Jan's
timings).
If I want to compare MonetDB/XQuery to other implementations in a
scientific paper, I typically want to warm up the system, then  
run the
query multiple times to get an average timing.  It is kind of  
inconvenient
not to be able to close down Mserver between experiments...
...
P.S.: The E-Mail subject seems slightly off topic here :)
Yes, thought I'd avoid touching the mouse to copy the email  
address.  Cut
away In-Reply-To:, but forgot to change Subject:...
Thank you for your assistance!
Klem fra Nils
...
On Mar 4, 2009, at 16:30, Nils Grimsmo wrote:
...
Hi, I just upgraded from the Augst to the Noveber super-ball,  
and the
performance has worsened badly.
Example queries on dblp.xml (441 MB):
Q0: count(/dblp//author[text()="Michael Stonebraker"])
Q1: count(/dblp/*/author[text()="Michael Stonebraker"])
Query time in milliseconds:
August    November
Q0     1100        4867
Q1     3993       17999
I have compiled with --enable-optimise both times.  I query with:
mclient --language=xquery --algebra --time < $QUERYFILE
Is this performance degradation expected?  If so, why?
BTW:  Is there any way of finding how much disk space a  
collection
uses?
Thank you for contributing free software!
Klem fra Nils
-- 
Jan Rittinger
Lehrstuhl Datenbanken und Informationssysteme
Wilhelm-Schickard-Institut für Informatik
Eberhard-Karls-Universität Tübingen

http://www-db.informatik.uni-tuebingen.de/team/rittinger