[MonetDB-users] MonetDB/XQuery: performance issues
I've been experimenting with MonetDB/XQuery for some time now and i wonder what kind of machine was used to produce the benchmark results as published on the website. Are there some guidelines ("do's and don'ts") with respect to performance? Peter van der Kamp
I've been experimenting with MonetDB/XQuery for some time now and i wonder what kind of machine was used to produce the benchmark results as published on the website.
I believe it was tested on a 1.6 GHz AMD Opteron system with 8GB RAM. The benchmark results are discussed in this paper: http://www.pathfinder-xquery.org/files/pathfinder-vldb2005.pdf Regards, Jennie
On Wed, Aug 17, 2005 at 12:15:07PM +0200, Ying Zhang wrote:
I've been experimenting with MonetDB/XQuery for some time now and i wonder what kind of machine was used to produce the benchmark results as published on the website.
I believe it was tested on a 1.6 GHz AMD Opteron system with 8GB RAM.
Right. Moreover, the code (both MonetDB & pathfinder) was compiled with gcc 3.3, full optimization switched on (configure --enable-optimize), and using 32-bit oids (configure --enable-oid32). Do you experience any performance problems? If so, could you please provide us with the details, like the detailed specification of your hardware, the exact version of MonetDB/XQuery, compiler and compilation options, documents and queries? Stefan
The benchmark results are discussed in this paper:
http://www.pathfinder-xquery.org/files/pathfinder-vldb2005.pdf
Regards,
Jennie
------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Do you experience any performance problems? If so, could you please provide us with the details, like the detailed specification of your hardware, the exact version of MonetDB/XQuery, compiler and compilation options, documents and queries?
My first experiments were done on an old Compaq Proliant 3000 with a 350 Mhz Pentium II processor and 128 Mb memory, running Fedora Core 3. I loaded 10 dictionary documents with a total size of 146 Mb. Retrieving the headwords from that file took about 2 minutes. Full text searches were slower and searching on attributes did not finish. As MonetDB was designed for high-performance I became suspicious, not only about the machine, but also about my queries. E.g. I had to loop over all the documents and I wonder if this could be a drawback. So I 'glued' the documents together and loaded that single file. But from a performance point of view it didn't make much difference. I have now transferred my experiments to a 2.8 GHz Pentium machine with 1 Gb of memory, also running Fedora Core 3, and that's much better with respect to performance. The complete (WNT) dictionary is now loaded, 40 files, total size c. 450 Mb. To give some more background information: we are currently in the process of selecting an xml database system for our dictionary data: Woordenboek der Nederlandsche Taal (WNT, Dictionary of the Dutch Language), Dictionary of Early Middle Dutch and General Dutch Dictionary (ANW). Requirements are (amongst others): good performance especially for full text searches e.g. searching for a word(s) in a sense or citation) and the ability to cooperate with xml editors like XMLSpy. Peter
Peter, On Wed, Aug 17, 2005 at 04:22:21PM +0200, Peter van der Kamp wrote:
My first experiments were done on an old Compaq Proliant 3000 with a 350 Mhz Pentium II processor and 128 Mb memory, running Fedora Core 3. I loaded 10 dictionary documents with a total size of 146 Mb. Retrieving the headwords from that file took about 2 minutes. Full text searches were slower and searching on attributes did not finish. As MonetDB was designed for high-performance I became suspicious, not only about the machine, but also about my queries. E.g. I had to loop over all the documents and I wonder if this could be a drawback. So I 'glued' the documents together and loaded that single file. But from a performance point of view it didn't make much difference.
I have now transferred my experiments to a 2.8 GHz Pentium machine with 1 Gb of memory, also running Fedora Core 3, and that's much better with respect to performance. The complete (WNT) dictionary is now loaded, 40 files, total size c. 450 Mb.
indeed, your first machine *seems* to be a bit small for the given document size, especially as (some) XQuery queries might require large intermediate results. In order to know, whether this is the case with your queries, and whether we might improve/extend the MonetDB XQuery compiler to avoid these large intermediate results (provided they are not query inherent), I/we would need to know your queries. Please feel free to send then to me/us via this list or via personal email.
To give some more background information: we are currently in the process of selecting an xml database system for our dictionary data: Woordenboek der Nederlandsche Taal (WNT, Dictionary of the Dutch Language), Dictionary of Early Middle Dutch and General Dutch Dictionary (ANW). Requirements are (amongst others): good performance especially for full text searches e.g. searching for a word(s) in a sense or citation) and the ability to cooperate with xml editors like XMLSpy.
We have not yet spent any time/efford on inpluding any particular high-performance full text search support in MonetDB/XQuery. However, within the CIRQUID project (http://wwwhome.cs.utwente.nl/~cirquid/), Arjen de Vries (http://www.cwi.nl/~arjen/, arjen@acm.org) and his colleagues are investigating how to "design and build a DBMS that seemlessly integrates relevance-oriented querying of semi-structured data (XML) with traditional querying of this data". This work also incudes full text search of XML documents. Please feel free to contact Arjen (directly or via me) for more details. Could yoy please specify, what kind of "cooperation" with xml editors you do require? Regards, Stefan -- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (3)
-
Peter van der Kamp
-
Stefan Manegold
-
Ying Zhang