Re: [MonetDB-users] MonetDB/XQuery: performance issues

23 Aug 2005

      Peter,

On Wed, Aug 17, 2005 at 04:22:21PM +0200, Peter van der Kamp wrote:
...
My first experiments were done on an old Compaq Proliant 3000 with a 
350 Mhz Pentium II processor and 128 Mb memory, running Fedora Core 
3. I loaded 10 dictionary documents with a total size of 146 Mb. 
Retrieving the headwords from that file took about 2 minutes. Full 
text searches were slower and searching on attributes did not finish. 
As MonetDB was designed for high-performance I became suspicious, not 
only about the machine, but also about my queries. E.g. I had to loop 
over all the documents and I wonder if this could be a drawback. So I 
'glued' the documents together and loaded that single file. But from 
a performance point of view it didn't make much difference.
I have now transferred my experiments to a 2.8 GHz Pentium machine 
with 1 Gb of memory, also running Fedora Core 3, and that's much 
better with respect to performance. The complete (WNT) dictionary is 
now loaded, 40 files, total size c. 450 Mb.
indeed, your first machine *seems* to be a bit small for the given document
size, especially as (some) XQuery queries might require large intermediate
results. In order to know, whether this is the case with your queries, and
whether we might improve/extend the MonetDB XQuery compiler to avoid these
large intermediate results (provided they are not query inherent), I/we
would need to know your queries. Please feel free to send then to me/us via
this list or via personal email.
...
To give some more background information: we are currently in the 
process of selecting an xml database system for our dictionary data: 
Woordenboek der Nederlandsche Taal (WNT, Dictionary of the Dutch 
Language), Dictionary of Early Middle Dutch and General Dutch 
Dictionary (ANW). Requirements are (amongst others): good performance 
especially for full text searches e.g. searching for a word(s) in a 
sense or citation) and the ability to cooperate with xml editors like 
XMLSpy.
We have not yet spent any time/efford on inpluding any particular
high-performance full text search support in MonetDB/XQuery. However, within
the CIRQUID project (http://wwwhome.cs.utwente.nl/~cirquid/), Arjen de Vries
(http://www.cwi.nl/~arjen/, arjen@acm.org) and his colleagues are
investigating how to "design and build a DBMS that seemlessly integrates
relevance-oriented querying of semi-structured data (XML) with traditional
querying of this data". This work also incudes full text search of XML
documents. Please feel free to contact Arjen (directly or via me) for more
details.

Could yoy please specify, what kind of "cooperation" with xml editors you do
require?

Regards,

Stefan

-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |