[MonetDB-users] MonetDB XML - slow execution of query with test on attribute

Hi, I am trying to write a xquery-expression that does a test on an attribute (id="1"). Unfortunately the query execution time for this query is really bad (>30 seconds). Is there any way to find out why this query takes so long? 30 seconds for a test on one single attribute is just way to high for our use-case. I read that MonetDB automatically creates indices on all attribute - is there a way to test, if there was an index created for that attribute and if that index was used for query execution? The query (in File q3.xp) is supposed to return the molecule with id "1" declare namespace xq="http://www.xml-cml.org/schema"; <smiles> { for $i in collection("tu") where $i/xq:molecule/@id = "1" return <smile>{$i}</smile> } </smiles> The query is executed like this: # mclient -t -lx -G q3.xp ... Trans 55.684 msec Shred 18.818 msec Query 29335.766 msec Print 0.258 msec The database contains 1.000.000 XML-documents in a single collection, is updatable and is about 2601 MiB. A document looks similar to this one: <?xml version="1.0" encoding="UTF-8"?> <molecule id="1" xmlns="http://www.xml-cml.org/schema"> <bondArray atomRef1="..." order="..."/> <atomArray atomID="..." elementType="..." x3="..."/> <identifier convention="iupac:smile" value="..."/> <identifier convention="iupac:inchi" value="..."/> </molecule> We are using: "MonetDB/XQuery 0.36.5, based on the pathfinder http://www.pathfinder-xquery.org compiler and MonetDB 4.36.1 http://monetdb.cwi.nl" Thank you very much for any help! Patrick

Hi Patrick, the evaluation plan of your query looks quite bad -- with respect to what could be possible... If you could use 'doc("tu")' (with a single big document containing all small documents), you would get the result in *no* time, as then the attribute index is used. Unfortunately I'm not aware of the difference in the attribute index with respect to collections... Regards, Jan On Jun 15, 2010, at 14:42, Patrick Schäfer wrote:
Hi,
I am trying to write a xquery-expression that does a test on an attribute (id="1"). Unfortunately the query execution time for this query is really bad (>30 seconds).
Is there any way to find out why this query takes so long? 30 seconds for a test on one single attribute is just way to high for our use-case. I read that MonetDB automatically creates indices on all attribute - is there a way to test, if there was an index created for that attribute and if that index was used for query execution?
The query (in File q3.xp) is supposed to return the molecule with id "1" declare namespace xq="http://www.xml-cml.org/schema"; <smiles> { for $i in collection("tu") where $i/xq:molecule/@id = "1" return <smile>{$i}</smile> } </smiles>
The query is executed like this: # mclient -t -lx -G q3.xp ... Trans 55.684 msec Shred 18.818 msec Query 29335.766 msec Print 0.258 msec
The database contains 1.000.000 XML-documents in a single collection, is updatable and is about 2601 MiB. A document looks similar to this one: <?xml version="1.0" encoding="UTF-8"?> <molecule id=“1“ xmlns="http://www.xml-cml.org/schema"> <bondArray atomRef1="..." order="..."/> <atomArray atomID="..." elementType="..." x3="..."/> <identifier convention="iupac:smile" value="..."/> <identifier convention="iupac:inchi" value="..."/> </molecule>
We are using: "MonetDB/XQuery 0.36.5, based on the pathfinder compiler and MonetDB 4.36.1"
Thank you very much for any help!
Patrick ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo_________________________________________... MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Jan Rittinger Lehrstuhl Datenbanken und Informationssysteme Wilhelm-Schickard-Institut für Informatik Eberhard-Karls-Universität Tübingen http://www-db.informatik.uni-tuebingen.de/team/rittinger

Hi, How about: smiles> { for $i in pf:collection("tu") where $i//xq:molecule/@id = "1" return <smile>{$i}</smile> } </smiles> is that any faster? Peter
Hi Patrick,
the evaluation plan of your query looks quite bad -- with respect to what could be possible...
If you could use 'doc("tu")' (with a single big document containing all small documents), you would get the result in *no* time, as then the attribute index is used. Unfortunately I'm not aware of the difference in the attribute index with respect to collections...
Regards, Jan
On Jun 15, 2010, at 14:42, Patrick Schäfer wrote:
Hi,
I am trying to write a xquery-expression that does a test on an attribute (id="1"). Unfortunately the query execution time for this query is really bad (>30 seconds).
Is there any way to find out why this query takes so long? 30 seconds for a test on one single attribute is just way to high for our use-case. I read that MonetDB automatically creates indices on all attribute - is there a way to test, if there was an index created for that attribute and if that index was used for query execution?
The query (in File q3.xp) is supposed to return the molecule with id "1" declare namespace xq="http://www.xml-cml.org/schema"; <smiles> { for $i in collection("tu") where $i/xq:molecule/@id = "1" return <smile>{$i}</smile> } </smiles>
The query is executed like this: # mclient -t -lx -G q3.xp ... Trans 55.684 msec Shred 18.818 msec Query 29335.766 msec Print 0.258 msec
The database contains 1.000.000 XML-documents in a single collection, is updatable and is about 2601 MiB. A document looks similar to this one: <?xml version="1.0" encoding="UTF-8"?> <molecule id=1 xmlns="http://www.xml-cml.org/schema"> <bondArray atomRef1="..." order="..."/> <atomArray atomID="..." elementType="..." x3="..."/> <identifier convention="iupac:smile" value="..."/> <identifier convention="iupac:inchi" value="..."/> </molecule>
We are using: "MonetDB/XQuery 0.36.5, based on the pathfinder compiler and MonetDB 4.36.1"
Thank you very much for any help!
Patrick ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo_________________________________________... MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Jan Rittinger Lehrstuhl Datenbanken und Informationssysteme Wilhelm-Schickard-Institut für Informatik Eberhard-Karls-Universität Tübingen
http://www-db.informatik.uni-tuebingen.de/team/rittinger
------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo_________________________________________... MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (3)
-
Jan Rittinger
-
Patrick Schäfer
-
Peter Boncz