[Monetdb-developers] problems with XQuery aggregation
hej monetdb developers, i tried to write a simple aggregation query: let $cands := doc("x.xml")//c for $eid in distinct-values($cands/@id) return <candidate id="{$eid}" count="{count($cands[@id eq $eid])}"/> however, executing the above query turns out to be rather difficult. i always killed the process after waiting for more than half an hour without getting any result. x.xml is 17MB large and contains 300000 candidates nodes c. i run it on Monet Database Server V4.19.0 where the query crashed ERROR = !ERROR: GDKload: cannot mmap(): name=10/1030, ext=buns.priv !OS: Cannot allocate memory !ERROR: GDKload failed: name=10/1030, ext=buns.priv !ERROR: CMDleftjoin: operation failed. and on the newer stable release MonetDB Server v4.20.0 on a machine with 16GB memory where the query just runs endless. apparently the query compiler does not recognise the necessary join and aggregation here. is that a known problem? btw. sorting the candidates by @id value is done in seconds: let $cands := doc("x.xml")//c for $c in $cands order by $c/@id return $c best -henning
Henning, I suppose you use the default MPS ("milprint_summer") backend, right? (i.e., a simple `mclient -lxquery q.xq` or `pf [-M] q.xq | mclient -lmil`.) Could you please also try the algebra backend via `pf -A q.xq | mclient -lmil` and report the results/behaviour? Stefan On Thu, Oct 25, 2007 at 12:39:07PM +0200, Henning Rode wrote:
hej monetdb developers,
i tried to write a simple aggregation query:
let $cands := doc("x.xml")//c for $eid in distinct-values($cands/@id) return <candidate id="{$eid}" count="{count($cands[@id eq $eid])}"/>
however, executing the above query turns out to be rather difficult. i always killed the process after waiting for more than half an hour without getting any result.
x.xml is 17MB large and contains 300000 candidates nodes c. i run it on Monet Database Server V4.19.0 where the query crashed
ERROR = !ERROR: GDKload: cannot mmap(): name=10/1030, ext=buns.priv !OS: Cannot allocate memory !ERROR: GDKload failed: name=10/1030, ext=buns.priv !ERROR: CMDleftjoin: operation failed.
and on the newer stable release MonetDB Server v4.20.0 on a machine with 16GB memory where the query just runs endless.
apparently the query compiler does not recognise the necessary join and aggregation here. is that a known problem?
btw. sorting the candidates by @id value is done in seconds:
let $cands := doc("x.xml")//c for $c in $cands order by $c/@id return $c
best -henning
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
oh, amazing... with the algebra backend it works in seconds. thanks a lot. is MPS not able to create an efficient query plan here? -henning Stefan Manegold wrote:
Henning,
I suppose you use the default MPS ("milprint_summer") backend, right? (i.e., a simple `mclient -lxquery q.xq` or `pf [-M] q.xq | mclient -lmil`.)
Could you please also try the algebra backend via `pf -A q.xq | mclient -lmil` and report the results/behaviour?
Stefan
On Thu, Oct 25, 2007 at 12:39:07PM +0200, Henning Rode wrote:
hej monetdb developers,
i tried to write a simple aggregation query:
let $cands := doc("x.xml")//c for $eid in distinct-values($cands/@id) return <candidate id="{$eid}" count="{count($cands[@id eq $eid])}"/>
however, executing the above query turns out to be rather difficult. i always killed the process after waiting for more than half an hour without getting any result.
x.xml is 17MB large and contains 300000 candidates nodes c. i run it on Monet Database Server V4.19.0 where the query crashed
ERROR = !ERROR: GDKload: cannot mmap(): name=10/1030, ext=buns.priv !OS: Cannot allocate memory !ERROR: GDKload failed: name=10/1030, ext=buns.priv !ERROR: CMDleftjoin: operation failed.
and on the newer stable release MonetDB Server v4.20.0 on a machine with 16GB memory where the query just runs endless.
apparently the query compiler does not recognise the necessary join and aggregation here. is that a known problem?
btw. sorting the candidates by @id value is done in seconds:
let $cands := doc("x.xml")//c for $c in $cands order by $c/@id return $c
best -henning
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
On Thu, Oct 25, 2007 at 02:44:44PM +0200, Henning Rode wrote:
oh, amazing... with the algebra backend it works in seconds.
thanks a lot. is MPS not able to create an efficient query plan here?
apparently. Stefan
-henning
Stefan Manegold wrote:
Henning,
I suppose you use the default MPS ("milprint_summer") backend, right? (i.e., a simple `mclient -lxquery q.xq` or `pf [-M] q.xq | mclient -lmil`.)
Could you please also try the algebra backend via `pf -A q.xq | mclient -lmil` and report the results/behaviour?
Stefan
On Thu, Oct 25, 2007 at 12:39:07PM +0200, Henning Rode wrote:
hej monetdb developers,
i tried to write a simple aggregation query:
let $cands := doc("x.xml")//c for $eid in distinct-values($cands/@id) return <candidate id="{$eid}" count="{count($cands[@id eq $eid])}"/>
however, executing the above query turns out to be rather difficult. i always killed the process after waiting for more than half an hour without getting any result.
x.xml is 17MB large and contains 300000 candidates nodes c. i run it on Monet Database Server V4.19.0 where the query crashed
ERROR = !ERROR: GDKload: cannot mmap(): name=10/1030, ext=buns.priv !OS: Cannot allocate memory !ERROR: GDKload failed: name=10/1030, ext=buns.priv !ERROR: CMDleftjoin: operation failed.
and on the newer stable release MonetDB Server v4.20.0 on a machine with 16GB memory where the query just runs endless.
apparently the query compiler does not recognise the necessary join and aggregation here. is that a known problem?
btw. sorting the candidates by @id value is done in seconds:
let $cands := doc("x.xml")//c for $c in $cands order by $c/@id return $c
best -henning
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
I'm sure it did recognize the join in this query some time/versions ago ... Peter, although we decided to not spend a lot of time into maintaining the mps-backend, it would be a pity if the reason for it is related to the value index recently introduced. Perhaps you may want to check if it is simply a matter of the join pattern not being recognized anymore due to the XQuery-Core rewritings for the value index. Maurice. Stefan Manegold wrote:
On Thu, Oct 25, 2007 at 02:44:44PM +0200, Henning Rode wrote:
oh, amazing... with the algebra backend it works in seconds.
thanks a lot. is MPS not able to create an efficient query plan here?
apparently.
Stefan
-henning
Stefan Manegold wrote:
Henning,
I suppose you use the default MPS ("milprint_summer") backend, right? (i.e., a simple `mclient -lxquery q.xq` or `pf [-M] q.xq | mclient -lmil`.)
Could you please also try the algebra backend via `pf -A q.xq | mclient -lmil` and report the results/behaviour?
Stefan
On Thu, Oct 25, 2007 at 12:39:07PM +0200, Henning Rode wrote:
hej monetdb developers,
i tried to write a simple aggregation query:
let $cands := doc("x.xml")//c for $eid in distinct-values($cands/@id) return <candidate id="{$eid}" count="{count($cands[@id eq $eid])}"/>
however, executing the above query turns out to be rather difficult. i always killed the process after waiting for more than half an hour without getting any result.
x.xml is 17MB large and contains 300000 candidates nodes c. i run it on Monet Database Server V4.19.0 where the query crashed
ERROR = !ERROR: GDKload: cannot mmap(): name=10/1030, ext=buns.priv !OS: Cannot allocate memory !ERROR: GDKload failed: name=10/1030, ext=buns.priv !ERROR: CMDleftjoin: operation failed.
and on the newer stable release MonetDB Server v4.20.0 on a machine with 16GB memory where the query just runs endless.
apparently the query compiler does not recognise the necessary join and aggregation here. is that a known problem?
btw. sorting the candidates by @id value is done in seconds:
let $cands := doc("x.xml")//c for $c in $cands order by $c/@id return $c
best -henning
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- ---------------------------------------------------------------------- Dr.Ir. M. van Keulen - Assistant Professor, Data Management Technology Univ. of Twente, Dept of EEMCS, POBox 217, 7500 AE Enschede, Netherlands Email: m.vankeulen@utwente.nl, Phone: +31 534893688, Fax: +31 534892927 Room: ZI 3039, WWW: http://www.cs.utwente.nl/~keulen
participants (3)
-
Henning Rode
-
Keulen, M. van (Maurice)
-
Stefan Manegold