Re: [MonetDB-users] Shredding DBLP XML document

Hi Erwin, shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4: ======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ======== Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)? You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)? Stefan ps: I felt free to cc this to the MonetDB-users lists. On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |

I use a machine with Intel Xeon 2GHz and 1GHz RAM. The OS is Windows XP Pro SP2. The version of MonetDB/XQuery is 0.12.0 (download from MonetDB web, Win32 binaries).
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Yes, I noticed that MapiClient.exe was still busy with shredding. I also
notice that the usage of RAM was increased slowly (starting from 5MB). In
addition, there was no error or warning message. I left my PC from last
Friday night to Monday morning. (Now, I try to run again on another PC -- P4
2.4GHz with 512MB and WinXP -- and leave it running for tonight).
Thanks
Erwin
On 9/21/06, Stefan Manegold
Hi Erwin,
shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4:
======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ========
Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)?
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Stefan
ps: I felt free to cc this to the MonetDB-users lists.
On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |

Erwin, also with MonetDB/XQuery 0.12.0 release, shredding the dblp.xml works without problems on my Linux desktop, both explicitly via shred_doc on Mserver's MIL console and "on-the-fly" via XQuery's fn:doc(): ======== $ Mserver # Monet Database Server V4.12.0 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>module(pathfinder); MonetDB>shred_doc("/tmp/dblp.xml","dblp"); # Shredded XML doc("dblp"), total time after commit=41.651s MonetDB> -------- $ echo 'count(doc("/tmp/dblp.xml")//*)' | MapiClient -lx -T 8253146 Trans 16.000 msec Shred 42128.000 msec Query 420.000 msec Print 44.000 msec Timer 43064.586 msec ======== Mserver grows to just below 1 GB virtual memory, but never beyond 600 MB real memory usage. Unfortunately, I have no Windows machine to also test it on Windows. Please let us know, how your experiments go. Stefan On Thu, Sep 21, 2006 at 07:06:47PM +0800, Erwin Leonardi wrote:
I use a machine with Intel Xeon 2GHz and 1GHz RAM. The OS is Windows XP Pro SP2. The version of MonetDB/XQuery is 0.12.0 (download from MonetDB web, Win32 binaries).
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Yes, I noticed that MapiClient.exe was still busy with shredding. I also notice that the usage of RAM was increased slowly (starting from 5MB). In addition, there was no error or warning message. I left my PC from last Friday night to Monday morning. (Now, I try to run again on another PC -- P4 2.4GHz with 512MB and WinXP -- and leave it running for tonight).
Thanks
Erwin
On 9/21/06, Stefan Manegold
wrote: Hi Erwin,
shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4:
======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ========
Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)?
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Stefan
ps: I felt free to cc this to the MonetDB-users lists.
On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |

MonetDB (4.12.0) on my machine (Centrino 1.5Ghz, 512MB, Windows XP Pro SP2) surely did not like shredding 330 megabyte (today's version of dblp.xml), but finally it responded with: mil>shred_doc("c:/tmp/dblp.xml","dblp.xml"); # Shredded XML doc("dblp.xml"), total time after commit=782.355s mil> The majority of the time was spend in I/O (CPU was about 90% idle most of the time). Greetings, Wouter p.s. I had to change the DTD-reference from "dblp.dtd" to "http://dblp.uni-trier.de/xml/dblp.xml" for MonetDB to be able to locate the DTD. -----Original Message----- From: monetdb-users-bounces@lists.sourceforge.net [mailto:monetdb-users-bounces@lists.sourceforge.net] On Behalf Of Stefan Manegold Sent: donderdag 21 september 2006 22:04 To: Erwin Leonardi Cc: Monetdb-users@lists.sourceforge.net Subject: Re: [MonetDB-users] Shredding DBLP XML document Erwin, also with MonetDB/XQuery 0.12.0 release, shredding the dblp.xml works without problems on my Linux desktop, both explicitly via shred_doc on Mserver's MIL console and "on-the-fly" via XQuery's fn:doc(): ======== $ Mserver # Monet Database Server V4.12.0 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>module(pathfinder); MonetDB>shred_doc("/tmp/dblp.xml","dblp"); # Shredded XML doc("dblp"), total time after commit=41.651s MonetDB> -------- $ echo 'count(doc("/tmp/dblp.xml")//*)' | MapiClient -lx -T 8253146 Trans 16.000 msec Shred 42128.000 msec Query 420.000 msec Print 44.000 msec Timer 43064.586 msec ======== Mserver grows to just below 1 GB virtual memory, but never beyond 600 MB real memory usage. Unfortunately, I have no Windows machine to also test it on Windows. Please let us know, how your experiments go. Stefan On Thu, Sep 21, 2006 at 07:06:47PM +0800, Erwin Leonardi wrote:
I use a machine with Intel Xeon 2GHz and 1GHz RAM. The OS is Windows XP Pro SP2. The version of MonetDB/XQuery is 0.12.0 (download from MonetDB web, Win32 binaries).
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Yes, I noticed that MapiClient.exe was still busy with shredding. I also notice that the usage of RAM was increased slowly (starting from 5MB). In addition, there was no error or warning message. I left my PC from last Friday night to Monday morning. (Now, I try to run again on another PC -- P4 2.4GHz with 512MB and WinXP -- and leave it running for tonight).
Thanks
Erwin
On 9/21/06, Stefan Manegold
wrote: Hi Erwin,
shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4:
======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ========
Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)?
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Stefan
ps: I felt free to cc this to the MonetDB-users lists.
On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 | ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users

Thanks! I can shred DBLP now. I removed the DTD reference in the XML
document. After I put and specified the full path of the folder in which I
keep the DTD, it works!
Thanks
erwin
On 9/22/06, Wouter Alink
MonetDB (4.12.0) on my machine (Centrino 1.5Ghz, 512MB, Windows XP Pro SP2) surely did not like shredding 330 megabyte (today's version of dblp.xml), but finally it responded with:
mil>shred_doc("c:/tmp/dblp.xml","dblp.xml"); # Shredded XML doc("dblp.xml"), total time after commit=782.355s mil>
The majority of the time was spend in I/O (CPU was about 90% idle most of the time).
Greetings, Wouter
p.s. I had to change the DTD-reference from "dblp.dtd" to "http://dblp.uni-trier.de/xml/dblp.xml" for MonetDB to be able to locate the DTD.
-----Original Message----- From: monetdb-users-bounces@lists.sourceforge.net [mailto:monetdb-users-bounces@lists.sourceforge.net] On Behalf Of Stefan Manegold Sent: donderdag 21 september 2006 22:04 To: Erwin Leonardi Cc: Monetdb-users@lists.sourceforge.net Subject: Re: [MonetDB-users] Shredding DBLP XML document
Erwin,
also with MonetDB/XQuery 0.12.0 release, shredding the dblp.xml works without problems on my Linux desktop, both explicitly via shred_doc on Mserver's MIL console and "on-the-fly" via XQuery's fn:doc():
======== $ Mserver # Monet Database Server V4.12.0 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>module(pathfinder); MonetDB>shred_doc("/tmp/dblp.xml","dblp"); # Shredded XML doc("dblp"), total time after commit=41.651s MonetDB> -------- $ echo 'count(doc("/tmp/dblp.xml")//*)' | MapiClient -lx -T 8253146
Trans 16.000 msec Shred 42128.000 msec Query 420.000 msec Print 44.000 msec Timer 43064.586 msec ========
Mserver grows to just below 1 GB virtual memory, but never beyond 600 MB real memory usage.
Unfortunately, I have no Windows machine to also test it on Windows.
Please let us know, how your experiments go.
Stefan
On Thu, Sep 21, 2006 at 07:06:47PM +0800, Erwin Leonardi wrote:
I use a machine with Intel Xeon 2GHz and 1GHz RAM. The OS is Windows XP Pro SP2. The version of MonetDB/XQuery is 0.12.0 (download from MonetDB web, Win32 binaries).
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Yes, I noticed that MapiClient.exe was still busy with shredding. I also notice that the usage of RAM was increased slowly (starting from 5MB). In addition, there was no error or warning message. I left my PC from last Friday night to Monday morning. (Now, I try to run again on another PC -- P4 2.4GHz with 512MB and WinXP -- and leave it running for tonight).
Thanks
Erwin
On 9/21/06, Stefan Manegold
wrote: Hi Erwin,
shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4:
======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ========
Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)?
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Stefan
ps: I felt free to cc this to the MonetDB-users lists.
On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users

Erwin, great. only thing I'm wondering about: before fixing the DTD URI, you didn't get any error message from MonetDB/XQuery that it couldn't find the DTD? I did get it (and I suppose, Wouter got it too, right?), but did not mention that I fixed the URI because I thought you had done this, too, since you mentioned not getting any error message... Howexectly did you shred the document? (1) via MIL command shred_doc() on the Mserver console? (2) via MIL command shred_doc() in a MIL MapiClient? (3) via XQuery function fn:doc() in an XQuery MapiClient (-lxuery)? (4) else? Stefan On Fri, Sep 22, 2006 at 05:15:38PM +0800, Erwin Leonardi wrote:
Thanks! I can shred DBLP now. I removed the DTD reference in the XML document. After I put and specified the full path of the folder in which I keep the DTD, it works!
Thanks erwin
On 9/22/06, Wouter Alink
wrote: MonetDB (4.12.0) on my machine (Centrino 1.5Ghz, 512MB, Windows XP Pro SP2) surely did not like shredding 330 megabyte (today's version of dblp.xml), but finally it responded with:
mil>shred_doc("c:/tmp/dblp.xml","dblp.xml"); # Shredded XML doc("dblp.xml"), total time after commit=782.355s mil>
The majority of the time was spend in I/O (CPU was about 90% idle most of the time).
Greetings, Wouter
p.s. I had to change the DTD-reference from "dblp.dtd" to "http://dblp.uni-trier.de/xml/dblp.xml" for MonetDB to be able to locate the DTD.
-----Original Message----- From: monetdb-users-bounces@lists.sourceforge.net [mailto:monetdb-users-bounces@lists.sourceforge.net] On Behalf Of Stefan Manegold Sent: donderdag 21 september 2006 22:04 To: Erwin Leonardi Cc: Monetdb-users@lists.sourceforge.net Subject: Re: [MonetDB-users] Shredding DBLP XML document
Erwin,
also with MonetDB/XQuery 0.12.0 release, shredding the dblp.xml works without problems on my Linux desktop, both explicitly via shred_doc on Mserver's MIL console and "on-the-fly" via XQuery's fn:doc():
======== $ Mserver # Monet Database Server V4.12.0 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>module(pathfinder); MonetDB>shred_doc("/tmp/dblp.xml","dblp"); # Shredded XML doc("dblp"), total time after commit=41.651s MonetDB> -------- $ echo 'count(doc("/tmp/dblp.xml")//*)' | MapiClient -lx -T 8253146
Trans 16.000 msec Shred 42128.000 msec Query 420.000 msec Print 44.000 msec Timer 43064.586 msec ========
Mserver grows to just below 1 GB virtual memory, but never beyond 600 MB real memory usage.
Unfortunately, I have no Windows machine to also test it on Windows.
Please let us know, how your experiments go.
Stefan
On Thu, Sep 21, 2006 at 07:06:47PM +0800, Erwin Leonardi wrote:
I use a machine with Intel Xeon 2GHz and 1GHz RAM. The OS is Windows XP Pro SP2. The version of MonetDB/XQuery is 0.12.0 (download from MonetDB web, Win32 binaries).
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Yes, I noticed that MapiClient.exe was still busy with shredding. I also notice that the usage of RAM was increased slowly (starting from 5MB). In addition, there was no error or warning message. I left my PC from last Friday night to Monday morning. (Now, I try to run again on another PC -- P4 2.4GHz with 512MB and WinXP -- and leave it running for tonight).
Thanks
Erwin
On 9/21/06, Stefan Manegold
wrote: Hi Erwin,
shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4:
======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ========
Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)?
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Stefan
ps: I felt free to cc this to the MonetDB-users lists.
On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |

oops, minor mistake in my earlier response... of course I didn't change "dblp.dtd" into "http://dblp.uni-trier.de/xml/dblp.xml" but into "http://dblp.uni-trier.de/xml/dblp.dtd". and yes, I got a proper warning: mil>shred_doc("c:\\tmp\\dblp.xml","dblp.xml"); MAPI = monetdb@localhost:50000 QUERY = shred_doc("c:\\tmp\\dblp.xml","dblp.xml"); ERROR = !ERROR: I/O warning : failed to load external entity "dblp.dtd" !ERROR: shredder.mx:catchExternalSubset: WARNING: xmlParseDTD("dblp.dtd") FAILED, NO ID/IDREF QUERIES !ERROR: shredder.mx:catchExternalSubset: NOTE : maybe using absolute filenames works, sorry! !ERROR: CMDshred2bats: operation failed. mil> Wouter -----Original Message----- From: monetdb-users-bounces@lists.sourceforge.net [mailto:monetdb-users-bounces@lists.sourceforge.net] On Behalf Of Stefan Manegold Sent: vrijdag 22 september 2006 11:24 To: Erwin Leonardi Cc: Monetdb-users@lists.sourceforge.net Subject: Re: [MonetDB-users] * Re: Shredding DBLP XML document Erwin, great. only thing I'm wondering about: before fixing the DTD URI, you didn't get any error message from MonetDB/XQuery that it couldn't find the DTD? I did get it (and I suppose, Wouter got it too, right?), but did not mention that I fixed the URI because I thought you had done this, too, since you mentioned not getting any error message... Howexectly did you shred the document? (1) via MIL command shred_doc() on the Mserver console? (2) via MIL command shred_doc() in a MIL MapiClient? (3) via XQuery function fn:doc() in an XQuery MapiClient (-lxuery)? (4) else? Stefan On Fri, Sep 22, 2006 at 05:15:38PM +0800, Erwin Leonardi wrote:
Thanks! I can shred DBLP now. I removed the DTD reference in the XML document. After I put and specified the full path of the folder in which I keep the DTD, it works!
Thanks erwin
On 9/22/06, Wouter Alink
wrote: MonetDB (4.12.0) on my machine (Centrino 1.5Ghz, 512MB, Windows XP Pro SP2) surely did not like shredding 330 megabyte (today's version of dblp.xml), but finally it responded with:
mil>shred_doc("c:/tmp/dblp.xml","dblp.xml"); # Shredded XML doc("dblp.xml"), total time after commit=782.355s mil>
The majority of the time was spend in I/O (CPU was about 90% idle most of the time).
Greetings, Wouter
p.s. I had to change the DTD-reference from "dblp.dtd" to "http://dblp.uni-trier.de/xml/dblp.xml" for MonetDB to be able to locate the DTD.
-----Original Message----- From: monetdb-users-bounces@lists.sourceforge.net [mailto:monetdb-users-bounces@lists.sourceforge.net] On Behalf Of Stefan Manegold Sent: donderdag 21 september 2006 22:04 To: Erwin Leonardi Cc: Monetdb-users@lists.sourceforge.net Subject: Re: [MonetDB-users] Shredding DBLP XML document
Erwin,
also with MonetDB/XQuery 0.12.0 release, shredding the dblp.xml works without problems on my Linux desktop, both explicitly via shred_doc on Mserver's MIL console and "on-the-fly" via XQuery's fn:doc():
======== $ Mserver # Monet Database Server V4.12.0 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>module(pathfinder); shred_doc("/tmp/dblp.xml","dblp"); # Shredded XML doc("dblp"), total time after commit=41.651s MonetDB> -------- $ echo 'count(doc("/tmp/dblp.xml")//*)' | MapiClient -lx -T 8253146
Trans 16.000 msec Shred 42128.000 msec Query 420.000 msec Print 44.000 msec Timer 43064.586 msec ========
Mserver grows to just below 1 GB virtual memory, but never beyond 600 MB real memory usage.
Unfortunately, I have no Windows machine to also test it on Windows.
Please let us know, how your experiments go.
Stefan
On Thu, Sep 21, 2006 at 07:06:47PM +0800, Erwin Leonardi wrote:
I use a machine with Intel Xeon 2GHz and 1GHz RAM. The OS is Windows XP Pro SP2. The version of MonetDB/XQuery is 0.12.0 (download from MonetDB web, Win32 binaries).
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Yes, I noticed that MapiClient.exe was still busy with shredding. I also notice that the usage of RAM was increased slowly (starting from 5MB). In addition, there was no error or warning message. I left my PC from last Friday night to Monday morning. (Now, I try to run again on another PC -- P4 2.4GHz with 512MB and WinXP -- and leave it running for tonight).
Thanks
Erwin
On 9/21/06, Stefan Manegold
wrote: Hi Erwin,
shredding dblp.xml works fine for me with MonetDB/XQuery 0.13.1 (latest CVS version) on an Althon64 X2 with 2 GB RAM under Fedora Core 4:
======== # Monet Database Server V4.13.1 # Copyright (c) 1993-2006, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 64bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. MonetDB>shred_doc("/tmp/dblp.xml","dblp.xml"); # Elapsed time = 02m 05s 557ms 075us [005us/node] # Shredded 1 XML documents, total time after commit=133.708s ========
Hence, could you please be more verbose about your setup? Which version of MonetDB/XQuery are you using? What kind of "PC" are you using (operating system, hardware)?
You say "the shredding process was not finished". Does this mean that MonetDB/XQuery was still busy (with shredding)?
Stefan
ps: I felt free to cc this to the MonetDB-users lists.
On Thu, Sep 21, 2006 at 02:02:05PM +0800, Erwin Leonardi wrote:
Hi all,
I tried to shred DBLP XML document (around 300MB), but I found the following situations. I left my PC over the weekend, and the shredding process was not finish and there is no error message. I tried to check my free disk space, and I found out that the free disk space remains the same. That is, it seems that MonetDB has not written any data to disk. Do you why this happened? Can you try to shred DBLP XML document ?
Thanks Erwin
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | CWI, | P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
--------------------------------------------------------------------- ---- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 | ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (3)
-
Erwin Leonardi
-
Stefan Manegold
-
Wouter Alink