[MonetDB-users] [Questions] Shredding 1GB XML File and Query Performance
I just installed MonetDB/XQuery 0.24.0 (released on 30 June 2008) and wanted to store some XML documents. When I tried to shred an XML document with almost 1GB filesize, I received the following error message.
MAPI = monetdb@localhost:50000 ACTION= read_line QUERY = pf:add-doc("C:/Data/TestData-10.xml","TestData-10.xml") ERROR = Connection terminated
After receiving this error on my mclient window, I noticed that the MonetDB XQuery Server and MClient were terminated. FYI, I used a Windows XP Pro SP3 PC with Intel Core2 Duo E6550 processor and 3.25GB of RAM. The size of my hard disk is 232 GB. May I know how I can shred this file? Another question is about to test the query performance. I tried to execute an XQuery using the following command.
mclient.bat -lxq -t -G XQuery-1.xq
This command will print out the query result and return the "Timer". a. Can I hide the result? I tried to add "-f none" to the above command, but it did not work. b. What is the "Timer"? How can I get only the query time? Thanks John
Hello John, First of all, thanks for using MonetDB(/XQuery)! On Jul 30, 2008, at 04:49 , John wrote:
I just installed MonetDB/XQuery 0.24.0 (released on 30 June 2008) and wanted to store some XML documents. When I tried to shred an XML document with almost 1GB filesize, I received the following error message.
MAPI = monetdb@localhost:50000 ACTION= read_line QUERY = pf:add-doc("C:/Data/TestData-10.xml","TestData-10.xml") ERROR = Connection terminated
After receiving this error on my mclient window, I noticed that the MonetDB XQuery Server and MClient were terminated. FYI, I used a Windows XP Pro SP3 PC with Intel Core2 Duo E6550 processor and 3.25GB of RAM. The size of my hard disk is 232 GB. May I know how I can shred this file?
Unfortunately, I don't have answer to this question. On a fedora 8 machine with 2 GB RAM, it was no problem to shred a 1GB document. I hope that other members in our group with Windows experience will react soon. However, since it is somer holiday time, it might take longer than usual. Please bear with us.
Another question is about to test the query performance. I tried to execute an XQuery using the following command.
mclient.bat -lxq -t -G XQuery-1.xq
This command will print out the query result and return the "Timer". a. Can I hide the result? I tried to add "-f none" to the above command, but it did not work. b. What is the "Timer"? How can I get only the query time?
Since you use '-G', you are using the new (just released) algebra frontend. I think the '-f none' is not implemented yet in algebra frontend, 'mclient -h' says that XQuery (only) supports output format {xml, typed, dm}. I think "Timer" includes i) time to translate your query, ii) time to shred XML document used by your query, iii) time to execute the query, and iv) time to print result. Since the algebra frontend is new, it does not support all functionality provided by the old frondend, yet. Maybe, you can also try the old frontend, which does support '-f none' and gives more detailed timer information: mclient.bat -lxq -t -g XQuery-1.xq -f none Kind regards, Jennie
Thanks John ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On Wed, Jul 30, 2008 at 01:11:41PM +0200, Ying Zhang wrote:
Hello John,
First of all, thanks for using MonetDB(/XQuery)!
On Jul 30, 2008, at 04:49 , John wrote:
I just installed MonetDB/XQuery 0.24.0 (released on 30 June 2008) and wanted to store some XML documents. When I tried to shred an XML document with almost 1GB filesize, I received the following error message.
MAPI = monetdb@localhost:50000 ACTION= read_line QUERY = pf:add-doc("C:/Data/TestData-10.xml","TestData-10.xml") ERROR = Connection terminated
After receiving this error on my mclient window, I noticed that the MonetDB XQuery Server and MClient were terminated. FYI, I used a Windows XP Pro SP3 PC with Intel Core2 Duo E6550 processor and 3.25GB of RAM. The size of my hard disk is 232 GB. May I know how I can shred this file?
Were you able to see any (error-) message in the server window? On you 64-bit hardware, are you running 64-bit or 32-bit windows? If 64-bit, did you install the 64-bit or 32-bit MonetDB/XQuery? The serialized size ("1 GB") does not say much about the actual complexity of the XML document, e.g., how many nodes, attributes, etc. does the document contain? Do you know these statistics, or could you share your document with us for testing?
Unfortunately, I don't have answer to this question. On a fedora 8 machine with 2 GB RAM, it was no problem to shred a 1GB document. I (For Jennie: ^^^^^^^^ *old* desktop at work, or own laptop at home?)
hope that other members in our group with Windows experience will react soon. However, since it is somer holiday time, it might take longer than usual. Please bear with us.
Another question is about to test the query performance. I tried to execute an XQuery using the following command.
mclient.bat -lxq -t -G XQuery-1.xq
This command will print out the query result and return the "Timer". a. Can I hide the result? I tried to add "-f none" to the above command, but it did not work. b. What is the "Timer"? How can I get only the query time?
Since you use '-G', you are using the new (just released) algebra frontend. I think the '-f none' is not implemented yet in algebra frontend, 'mclient -h' says that XQuery (only) supports output format {xml, typed, dm}.
I think "Timer" includes i) time to translate your query, ii) time to shred XML document used by your query, iii) time to execute the query, and iv) time to print result.
i) yes (translation includes optimization). ii) only if the document is not pre-loaded into the database, but shredded on the fly! iii) yes. iv) yes. when measuing performance,make sure that you redirect any output (stdout) to a files (or to /dev/null on Unix resp. C:nul on Windows --- hoping that the timing info is echoed to stderr ...) --- otherwise, (with large(r), you mainly measure the scrolling speed of your terminal ...)
Since the algebra frontend is new, it does not support all functionality provided by the old frondend, yet. Maybe, you can also try the old frontend, which does support '-f none' and gives more detailed timer information:
mclient.bat -lxq -t -g XQuery-1.xq -f none
Of course, this will tell about the performance of the old frontend (back-end?) ... Stefan
Kind regards,
Jennie
Thanks John ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Unfortunately, I don't have answer to this question. On a fedora 8 machine with 2 GB RAM, it was no problem to shred a 1GB document. I (For Jennie: ^^^^^^^^ *old* desktop at work, or own laptop at home?)
Both. I used to shred 1 GB document on my previous desktop (using a older version of MXQ of course). Shredding the same document on my Mac, using the latest release, says: MonetDB>shred_doc("/Users/jennie/xmark/xmark-1000MB.xml", "xmark.xml"); # Elapsed time = 03m 17s 507ms 882us [004us/node] # Shredded 1 XML document (xmark.xml), total time after commit=353.870s Jennie
participants (3)
-
John
-
Stefan Manegold
-
Ying Zhang