Hi Klarinda, [first of all, our apologies for the late reply --- it's summer and vacation time... ;-)] Concerning your, it looks as if the 1GB requires more (virtual) memory (and/or disk space) to load than is available on your system. The error indicates that MonetDB/XQuery fails to allocate 442499072 bytes (422 MB), while already 794361856 bytes (757 MB) of your machine's virtual memory (512 MB physical memory plus the size of your Windows installation's "page file") are used by MonetDB. Most probably, your virtual memory is less than 422 MB + 757 MB = 1179 MB, right? It could also be, that you harddisk (at least the partition used by MonetDB) has only less then 422 MB free at the time that MonetDB/XQuery tries to allocate the extra 422 MB. Given that, it remains open, whether MonetDB/XQuery indeed requires all the memory to load the 1GB document, or whether there is a bug that makes it request more memory than it should require. To test this, we'd need to have your document --- the pure size of the serialized document is not enough information for us to estimate how big the internal data structure will/must be --- we also need to know (a.o.) how many nodes the document has, what the structure look like, etc. I tried to generate your document myself in order to analyse the problem, but (while working fine for Text-Centric documents) the XBench/ToXgene generator fails for me with Document-Centric documents (at least with the "large" and "huge" ones): ======== $ perl ./xdbgen.pl ----------------------------------- | XBench Database Generator v1.0 | | (c)2002 by University of Waterloo | ----------------------------------- Database Class: [1]TC/SD [2]TC/MD [3]DC/SD [4]DC/MD Please choose database class (any other key to exit): 3 Database Size: [1]Small [2]Normal [3]Large [4]Huge Please choose database size (default is Normal): 3 Generating template templates/DCSD.tsl==>templates/newDCSD.tsl Generating TPC-W titles/lastnames ... sh: wgen/tpcw: cannot execute binary file sh: wgen/tpcw: cannot execute binary file ToXgene 1.1a - (c) 2001 by University of Toronto and IBM Corporation ***** Parsing template: Done! Generating 250000 elements in items: Done! Reading list titles from input/titles.xml: java.lang.ArrayIndexOutOfBoundsException: 1 at genes.lists.ToxListParser.endElement(ToxListParser.java:125) at org.apache.xerces.parsers.SAXParser.endElement(SAXParser.java:1403) at org.apache.xerces.validators.common.XMLValidator.callEndElement(XMLValidator.java:1480) at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1204) at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381) at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1081) at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1122) at genes.lists.ToxListParser.parse(ToxListParser.java:53) at genes.lists.ToxList.readFromFile(ToxList.java:354) at genes.lists.ToxList.generate(ToxList.java:230) at toxgene.ToXgene.main(ToXgene.java:160) ***** UNRECOGNIZED ERROR: An unrecognized error has occurred. Please report the following debug information to toxgene-bugs@cs.toronto.edu Please include the template that causes this problem. java.lang.NullPointerException at genes.lists.ToxList.readFromFile(ToxList.java:419) at genes.lists.ToxList.generate(ToxList.java:230) at toxgene.ToXgene.main(ToXgene.java:160) Total elapsed time: 18759ms. Parsing time: 835ms. List processing time: -1154356332715ms. Done. ======== Hence, could you please send me the (compressed!) document by email (stefan.manegold@cwi.nl), or please it somewhere where I coudl download it from? One final question (for now): Which version of MonetDB/XQuery are you using? Kind Regards, Stefan On Mon, Jul 24, 2006 at 05:43:33PM +0800, kla gw wrote:
Hi,
I tried MonetDB/XQuery to shred a 1gb xml file, but it failed.
Following is the error message: MonetDB>shred_doc("D:xbench/output/DC1000catalog.xml", "DC1000catalog.xml"); !ERROR: MT_mmap: MapViewOfFile(6b4, 2, 0, 0, 442499072, 0) failed !OS: Not enough space !GDKmmap(442499072) fail => BBPtrim(enter) usage[mem=101297568,vm=794361856]
I use windows XP Professional version 2002 service pack 2, Pentium 4 CPU, 2.40GHz, 512 MB of RAM.
Previously I tried to shred 100mb xml file, and it took 18.532 sec. For this 1 gb file, I left it overnight so I don't know how long it takes till the error message occurs.
Can please help me to solve this problem?
Regards,
Klarinda
Below is the complete error message: MonetDB>shred_doc("D:xbench/output/DC1000catalog.xml", "DC1000catalog.xml"); !ERROR: MT_mmap: MapViewOfFile(6b4, 2, 0, 0, 442499072, 0) failed !OS: Not enough space !GDKmmap(442499072) fail => BBPtrim(enter) usage[mem=101297568,vm=794361856] # !mallinfo.arena = 15613828 !mallinfo.ordblks = 46134 !mallinfo.smblks = 15492 !mallinfo.hblkhd = 0 !mallinfo.hblks = 0 !mallinfo.usmblks = 13718408 !mallinfo.fsmblks = 899720 !mallinfo.uordblks = 950740 !mallinfo.fordblks = 44960 #BBPTRIM_ENTER: memsize=101297568,vmsize=794361856 #BBPTRIM: memtarget=0 vmtarget=1073741824 #TRIMSCAN: mem=0 vm=1, start=1, limit=1 #TRIMSCAN: 145030 0=tmp_35 (#0) #TRIMSCAN: 145059 1=tmp_36 (#0) #TRIMSCAN: 145088 2=tmp_37 (#0) #TRIMSCAN: 145146 3=tmp_41 (#0) #TRIMSCAN: 149075 4=doc_query (#0) #TRIMSCAN: 149092 5=doc_sema (#0) #TRIMSCAN: 155215 6=tmp_374 (#0) #TRIMSCAN: 155218 7=prop_pre_39 (#0) #TRIMSCAN: 157895 8=tmp_533 (#0) #TRIMSCAN: 157898 9=prop_pre_310 (#0) #TRIMSCAN: end at 1 (size=628) #TRIMSELECT: dirty = 0 #TRIMSELECT: candidate=tmp_35 BAT*=03D6A230 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145030,145030 ,145030) #TRIMSELECT: keep tmp_35 [224,0] bytes [224,0] dirty target(mem=0 vm=1073741824)
#TRIMSELECT: candidate=tmp_36 BAT*=03D66E60 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145059,145059 ,145059) #TRIMSELECT: keep tmp_36 [224,0] bytes [224,0] dirty target(mem=0 vm=1073741824)
#TRIMSELECT: candidate=tmp_37 BAT*=058080B0 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145088,145088 ,145088) #TRIMSELECT: keep tmp_37 [224,0] bytes [224,0] dirty target(mem=0 vm=1073741824)
#TRIMSELECT: candidate=tmp_41 BAT*=058066F0 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145146,145146 ,145146) #TRIMSELECT: keep tmp_41 [224,0] bytes [224,0] dirty target(mem=0 vm=1073741824)
#TRIMSELECT: candidate=doc_query BAT*=057F9D10 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=149075,149075 ,149075) #TRIMSELECT: keep doc_query [224,0] bytes [224,0] dirty target(mem=0 vm=10737418 24) #TRIMSELECT: candidate=doc_sema BAT*=05845370 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=149092,149092 ,149092) #TRIMSELECT: keep doc_sema [224,0] bytes [224,0] dirty target(mem=0 vm=107374182 4) #TRIMSELECT: candidate=tmp_374 BAT*=03D722F0 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=155215,155215 ,155215) #TRIMSELECT: keep tmp_374 [224,0] bytes [0,0] dirty target(mem=0 vm=1073741824) #TRIMSELECT: candidate=prop_pre_39 BAT*=05845CD0 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=155218,155218 ,155218) #TRIMSELECT: keep prop_pre_39 [224,0] bytes [0,0] dirty target(mem=0 vm=10737418 24) #TRIMSELECT: candidate=tmp_533 BAT*=0583C598 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=157895,157895 ,157895) #TRIMSELECT: keep tmp_533 [224,0] bytes [0,0] dirty target(mem=0 vm=1073741824) #TRIMSELECT: candidate=prop_pre_310 BAT*=057F7E70 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=157898,157898 ,157898) #TRIMSELECT: keep prop_pre_310 [224,0] bytes [0,0] dirty target(mem=0 vm=1073741 824) #TRIMSELECT: end #TRIMSELECT: dirty = 1 #TRIMSELECT: candidate=tmp_35 BAT*=03D6A230 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145030,145030 ,145030) #TRIMSELECT: delete tmp_35 from trimlist (does not match trim needs) #TRIMSELECT: candidate=tmp_36 BAT*=03D66E60 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145059,145059 ,145059) #TRIMSELECT: delete tmp_36 from trimlist (does not match trim needs) #TRIMSELECT: candidate=tmp_37 BAT*=058080B0 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145088,145088 ,145088) #TRIMSELECT: delete tmp_37 from trimlist (does not match trim needs) #TRIMSELECT: candidate=tmp_41 BAT*=058066F0 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=145146,145146 ,145146) #TRIMSELECT: delete tmp_41 from trimlist (does not match trim needs) #TRIMSELECT: candidate=doc_query BAT*=057F9D10 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=149075,149075 ,149075) #TRIMSELECT: delete doc_query from trimlist (does not match trim needs) #TRIMSELECT: candidate=doc_sema BAT*=05845370 # (cnt=0, mode=1024, refs=0, wait=0, parent=0, lastused=149092,149092 ,149092) #TRIMSELECT: delete doc_sema from trimlist (does not match trim needs) #TRIMSELECT: candidate=tmp_374 BAT*=03D722F0 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=155215,155215 ,155215) #TRIMSELECT: delete tmp_374 from trimlist (does not match trim needs) #TRIMSELECT: candidate=prop_pre_39 BAT*=05845CD0 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=155218,155218 ,155218) #TRIMSELECT: delete prop_pre_39 from trimlist (does not match trim needs) #TRIMSELECT: candidate=tmp_533 BAT*=0583C598 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=157895,157895 ,157895) #TRIMSELECT: delete tmp_533 from trimlist (does not match trim needs) #TRIMSELECT: candidate=prop_pre_310 BAT*=057F7E70 # (cnt=0, mode=4096, refs=0, wait=0, parent=0, lastused=157898,157898 ,157898) #TRIMSELECT: delete prop_pre_310 from trimlist (does not match trim needs) #TRIMSELECT: end #BBPTRIM: no more unload candidates! #BBPTRIM_EXIT: memsize=95140356,vmsize=794361856 !GDKmmap(442499072) fail => BBPtrim(ready) usage[mem=101297568,vm=794361856] # !mallinfo.arena = 15613828 !mallinfo.ordblks = 46134 !mallinfo.smblks = 15492 !mallinfo.hblkhd = 0 !mallinfo.hblks = 0 !mallinfo.usmblks = 13718408 !mallinfo.fsmblks = 899720 !mallinfo.uordblks = 950740 !mallinfo.fordblks = 44960 !ERROR: MT_mmap: MapViewOfFile(6b0, 2, 0, 0, 442499072, 0) failed !OS: Not enough space !ERROR: GDKload: cannot mmap(): name=05\552, ext=theap.priv !OS: Not enough space !ERROR: GDKload failed: name=05\552, ext=theap.priv !ERROR: shredder.mx:append_str2bat: APPEND-STR[PROP_TEXT](final foxes since the silent, quick realms should breach never sheaves--ruthless, daring waters beneat h the close asymptotes c), BUNappend fails !ERROR: CMDshred2bats: operation failed. MonetDB>
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |