Thanks Stefan, This is much better. I have started looking. In case I find that there is something wrong here, a bug should be opened at sourceforge. That is in http://sf.net/projects/monetdb, choosing Tracker -> Bugs. I am currently importing the dataset -- the shredding is already taking an hour. This is due to many hash collisions on the attributes that have a purely numeric value. Your document has no text nodes, but lots of those attributes. I already noted that the MonetDB hash function is very fragile in this domain. I think I will open a bug report on that one. Bad thing is that fixing it will alter our binary repository format. But that has been done before. Will keep you posted about my progress in reproducing your remap error. I am trying with 64-bits compilation and 64-bits oids, so there should be no scalability problems.. Thing is I will be using fedora core, not gentoo. Peter
-----Original Message----- From: Stefan de Konink [mailto:skinkie@xs4all.nl] Sent: Thursday, October 11, 2007 10:19 AM To: P.Boncz@cwi.nl Cc: monetdb-developers@lists.sourceforge.net Subject: Re: Monetdb-developers Digest, Vol 17, Issue 6 (was: Fixing the update issue for Large XML documents)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
Dear Peter,
Peter Boncz schreef:
Thanks for your question, but I regret to inform you that your way of communicating technical problems severely falls short of what is required by technical etiquette:
(1) ambiguous software version description: you cannot be running MonetDB4 and MonetDB5 at the same time. Even more, I want to know the exact version number of the software, whether it was a 32-bits or 64-bits build (and if so, whether it used 64-bits or 32-bits oids).
Why can't I run MonetDB4 and MonetDB5, in respect to XML and SQL at the same time? In this case I'm reporting a bug for MonetDB4, with the pathfinder module.
Monet Database Server V4.18.2 Compiled for x86_64-unknown-linux-gnu/64bit with 64bit OIDs;
(2) major omissions in your platform description. I presume you have installed an operating system on your Xeon. Would you care to tell us which operating system that is?
Gentoo, Linux, xen01 2.6.20-xen-r3 #3 SMP Sun Oct 7 05:22:20 CEST 2007 x86_64 Intel(R) Xeon(R) CPU L5320 @ 1.86GHz GenuineIntel GNU/Linux
(Before further questions: it is the host machine, not a virtual server)
(3) ill-described reproduction path. Your bug report suggests being about updates, but as far as I read your email, I doubt you did any updates yet. This is again is crucial information.
- - Download http://mirror.openstreetmap.nl/planet/planet-071003.osm.bz2 - - Decompress the document. - - Start the MonetDB4 server with Pathfinder module. - - Add this document to the database server with 5 procent space for updates. - - Request any operation to this document will result in the failure reported.
This error doesn't happen if the document was added without the ability to update the document.
The error you see is the "remap" call failing -- the most probable cause of error is VM space shortage. For MonetDB to work on 20GB size datasets you must use a 64-bits operating system and MonetDB binary, and may even 64-bits oids (because your volume of text nodes is likely to be significant).
This doesn't explain the fact that the readonly version works.
Also for you my current ls()
MonetDB>ls(); #------------------------------------------------------------- -------------------------------------------------------------- --------------# # name htype ttype count heat dirty status kind refcnt lrefcnt # name # str str str lng int str str str int int # type #------------------------------------------------------------- -------------------------------------------------------------- --------------# [ "1000000000_attr_own", "void", "oid", 747750437, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_attr_prop", "void", "oid", 747750437, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_attr_qn", "void", "oid", 747750437, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_frag_root", "oid", "oid", 1, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000000_map_pid", "void", "void", 42822, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_nid_rid", "void", "void", 701587412, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_prop_com", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_prop_ins", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_prop_text", "void", "str", 3, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_prop_tgt", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_prop_val", "void", "str", 118077679, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_qn_histogram", "void", "lng", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_qn_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_qn_nid", "oid", "oid", 271876931, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000000_qn_prefix", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_qn_prefix_uri_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_qn_uri", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_qn_uri_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_rid_kind", "void", "chr", 701587412, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_rid_level", "void", "chr", 701587412, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_rid_nid", "void", "void", 701587412, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_rid_prop", "void", "oid", 701587412, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_rid_size", "void", "int", 701587412, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000000_vx_hsh_nid", "int", "oid", 402481184, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000001_attr_own", "void", "oid", 754314985, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_attr_prop", "void", "oid", 754314985, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_attr_qn", "void", "oid", 754314985, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_frag_root", "oid", "oid", 1, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000001_map_pid", "void", "oid", 45450, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_nid_rid", "void", "oid", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_prop_com", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_prop_ins", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_prop_text", "void", "str", 3, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_prop_tgt", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_prop_val", "void", "str", 119784146, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_qn_histogram", "void", "lng", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_qn_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_qn_prefix", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_qn_prefix_uri_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_qn_uri", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_qn_uri_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_rid_kind", "void", "chr", 744652800, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_rid_level", "void", "chr", 744652800, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_rid_nid", "void", "oid", 744652800, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_rid_prop", "void", "oid", 744652800, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000001_rid_size", "void", "int", 744652800, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_attr_own", "void", "oid", 754314985, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_attr_prop", "void", "oid", 754314985, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_attr_qn", "void", "oid", 754314985, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_frag_root", "oid", "oid", 1, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000002_map_pid", "void", "void", 43178, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_nid_rid", "void", "void", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_prop_com", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_prop_ins", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_prop_text", "void", "str", 3, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_prop_tgt", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_prop_val", "void", "str", 119784146, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_qn_histogram", "void", "lng", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_qn_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_qn_nid", "oid", "oid", 273809205, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000002_qn_prefix", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_qn_prefix_uri_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_qn_uri", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_qn_uri_loc", "void", "str", 19, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_rid_kind", "void", "chr", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_rid_level", "void", "chr", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_rid_nid", "void", "void", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_rid_prop", "void", "oid", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_rid_size", "void", "int", 707424150, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000002_vx_hsh_nid", "int", "oid", 407415839, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000003_attr_own", "void", "oid", 5870236, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_attr_prop", "void", "oid", 5870236, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_attr_qn", "void", "oid", 5870236, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_frag_root", "oid", "oid", 1, 0, "clean", "disk", "pers", 0, 1 ] [ "1000000003_map_pid", "void", "oid", 351, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_nid_rid", "void", "oid", 5455263, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_prop_com", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_prop_ins", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_prop_text", "void", "str", 3, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_prop_tgt", "void", "str", 0, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_prop_val", "void", "str", 1814631, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_qn_histogram", "void", "lng", 16, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_qn_loc", "void", "str", 16, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_qn_prefix", "void", "str", 16, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_qn_prefix_uri_loc", "void", "str", 16, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_qn_uri", "void", "str", 16, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_qn_uri_loc", "void", "str", 16, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_rid_kind", "void", "chr", 5750784, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_rid_level", "void", "chr", 5750784, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_rid_nid", "void", "oid", 5750784, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_rid_prop", "void", "oid", 5750784, 0, "clean", "disk", "pers", 0, 2 ] [ "1000000003_rid_size", "void", "int", 5750784, 0, "clean", "disk", "pers", 0, 2 ] [ "collection_name", "oid", "str", 4, 0, "clean", "load", "pers", 0, 2 ] [ "collection_size", "oid", "lng", 4, 0, "clean", "disk", "pers", 0, 2 ] [ "doc_collection", "oid", "oid", 6, 0, "clean", "load", "pers", 0, 2 ] [ "doc_location", "oid", "str", 4, 0, "clean", "disk", "pers", 0, 2 ] [ "doc_name", "oid", "str", 4, 0, "clean", "load", "pers", 0, 2 ] [ "doc_timestamp", "oid", "timestamp", 4, 0, "clean", "disk", "pers", 0, 2 ] [ "uri_lifetime", "str", "lng", 1, 0, "clean", "load", "pers", 0, 2 ] [ "xquery_catalog", "int", "str", 86, 0, "clean", "load", "pers", 1, 1 ] [ "xquery_seqs", "int", "lng", 1, 0, "clean", "load", "pers", 1, 2 ] [ "xquery_snapshots", "int", "int", 0, 0, "clean", "load", "pers", 1, 2 ]
Yours Sincerely,
Stefan de Konink -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHDdyGYH1+F2Rqwn0RCkg5AJwNn3V6AoWnIpbB30lJLyzSnTnFtQCgkPON GWp/yNGobkgpBQCR7SIb14Y= =MqLA -----END PGP SIGNATURE-----