On 2010-01-22 12:31:15 +0100, Ying Zhang said:
Hello Xander,
On Fri, Jan 22, 2010 at 09:48:31AM +0100, Xander Schrijen wrote:
Hi Jennie,
Thank you for your response. To answer the easy questions first: I am using Mac OS X 10.5.8, and running a 32 bits MonetDB. I grabbed the MonetDB-Nov2009-SuperBall from the monetdb.cwi,nl website. I add the documents as read-only. Anything suspicious?
I can't see anything suspicious directly. If your documents are large or contain a large number of nodes, a 32bits system could be a problem => 32bits OIDs might not be enough to number all the nodes. But still, it should not corrupt the database.
I always get the error on the same documents, but sometimes the database seems to be corrupted after the error, and queries crash as well, and sometimes I get a clean abort. The corruption usually happens when I do a batch import.
I've switched to a different XQuery/XML Database for the moment because I really needed to run the queries, but I'll try to set-up a reproducable crash. I need to check if I can share the data+query.
Sorry for the inconvenience. We would appreciate it very much if you could provide us the data and query. Just a minimal set to reproduce the problem would be great. Then, we can look into the problem.
Unfortunately, I can't provide you with the data since it's customer data. What happens is that there are xmlns:prefix="This is not a URI" attributes on which the shredder gives a warning. It looks as if when 'enough' of these happen, a query on namespaces like the one below crashes. declare function local:namespaces() as xs:string* { for $node in $collection//* let $name := namespace-uri($node) return $name }; let $nspaces := distinct-values(local:namespaces()) let $rows := for $namespace in $nspaces let $count := count($collection//*[namespace-uri() = $namespace]) let $namespace-text := if ($namespace = "") then "<empty namespace>" else $namespace order by $count descending return <tr><td>{$namespace-text}</td><td class="num">{$count}</td></tr> return subsequence($rows,1,10) (Gives a top 10 list of the most often occuring namespaces.) Xander.
Regards,
Jennie
Xander.
On 2010-01-19 17:22:13 +0100, Ying Zhang said:
Hello Xander,
First of all, thanks for trying MonetDB/XQuery, and sorry for the incovenience.
Before we can help you, I have several questions:
- which OS are you using? Is it 32bits or 64bits? - which MonetDB/XQuery version are you using, and how did you install it? - how large are your documents, i.e., sizes of individual documents and total size? - I assume that you use the pf:add-doc() function to add your documents. Do you add them as read-only or updatable documents? - Do you always get the errors, i.e., no matter how many documents are contained in a set? or do you only/mainly get the errors if your set contains a large number of documents? - it would help a lot if we can reproduce your problem. So, I'm wondering if you can make (part of) your data+query accessable for us.
Kind regards,
Jennie
On Thu, Jan 14, 2010 at 10:47:14AM +0100, Xander Schrijen wrote:
Hi list,
I'm using MonetDB/XQuery to run queries on sets of XML documents of all kinds (build files, process descriptions). These sets can range from a few dozen to a few thousand documents, so the speed of MonetDB can be a real boon.
However, when shredding these documents there may be a few errors. Sometimes a namespace URI is reported as not well formed, sometimes shredding 'hangs' (esp. when using the batch-import method from the manual), sometimes I can't find a reason.
After such an error, the database seems to be corrupted. Sometimes queries for which collections are in the database give errors, sometimes an XPath query doesn't finish. In most cases, the ERROR=.... from mclient doesn't give a proper error (ascii garbage).
I reallize these reports are vague. However, even though this happens a lot, reproducing the exact sequence can be hard. (Importing a few 100 documents, waiting for the import error to occur, using the right query to trigger the error...) But as it is now, I don't even know if these things are known, have work-arounds, or require fixing the xml documents upfront.
Are there people on the list with experience with this that are willing to help me out?
Regards,
Xander.
------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev
------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev