As a further experiment I shredded the individual documents used to make the large composite into the db on the Linux box. The shredding completed without error, adding 532 documents to the database. After creating the tijah index I ran the following query:
tijah:queryall("//p[about(., 'drug misuse')]")
and it returned a number of correct results. (The same sequence, i.e., shredding->indexing->querying, on a Windows box produced no results.)

Not all results were printed to the console, however, as the query produced the following error:
!ERROR: XML Generation: tmpr_1231 BAT does not have a 120 head.
...
ERROR = !ERROR:
!ERROR: xquery_print_result_main: operation failed.
It's possible that the error is memory related as my Jaunty installation is running under VirtualBox.

-- Roy

Roy Walter wrote:
It's a strange one. I have been experimenting a little.

I was working with a single large [composite] document (176MB) that 
showed the problem. So I took a subset of files containing the search 
terms and created a smaller composite (8MB).

The small composite works correctly, as far as I can tell. So instead of 
a large composite I shredded the all the individual files used to create 
the large composite to see if it makes any difference. It doesn't. The 
problem persists.

When working with the small composite I noticed too that the query 
produces more [correct] results than when working with the large 
composite. I noticed too that the large composite returns incorrect results.

For example, when searching for the phrase 'drug treatment', querying 
the large composite document returned hits containing 'drug' AND 
'treatment' and only one hit containing the sought phrase. Searching the 
small composite returned 20 correct results for the sought phrase. (To 
clarify: the large and small composites contain the same documents.)

I don't know if it's related, but I installed MonetDB4/XQuery on an 
Ubuntu box and I cannot shred the large composite into the database 
owing to a parsing error. This error, clearly, is not occurring under 
Windows. I will try shredding the small composite on the Linux box.

(Incidentally, if files are added to the database in bulk using UNC 
pathnames, e.g., <doc path="\\nas\public\export\" name="myfile.xml"/> 
the files are shredded into the db but tijah:create-ft-index() 
thereafter fails with a shred error because of the pathname. I guess 
it's expecting Unix style pathnames. Although I question why the tijah 
indexer cares about the pathname since the documents are in the db. I 
will add this to the bug list.)

-- Roy

Henning Rode wrote:
  
sounds clearly like a bug. could you send me a short example document 
that i can index and experiment with to find the bug?

-henning

Roy Walter wrote:
    
The following query:

    tijah:queryall("//p[about(., 'drug treatment')]")

returns a number of results from my sample document. Some of these 
results contain the phrase "drug misuse". The following query:

    tijah:queryall("//p[about(., 'drug misuse')]")

returns zero results from the sample document, which is clearly 
incorrect since some results returned by the first query should be 
returned by the second.

I have deleted and reloaded the sample document and I have recreated 
the tijah index and the result is consistently incorrect. Is this a bug?

-- Roy

------------------------------------------------------------------------------ 

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 
30-Day trial. Simplify your report design, integration and deployment 
- and focus on what you do best, core application coding. Discover 
what's new with Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
MonetDB-users mailing list
MonetDB-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users
  
      
------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 8.5.409 / Virus Database: 270.13.66/2325 - Release Date: 08/25/09 06:08:00

  
    

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
MonetDB-users mailing list
MonetDB-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users

No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database: 270.13.67/2326 - Release Date: 08/25/09 18:07:00