On 2009-07-22 16:18, Roy Walter wrote:
OK thanks.
Trying to create an index on a single 300MB document I get this error:
xquery>tijah:create-ft-index() more>^Z #VirtualAlloc(00000000,161714176,MEM_COMMIT,PAGE_READWRITE): failed #GDKvmalloc(161710088) fails, try to free up space [memory in use=269395036,vir ual memory in use=1611661312] #GDKvmalloc(161710088) result [mem=220991540,vm=1554972672] #VirtualAlloc(00000000,161714176,MEM_COMMIT,PAGE_READWRITE): failed #GDKmmap(162004992) fails, try to free up space [memory in use=220991564,virtua memory in use=1554972672] #GDKmmap(162004992) result [mem=218123036,vm=1553268736] MAPI = monetdb@localhost:50000 QUERY = tijah:create-ft-index() ERROR = !ERROR: HEAPalloc: Insufficient space for HEAP of 162004992 bytes. !ERROR: CMDkdiff: operation failed.
Is there a a workaround?
Assuming this is the same system as the one you started this thread with, the "workaround" may have to be to get a bigger machine. In particular, you may have to upgrade to a 64 bit architecture. Since MonetDB needs to address all tables it is working on simultaneously, it needs a large enough address space. With a 32 bit architecture you reach the limit (2 GB address space) pretty quickly if you're using large documents.
Incidentally, I tried to install the Ubuntu version from http://monetdb.cwi.nl/downloads/Ubuntu/. After |apt-get install monetdb\* I get an error: Couldn't find package monetdb*.|
Last time I tried it it worked, but it's been a while since I tried it. I'll try it again, as soon as I'm near a Ubuntu virtual machine. By the way, you *are* trying it for Jaunty?
-- Roy
Sjoerd Mullender wrote:
This is an internal error. Just yesterday I fixed a problem which resulted in this very error. I hope that my fix also fixes this instance of the error.
I'm currently working on a new bug fix release, to be released next week (if things work out with the build) which contains my fix.
Roy Walter wrote:
Another problem:
Because of the difficulty I was having with collections and indexes I decided to combine my XML documents to create several large documents of approx. 100MB. Adding and querying one document worked fine. Then I tried to add a second document which produced the following error:
xquery>pf:add-doc("http://dev.govmonitor.com/export/debates2006.xml", "debates20 06.xml", "debates") more>^Z xquery>pf:add-doc("http://dev.govmonitor.com/export/debates2007.xml", "debates20 07.xml", "debates") more>^Z MAPI = monetdb@localhost:50000 QUERY = pf:add-doc("http://dev.govmonitor.com/export/debates2007.xml", "debates2 007.xml", "debates") ERROR = !ERROR: GDKremovedir: rmdir(bat\DELETE_ME) failed. !OS: The directory is not empty. !OS: XQDY0062: checkpoint failed (in pf_checkpoint), query aborted. xquery>
Anyone know what this means?
-- Roy
Roy Walter wrote:
OK a few more problems and a success.
I deleted my data and started again to get around the index corruption.
I reloaded the 110MB xmark document and ran a tijah:query(). The query was completed in< 0.5 second. Good!
I then reloaded a collection of 360 of my documents. The load was fine so I ran a basic tijah:query() that took a long time. Something must be wrong so I thought I would delete and re-create the index.
Deleting the index appeared to work without error. On recreating the index I got an error. The error refers to the XML document that was used to create my collection, i.e., the offending document is not in the database. Here's the console output:
QUERY = tijah:create-ft-index() ERROR = !ERROR: [shred_url]: 1 times inserted nil due to errors at tuples 0@0. !ERROR: [shred_url]: first error was: !ERROR: shred: cannot stat `::4417490901795001::\\nas\public\2006docs.xm l': No such file or directory !ERROR: CMDshred_url: operation failed. !ERROR: interpret_params: leftfetchjoin(param 2): evaluation error. Timer 1439.966 msec
-- Roy
Sjoerd Mullender wrote:
On Windows the .bat scripts that you use to start the server (possibly via the Start menu) specify the dbfarm directory as %APPDATA%\MonetDB4\dbfarm. APPDATA is you Application Data folder C:\Documents and Settings\<username>\Application Data (on XP).
Roy Walter wrote:
OK so this is interesting. I don't have a dbfarm directory.
I looked in monetdb.conf and noticed a couple of things.
The MonetDB installation path value appears as:
prefix=c:\documents and settings\sjoerd\my documents\src\stable\icc32\nt32
This means that neither the datadir nor the gdk_dbfarm entries can be properly processed. Might this explain why I [and others] have to start the XQuery Server from the command line and load the modules manually?
So right now I can't find my data :) What file extensions does MonetDB use?
-- Roy
Lefteris wrote:
> Hi, > > it looks that something got corrupted in the process. This is starting > to be very usual with windows installations. Anyway, you should delete > the contents of the dbfarm directory in the monetdb installation > directory. This will *delete* all data in your database (i guess you > dont have any important data still in cause you are testing). After > you delete dbfarm, start mserver, that will populate again dbfarm with > the default files. then, add again your data with the usual > pf:add-doc, then create the tijah indices and then run your queries:) > > If this does not work, we will have to try to "wake up" the pf/tijah > people to help us. > > Hope this will help, > > lefteris > > On Tue, Jul 21, 2009 at 10:35 AM, Roy Walter
wrote: > > > >> OK, I restarted and got an error message that pointed to a problem document >> in a collection. I deleted the offending document and then tried to generate >> the default index with tijah:create-ft-index(). This failed because, >> apparently, the DFLT_FT_INDEX already exists. >> >> So I thought that even though the index compilation appeared to have failed >> at the earlier error an index must have been created. >> >> I tried a tijah:query(). That failed because the DFLT_FT_INDEX does not >> exist. >> >> Hmm, so I tried tijah:delete-ft-index() and it too told me that >> DFLT_FT_INDEX does not exist. >> >> tijah:create-ft-index() still fails with: !ERROR tj_init_collection, pftijah >> collection already exists: DFLT_FT_INDEX >> >> How do I reset? >> >> -- Roy >> >> >> Lefteris wrote: >> >> This is not expected. >> >> Did you try to restart the server and retry? >> >> You might also have a corrupted dbfarm or the documents didn't shred >> correctly to begin with. Which version of monet are you using? how did >> you installed it? >> >> lefteris >> >> On Mon, Jul 20, 2009 at 8:52 PM, Roy Walter >> wrote: >> >> >> Hi lefteris >> >> Well that seems to tick all the boxes. >> >> I tried the global index creation: >> >> tijah:create-ft-index() >> >> and it crashed the server with: >> >> !WARNING: readClient: unexpected end of file; discarding partial input >> >> Hmm... >> >> R. >> >> Lefteris wrote: >> >> >> Hi Roy, >> >> I suggest that you try the pf/tijah module for MonetDB/XQuery. >> >> http://dbappl.cs.utwente.nl/pftijah/ >> >> This will create specific indices for your queries to facilitate text >> search. >> >> Hope this helps for now. We will also investigate were the time is >> spent in your case (without pf/tijah) and come back to you. How many p >> elements your documents have? The problem might be that because monet >> does not build inverted indices on text by itself, it has to visit >> each p element and search with the help of the pcre library. Pf/tijah >> was build for that purpose and should help alot. >> >> Please feel free to contact us for further clarification and new >> findings from your tests:) >> >> cheers, >> >> lefteris >> >> On Mon, Jul 20, 2009 at 6:28 PM, Roy Walter >> wrote: >> >> >> >> Running MonetDB/XQuery on a 2.6GHz 32-bit Windows XP box with 1GB of RAM. >> >> What is the best way to organise XML in MonetDB for rapid text searching? >> A >> run down of my recent experience might help. >> >> I created a collection of around 450 documents (153MB approx.). I ran the >> following query from the command line: >> >> collection("papers")//p[contains(., 'wind farm')] >> >> The query time is at best 19 seconds. That's bad. (It's worse than >> querying >> a Postgres database with documents stored in the XML field type.) >> >> So to get a reference point I loaded up the 114MB XMark document and ran >> this query: >> >> doc("standard")//text[contains(., "yoke")] >> >> The query time varies from 2 to 4 seconds. Better, but still not great. >> >> Now, adding more RAM (and moving to 64-bit) would speed things up I hope! >> But hardware aside: >> >> 1. Is it better to have big documents rather than big collections? >> >> 2. Is having small collections (<10 docs) of big documents also >> inefficient? >> >> Ideally I need to query collections comprising several thousand documents >> using 'text search' predicates. Are there other, better ways to run this >> type of query against a MonetDB XML database? Or should I really be using >> some other platform for this task? >> >> Thanks in advance for any pointers. >> >> -- Roy >> >> >> ------------------------------------------------------------------------------ >> Enter the BlackBerry Developer Challenge >> This is your chance to win up to $100,000 in prizes! For a limited time, >> vendors submitting new applications to BlackBerry App World(TM) will have >> the opportunity to enter the BlackBerry Developer Challenge. See full >> prize >> details at:http://p.sf.net/sfu/Challenge >> _______________________________________________ >> MonetDB-users mailing list >> MonetDB-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/monetdb-users >> >> >> >> >> >> >> >> >> >> >> > > > ------------------------------------------------------------------------ ------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at:http://p.sf.net/sfu/Challenge
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------
------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at:http://p.sf.net/sfu/Challenge ------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------
------------------------------------------------------------------------------
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------
------------------------------------------------------------------------------
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Sjoerd Mullender