[MonetDB-users] shrinking BATs and distributed/streaming MonetDB
I was finally able to install monetdb and put data into it (on OSX), although I don't have any performance numbers to report back just yet. -If i delete or even drop a large table from MDB5, many of the BAT files remain on the file system. Is there a command that deflates/shrinks or completely removes to files? I wasn't able to find anything in the docs -According to previous messages on the mailing list and some other docs, MonetDB 5 is supposed to work with distributed databases (Armada?). I don't see anything regarding this in the docs either, is it coming soon? -I am very (VERY) interested in the ability to have a cluster/grid of MDB servers which let me distribute data and computation while providing an easy to use interface for end users and admins. Along with that, if we get the ability to process streaming data (such as stock quotes), MonetDB will become king of Wall Street :) -I see that one of the recommended projects for students is to extend MonetDB to process time series data. If someone is looking at this, please take a look at this paper: http://citeseer.ist.psu.edu/610025.html (AQuery for ordered data). Thanks
Shahbaz wrote:
I was finally able to install monetdb and put data into it (on OSX), although I don't have any performance numbers to report back just yet. Thanks. Good to know
-If i delete or even drop a large table from MDB5, many of the BAT files remain on the file system. Is there a command that deflates/shrinks or completely removes to files? I wasn't able to find anything in the docs
This sounds not so good and should not happen. It would be nice if you can demonstrate this with a small example and file a bug report. If you drop a table the underlying files should have been gone. If you delete, the files remain but will be empty. [Aside, we noticed some memory leakage, which has been attacked recently. This has been resolved in the bug-fix release emitted 2 days ago] Suggested quick approach. Dump your SQL database , throw away dbfarm and reload the SQL database.
-According to previous messages on the mailing list and some other docs, MonetDB 5 is supposed to work with distributed databases (Armada?). I don't see anything regarding this in the docs either, is it coming soon?
Armada is a research project on distributed and autonomous DBMS. There won't be a releasable version soon. Currently, distribution would have to be taken are of in the application/middleware. We haven't tried out JDBC-based stuff like GORDA http://gorda.di.uminho.pt/publications
-I am very (VERY) interested in the ability to have a cluster/grid of MDB servers which let me distribute data and computation while providing an easy to use interface for end users and admins. Along
Yes we too and are working on innovations in this area in the context of scaling up the skyserver implementation(http://cas.sdss.org/dr5/en/)
with that, if we get the ability to process streaming data (such as stock quotes), MonetDB will become king of Wall Street :) yes is looked into with 2 PhDs.
-I see that one of the recommended projects for students is to extend MonetDB to process time series data. If someone is looking at this, please take a look at this paper: http://citeseer.ist.psu.edu/610025.html (AQuery for ordered data). Yes, array based support is part of our research agenda for some time for information retrieval. The latest paper available from the website: R. Cornacchia, S. Heman, M. Zukowski, A. P. de Vries, P. A. Boncz. Flexible and efficient IR using array databases. Technical Report INS-E0701, CWI, Amsterdam, The Netherlands, January 2007.
regards, Martin
Thanks Martin, Apparently the BATs stayed large because my server had crashed. I again tried to load a file into my server and it crashed again. The reason is probably that I am on a 32 bit machine (osx laptop) and I was trying to insert a file slightly over 5 gigs. One of the problems was that I wasn't getting any messages, I couldn't tell if the file was even being inserted. Initially I tried loading my file using the following syntax: ./MapiClient -l sql -d demo -u monetdb -P monetdb -s "copy into sys.dash5 from 'data/dump.final' using delimiters '|'" This command just seems to hang there. I then logged in using mjclient and just issued the command inside the quotes: "copy into ... using deli..." This ran for a while then I received the following message: Error: End of stream reached (Mserver still alive?) When I checked my server, it had crashed, with the following message: ./mserver5 # MonetDB Server v5.0.0_beta2_1 # Copyright (c) 1993-2007 CWI, all rights reserved # Compiled for i686-apple-darwin8.8.1/32bit with 32bit OIDs dynamically linked # dbname:demo # Visit http://monetdb.cwi.nl/ for further information #warning: please don't forget to set your vault key! #(see /Users/me/MonetDB/etc/monetdb5.conf)
include sql; ##GDKvmtrim(4294967295) MT_mmap_trim(4095 MB): rss = 0 MB ##GDKvmtrim(4294967295) MT_mmap_trim(4095 MB): rss = 0 MB ##GDKvmtrim(4294967295) MT_mmap_trim(4095 MB): rss = 0 MB gdk_utils.mx:1149: failed assertion `(oldsize & 2) == 0' Abort trap
Part of the problem is that I don't really see any messages saying what the server is doing. I try to follow the change in disk size (df -kh) but I saw that number decreasing very slowly and by very small amount.
Thanks for this detailed report, we'll take it along in scrutinizing the code and improving documentation/error reporting. Shahbaz wrote:
Thanks Martin, Apparently the BATs stayed large because my server had crashed. I again tried to load a file into my server and it crashed again. The reason is probably that I am on a 32 bit machine (osx laptop) and I was trying to insert a file slightly over 5 gigs. One of the problems was that I wasn't getting any messages, I couldn't tell if the file was even being inserted.
Initially I tried loading my file using the following syntax: ./MapiClient -l sql -d demo -u monetdb -P monetdb -s "copy into sys.dash5 from 'data/dump.final' using delimiters '|'" This command just seems to hang there.
I then logged in using mjclient and just issued the command inside the quotes: "copy into ... using deli..." This ran for a while then I received the following message: Error: End of stream reached (Mserver still alive?)
When I checked my server, it had crashed, with the following message: ./mserver5 # MonetDB Server v5.0.0_beta2_1 # Copyright (c) 1993-2007 CWI, all rights reserved # Compiled for i686-apple-darwin8.8.1/32bit with 32bit OIDs dynamically linked # dbname:demo # Visit http://monetdb.cwi.nl/ for further information #warning: please don't forget to set your vault key! #(see /Users/me/MonetDB/etc/monetdb5.conf)
include sql; ##GDKvmtrim(4294967295) MT_mmap_trim(4095 MB): rss = 0 MB ##GDKvmtrim(4294967295) MT_mmap_trim(4095 MB): rss = 0 MB ##GDKvmtrim(4294967295) MT_mmap_trim(4095 MB): rss = 0 MB gdk_utils.mx:1149: failed assertion `(oldsize & 2) == 0' Abort trap
Part of the problem is that I don't really see any messages saying what the server is doing. I try to follow the change in disk size (df -kh) but I saw that number decreasing very slowly and by very small amount.
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (2)
-
Martin Kersten
-
Shahbaz