[MonetDB-users] Database size
Hi, I have read previously that monetdb is intended to be used only when the data can fit in memory. I'm not sure how to interpret this statement - does this mean that the entire database must fit in memory, or does it mean that only the data being operated on when selected from the database must fit in memory? We are wanting to store *huge* amounts of data, ultimately several TBs. The actual data being select at any given time will only be several hundred MBs, however. Is this a reasonable usage of MonetDB? But most importantly, I have noticed that all the row ID indexes are "int" - why the 32 bit limitation? This just seems like a symptom of the above problem; that the entire database must fit in memory. Thanks a lot for your advice, Charles -- Charles Samuels
Hi,
monet reads to the memory only the relevant to the query data. Since
monet is also a column store, it does not need to read the entire
table into memory but only the columns that are needed. As such, there
is *no* limit that the entire database must fit in memory. Moreover,
monet will continue working even if the relevant data of the query do
not fit in memory:) so in that sense monet is not "intended to be used
only when the data can fit in memory". Also notice that in memory will
be only the columns that participate on the current operator of a
query plan and not all columns of the entire plan.
As for the row IDs, which in monet terminology are called OIDs, can
also be 64 bit. Actually for the size of database you are talking
about, it is needed to be 64-bits. This can be done by compiling with
--oid-64 option or downloading the 64-bit version of the binary
distribution.
I hope I could help. Feel free to send email if you have any more questions.
Regards,
lefteris
On Wed, Jul 22, 2009 at 8:30 PM, Charles
Samuels
Hi,
I have read previously that monetdb is intended to be used only when the data can fit in memory. I'm not sure how to interpret this statement - does this mean that the entire database must fit in memory, or does it mean that only the data being operated on when selected from the database must fit in memory?
We are wanting to store *huge* amounts of data, ultimately several TBs. The actual data being select at any given time will only be several hundred MBs, however. Is this a reasonable usage of MonetDB?
But most importantly, I have noticed that all the row ID indexes are "int" - why the 32 bit limitation? This just seems like a symptom of the above problem; that the entire database must fit in memory.
Thanks a lot for your advice,
Charles
-- Charles Samuels
------------------------------------------------------------------------------ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Charles Samuels wrote:
Hi,
I have read previously that monetdb is intended to be used only when the data can fit in memory. I'm not sure how to interpret this statement - does this mean that the entire database must fit in memory, or does it mean that only the data being operated on when selected from the database must fit in memory?
MonetDB does *not* limit your database to fit in main memory. However, it uses main memory techniques to address the elements in tables loaded for processing. This means that the hot-set accessed during query execution is limited when you run on 32-bit machines. For 64-bit architectures you don't have this 'problem'.
We are wanting to store *huge* amounts of data, ultimately several TBs. The actual data being select at any given time will only be several hundred MBs, however. Is this a reasonable usage of MonetDB?
The largest database we use internally to experiment with is a 6TB astronomical database. It contains two large tables, one with 400M rows of 500 columns and another of 20 B rows and 6 columns. Both tables are larger then 1TB each. This runs on a Linux dual quadcore with 64G Ram and lots of disk space. Ofcourse, at some point you well notice IO behavior ;)
But most importantly, I have noticed that all the row ID indexes are "int" - why the 32 bit limitation? This just seems like a symptom of the above problem; that the entire database must fit in memory.
You can use the system with both 32- and 64- bit oids. regards, Martin
Thanks a lot for your advice,
Charles
Hi,
I have read previously that monetdb is intended to be used only when the data can fit in memory. I'm not sure how to interpret this statement - does
Hi, I think that is a quite interesting point. We tested MonetDB (MonetDB5-SQL-Installer-x86_64-20090708.msi) on a 64Bit Windows Server 2003 machine with 2GB ram and sufficient available HDD memory and always got "runtime error R6031 - Attempt to initialize the CRT more than once. This indicates a bug in your application." when we tried to load more than about 3.4 GB of data during a single transaction into the tables. After extending the memory to 4GB we were able to load the data into the tables, but complex crossjoins with order-by statements caused again the crashing of the server or the error mentioned above. We were able to reconstruct this error and got to know that a second after the free ram drops under 200MB, the CPU usage stops and the monetdb server application responds with the error above or freezes. The database is always fully loaded to the memory and there seem not to be any mechanism which clear the memory?! So it would be indeed very interesting to know, whether monetdb on a windows machine is able to administer a database which is significant bigger than the available ram and if the mentioned problem is a know bug. For the possibility to reconstruct the issue: We used the star-schema-benchmark described here http://www.cs.umb.edu/~xuedchen/research/publications/StarSchemaB.PDF with this data generator http://www.tpc.org/tpch/ and a scale factor 10 [dbgen -s 1 -T a (SF = 10)] (about 5.6 GB of data). Thank you in advance for your answers. Regards, Sebastian Bräuer -----Ursprüngliche Nachricht----- Von: Martin Kersten [mailto:Martin.Kersten@cwi.nl] Gesendet: Mittwoch, 22. Juli 2009 21:25 An: Communication channel for MonetDB users Betreff: Re: [MonetDB-users] Database size Charles Samuels wrote: this
mean that the entire database must fit in memory, or does it mean that only the data being operated on when selected from the database must fit in memory?
MonetDB does *not* limit your database to fit in main memory. However, it uses main memory techniques to address the elements in tables loaded for processing. This means that the hot-set accessed during query execution is limited when you run on 32-bit machines. For 64-bit architectures you don't have this 'problem'.
We are wanting to store *huge* amounts of data, ultimately several TBs.
The
actual data being select at any given time will only be several hundred MBs, however. Is this a reasonable usage of MonetDB?
The largest database we use internally to experiment with is a 6TB astronomical database. It contains two large tables, one with 400M rows of 500 columns and another of 20 B rows and 6 columns. Both tables are larger then 1TB each. This runs on a Linux dual quadcore with 64G Ram and lots of disk space. Ofcourse, at some point you well notice IO behavior ;)
But most importantly, I have noticed that all the row ID indexes are "int"
-
why the 32 bit limitation? This just seems like a symptom of the above problem; that the entire database must fit in memory. You can use the system with both 32- and 64- bit oids.
regards, Martin
Thanks a lot for your advice,
Charles
---------------------------------------------------------------------------- -- _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (4)
-
Charles Samuels
-
Lefteris
-
Martin Kersten
-
Sebastian Bräuer