[MonetDB-users] BAT Memory Usage
Hello everyone, We are evaluating Monet right now to store and analyze a huge quantity of data. We have several facts collection that range from 1 million row, 200 cols. to way over 300 million rows and over 10.000 columns. We are aware of memory limitations of current Monet version and looking forward to the new one but in the meantime we'd like to do some performance and load tests. In order to know which data we will be able to store, we need to know how much memory will use a BAT depending on the datatype and the number of elements we want to store in it. Is there a lineal relation between those variables (ie. 8 bytes oid, 8 bytes int => 16 bytes per element) or do Monet encode data using intervals or such? Thank you in advance and kudos for your big efforts, Guillermo Arbeiza, Open Sistemas de Información e Internet garbeiza(AT)opensistemas.com ---------------------------------------------------------------------- Mensaje enviado por el servidor de openSistemas (www.opensistemas.com)
Dear Guillermo, Thanks for your email and the great challenge. You assumption about the BATsize is largelycorrect. The cost can be derived from the underlying type and the word alignment cost. A design decision which is exploited all over the place in our applications (e.g. SQL) is to use virtual oids, because they don't require any storage at all. In mapping scenarios for relational tables it means there need not be any overhead from the oids to reconstruct the rows. The number of columns is not so relevant, provided you don't need them all at the same time in memory. The cost for your individual (int) columns run from 8Mb to 2.4GB, excluding possible hash index overhead. regards, Martin Guillermo Arbeiza wrote:
Hello everyone,
We are evaluating Monet right now to store and analyze a huge quantity of data. We have several facts collection that range from 1 million row, 200 cols. to way over 300 million rows and over 10.000 columns.
We are aware of memory limitations of current Monet version and looking forward to the new one but in the meantime we'd like to do some performance and load tests.
In order to know which data we will be able to store, we need to know how much memory will use a BAT depending on the datatype and the number of elements we want to store in it.
Is there a lineal relation between those variables (ie. 8 bytes oid, 8 bytes int => 16 bytes per element) or do Monet encode data using intervals or such?
Thank you in advance and kudos for your big efforts,
Guillermo Arbeiza, Open Sistemas de Información e Internet garbeiza(AT)opensistemas.com
---------------------------------------------------------------------- Mensaje enviado por el servidor de openSistemas (www.opensistemas.com)
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi, I am evaluating MonetDB5 for our existing system. Currently our db size is ~600GB - only a few tables are huge ( 700M rows and growing!!!). It runs on 4 CPU 64 bit processor with 16 G Ram and 16G Swap space. Can some point me to a document link or a mail thread that summaries the lessons learnt while dealing with very large datasets. One of the questions i have i mind is that since monetdb being main memory database, should i keep my ram+swap files size big enough to hold the largest table usage in the sql? Thanks Bharani Martin Kersten wrote:
Dear Guillermo,
Thanks for your email and the great challenge. You assumption about the BATsize is largelycorrect. The cost can be derived from the underlying type and the word alignment cost.
A design decision which is exploited all over the place in our applications (e.g. SQL) is to use virtual oids, because they don't require any storage at all. In mapping scenarios for relational tables it means there need not be any overhead from the oids to reconstruct the rows.
The number of columns is not so relevant, provided you don't need them all at the same time in memory. The cost for your individual (int) columns run from 8Mb to 2.4GB, excluding possible hash index overhead.
regards, Martin Guillermo Arbeiza wrote:
Hello everyone,
We are evaluating Monet right now to store and analyze a huge quantity of data. We have several facts collection that range from 1 million row, 200 cols. to way over 300 million rows and over 10.000 columns.
We are aware of memory limitations of current Monet version and looking forward to the new one but in the meantime we'd like to do some performance and load tests.
In order to know which data we will be able to store, we need to know how much memory will use a BAT depending on the datatype and the number of elements we want to store in it.
Is there a lineal relation between those variables (ie. 8 bytes oid, 8 bytes int => 16 bytes per element) or do Monet encode data using intervals or such?
Thank you in advance and kudos for your big efforts,
Guillermo Arbeiza, Open Sistemas de Información e Internet garbeiza(AT)opensistemas.com
---------------------------------------------------------------------- Mensaje enviado por el servidor de openSistemas (www.opensistemas.com)
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- View this message in context: http://www.nabble.com/BAT-Memory-Usage-tf3083243.html#a11445556 Sent from the monetdb-users mailing list archive at Nabble.com.
participants (3)
-
Bharani
-
Guillermo Arbeiza
-
Martin Kersten