Introducing testing scenario on big data

17 Oct 2014

      Hello list (and people behind),

you are doing a great job on MonetDB and I have great respect to your work. During the last time I made some testing with MonetDB. I like the concept of memory mapped files, since files are "in-memory", even if MonetDB is restarted. Please understand my following mails to this list as contribution to your project. I am not very familiar to the BAT algebra or MAL language, moreover looking for an open-source alternative to commercial in-memory databases, able to run on commodity hardware,  not beeing bounded to a specific operating system.

For my use cases it is important to have a rich implementation of SQL and - even more important - to have a good SQL optimizer, since I have complex SQL statements with some joins over huge tables. From point of my observations, an improved optimizer would be the key for a wider spread of MonetDBs user community. On simple statements I see a performance comparable to commercial vedors/products.

If my scenarios are interesting for you (developers), please feel free to contact me to discuss internals or to look inside the servers behavior yourself.

I am working on 6 tables, each up to 1 billion rows and between 4 an 12 columns. Server with 300 GB RAM, fast SSDs on RAID 10 and 2 CPUs with 6 cores each. Testing was done both in DBVisualizer and in mclient (to avoid JDBC issues and fetch times). Running MonetDB v11.17.21 (Jan2014-SP3) on a fresh and clean Debian Wheezy.

I will go on to describe my dev-related observations in separate mails in order to enable a separate discussion an each issue.

Robert

Robert Koch

tags

participants (1)