Hi, Thanks for your interest in MonetDB. I'll try to answer your questions wherever I can. Quartz wrote:
Hi,
I'm not too familiar with the BAT concept. Regardless, I want to know more about the SQL equivalents.
We have need for a 300 million recors DB of about 500 GB. We insert about 1000 records per seconds. or about 100 millions record per day, for 3 days.
wow Can you tell us which database on what OS on which hardware this runs?
We 'roll' the data by time, deleting the tail end of the tables, but the the 'deletes' are too slow. So, we are looking at better DB or data storage patterns (one that would allow a quick DROP-table-like or file.delete() response time).
So, as I look at MonetDB, I have some questions:
1-What is the insert speed relatively to other DBs?
Using SQL, with the current version we won't compete very well. We get much more acceptable speeds in 'raw' mode, where lots of the parsing overhead is removed.
2-What is the delete speed relatively to other DBs?
If I recall correctly, on a moving area of data like you have, our BATs keep on growing. Delete speed should be fine, but you will encounter a serious other problem with the growing BATs.
3-What is the insert speed degradation with the addition of indexes?
I thought we don't have indices. That is, if necessary they are created on the fly. Someone should correct me if I'm wrong.
4-What is the delete speed degradation with the addition of indexes?
idem
Also, in the forum, I saw some discussions about size. We may remain on a 32 bits kernel, so:
5-The 2 gigs limit on BATs, is that records count is byte size?
I think so, yes. A BAT must be fully memory addressable, which draws a limitation around 2G on a 32-bits system. Again, someone correct me if I'm wrong, or not entirely right.
A performance constraint was also highlighted: "the performance degradation starts when MonetDB has to access BATs larger then the available memory"
6-In our case, with 300 millions records, how big would be the BATs?
A BAT can be thought of as a single column. If you have 300 million ints, then you need over 1.118GB. If you don't have this amount of memory, then it will be memory mapped, in which case your performance is limited by the speed of your disk. The performance degradation has to do with MonetDB having to resort to virtual memory which is ofcourse much slower than real memory.
Last question, lucky 7: 7-Do you think MonetDB is at all a good choice for our high-thoughput highly volatile data?
I think you would seriously need horizontal partitioning which will be available in the next generation MonetDB kernel, to get around some problems both memory and delete wise. Since you have a very specific use case, I think it might be worth to have some more detailed discussion on whether MonetDB can be of any use in its current form, a next generation, or not for what the planning looks like. Regards, Fabian