
Dear Jeremy, 1) i guess they have used an older 32-bit windows system version. See for information about the database sizes and platforms: http://www.monetdb.org/Documentation/UserGuide/resources In case of a crash, I hope they reported the problem on the MonetDB user/developer list for handling it. 2) remarks on TPCH in isolation do not make much sense. They should provide comparative figures on the same platform and report against other open-source systems. On the old website we had figures that clearly showed MySQL and Postgresql to be way slower with > SF-2. MonetDB runs SF-100 (100GB) on my Linux desktop without problems and it certainly does not have 100GB RAM ;) Overall, on the 4TB Skyserver application, MonetDB and MS SQLserver, were comparable, with some queries handled better by either one. 3) You might also have a look at http://homepages.cwi.nl/~mk/ontimeReport for a comparison on a 120M row datawarehouse and comparative performance against others. Overall, performance evaluations have to be done with care. It requires good understanding on many aspects of the systems under test. Most importantly the impact of the resource parameters and versions. regards, Martin On 10/19/11 1:53 AM, Stefan de Konink wrote:
On Tue, 18 Oct 2011, Jeremy OSullivan wrote:
till system crash when data volume exceeds total capacity of memory and swap.
Since MonetDB uses memory mapping any "data volume" is in essence 'swap', it is up to the operation system to load these parts in memory that it is processing on.
So the high performance of MMDB relies on sufficient memory supply. "
Is this a fair assessment ?
I don't think so; it is about selectivity. First of all MonetDB is a column store thus only the columns that are used in a query plan have to be in memory, either in paralel or in sequence. The total amount data is therefore not the bottle neck, the selectivity of the query is. Since meta data on columns exist data can be read from a specific offset.
The memory mapping allows a specific region te be load into memory by the Operating System. Obviously this will not really help with a sequential scan on a column that is bigger than the available memory of the host system, but given that scenario the actual performance issue is in the bandwidth of storage to the memory, and not the amount of memory: the data is a supply by itself - swap space.
The only thing regarding to 'system crash' I can currently think of: if the intermediate results exceed the amount available memory (in physical RAM and in swap)...
Stefan
------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users