large scale hardware configuration for MonetDB

Hi there,
From https://www.monetdbsolutions.com/news/monetdb-numascale : "The system is fine-tuned to exploit the large main memories of modern computer systems effectively and has proven to be highly efficient in data warehouse query processing. NumaQ can easily scale to thousands of cores and terabytes of memory for data-intensive."
I wonder if there is a public documentation on how to configure large scale hardware for optimal MonetDB performance. So one could try setup NumaQ-like machine by using what you already learned. Also from https://stackoverflow.com/questions/7435235/is-it-worth-trying-monetdb : "MonetDB is developed/tuned under two main assumptions: 1. Your workload is analytical, i.e., you have lots of (grouped) aggregations and the like. 2. Even more important: your hot dataset (the data that you actually work with) fits into the main memory of your system. MonetDB does not have it's own Buffer Manager but relies on the OS to handle disk I/O. Since the OS (especially windows but Linux too) is sometimes very dumb about disk swapping that may become a problem (especially for joins that run out of memory)." Are there any official benchmarks for Windows vs Linux MonetDB performance (especially for queries that require disk swapping) ? Thank you, Anton

On 6 Sep 2017, at 17:39, Anton Kravchenko
wrote: Hi there,
From https://www.monetdbsolutions.com/news/monetdb-numascale : "The system is fine-tuned to exploit the large main memories of modern computer systems effectively and has proven to be highly efficient in data warehouse query processing. NumaQ can easily scale to thousands of cores and terabytes of memory for data-intensive."
I wonder if there is a public documentation on how to configure large scale hardware for optimal MonetDB performance. So one could try setup NumaQ-like machine by using what you already learned.
Hai Anton, I’m not aware of such document, not even internally (correct me if I’m wrong). There is no single answer to "how to configure large scale hardware for optimal MonetDB performance”. It highly depends on the application use case, e.g. size of data, rate&size of updates, level of concurrency, type of queries, expected reaction time. In addition, the definition for “large scale hardware” can also be very different. We’re doing project on completely different types of hardware: i) a single "fat” server machine with many powerful CPU cores and a lot of memory; and ii) a huge number of “thin” servers with weak CPUs and very few MEM (say a million of Raspberry Pis). It’s quite a challenge to make bet use of both types of hardware. We define very different workloads to evaluate those systems. MonetDB likes single “fat” servers very much. That’s the category of hardware it was originally design for.
Also from https://stackoverflow.com/questions/7435235/is-it-worth-trying-monetdb : "MonetDB is developed/tuned under two main assumptions: • Your workload is analytical, i.e., you have lots of (grouped) aggregations and the like. • Even more important: your hot dataset (the data that you actually work with) fits into the main memory of your system. MonetDB does not have it's own Buffer Manager but relies on the OS to handle disk I/O. Since the OS (especially windows but Linux too) is sometimes very dumb about disk swapping that may become a problem (especially for joins that run out of memory)." Are there any official benchmarks for Windows vs Linux MonetDB performance (especially for queries that require disk swapping) ?
I’m not aware of any such benchmarks (again, correct me if I’m wrong), but MonetDB easily runs twice as fast as Windows. A main reason is the different between mmap on Linux vs Windows. When MEM is not large enough, SSDs might be helpful. Also, MonetDB had changed a lot (I mean, really a lot) since 2015. Regards, Jennie
Thank you, Anton
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Anton Kravchenko
-
Ying Zhang