Re: [MonetDB-users] tuning advice for TPC-H
I thought I would try and ask these questions again, since there was no response. - How does how MonetDB handles its virtual memory? Does it swap out entire BATs at a time? Or just smaller blocks? What type of caching algorithms does it use? - If I configure a large swap, MonetDB is not using it. I guess it allocates based on actual physical memory. If I want to run a test to let the OS handle virtual memory and have MonetDB try and use more of system virtual memory, how do I do that? I am interested in seeing how this affects performance when the database is too large to fit in memory. - Are there any other tuning settings I should set when using a database that is much larger than memory? Thanks in advance, Rahul
----- Original Message ----- From: "Rahul Chopra"
To: "Communication channel for MonetDB users" Subject: Re: [MonetDB-users] tuning advice for TPC-H Date: Thu, 10 Jan 2008 03:14:54 +0800 I am still trying to get better performance when doing TPC-H tests for a database more than twice the size of physical memory.
I am not sure how MonetDB handles its virtual memory, but it sure seems to be going to disk an awful lot. Does it swap out entire BATs? Or just smaller blocks? What type of caching algorithms does it use?
Also, one of the things I wanted to try is to setup a really big swap partition on a separate device with write-back cache enabled and have MonetDB use it (Basically, see if Linux will virtual memory better than MonetDB does, and take advantage of putting swap on a separate device for speed). When I try and do that though, MonetDB seems to just be trying to stay within the confines of physical memory available. Is there something I can configure to try and get it to use system virtual memory? Has anyone tried out anything like that?
-Rahul
I am trying to run some TPC-H queries to get a sense for how large of databases MonetDB can handle. I am currently trying
a > > factor 20 database. I notice that on query 9, it is taking over > > two hours (still running). In "top" it claims that virtual > memory > for the process started at 27 GB or so, and is now down > to 10 GB. > What is noteworthy is that CPU usage is pretty small, > at 2%. I > previously observed 80-100% with a factor 1 database.
Does this mean you got the sf-20 loaded? What fixed your problem with loading?
Yes, I got it loaded. Since I had broken up the file into smaller chunks, I just restarted on the last chunk that it failed on before the crash. Eventually, it went through all of the tables.
The low cpu usage indicates that MonetDB is waiting for IO. I'll later this week try to repeat your experiment on a similar system. Then I'll have more input on how to improve things.
After writing the email I noticed that system CPU wait time was at 25%, with MonetDB at 2% and idle at 73% or so. So, yes, it is waiting on IO. Perhaps it is just my hardware.
I did disable the 2 GB swap partition and rerun. After doing so, execution time did improve somewhat. I am not sure if this means one should always disable swap on a MonetDB system, since the system drive is slower than the data drives. I will try another experiment and let you know the results.
= College For Financial Planning Nation's most successful CFP® Program with 35+ years of experience. http://a8-asy.a8ww.net/a8-ads/adftrclick?redirectid=0badaaf35b41b0f3ca7504bf...
-- Powered by Outblaze
------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
= Las Vegas, Nevada Home for Sale Dave Radcliffe will help you find your Las Vegas new home for sale, dream condo, townhome, or real estate investment property. Search here for all the homes for sale in Las Vegas, Nevada. http://a8-asy.a8ww.net/a8-ads/adftrclick?redirectid=95c6cb18131e4c352f8c9204... -- Powered by Outblaze
Hello, the kernel guy is currently in the US. here a short reaction. Martin Rahul Chopra wrote:
I thought I would try and ask these questions again, since there was no response.
- How does how MonetDB handles its virtual memory? Does it swap out entire BATs at a time? Or just smaller blocks? What type of caching algorithms does it use?
it relies completely on the OS facilities, using memory map advice to indicate the anticipated behavior. BATs are always the unit of memory mapping, but a background thread triggers flushing when we start to run low on real mem. There no such notion as blocks in BATs. the caching effect is dependent on the virtual memory manager, whereby most temporary BATs won't even reach disk at all.
- If I configure a large swap, MonetDB is not using it. I guess it allocates based on actual physical memory. If I want to run a test to let the OS handle virtual memory and have MonetDB try and use more of system virtual memory, how do I do that? I am interested in seeing how this affects performance when the database is too large to fit in memory.
for the latter issue, just look at the tpch figures on the website, they show the effect from in-memory, near to memory limit, and out of memory performance. This gives a rough indication. Overall, performance degrades significantly it a hash lookup is used to access elements in a single bat, which consumes much more than your primary memory.
- Are there any other tuning settings I should set when using a database that is much larger than memory?
the only setting controllable is the minimum bat size when MonetDB uses memory mapped bats.
Thanks in advance,
Rahul
----- Original Message ----- From: "Rahul Chopra"
To: "Communication channel for MonetDB users" Subject: Re: [MonetDB-users] tuning advice for TPC-H Date: Thu, 10 Jan 2008 03:14:54 +0800 I am still trying to get better performance when doing TPC-H tests for a database more than twice the size of physical memory.
I am not sure how MonetDB handles its virtual memory, but it sure seems to be going to disk an awful lot. Does it swap out entire BATs? Or just smaller blocks? What type of caching algorithms does it use?
Also, one of the things I wanted to try is to setup a really big swap partition on a separate device with write-back cache enabled and have MonetDB use it (Basically, see if Linux will virtual memory better than MonetDB does, and take advantage of putting swap on a separate device for speed). When I try and do that though, MonetDB seems to just be trying to stay within the confines of physical memory available. Is there something I can configure to try and get it to use system virtual memory? Has anyone tried out anything like that?
-Rahul
I am trying to run some TPC-H queries to get a sense for how large of databases MonetDB can handle. I am currently trying
a > > factor 20 database. I notice that on query 9, it is taking over > > two hours (still running). In "top" it claims that virtual > memory > for the process started at 27 GB or so, and is now down > to 10 GB. > What is noteworthy is that CPU usage is pretty small, > at 2%. I > previously observed 80-100% with a factor 1 database.
Does this mean you got the sf-20 loaded? What fixed your problem with loading?
Yes, I got it loaded. Since I had broken up the file into smaller chunks, I just restarted on the last chunk that it failed on before the crash. Eventually, it went through all of the tables.
The low cpu usage indicates that MonetDB is waiting for IO. I'll later this week try to repeat your experiment on a similar system. Then I'll have more input on how to improve things.
After writing the email I noticed that system CPU wait time was at 25%, with MonetDB at 2% and idle at 73% or so. So, yes, it is waiting on IO. Perhaps it is just my hardware.
I did disable the 2 GB swap partition and rerun. After doing so, execution time did improve somewhat. I am not sure if this means one should always disable swap on a MonetDB system, since the system drive is slower than the data drives. I will try another experiment and let you know the results.
= College For Financial Planning Nation's most successful CFP® Program with 35+ years of experience. http://a8-asy.a8ww.net/a8-ads/adftrclick?redirectid=0badaaf35b41b0f3ca7504bf...
-- Powered by Outblaze
------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
= Las Vegas, Nevada Home for Sale Dave Radcliffe will help you find your Las Vegas new home for sale, dream condo, townhome, or real estate investment property. Search here for all the homes for sale in Las Vegas, Nevada. http://a8-asy.a8ww.net/a8-ads/adftrclick?redirectid=95c6cb18131e4c352f8c9204...
participants (2)
-
Martin Kersten
-
Rahul Chopra