[MonetDB-users] Memory Allocation errors
Hi all, I've been testing MonetDB and I've been getting strange memory allocation errors while running large 'group by' queries. From a dataset of around 60GB, I'm running 'group by' queries against it to test for speed. Queries of the sort: select keyword, count(*) as c from raw_data group by keyword order by c desc limit 100; What's happening is that it will trundle along ok for a while, generally consuming all the available memory (98%) on 3GB box. But after a while I get these messages on the server: # Listening for connection requests on mapi:monetdb://127.0.0.1:50000/ # MonetDB/SQL module v2.28.4 loaded
#GDKmmap(10412883968) fails, try to free up space [memory in use=72431192,virtual memory in use=71134150656] #GDKmmap(10412883968) result [mem=71387384,vm=71134150656] #GDKvmalloc(10412883984) fails, try to free up space [memory in use=71386904,virtual memory in use=71134150656] #GDKvmalloc(10412883984) result [mem=71130480,vm=71134150656] !ERROR: HEAPalloc: Insufficient space for HEAP of 10412883968 bytes.
Also while checking top it will reports a much large RES size than physical memory. E.g. (250% for %MEM). I haven't changed any out of the box settings. Is there a minimum memory size that MonetDB requires to play nice? Is there anywhere I can go to limit the amount of memory MonetDB requires? The version I'm using is: mserver5 --version MonetDB server v5.10.4 (64-bit), based on kernel v1.28.4 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2009 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Configured for prefix: /usr Libraries: libpcre: 6.6 06-Feb-2006 (compiled with 6.6) openssl: OpenSSL 0.9.8e-rhel5 01 Jul 2008 (compiled with OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008) libxml2: 2.6.26 (compiled with 2.6.26) Compiled by: mockbuild@surya.karan.org Compilation: gcc -O2 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -std=c99 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 Cheers
Hi Rob, On Thu, May 28, 2009 at 01:25:58PM +0100, Rob Wickert wrote:
Hi all,
I've been testing MonetDB and I've been getting strange memory allocation errors while running large 'group by' queries. From a dataset of around 60GB, I'm running 'group by' queries against it to test for speed. Queries of the sort:
select keyword, count(*) as c from raw_data group by keyword order by c desc limit 100;
How many tuples are in your 60GB dataset, i.e., your table "raw_data"? Of which datatype is attribute "keyword"? How many distinct values are there / do you expect in column "keyword"?
What's happening is that it will trundle along ok for a while, generally consuming all the available memory (98%) on 3GB box. But after a while I get these messages on the server:
# Listening for connection requests on mapi:monetdb://127.0.0.1:50000/ # MonetDB/SQL module v2.28.4 loaded
#GDKmmap(10412883968) fails, try to free up space [memory in use=72431192,virtual memory in use=71134150656] #GDKmmap(10412883968) result [mem=71387384,vm=71134150656] #GDKvmalloc(10412883984) fails, try to free up space [memory in use=71386904,virtual memory in use=71134150656] #GDKvmalloc(10412883984) result [mem=71130480,vm=71134150656] !ERROR: HEAPalloc: Insufficient space for HEAP of 10412883968 bytes.
Since address space restriction should not apply on your 64-bit system, could it be that your disk (partition), where your dbfarm resides on, is/was (almost) full, i.e., had less than 10 GB free at that time?
Also while checking top it will reports a much large RES size than physical memory. E.g. (250% for %MEM).
I haven't changed any out of the box settings. Is there a minimum memory size that MonetDB requires to play nice?
This depends on the amount and kind of data as well as the type of queries. While we manage to run 100GB TPC-H on a 8 GB machine, generally "the more the better" does apply...
Is there anywhere I can go to limit the amount of memory MonetDB requires?
Other than manually partitioning your data horizontally and rephrasing your queries accordingly, no. MonetDB needs to have all data, that it requires for processing a query, mapped into the processed address space. I will resort to virtual memory and memory mapped files once it runs out of physical memory.
The version I'm using is:
mserver5 --version MonetDB server v5.10.4 (64-bit), based on kernel v1.28.4 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2009 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Configured for prefix: /usr Libraries: libpcre: 6.6 06-Feb-2006 (compiled with 6.6) openssl: OpenSSL 0.9.8e-rhel5 01 Jul 2008 (compiled with OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008) libxml2: 2.6.26 (compiled with 2.6.26) Compiled by: mockbuild@surya.karan.org Compilation: gcc -O2 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -std=c99 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64
Cheers
Stefan -- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
How many tuples are in your 60GB dataset, i.e., your table "raw_data"? Of which datatype is attribute "keyword"? How many distinct values are there / do you expect in column "keyword"?
This depends on the amount and kind of data as well as the type of queries. While we manage to run 100GB TPC-H on a 8 GB machine, generally "the more the better" does apply...
There are aprox 28 million keywords (varchar(100)) out of 680 million row table. Yes I know 3GB is not ideal and I wasn't expecting blindingly fast queries until we bump up the memory, but I wasn't expecting the memory mapping issue. There's plenty of space on the disk - 151GB;
On Fri, May 29, 2009 at 11:35:12AM +0100, Rob Wickert wrote:
How many tuples are in your 60GB dataset, i.e., your table "raw_data"? Of which datatype is attribute "keyword"? How many distinct values are there / do you expect in column "keyword"?
This depends on the amount and kind of data as well as the type of queries. While we manage to run 100GB TPC-H on a 8 GB machine, generally "the more the better" does apply...
There are aprox 28 million keywords (varchar(100)) out of 680 million row table. Yes I know 3GB is not ideal and I wasn't expecting blindingly fast queries until we bump up the memory, but I wasn't expecting the memory mapping issue. There's plenty of space on the disk - 151GB;
Rob, also for this one, could you please upgrade to the latest May2009 release (http://monetdb.cwi.nl/Development/Releases/May2009/), try again, preferably prefixinng your query in mclient with "TRACE " to produce a performance trace that will give us more hints, where the problem occurs, and report? Thanks, Stefan
------------------------------------------------------------------------------ Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (2)
-
Rob Wickert
-
Stefan Manegold