Hi Rob, On Thu, May 28, 2009 at 01:25:58PM +0100, Rob Wickert wrote:
Hi all,
I've been testing MonetDB and I've been getting strange memory allocation errors while running large 'group by' queries. From a dataset of around 60GB, I'm running 'group by' queries against it to test for speed. Queries of the sort:
select keyword, count(*) as c from raw_data group by keyword order by c desc limit 100;
How many tuples are in your 60GB dataset, i.e., your table "raw_data"? Of which datatype is attribute "keyword"? How many distinct values are there / do you expect in column "keyword"?
What's happening is that it will trundle along ok for a while, generally consuming all the available memory (98%) on 3GB box. But after a while I get these messages on the server:
# Listening for connection requests on mapi:monetdb://127.0.0.1:50000/ # MonetDB/SQL module v2.28.4 loaded
#GDKmmap(10412883968) fails, try to free up space [memory in use=72431192,virtual memory in use=71134150656] #GDKmmap(10412883968) result [mem=71387384,vm=71134150656] #GDKvmalloc(10412883984) fails, try to free up space [memory in use=71386904,virtual memory in use=71134150656] #GDKvmalloc(10412883984) result [mem=71130480,vm=71134150656] !ERROR: HEAPalloc: Insufficient space for HEAP of 10412883968 bytes.
Since address space restriction should not apply on your 64-bit system, could it be that your disk (partition), where your dbfarm resides on, is/was (almost) full, i.e., had less than 10 GB free at that time?
Also while checking top it will reports a much large RES size than physical memory. E.g. (250% for %MEM).
I haven't changed any out of the box settings. Is there a minimum memory size that MonetDB requires to play nice?
This depends on the amount and kind of data as well as the type of queries. While we manage to run 100GB TPC-H on a 8 GB machine, generally "the more the better" does apply...
Is there anywhere I can go to limit the amount of memory MonetDB requires?
Other than manually partitioning your data horizontally and rephrasing your queries accordingly, no. MonetDB needs to have all data, that it requires for processing a query, mapped into the processed address space. I will resort to virtual memory and memory mapped files once it runs out of physical memory.
The version I'm using is:
mserver5 --version MonetDB server v5.10.4 (64-bit), based on kernel v1.28.4 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2009 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Configured for prefix: /usr Libraries: libpcre: 6.6 06-Feb-2006 (compiled with 6.6) openssl: OpenSSL 0.9.8e-rhel5 01 Jul 2008 (compiled with OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008) libxml2: 2.6.26 (compiled with 2.6.26) Compiled by: mockbuild@surya.karan.org Compilation: gcc -O2 -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -std=c99 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64
Cheers
Stefan -- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |