Sjoerd, Thanks for these details. Let me focus on these two concepts:
Allocated address space may or may not reside in physical memory. The kernel decides that.
Absolutely. Still, you can decide what's the max that malloc() can use:
gdk_mem_maxsize is the maximum amount of address space we want to allocate using malloc and friends.
Isn't malloc() using RSS + swap to back its allocations? Does that mean
that gdk_mem_maxsize should be a cap to what we want to be able to allocate
on RSS + swap?
In this case, actually, swap usage was 0.
So I still don't understand why mallocs for 16g happened,
when gdk_mem_maxsize was 14g.
On Tue, 9 Mar 2021 at 17:27, Sjoerd Mullender
We do not in any way control RSS (resident set size). That is fully under control of the kernel.
gdk_mem_maxsize is the maximum amount of address space we want to allocate using malloc and friends. gdk_vm_maxsize is the maximum amount of address space we want to allocate (malloc + mmap). Neither value has anything to do with how much actual, physical memory is being used. They are just measures of how much address space is used, allocated either through malloc or malloc+mmap. Allocated address space may or may not reside in physical memory. The kernel decides that.
Of course, if you're using the address space (however you got it), there must be physical memory to which the address space is mapped.
The difference between malloc and mmap is mostly where the physical, disk-spaced backing (if any) for the virtual memory is located, i.e. where the kernel can copy the memory to if it needs space. In the case of mmap (our use of it, anyway) it is files in the file system, and in the case of malloc it is swap (if you have it) or physical memory (if you don't).
On 09/03/2021 16.58, Roberto Cornacchia wrote:
Hi,
I would appreciate some help interpreting the following memory-related issues.
I've got a mserver5 instance running with cgroups v1 constraints
- memory.limit_in_bytes = 17179869184 (16g) - memory.memsw.limit_in_bytes = 9223372036854771712
gdk_mem_maxsize is initialised as 0.815 * 17179869184 = 14001593384.
So I get: sql>select * from env() where name in ('gdk_mem_maxsize', 'gdk_vm_maxsize'); +-----------------+---------------+ | name | value | +=================+===============+ | gdk_vm_maxsize | 4398046511104 | | gdk_mem_maxsize | 14001593384 | +-----------------+---------------+
That looks good.
To my surprise, this instance gets frequently OOM-killed for reaching 16g of RSS (no swap used):
memory: usage 16777216kB, limit 16777216kB, failcnt 244063804 memory+swap: usage 16777964kB, limit 9007199254740988kB, failcnt 0 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup out of memory: Kill process 975803 (mserver5) score 1003 or sacrifice child
Now, there are two different aspects: giving a process a memory cap and making the process respect that cap without getting killed.
- if the process allocates more than defined with cgroups, then it gets killed. That is fine, it doesn't surprise me - the question is: why did monetDB surpass the 16g limit?
Even more surprising, given that it "prudently" initialises itself at 80% of the available memory.
Perhaps I was under the wrong assumption that MonetDB would never allocate more than gdk_mem_maxsize, but now I seem to realise that it simply uses this value to optimise its memory management (e.g. to decide how early to mmap).
So, am I correct that setting gdk_mem_maxsize (indirectly via gcroups or directly via memmaxsize parameter) does not guarantee rss memory will stay underthat value?
If that is true, I am back at square 1 in my quest for how to cap rss usage (without getting the process killed).
Thanks for your help. Roberto
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list