Hi,
I would appreciate some help interpreting the following memory-related issues.
I've got a mserver5 instance running with cgroups v1 constraints
- memory.limit_in_bytes = 17179869184 (16g)
- memory.memsw.limit_in_bytes = 9223372036854771712
gdk_mem_maxsize is initialised as 0.815 * 17179869184 = 14001593384.
So I get:
sql>select * from env() where name in ('gdk_mem_maxsize', 'gdk_vm_maxsize');
+-----------------+---------------+
| name | value |
+=================+===============+
| gdk_vm_maxsize | 4398046511104 |
| gdk_mem_maxsize | 14001593384 |
+-----------------+---------------+
That looks good.
To my surprise, this instance gets frequently OOM-killed for reaching 16g of RSS (no swap used):
memory: usage 16777216kB, limit 16777216kB, failcnt 244063804
memory+swap: usage 16777964kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup out of memory: Kill process 975803 (mserver5) score 1003 or sacrifice child
Now, there are two different aspects: giving a process a memory cap and making the process respect that cap without getting killed.
- if the process allocates more than defined with cgroups, then it gets killed. That is fine, it doesn't surprise me
- the question is: why did monetDB surpass the 16g limit?
Even more surprising, given that it "prudently" initialises itself at 80% of the available memory.
Perhaps I was under the wrong assumption that MonetDB would never allocate more than gdk_mem_maxsize, but now I seem to realise that it simply uses this value to optimise its memory management (e.g. to decide how early to mmap).
So, am I correct that setting gdk_mem_maxsize (indirectly via gcroups or directly via memmaxsize parameter) does not guarantee rss memory will stay underthat value?
If that is true, I am back at square 1 in my quest for how to cap rss usage (without getting the process killed).
Thanks for your help.
Roberto