Latest release (11.33.11) still doesn't honor numactl
I have just upgraded MonetDB on an 88 core CentOS box - 2 cpu modules. I was also one of the people who originally found the numa issue. The system doesn't behave as I would expect. Here is the sequence of commands I execute to start up MonetDB (overkill, but creating a clean database for the test): 1. numactl --cpunodebind=1 --membind=1 monetdbd create /srv/zfsdata/monetdbTesting/nTC 2. numactl --cpunodebind=1 --membind=1 monetdbd set listenaddr=0.0.0.0 /srv/zfsdata/monetdbTesting/nTC 3. numactl --cpunodebind=1 --membind=1 monetdbd set discovery=false /srv/zfsdata/monetdbTesting/nTC 4. numactl --cpunodebind=1 --membind=1 monetdbd set control=false /srv/zfsdata/monetdbTesting/nTC 5. numactl --cpunodebind=1 --membind=1 monetdbd start /srv/zfsdata/monetdbTesting/nTC 6. numactl --cpunodebind=1 --membind=1 monetdb -p 50000 create nTC 7. numactl --cpunodebind=1 --membind=1 monetdb -p 50000 release nTC If I launch MonetDB using the monetdbd command as "numactl -cpunodebind=1 -membind=1", that means that we are assigning only 44 cores for use with MonetDB (mserver5). I would expect that either command 1 or 5 is the one that spawns mserver5. What I then see with numactl --show --pid=`pidof mserver5` is: policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 cpubind: 0 1 nodebind: 0 1 membind: 0 1 I would have expected that if we launch mserver5 via the monetdbd wrapper (as is what's documented in the tutorials), that the numa settings would be honored. Note also, I would expect 44 threads (plus a few extra) be set rather than 88 (plus a few extra). We are seeing 96 threads. Looking at the useful part of the /proc/`pidof mserver5`/status file gives the following: Cpus_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00fffffc,00000fff,ffc00000 Cpus_allowed_list: 22-43,66-87 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 Mems_allowed_list: 0-1 Which implies it IS putting the program on 44 cores, but still creating 96 threads. It also shows that it isn't honoring the memory constraint? Any suggestion on how to correctly launch so it honors our numa requirements? Thanks, Dave
Hai Dave, Unfortunately, a very short answer to this is that we haven’t had any opportunities to work on making MonetDB to respect the hardware limitations set by external tools such as those identified below. We did some deeper internal brainstorming triggered by Roberto’s e-mail about "Guidelines for MonetDB in production environments” earlier this year (https://www.monetdb.org/pipermail/users-list/2019-June/010437.html). We see several possibilities, but none of them is simple and requires significant resource allocation and planning => i.e. not something that we’ll include in a bugfix release. Regards, Jennie
On 3 Sep 2019, at 23:06, Gotwisner, Dave
wrote: I have just upgraded MonetDB on an 88 core CentOS box – 2 cpu modules. I was also one of the people who originally found the numa issue. The system doesn’t behave as I would expect.
Here is the sequence of commands I execute to start up MonetDB (overkill, but creating a clean database for the test): • numactl --cpunodebind=1 --membind=1 monetdbd create /srv/zfsdata/monetdbTesting/nTC • numactl --cpunodebind=1 --membind=1 monetdbd set listenaddr=0.0.0.0 /srv/zfsdata/monetdbTesting/nTC • numactl --cpunodebind=1 --membind=1 monetdbd set discovery=false /srv/zfsdata/monetdbTesting/nTC • numactl --cpunodebind=1 --membind=1 monetdbd set control=false /srv/zfsdata/monetdbTesting/nTC • numactl --cpunodebind=1 --membind=1 monetdbd start /srv/zfsdata/monetdbTesting/nTC • numactl --cpunodebind=1 --membind=1 monetdb -p 50000 create nTC • numactl --cpunodebind=1 --membind=1 monetdb -p 50000 release nTC
If I launch MonetDB using the monetdbd command as “numactl –cpunodebind=1 –membind=1”, that means that we are assigning only 44 cores for use with MonetDB (mserver5). I would expect that either command 1 or 5 is the one that spawns mserver5.
What I then see with numactl --show --pid=`pidof mserver5` is: policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 cpubind: 0 1 nodebind: 0 1 membind: 0 1
I would have expected that if we launch mserver5 via the monetdbd wrapper (as is what’s documented in the tutorials), that the numa settings would be honored. Note also, I would expect 44 threads (plus a few extra) be set rather than 88 (plus a few extra). We are seeing 96 threads. Looking at the useful part of the /proc/`pidof mserver5`/status file gives the following: Cpus_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00fffffc,00000fff,ffc00000 Cpus_allowed_list: 22-43,66-87 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 Mems_allowed_list: 0-1 Which implies it IS putting the program on 44 cores, but still creating 96 threads. It also shows that it isn’t honoring the memory constraint?
Any suggestion on how to correctly launch so it honors our numa requirements?
Thanks, Dave _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Gotwisner, Dave
-
Ying Zhang