Hello,
We had a DB crash reported with the below messages, Do we need alter any system level parameters or changes to avoid the below issue –
Server has 128 CPU and 1TB memory.
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
/etc/security/limits.conf
monetadmin soft nproc 102400
monetadmin hard nproc 102400
monetadmin soft nofile 500000
monetadmin hard nofile 500000
monetadmin soft stack 10240
monetadmin soft memlock unlimited
monetadmin hard memlock unlimited
[monetadmin@lnx1535 DBFARM_TSV_P_B]$ /monet_binaries/MonetDB-11.27.13_PY/bin/monetdb -p 50010 get all TSV_PROD_DB_B
name prop source value
TSV_PROD_DB_B name - TSV_PROD_DB_B
TSV_PROD_DB_B type default database
TSV_PROD_DB_B shared default yes
TSV_PROD_DB_B nthreads default 128
TSV_PROD_DB_B optpipe local sequential_pipe
TSV_PROD_DB_B readonly local yes
TSV_PROD_DB_B embedr default no
TSV_PROD_DB_B embedpy local yes
TSV_PROD_DB_B embedpy3 default no
TSV_PROD_DB_B nclients local 2048
TSV_PROD_DB_B dbextra default <unknown>
[monetadmin@lnx1535 DBFARM_TSV_P_B]$
2018-08-13 08:46:30 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2018-08-13 08:46:30 MSG merovingian[36790]: proxying client 10.106.5.250:29929 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
2018-08-13 08:46:30 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:33 MSG merovingian[36790]: proxying client 10.106.5.250:29957 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
2018-08-13 08:46:33 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2018-08-13 08:46:36 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:46:40 MSG merovingian[36790]: proxying client 10.106.5.250:64830 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
2018-08-13 08:46:40 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2018-08-13 08:46:42 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads
2018-08-13 08:47:47 MSG discovery[36790]: new neighbour lnx1536.ch3.prod.i.com (lnx1536.ch3.prod.i.com)
2018-08-13 08:47:48 MSG discovery[36790]: new database mapi:monetdb://lnx1536.ch3.prod.i.com:50010/TSV_PROD_DB_B (ttl=660s)
2018-08-13 08:48:55 MSG merovingian[36790]: database 'TSV_PROD_DB_B' (57408) has crashed (dumped core)
2018-08-13 08:49:12 MSG merovingian[36790]: database 'TSV_PROD_DB_B' has crashed after start on 2018-08-13 05:35:17, attempting restart, up min/avg/max: 2h/5d/3w, crash average: 1.00
0.70 0.43 (23-10=13)
2018-08-13 08:49:12 MSG TSV_PROD_DB_B[72455]: arguments: /monet_binaries/MonetDB-11.27.13_PY/bin/mserver5 --dbpath=/monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B --set merovingian_uri=mapi:monetdb://lnx1535.ch3.prod.i.com:50010/TSV_PROD_DB_B
--set mapi_open=false --set mapi_port=0 --set
2018-08-13 08:49:12 MSG TSV_PROD_DB_B[72455]: mapi_usock=/monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock --set monet_vault_key=/monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.vaultkey
--set gdk_nr_threads=128 --set max_clients=2048 --set sql_optimizer=sequential_pipe --set embedded_py=true --readonly --set monet_daemon=yes
2018-08-13 08:49:12 MSG TSV_PROD_DB_B[72455]:
2018-08-13 08:49:12 MSG merovingian[36790]: proxying client 10.106.5.250:16467 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
2018-08-13 08:49:12 MSG merovingian[36790]: proxying client 10.106.5.250:45439 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
2018-08-13 08:49:12 MSG merovingian[36790]: starting a proxy failed: cannot connect: Connection refused
2018-08-13 08:49:12 ERR control[36790]: !monetdbd: an internal error has occurred 'cannot connect: Connection refused'
2018-08-13 08:49:12 ERR merovingian[36790]: client error: cannot connect: Connection refused
2018-08-13 08:49:12 MSG merovingian[36790]: proxying client 10.106.5.250:45447 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
2018-08-13 08:49:12 ERR control[36790]: !monetdbd: an internal error has occurred 'cannot connect: Connection refused'
2018-08-13 08:49:12 MSG merovingian[36790]: starting a proxy failed: cannot connect: Connection refused
2018-08-13 08:49:13 ERR merovingian[36790]: client error: cannot connect: Connection refused
2018-08-13 08:49:13 MSG merovingian[36790]: proxying client 10.106.5.250:36605 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B
/etc/security/limits.conf
monetadmin soft nproc 102400
monetadmin hard nproc 102400
monetadmin soft nofile 500000
monetadmin hard nofile 500000
monetadmin soft stack 10240
monetadmin soft memlock unlimited
monetadmin hard memlock unlimited
[monetadmin@lnx1535 DBFARM_TSV_P_B]$ /monet_binaries/MonetDB-11.27.13_PY/bin/monetdb -p 50010 get all TSV_PROD_DB_B
name prop source value
TSV_PROD_DB_B name - TSV_PROD_DB_B
TSV_PROD_DB_B type default database
TSV_PROD_DB_B shared default yes
TSV_PROD_DB_B nthreads default 128
TSV_PROD_DB_B optpipe local sequential_pipe
TSV_PROD_DB_B readonly local yes
TSV_PROD_DB_B embedr default no
TSV_PROD_DB_B embedpy local yes
TSV_PROD_DB_B embedpy3 default no
TSV_PROD_DB_B nclients local 2048
TSV_PROD_DB_B dbextra default <unknown>
[monetadmin@lnx1535 DBFARM_TSV_P_B]$