problem stopping/starting monetdbd
Hi all - Yesterday our monetdb instance seemed to be running slowing, perhaps due to user queries that had been killed but maybe not cleaned up. We decided to stop the dbs, then start/stop monetdbd. I executed the following commands, but was unable to restart the daemon: sudo monetdbd stop /local/monetdb/dbfarmBL sudo monetdbd start /local/monetdb/dbfarmBL We had our admin reboot the machine, thinking this would clear up everything. This morning I tried again to start monetdbd. The command comes back with no messages, but I get a connection refused message when trying to access a database: could not connect to localhost:50000: Connection refused Get all shows this: [lcj34@cbsudc01 anp68_ije_Locus9Combined_Result]$ monetdbd get all /local/monetdb/dbfarmBL property value hostname cbsudc01 dbfarm /local/monetdb/dbfarmBL status no monetdbd is serving this dbfarm mserver unknown (monetdbd not running) logfile /local/monetdb/dbfarmBL/merovingian.log pidfile /local/monetdb/dbfarmBL/merovingian.pid sockdir /tmp listenaddr 128.84.3.206 port 50000 exittimeout 60 forward proxy discovery yes discoveryttl 600 control no passphrase <unknown> mapisock /tmp/.s.monetdb.50000 controlsock /tmp/.s.merovingian.50000 [lcj34@cbsudc01 anp68_ije_Locus9Combined_Result]$ The log shows the following message: 2017-01-27 07:53:26 MSG maizeFullGenomeDB[14286]: # Listening for UNIX domain connection requests on mapi:monetdb:///local/monetdb/dbfarmBL/maizeFullGenomeDB/.mapi.sock 2017-01-27 07:53:26 MSG maizeFullGenomeDB[14286]: # MonetDB/SQL module loaded 2017-01-27 07:53:29 ERR merovingian[13705]: client error: cannot connect: Connection refused 2017-01-27 07:55:41 MSG merovingian[13705]: caught SIGTERM, starting shutdown sequence 2017-01-27 07:55:50 MSG merovingian[15395]: Merovingian 1.7 (Jun2016-SP2) starting 2017-01-27 07:55:50 MSG merovingian[15395]: monitoring dbfarm /local/monetdb/dbfarmBL 2017-01-27 07:55:50 MSG merovingian[15395]: accepting connections on TCP socket cbsudc01:50000 2017-01-27 07:55:50 MSG merovingian[15395]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2017-01-27 07:55:50 ERR merovingian[15395]: binding to datagram socket port 50000 failed: no available address 2017-01-27 07:55:50 MSG merovingian[15395]: Merovingian 1.7 stopped 2017-01-27 07:55:50 ERR merovingian[15395]: fatal startup condition encountered, aborting startup Any ideas what to try? Thanks - Lynn
Lynn, Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem. https://www.monetdb.org/Documentation/UserGuide/Debugging Dave From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson Sent: Friday, January 27, 2017 8:07 AM To: Communication channel for MonetDB users Cc: Karl Anton Kremling Subject: problem stopping/starting monetdbd Hi all - Yesterday our monetdb instance seemed to be running slowing, perhaps due to user queries that had been killed but maybe not cleaned up. We decided to stop the dbs, then start/stop monetdbd. I executed the following commands, but was unable to restart the daemon: sudo monetdbd stop /local/monetdb/dbfarmBL sudo monetdbd start /local/monetdb/dbfarmBL We had our admin reboot the machine, thinking this would clear up everything. This morning I tried again to start monetdbd. The command comes back with no messages, but I get a connection refused message when trying to access a database: could not connect to localhost:50000: Connection refused Get all shows this: [lcj34@cbsudc01 anp68_ije_Locus9Combined_Result]$ monetdbd get all /local/monetdb/dbfarmBL property value hostname cbsudc01 dbfarm /local/monetdb/dbfarmBL status no monetdbd is serving this dbfarm mserver unknown (monetdbd not running) logfile /local/monetdb/dbfarmBL/merovingian.log pidfile /local/monetdb/dbfarmBL/merovingian.pid sockdir /tmp listenaddr 128.84.3.206 port 50000 exittimeout 60 forward proxy discovery yes discoveryttl 600 control no passphrase <unknown> mapisock /tmp/.s.monetdb.50000 controlsock /tmp/.s.merovingian.50000 [lcj34@cbsudc01 anp68_ije_Locus9Combined_Result]$ The log shows the following message: 2017-01-27 07:53:26 MSG maizeFullGenomeDB[14286]: # Listening for UNIX domain connection requests on mapi:monetdb:///local/monetdb/dbfarmBL/maizeFullGenomeDB/.mapi.sock 2017-01-27 07:53:26 MSG maizeFullGenomeDB[14286]: # MonetDB/SQL module loaded 2017-01-27 07:53:29 ERR merovingian[13705]: client error: cannot connect: Connection refused 2017-01-27 07:55:41 MSG merovingian[13705]: caught SIGTERM, starting shutdown sequence 2017-01-27 07:55:50 MSG merovingian[15395]: Merovingian 1.7 (Jun2016-SP2) starting 2017-01-27 07:55:50 MSG merovingian[15395]: monitoring dbfarm /local/monetdb/dbfarmBL 2017-01-27 07:55:50 MSG merovingian[15395]: accepting connections on TCP socket cbsudc01:50000 2017-01-27 07:55:50 MSG merovingian[15395]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2017-01-27 07:55:50 ERR merovingian[15395]: binding to datagram socket port 50000 failed: no available address 2017-01-27 07:55:50 MSG merovingian[15395]: Merovingian 1.7 stopped 2017-01-27 07:55:50 ERR merovingian[15395]: fatal startup condition encountered, aborting startup Any ideas what to try? Thanks - Lynn
Thanks Dave.
I read through the link below. The debugger is attached after the daemon
is running. But I can¹t get to this stage - I am unable to get the daemon
running. The problem seems to be with attaching to port 50000. From the
log below:
binding to datagram socket port 50000 failed: no available address
Could there be a UNIX socket issue I should investigate?
On 1/27/17, 10:16 AM, "users-list on behalf of Anderson, David B "
Lynn,
Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem.
https://www.monetdb.org/Documentation/UserGuide/Debugging
Dave
From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson Sent: Friday, January 27, 2017 8:07 AM To: Communication channel for MonetDB users Cc: Karl Anton Kremling Subject: problem stopping/starting monetdbd
Hi all -
Yesterday our monetdb instance seemed to be running slowing, perhaps due to user queries that had been killed but maybe not cleaned up. We decided to stop the dbs, then start/stop monetdbd. I executed the following commands, but was unable to restart the daemon:
sudo monetdbd stop /local/monetdb/dbfarmBL sudo monetdbd start /local/monetdb/dbfarmBL
We had our admin reboot the machine, thinking this would clear up everything. This morning I tried again to start monetdbd. The command comes back with no messages, but I get a connection refused message when trying to access a database:
could not connect to localhost:50000: Connection refused
Get all shows this:
[lcj34@cbsudc01 anp68_ije_Locus9Combined_Result]$ monetdbd get all /local/monetdb/dbfarmBL property value hostname cbsudc01 dbfarm /local/monetdb/dbfarmBL status no monetdbd is serving this dbfarm mserver unknown (monetdbd not running) logfile /local/monetdb/dbfarmBL/merovingian.log pidfile /local/monetdb/dbfarmBL/merovingian.pid sockdir /tmp listenaddr 128.84.3.206 port 50000 exittimeout 60 forward proxy discovery yes discoveryttl 600 control no passphrase <unknown> mapisock /tmp/.s.monetdb.50000 controlsock /tmp/.s.merovingian.50000 [lcj34@cbsudc01 anp68_ije_Locus9Combined_Result]$
The log shows the following message:
2017-01-27 07:53:26 MSG maizeFullGenomeDB[14286]: # Listening for UNIX domain connection requests on mapi:monetdb:///local/monetdb/dbfarmBL/maizeFullGenomeDB/.mapi.sock 2017-01-27 07:53:26 MSG maizeFullGenomeDB[14286]: # MonetDB/SQL module loaded 2017-01-27 07:53:29 ERR merovingian[13705]: client error: cannot connect: Connection refused 2017-01-27 07:55:41 MSG merovingian[13705]: caught SIGTERM, starting shutdown sequence 2017-01-27 07:55:50 MSG merovingian[15395]: Merovingian 1.7 (Jun2016-SP2) starting 2017-01-27 07:55:50 MSG merovingian[15395]: monitoring dbfarm /local/monetdb/dbfarmBL 2017-01-27 07:55:50 MSG merovingian[15395]: accepting connections on TCP socket cbsudc01:50000 2017-01-27 07:55:50 MSG merovingian[15395]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2017-01-27 07:55:50 ERR merovingian[15395]: binding to datagram socket port 50000 failed: no available address 2017-01-27 07:55:50 MSG merovingian[15395]: Merovingian 1.7 stopped 2017-01-27 07:55:50 ERR merovingian[15395]: fatal startup condition encountered, aborting startup
Any ideas what to try?
Thanks - Lynn _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Netstat should let you know if the port is in use. I don't remember the exact syntax. . .
-----Original Message-----
From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson
Sent: Friday, January 27, 2017 10:25 AM
To: Communication channel for MonetDB users
Subject: Re: problem stopping/starting monetdbd
Thanks Dave.
I read through the link below. The debugger is attached after the daemon is running. But I can¹t get to this stage - I am unable to get the daemon
running. The problem seems to be with attaching to port 50000. From the
log below:
binding to datagram socket port 50000 failed: no available address
Could there be a UNIX socket issue I should investigate?
On 1/27/17, 10:16 AM, "users-list on behalf of Anderson, David B "
Lynn,
Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_Do cumentation_UserGuide_Debugging&d=DQIFAw&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_G Yjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=XI4qWt-5aWPNqmG6nuyIr8bIXTGV -oUAKTa8GbKzAgA&s=c3g7TTz6YMyy7YBSLKcunNheDve8f9fUcIeC9gxpzh0&e=
Dave
Thanks for your responses. We found the problem. There was a bad process
running we needed to kill, then all got cleaned up and we’re good.
I appreciate your help - Lynn
On 1/27/17, 10:30 AM, "users-list on behalf of Anderson, David B "
Netstat should let you know if the port is in use. I don't remember the exact syntax. . .
-----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson Sent: Friday, January 27, 2017 10:25 AM To: Communication channel for MonetDB users Subject: Re: problem stopping/starting monetdbd
Thanks Dave.
I read through the link below. The debugger is attached after the daemon is running. But I can¹t get to this stage - I am unable to get the daemon running. The problem seems to be with attaching to port 50000. From the log below:
binding to datagram socket port 50000 failed: no available address
Could there be a UNIX socket issue I should investigate?
On 1/27/17, 10:16 AM, "users-list on behalf of Anderson, David B "
wrote: Lynn,
Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_Do cumentation_UserGuide_Debugging&d=DQIFAw&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_G Yjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=XI4qWt-5aWPNqmG6nuyIr8bIXTGV -oUAKTa8GbKzAgA&s=c3g7TTz6YMyy7YBSLKcunNheDve8f9fUcIeC9gxpzh0&e=
Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Lynn,
What kind of bad process out of interest as it binds the TCP & UNIX socket
then dies with no real explanation ?
Where some files lock by of another process ?
Regards,
Brian Hood
On Fri, Jan 27, 2017 at 5:51 PM, Lynn Carol Johnson
Thanks for your responses. We found the problem. There was a bad process running we needed to kill, then all got cleaned up and we’re good.
I appreciate your help - Lynn
On 1/27/17, 10:30 AM, "users-list on behalf of Anderson, David B "
wrote: Netstat should let you know if the port is in use. I don't remember the exact syntax. . .
-----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson Sent: Friday, January 27, 2017 10:25 AM To: Communication channel for MonetDB users Subject: Re: problem stopping/starting monetdbd
Thanks Dave.
I read through the link below. The debugger is attached after the daemon is running. But I can¹t get to this stage - I am unable to get the daemon running. The problem seems to be with attaching to port 50000. From the log below:
binding to datagram socket port 50000 failed: no available address
Could there be a UNIX socket issue I should investigate?
On 1/27/17, 10:16 AM, "users-list on behalf of Anderson, David B "
wrote: Lynn,
Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_Do cumentation_UserGuide_Debugging&d=DQIFAw&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_G Yjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=XI4qWt-5aWPNqmG6nuyIr8bIXTGV -oUAKTa8GbKzAgA&s=c3g7TTz6YMyy7YBSLKcunNheDve8f9fUcIeC9gxpzh0&e=
Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Brian -
The machine was rebooted in the evening, which should have killed all processes.. The next morning I tried starting monetdbd but got error messages.
When doing a grep for monet we found this: ( I have the command/result saved in an email !)
[root@cbsudc01 bin]# ps -ef | grep monet
root 14286 1 0 07:53 ? 00:00:26 /usr/bin/mserver5 --dbpath=/local/monetdb/dbfarmBL/maizeFullGenomeDB --set merovingian_uri mapi:monetdb://cbsudc01:50000/maizeFullGenomeDB --set mapi_open false --set mapi_port 0 --set mapi_usock /local/monetdb/dbfarmBL/maizeFullGenomeDB/.mapi.sock --set monet_vault_key /local/monetdb/dbfarmBL/maizeFullGenomeDB/.vaultkey --set gdk_nr_threads 24 --set max_clients 64 --set sql_optimizer default_pipe --set monet_daemon yes
Once we killed pid 14286 I was able to start the daemon. Looks like this process was started that morning. Could the monetdb process above have gotten started via a user request, but hung before the monetdbd was successfully started? Or is this message related to the monetdbd start request I used.
Lynn
From: users-list
Netstat should let you know if the port is in use. I don't remember the exact syntax. . .
-----Original Message----- From: users-list [mailto:users-list-bounces+david.b.andersonmailto:users-list-bounces%2Bdavid.b.anderson=citi.com@monetdb.orgmailto:citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson Sent: Friday, January 27, 2017 10:25 AM To: Communication channel for MonetDB users Subject: Re: problem stopping/starting monetdbd
Thanks Dave.
I read through the link below. The debugger is attached after the daemon is running. But I can¹t get to this stage - I am unable to get the daemon running. The problem seems to be with attaching to port 50000. From the log below:
binding to datagram socket port 50000 failed: no available address
Could there be a UNIX socket issue I should investigate?
On 1/27/17, 10:16 AM, "users-list on behalf of Anderson, David B "
mailto:cornell.edu@monetdb.org on behalf of david.b.anderson@citi.commailto:david.b.anderson@citi.com> wrote: Lynn,
Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_Do cumentation_UserGuide_Debugging&d=DQIFAw&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_G Yjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=XI4qWt-5aWPNqmG6nuyIr8bIXTGV -oUAKTa8GbKzAgA&s=c3g7TTz6YMyy7YBSLKcunNheDve8f9fUcIeC9gxpzh0&e=
Dave
_______________________________________________ users-list mailing list users-list@monetdb.orgmailto:users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.orgmailto:users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Lynn,
Well if the monetdbd process is running which manages all of connections to
your schemas if a schema crashes from a user request / query it will
restart that schema.
to stop a schema you run
sudo monetdb stop schemaname
however if there is a user request that will be restarted.
However your sudo monetdbd stop /local/monetdb/dbfarmBL
Should have stopped everything and there shouldn't be anything listening to
invoke that restart procedure.
With the reboot maybe an init script started with monetdbd process.
However this message i don't understand.
2017-01-27 07:55:50 ERR merovingian[15395]: fatal startup condition
encountered, aborting startup
Regards,
Brian Hood
On Mon, Jan 30, 2017 at 3:39 PM, Lynn Carol Johnson
Hi Brian -
The machine was rebooted in the evening, which should have killed all processes.. The next morning I tried starting monetdbd but got error messages.
When doing a grep for monet we found this: ( I have the command/result saved in an email !)
[root@cbsudc01 bin]# ps -ef | grep monet
root 14286 1 0 07:53 ? 00:00:26 /usr/bin/mserver5 --dbpath=/local/monetdb/dbfarmBL/maizeFullGenomeDB --set merovingian_uri mapi:monetdb://cbsudc01:50000/maizeFullGenomeDB --set mapi_open false --set mapi_port 0 --set mapi_usock /local/monetdb/dbfarmBL/maizeFullGenomeDB/.mapi.sock --set monet_vault_key /local/monetdb/dbfarmBL/maizeFullGenomeDB/.vaultkey --set gdk_nr_threads 24 --set max_clients 64 --set sql_optimizer default_pipe --set monet_daemon yes
Once we killed pid 14286 I was able to start the daemon. Looks like this process was started that morning. Could the monetdb process above have gotten started via a user request, but hung before the monetdbd was successfully started? Or is this message related to the monetdbd start request I used.
Lynn
From: users-list
on behalf of Brian Hood Reply-To: Communication channel for MonetDB users Date: Saturday, January 28, 2017 at 10:23 AM To: Communication channel for MonetDB users Subject: Re: problem stopping/starting monetdbd
Hi Lynn,
What kind of bad process out of interest as it binds the TCP & UNIX socket then dies with no real explanation ?
Where some files lock by of another process ?
Regards,
Brian Hood
On Fri, Jan 27, 2017 at 5:51 PM, Lynn Carol Johnson
wrote: Thanks for your responses. We found the problem. There was a bad process running we needed to kill, then all got cleaned up and we’re good.
I appreciate your help - Lynn
On 1/27/17, 10:30 AM, "users-list on behalf of Anderson, David B "
wrote: Netstat should let you know if the port is in use. I don't remember the exact syntax. . .
-----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Lynn Carol Johnson Sent: Friday, January 27, 2017 10:25 AM To: Communication channel for MonetDB users Subject: Re: problem stopping/starting monetdbd
Thanks Dave.
I read through the link below. The debugger is attached after the daemon is running. But I can¹t get to this stage - I am unable to get the daemon running. The problem seems to be with attaching to port 50000. From the log below:
binding to datagram socket port 50000 failed: no available address
Could there be a UNIX socket issue I should investigate?
On 1/27/17, 10:16 AM, "users-list on behalf of Anderson, David B "
wrote: Lynn,
Since you are getting a SIGTERM with limited error messages, you may need to start the crashing process in gdb to get a stack trace. That should help isolate the problem.
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_Do cumentation_UserGuide_Debugging&d=DQIFAw&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_G Yjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=XI4qWt-5aWPNqmG6nuyIr8bIXTGV -oUAKTa8GbKzAgA&s=c3g7TTz6YMyy7YBSLKcunNheDve8f9fUcIeC9gxpzh0&e=
Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (3)
-
Anderson, David B
-
Brian Hood
-
Lynn Carol Johnson