Yes. Actually I did observe that after starting to load the data,
within few minutes, the server m/c would freeze and I was not even
able to SSH into it. But I wonder why this would happen on one m/c and
not on the other which is using exactly the same sources of MonetDB
server and client, and same data to load!
The m/c where mserver is getting killed is running CentOS, following
is the output of "uname -a" (masking out the server domain name for
security)!
"Linux <servername>.<inst>.edu 2.6.18-128.4.1.el5 #1 SMP Tue Aug 4
20:19:25 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux"
I checked "dmesg". The process was indeed killed by the OS. Here's the
o/p from dmesg --
---------------------------------------
Out of memory: Killed process 5857 (mserver5).
Out of memory: Killed process 5858 (mserver5).
mserver5: page allocation failure. order:0, mode:0x200d2
Out of memory: Killed process 14891 (mserver5).
Out of memory: Killed process 14892 (mserver5).
mserver5: page allocation failure. order:0, mode:0x200d2
mserver5: page allocation failure. order:0, mode:0x200d2
VM: killing process mserver5
mserver5: page allocation failure. order:0, mode:0x200d2
Out of memory: Killed process 16950 (mserver5).
Out of memory: Killed process 16951 (mserver5).
mserver5: page allocation failure. order:0, mode:0x200d2
mserver5: page allocation failure. order:0, mode:0x200d2
mserver5 invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Out of memory: Killed process 26793 (mserver5).
Out of memory: Killed process 26794 (mserver5).
mserver5: page allocation failure. order:0, mode:0x200d2
mserver5: page allocation failure. order:0, mode:0x200d2
mserver5: page allocation failure. order:0, mode:0x200d2
---------------------------------------
Medha
On Fri, Oct 16, 2009 at 4:00 PM, Sjoerd Mullender
On 2009-10-16 21:54, Medha Atre wrote:
Actually, I did a typo in my previous message. The response is not "connection *timeout*" but it's "connection *terminated*"! Since I don't own this machine, i will have to contact the administrators of the machine to check with them, but in the meanwhile searching on the web for the same error brought to my notice that several a few users have also reported the same problem and a bug was filed in MonetDB sometime early this year. I was just wondering if it's the same thing?
For more information, here are the last few lines of merovingian log
2009-10-16 07:13:52 MSG rdf[26793]: # MonetDB server v5.14.2, based on kernel v1.32.2 2009-10-16 07:13:52 MSG rdf[26793]: # Serving database 'rdf', using 8 threads 2009-10-16 07:13:52 MSG rdf[26793]: # Compiled for x86_64-unknown-linux-gnu/64bit with 64bit OIDs dynamically linked 2009-10-16 07:13:52 MSG rdf[26793]: # Copyright (c) 1993-July 2008 CWI. 2009-10-16 07:13:52 MSG rdf[26793]: # Copyright (c) August 2008-2009 MonetDB B.V., all rights reserved 2009-10-16 07:13:52 MSG rdf[26793]: # Visit http://monetdb.cwi.nl/ for further information 2009-10-16 07:13:52 MSG rdf[26793]: # Listening for connection requests on mapi:monetdb://127.0.0.1:50001/ 2009-10-16 07:13:52 MSG rdf[26793]: # MonetDB/SQL module v2.32.2 loaded 2009-10-16 07:14:22 MSG merovingian[26784]: proxying client localhost.localdomain:43986 for database 'rdf' to mapi:monetdb://127.0.0.1:50001/rdf 2009-10-16 07:49:59 MSG merovingian[26784]: database 'rdf' (26793) was killed by signal 9
This killed by signal 9 is a hint that the OS thought the process was taking too many resources (presumably memory resources) and killed the process.
2009-10-16 07:50:00 MSG merovingian[26784]: client localhost.localdomain:43986 has disconnected from proxy
I will also look into the connection timeout problem if that's the cause.
Thanks for the help. Medha
On Fri, Oct 16, 2009 at 3:15 PM, Martin Kersten
wrote: The error response produced is not MonetDB related. It is a result from the compute environment, which closes a connection is it has been silent for a long time. It may be as simple as timeout settings of your basic communication channel. There is nothing we can do about that. This may happen if you start any query (or dataload with integrity checking) that takes a long time.
Lefteris wrote:
Since you are loading triples in ID integer form, a billion triples should be no problem in a 16gig memory machine. Could you please give more information on what commands/queries you use to load the data, and the exact full error message. Which OS are you running in the machine and which version of monet you installed? (stable or current?)
regards,
lefteris