
My database is crashing quite systematically in production during some data import (which is in the form as a serie of DELETE ... WHERE / COPY INTO). The relevant information from the merovigian.log seems to be: 2011-06-02 04:53:27 MSG prod_reporting[25449]: !SQLException:SQLinit:Catalogue initialization failed 2011-06-02 04:53:27 MSG prod_reporting[25449]: !ERROR: HEAPextend: failed to extend to 3316460814336 for 11/40/114026theap A) What does this error mean? ----------------------------------------- A lack of memory? I am quite confused as the entire database is barely 5G on disk and there is 7G of RAM on this machine, which is dedicated solely to MonetDB. Moreover I only DELETE/COPY INTO in a single table at a time and the biggest table is barely 1.6G. Cleary there is some memory dynamic that I am not understanding. B) How do I recover the crashed database? ----------------------------------------------------------- It would not start anymore: 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Listening for UNIX domain connection requests on mapi:monetdb:///var/monetdb5/dbfarm/prod_reporting/.mapi.sock 2011-06-02 04:55:19 MSG prod_reporting[1665]: # MonetDB/SQL module loaded 2011-06-02 04:55:19 MSG merovingian[1644]: database 'prod_reporting' (1665) was killed by signal SIGSEGV 2011-06-02 04:55:29 ERR control[1644]: (local): failed to fork mserver: database 'prod_reporting' has crashed after starting, manual intervention needed, check merovingian's logfile for details Thanks in advance, - Philippe -- more complete extract from merovigian.log --- 2011-06-02 04:40:38 MSG merovingian[639]: proxying client ip-10-204-61-105.ec2.internal:52864 for database 'prod_reporting' to mapi:monetdb:///var/monetdb5/dbfarm/prod_reporting/.mapi.sock?database=prod_reporting 2011-06-02 04:40:38 MSG merovingian[639]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2011-06-02 04:53:09 MSG control[639]: (local): served status list 2011-06-02 04:53:23 MSG merovingian[639]: caught SIGTERM, starting shutdown sequence 2011-06-02 04:53:24 MSG control[639]: control channel closed 2011-06-02 04:53:27 MSG merovingian[639]: sending process 25449 (database 'prod_reporting') the TERM signal 2011-06-02 04:53:27 MSG prod_reporting[25449]: !SQLException:SQLinit:Catalogue initialization failed 2011-06-02 04:53:27 MSG prod_reporting[25449]: !ERROR: HEAPextend: failed to extend to 3316460814336 for 11/40/114026theap 2011-06-02 04:53:27 MSG merovingian[639]: database 'prod_reporting' (25449) has exited with exit status 0 2011-06-02 04:53:27 MSG merovingian[639]: database 'prod_reporting' has shut down 2011-06-02 04:53:27 MSG merovingian[639]: Merovingian 1.4 stopped 2011-06-02 04:55:16 MSG merovingian[1644]: Merovingian 1.4 (Apr2011-SP1) starting 2011-06-02 04:55:16 MSG merovingian[1644]: monitoring dbfarm /var/monetdb5/dbfarm 2011-06-02 04:55:16 MSG merovingian[1644]: accepting connections on TCP socket 0.0.0.0:50000 2011-06-02 04:55:16 MSG merovingian[1644]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2011-06-02 04:55:16 MSG discovery[1644]: listening for UDP messages on 0.0.0.0:50000 2011-06-02 04:55:16 MSG control[1644]: accepting connections on UNIX domain socket /tmp/.s.merovingian.50001 2011-06-02 04:55:16 MSG control[1644]: accepting connections on TCP socket 0.0.0.0:50001 2011-06-02 04:55:16 MSG discovery[1644]: new neighbour ip-10-32-111-2 (ip-10-32-111-2.ec2.internal) 2011-06-02 04:55:16 MSG discovery[1644]: new database mapi:monetdb://ip-10-32-111-2:50000/prod_reporting (ttl=660s) 2011-06-02 04:55:16 MSG discovery[1644]: registered neighbour ip-10-32-111-2:50001 2011-06-02 04:55:19 MSG control[1644]: (local): served status list 2011-06-02 04:55:19 MSG merovingian[1644]: starting database 'prod_reporting', up min/avg/max: 1m/31m/1h, crash average: 0.00 0.10 0.03 (6-5=1) 2011-06-02 04:55:19 MSG prod_reporting[1665]: arguments: /usr/bin/mserver5 --set gdk_dbfarm=/var/monetdb5/dbfarm --dbname=prod_reporting --set merovingian_uri=mapi:monetdb://ip-10-32-111-2:50000/prod_reporting --set mapi_open=false --set mapi_port=0 --set mapi_usock=/var/monetdb5/dbfarm/prod_reporting/.mapi.sock --set monet_vault_key=/var/monetdb5/dbfarm/prod_reporting/.vaultkey --set monet_daemon=yes 2011-06-02 04:55:19 MSG prod_reporting[1665]: # MonetDB 5 server v11.3.3 "Apr2011-SP1" 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Serving database 'prod_reporting', using 2 threads 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Compiled for x86_64-pc-linux-gnu/64bit with 64bit OIDs dynamically linked 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Found 7.294 GiB available main-memory. 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Copyright (c) 1993-July 2008 CWI. 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Copyright (c) August 2008-2011 MonetDB B.V., all rights reserved 2011-06-02 04:55:19 MSG prod_reporting[1665]: # Listening for UNIX domain connection requests on mapi:monetdb:///var/monetdb5/dbfarm/prod_reporting/.mapi.sock 2011-06-02 04:55:19 MSG prod_reporting[1665]: # MonetDB/SQL module loaded 2011-06-02 04:55:19 MSG merovingian[1644]: database 'prod_reporting' (1665) was killed by signal SIGSEGV 2011-06-02 04:55:29 ERR control[1644]: (local): failed to fork mserver: database 'prod_reporting' has crashed after starting, manual intervention needed, check merovingian's logfile for details 2011-06-02 05:03:18 MSG merovingian[1644]: database 'prod_reporting' has crashed after start on 2011-06-02 04:55:19, attempting restart, up min/avg/max: 1m/31m/1h, crash average: 1.00 0.20 0.07 (7-5=2)