
Great info. Thanks a lot.
I am going to keep my db in this state. I will be able to perform your suggestions tomorrow.
Until then this is what I see in the logs when I do a fresh monetdb start:
2013-02-11 10:49:46 MSG merovingian[31990]: database 'msearch_stats_db' (32018) was killed by signal SIGSEGV
2013-02-11 10:49:46 ERR control[31990]: (local): failed to fork mserver: database 'msearch_stats_db' has crashed after starting, manual intervention ...
FYI I have not included my UDF. I have done a basic configure-make-make install with no modifications/extra options on Ubuntu 12.10 64 bit.(mercurial changeset: 46861:45c89b2e2ac2 Wed Feb 06 11:42:37)
Usage profile: There is a constant transactional insert/update load of the order of 100 update attempts per second. There is a "select 1;" fired every 10 seconds to check if DB is alive. There has been no significant select load yet(we are still loading historical data into the db).
Thanks and Regards,
Tapomay.
________________________________
From: Stefan Manegold
Dear Tapomay,
Taking the non-released revision is indeed "living on the edge".
1. BTW I am running a non-released revision of Feb13 branch. Could this be the reason for such a crash? I am doing so coz I need a fix that Niels had made for fixing a concurrency issue that caused duplicate keys. Also planning to implement group_concat UDF as per the changed semantics of Feb13. I already have a partially running one for Oct12. There are a few cases known where it may crash, it is worked upon. In the testweb you can find the few cases. In general, you would be
On 2/11/13 6:44 AM, Tapomay Dey wrote: the unlucky guy if you hit on them immediately. They seem rare.
2. As the DB crashes each time I try to start it I think its a perfect state to gather more diagnostics. How do I do so? I really need that a DB never reaches a non-recoverable state.
If it never passes the initialization phase after restart, it is most likely a corrupted database. This could happen as a result of a hardware failure, or an unknown error software error that caused a crash. It may be your UDF that went haywire and caused the system to loose. If it crashes without your UDF, then a run of the mserver using gdb may provide a hint on the whereabouts (see calling sequence in meriovingian.log to start mserver directly)
My approach would now be: 1) restore database from backup (or a small testdb) 2) ensure it is working correctly without your UDF 3) prepare test cases for your UDF 4) add your UDF 5) start/stop after the first few calls of code with UDF to observe behavior.
Success, Martin
My setup is such that there would be non-stop Inserts/updates into the DB 24/7.
Thanks and Regards, Tapomay.
------------------------------------------------------------------------ *From:* Tapomay Dey
*To:* Communication channel for MonetDB users *Sent:* Monday, February 11, 2013 10:47 AM *Subject:* Re: monetdb status health Thanks a lot. But since the time I asked the question the DB has gone into a state where it keeps logging 2013-02-11 04:12:40 ERR merovingian[15380]: client error: database 'msearch_stats_db' has crashed after starting, manual intervention needed, check monetdbd's logfile for details
in merovingian.log.
Health is 1%.
What can I do at this stage?
Thanks and Regards, Tapomay.
------------------------------------------------------------------------ *From:* Fabian Groffen
*To:* users-list@monetdb.org *Sent:* Sunday, February 10, 2013 11:33 PM *Subject:* Re: monetdb status health On 10-02-2013 09:44:09 -0800, Tapomay Dey wrote: > My questions are simple: > > what causes crashes?
The mserver5 (monetdb database) terminates in such a way that it can not be considered a clean shutdown, this is usually the case when the program gets terminated due to a condition that makes further execution impossible, e.g. memory faults. These are almost always program errors.
> what is health?
Health is the percentage of start-stop sequences compared to the number of times the database was actually started. E.g. how many times a start was followed by a clean shutdown (hence no crash).
> how do we stop health from degrading?
You can't, a database that crashes, and keeps on doing so will cause the health of the database to degrade.
> Following is the status of my db- > start count: 140 > stop count: 1 > crash count: 138
So, essentially, every time you start your database, it never reaches a point where you stop it cleanly, but instead your database crashes all the time.
-- Fabian Groffen fabian@monetdb.org mailto:fabian@monetdb.org column-store pioneer http://www.monetdb.org/Home _______________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list