Hi,
I think you were right about the "interesting" part as I am now suspecting it must have been more of a local environment problem.
As replied by Niels for: "undefined symbol: msab_init while starting monetdb". It could be a stale library path issue as I have not seen most of the errors listed below after I rebooted. The only one I have seen once is client [XX.XX.XX.XX]:46739 sent challenge in incomplete block.
@Niels: I am suspecting the reboot must have triggered something like sudo ldconfig that could have corrected stale lib problems.

Thanks and Regards,
Tapomay.


From: Tapomay Dey <tapomay@yahoo.com>
To: Communication channel for developers of the MonetDB suite. <developers-list@monetdb.org>
Sent: Friday, January 11, 2013 9:32 PM
Subject: Re: Error in merovingian log while performing a specific select: gdk_batop.c insert_string_bat: Assertion failed

RAM: 32GB. 
I am running an ETL that keeps sending Inserts/Updates/DDLs to the DB continuously(more details below).
The errors occured while the ETL was done with around 3.5 lakh inserts. The farm size is around 137MB.
The ETL sends batches of 100 inserts/1 update/1 DDL in a go with auto-commit=false. Initially inserts happened at around 1 insert batch per second.
When I checked after the table size was above 2 lakh records speed was 1 insert batch/10-15 seconds.

The client setup is as follows:
JDBC with c3p0 connection pooling parameters as:
min_size: 20,
max_size: 40,(pool size)
acquire_increment: 5,
idle_test_period: 60,
max_statements: 0,
timeout: 2000

I am also trying to achieve master-master replication here. I acquire a java lock per master before firing a query(to prevent commit failures due to optimistic locking). So we can assume that the inserts are pretty much funneled on the client itself.
I have never seen the memory usage exceed 3% after switching to Feb2013(No selects yet)
Following is the monetdb status of the failing master:
  start count: 33
  stop count: 7
  crash count: 25
  current uptime: 12m 6s
  average uptime: 2h 20m 13s
  maximum uptime: 4h 51m 24s
  average of crashes in the last 10 start attempts: 0.50
  average of crashes in the last 30 start attempts: 0.73
  crash average: 0.00 0.50 0.73 (over 1, 15, 30 starts) in total 25 crashes
  uptime stats (min/avg/max): 4s/2h/4h over 7 runs

As we use pooled connections we can assume that there have been around 40 connections with 3.5lakh/100/40 queries per connection.(at least theoretically)

NEW ISSUE:
After this incident I am facing a new issue - I am unable to connect to the database. The mclient just hangs after I enter my password and there is nothing being logged into merovingian.log. monetdbd stop fails to end the monetdbd process. Access comes back with luck after many force kills.

Thanks and Regards,
Tapomay.

From: Fabian Groffen <fabian@monetdb.org>
To: developers-list@monetdb.org
Sent: Friday, January 11, 2013 3:54 PM
Subject: Re: Error in merovingian log while performing a specific select: gdk_batop.c insert_string_bat: Assertion failed

On 11-01-2013 02:12:32 -0800, Tapomay Dey wrote:
> Following is an excerpt from the merovingian logs:
>
> 2013-01-10 10:40:49 ERR msearch_stats_db[19210]: !mal_mapi.listen: expected
> filedescriptor, but received something else
>
> 2013-01-10 10:40:57 ERR merovingian[8996]: client error: client
> [XX.XX.XX.XX]:51573 sent challenge in incomplete block:
>
> 2013-01-10 10:45:29 ERR msearch_stats_db[19210]: mserver5: bat_storage.c:40:
> delta_bind_del: Assertion `b' failed.
>
> 2013-01-10 10:45:29 ERR msearch_stats_db[19210]: mserver5: bat_storage.c:40:
> delta_bind_del: Assertion `b' failed.
>
> 2013-01-10 23:48:18 ERR msearch_stats_db[31637]: !mal_mapi.listen: expected
> filedescriptor, but received something else
>
> 2013-01-10 23:48:19 ERR msearch_stats_db[31637]: !mal_mapi.listen: expected
> filedescriptor, but received something else
>
> 2013-01-10 23:48:32 ERR msearch_stats_db[31637]: !FATAL: BBPdir: subcommit
> attempted without backup BBP.dir.
>
> 2013-01-11 04:29:31 ERR msearch_stats_db[4480]: mserver5: gdk_batop.c:175:
> insert_string_bat: Assertion `v >= ((var_t) (((1<<10) * sizeof(stridx_t)) >>
> 0))' failed.

You have an interesting amount of errors.  Can you describe what load
you are imposing on the database, and the system?  Something along the
lines of connections per second, queries per connection, memory usage,
load average, and start, stop and crashcounters of the database would be
interesting.


--
Fabian Groffen                              fabian@monetdb.org
column-store pioneer              http://www.monetdb.org/Home

_______________________________________________
developers-list mailing list
developers-list@monetdb.org
http://mail.monetdb.org/mailman/listinfo/developers-list



_______________________________________________
developers-list mailing list
developers-list@monetdb.org
http://mail.monetdb.org/mailman/listinfo/developers-list