commit failures with dbfarm on iSCSI LUN

24 Nov 2015

      Hi there,

Do you have any experience with running a dbfarm over iSCSI?

We have tried to use the NAS in our 1Gbit LAN for our largish daily
experiments with MonetDB. It's a very handy setup and seems more suited
than NFS.

It seems to achieve reasonable performance, but we get quite regularly
(though not predictably for now) commit failures during a rather long ETL.
We do not get such commit failures when the same db and ETL are run on a
local disk.

Excerpt from merovingian.log (Jul2015-SP1):

2015-11-24 11:52:37 ERR trec01[19110]: !ERROR: bm_subcommit: commit failed
2015-11-24 11:52:37 ERR trec01[19110]: !ERROR: log_tend: write failed
2015-11-24 11:52:37 ERR trec01[19110]: !FATAL: 40000!COMMIT: transation
commit failed (perhaps your disk is full?) exiting (kernel error: !ERROR:
GDKsave: error on: name=07/717, ext=theap, mode=1
2015-11-24 11:52:37 ERR trec01[19110]: !OS: Input/output error
2015-11-24 11:52:37 ERR trec01[19110]: )

The disk is most definitely not full, 1.5 TB available (the same works on a
local disk with less space available).
It looks like iSCSI is the problem (which works perfectly except these
random failures).

Can you think of any reason why iSCSI could could fail where a real local
block device would not?

iscsi client (where MonetDB runs): libiscsi 1.11.0
iscsi storage (where the dbfarm is stored): iscsid 2.0-871

The iSCSI LUN is created as regular file with thin provisioning (a file
that dynamically grows on the NAS). We haven't tried yet with a fixed-size
block-level LUN (trying this today anyway)

Hoping someone can have an idea already.

Roberto

Roberto Cornacchia

Arjen de Rijke

Roberto Cornacchia

tags

participants (2)