Hi Roberto,
The obvious explanation is network "issues". The error message would likely be wrong, because the assumption is that monetdb runs on top of a local filesystem. If your write takes longer because of some network issue, it will timeout at some point. But timeout setting on a network connection are different then timeouts on writes to harddiscs. This might lead to different errors, that are not handled by monetdb in the proper way. I am sure we do not test this setup at the moment.
And another thing that might be important is how well memory mapping works with iscsi.
Arjen de Rijke
> _______________________________________________
----- Original Message -----
> From: "Roberto Cornacchia" <roberto.cornacchia@gmail.com>
> To: "Communication channel for MonetDB users" <users-list@monetdb.org>
> Sent: Tuesday, November 24, 2015 12:24:44 PM
> Subject: commit failures with dbfarm on iSCSI LUN
> Hi there,
>
> Do you have any experience with running a dbfarm over iSCSI?
>
> We have tried to use the NAS in our 1Gbit LAN for our largish daily experiments
> with MonetDB. It's a very handy setup and seems more suited than NFS.
>
> It seems to achieve reasonable performance, but we get quite regularly (though
> not predictably for now) commit failures during a rather long ETL.
> We do not get such commit failures when the same db and ETL are run on a local
> disk.
>
> Excerpt from merovingian.log (Jul2015-SP1):
>
> 2015-11-24 11:52:37 ERR trec01[19110]: !ERROR: bm_subcommit: commit failed
> 2015-11-24 11:52:37 ERR trec01[19110]: !ERROR: log_tend: write failed
> 2015-11-24 11:52:37 ERR trec01[19110]: !FATAL: 40000!COMMIT: transation commit
> failed (perhaps your disk is full?) exiting (kernel error: !ERROR: GDKsave:
> error on: name=07/717, ext=theap, mode=1
> 2015-11-24 11:52:37 ERR trec01[19110]: !OS: Input/output error
> 2015-11-24 11:52:37 ERR trec01[19110]: )
>
> The disk is most definitely not full, 1.5 TB available (the same works on a
> local disk with less space available).
> It looks like iSCSI is the problem (which works perfectly except these random
> failures).
>
> Can you think of any reason why iSCSI could could fail where a real local block
> device would not?
>
> iscsi client (where MonetDB runs): libiscsi 1.11.0
> iscsi storage (where the dbfarm is stored): iscsid 2.0-871
>
> The iSCSI LUN is created as regular file with thin provisioning (a file that
> dynamically grows on the NAS). We haven't tried yet with a fixed-size
> block-level LUN (trying this today anyway)
>
> Hoping someone can have an idea already.
>
> Roberto
>
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list