Hi all - Sorry for the multiple posts. Earlier in the fall I downloaded monetdb Jul2015 version on my MAC and could successfully load a file with 60M rows, 37 columns of mixed int/real/text/boolean data. Later I updated to the Jul2015-SP1 pack and was unable to load this same file. After trying to debug, I removed monetdb, reinstalled Jul2015-SP1, still had the problem. Then un-installed, reinstalled the Jul2015 version and it worked again. So the problem appears to be with the Jul2015-SP1 version. Details: After creating a schema, I load a .sql file that creates this table: sql>\d annosites CREATE TABLE "testjeff"."annosites" ( "chr" INTEGER, "pos" INTEGER, "hapmap31_total_depth" INTEGER, "hapmap31_num_taxa" SMALLINT, "hapmap31_num_alleles" SMALLINT, "hapmap31_minor_allele_avg_depth" REAL, "hapmap31_minor_allele_avg_phred" REAL, "hapmap31_num_hets" SMALLINT, "hapmap31_ed_factor" REAL, "hapmap31_seg_test_p_value" REAL, "hapmap31_ibd_one_allele" BOOLEAN, "hapmap31_in_local_ld" BOOLEAN, "hapmap31_maf" REAL, "hapmap31_near_indel" BOOLEAN, "hapmap31_first_alt_allele_is_ins_or_del" BOOLEAN, "snpeff40e_effect_hapmap31" CHARACTER LARGE OBJECT, "snpeff40e_effectimpact_hapmap31" CHARACTER LARGE OBJECT, "snpeff40e_functionalclass_hapmap31" CHARACTER LARGE OBJECT, "gerp_neutral_tree_length" REAL, "gerp_score" REAL, "gerp_conserved" BOOLEAN, "mnase_low_minus_high_rpm_shoots" REAL, "mnase_bayes_factor_shoots" REAL, "mnase_hotspot_shoots" BOOLEAN, "mnase_low_minus_high_rpm_roots" REAL, "mnase_bayes_factor_roots" REAL, "mnase_hotspot_roots" BOOLEAN, "within_gene" BOOLEAN, "within_transcript" BOOLEAN, "within_exon" BOOLEAN, "within_cds" BOOLEAN, "within_cds_from_gff3" BOOLEAN, "within_five_prime_utr" BOOLEAN, "within_three_prime_utr" BOOLEAN, "codon_position" SMALLINT, "go_term_accession" CHARACTER LARGE OBJECT, "go_term_name" CHARACTER LARGE OBJECT ); sql> Then I use a COPY INTO command (sometimes specifying number of records, sometimes not) to load my data file containing 60M lines: (this is the command when it WORKS on the Jul2015 load): sql>COPY INTO annoSites FROM '/Users/lcj34/notes_files/machineLearningDB/annoDB_related/siteAnnoNoHdrsCol35Fixed_20151011.txt' USING DELIMITERS '\t','\n'; 60362853 affected rows (3m 5s) sql> When running with the Jul2015-SP1 installed on the MAC (OS 10.9.5) . the COPY INTO command dies (i.e., I get back an sql prompt with no message). The merovingian.log from the MAC shows this: 2015-11-23 13:08:21 MSG test1[1531]: # MonetDB/SQL module loaded 2015-11-23 13:08:21 MSG merovingian[1528]: proxying client 0.0.0.0:0 for database 'test1' to mapi:monetdb:///users/lcj34/development/mydbfarm/test1/.mapi.sock?database=test1 2015-11-23 13:08:21 MSG merovingian[1528]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2015-11-23 13:08:21 MSG merovingian[1528]: proxying client 0.0.0.0:0 for database 'test1' to mapi:monetdb:///users/lcj34/development/mydbfarm/test1/.mapi.sock?database=test1 2015-11-23 13:08:21 MSG merovingian[1528]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2015-11-23 13:13:01 ERR test1[1531]: mserver5(1531,0x38967f000) malloc: *** error for object 0x7fc17393b208: incorrect checksum for freed object - object was probably modified after being freed. 2015-11-23 13:13:01 ERR test1[1531]: *** set a breakpoint in malloc_error_break to debug 2015-11-23 13:13:03 MSG merovingian[1528]: database 'test1' (1531) was killed by signal SIGABRT My MAC is just a test bed, our real server is RedHat Release which I am trying to get setup with monetdb. On this machine I loaded the Jul2015-sp1 for REdHat/CentOS following these instructions: http://rogerhosto.com/installing-monetdb-on-centosredhat/ The errors are a bit different. On Redhat, the command also aborts, sometimes with no message, sometimes with the message below. I show running both with and without specifying the number of records as the error message is slightly different. (Note on the MAC I did not receive a command line message): sql>COPY 61000000 records INTO annosites FROM '/home/lcj34/monetdbFiles/sites10000Jeff.txt' USING DELIMITERS '\t','\n'; Failed to import table Leftover data 'component of nuclear inner membrane;molecular_function;biological_process;endoplasmic reticulum' sql> sql>COPY INTO annosites FROM '/home/lcj34/monetdbFiles/sites10000Jeff.txt' USING DELIMITERS '\t','\n'; Failed to import table Leftover data 'binding' sql> There are multiple days/attempts (yesterday and today). The last attempt to COPY the file has these messages: 2015-11-24 12:39:45 MSG merovingian[15627]: proxying client (local) for database 'jeffTest' to mapi:monetdb:///opt/dbfarm/jeffTest/.mapi.sock?database=jeffTest 2015-11-24 12:39:45 MSG merovingian[15627]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2015-11-24 12:43:38 ERR jeffTest[15693]: *** Error in `/usr/bin/mserver5': free(): invalid next size (normal): 0x00007f5b44004ac0 *** 2015-11-24 12:43:38 ERR jeffTest[15693]: ======= Backtrace: ========= 2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libc.so.6(+0x7d1fd)[0x7f5b628101fd] 2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libbat.so.12(GDKfree+0x13)[0x7f5b64f6f093]2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libmonetdb5.so.19(+0x135e11)[0x7f5b655e1e11] 2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libmonetdb5.so.19(+0x136534)[0x7f5b655e2534]2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libpthread.so.0(+0x7df5)[0x7f5b62b5bdf5] 2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libc.so.6(clone+0x6d)[0x7f5b628891ad]2015-11-24 12:43:38 ERR jeffTest[15693]: ======= Memory map: ======== 2015-11-24 12:43:38 ERR jeffTest[15693]: 00400000-00405000 r-xp 00000000 fd:01 1076941767 /usr/bin/mserver5 2015-11-24 12:43:38 ERR jeffTest[15693]: 00604000-00605000 r--p 00004000 fd:01 1076941767 /usr/bin/mserver5 2015-11-24 12:43:38 ERR jeffTest[15693]: 00605000-00606000 rw-p 00005000 fd:01 1076941767 /usr/bin/mserver5 2015-11-24 12:43:38 ERR jeffTest[15693]: 00606000-00608000 rw-p 00000000 00:00 02015-11-24 12:43:38 ERR jeffTest[15693]: 012af000-02860000 rw-p 00000000 00:00 0 [heap] I can forward a log file if requested. I'm thinking perhaps there is some problem in the file with a floating point/real number? Though importing this file works with the Jul2015 release. Were there changes related to this in the service pack? Final question: We need to be testing on the RedHat machine. Is the Jul2015 EPEL for Redhat/CentOS still available? Thanks - Lynn
I have isolated the problem. Apparently, the july2015-SP1 load is more sensitive than the july2015 load. I was loading data curated by another group. The last column in my tab-delimited file was a text field which contained a spaced-delimited biological description. Unfortunately, the “spaces” were not always “spaces” but were sometimes “tabs”. This text fields was the last column of the schema. The extraneous fields created by the tabs were ignored by the july2015 load, but caused errors when trying to load the same file into the july2015-sp1, which expected only a certain number of fields
I think the extra-sensitivity is good – it pointed out an error in the data. However, an informative error message would be nice as it would have saved a bit of trouble.
I have fixed the data and am no longer having problems with the july2015-sp1 load.
Thanks- Lynn
From: users-list
Hi All,
Yes I had a problem loading csv's with quoted strings I.e "text" with the
message failed to load leftover data, this didn't with the same data in
previous releases.
Regards,
Brian Hood
On Nov 30, 2015 2:04 PM, "Lynn Carol Johnson"
I have isolated the problem. Apparently, the july2015-SP1 load is more sensitive than the july2015 load. I was loading data curated by another group. The last column in my tab-delimited file was a text field which contained a spaced-delimited biological description. Unfortunately, the “spaces” were not always “spaces” but were sometimes “tabs”. This text fields was the last column of the schema. The extraneous fields created by the tabs were ignored by the july2015 load, but caused errors when trying to load the same file into the july2015-sp1, which expected only a certain number of fields
I think the extra-sensitivity is good – it pointed out an error in the data. However, an informative error message would be nice as it would have saved a bit of trouble.
I have fixed the data and am no longer having problems with the july2015-sp1 load.
Thanks- Lynn
From: users-list
on behalf of Lynn Carol Johnson Reply-To: Communication channel for MonetDB users Date: Wednesday, November 25, 2015 at 7:01 AM To: Communication channel for MonetDB users Subject: July2015-sp1 fails loading data Hi all -
Sorry for the multiple posts. Earlier in the fall I downloaded monetdb Jul2015 version on my MAC and could successfully load a file with 60M rows, 37 columns of mixed int/real/text/boolean data. Later I updated to the Jul2015-SP1 pack and was unable to load this same file. After trying to debug, I removed monetdb, reinstalled Jul2015-SP1, still had the problem. Then un-installed, reinstalled the Jul2015 version and it worked again.
So the problem appears to be with the Jul2015-SP1 version.
Details: After creating a schema, I load a .sql file that creates this table:
sql>\d annosites
CREATE TABLE "testjeff"."annosites" (
"chr" INTEGER,
"pos" INTEGER,
"hapmap31_total_depth" INTEGER,
"hapmap31_num_taxa" SMALLINT,
"hapmap31_num_alleles" SMALLINT,
"hapmap31_minor_allele_avg_depth" REAL,
"hapmap31_minor_allele_avg_phred" REAL,
"hapmap31_num_hets" SMALLINT,
"hapmap31_ed_factor" REAL,
"hapmap31_seg_test_p_value" REAL,
"hapmap31_ibd_one_allele" BOOLEAN,
"hapmap31_in_local_ld" BOOLEAN,
"hapmap31_maf" REAL,
"hapmap31_near_indel" BOOLEAN,
"hapmap31_first_alt_allele_is_ins_or_del" BOOLEAN,
"snpeff40e_effect_hapmap31" CHARACTER LARGE OBJECT,
"snpeff40e_effectimpact_hapmap31" CHARACTER LARGE OBJECT,
"snpeff40e_functionalclass_hapmap31" CHARACTER LARGE OBJECT,
"gerp_neutral_tree_length" REAL,
"gerp_score" REAL,
"gerp_conserved" BOOLEAN,
"mnase_low_minus_high_rpm_shoots" REAL,
"mnase_bayes_factor_shoots" REAL,
"mnase_hotspot_shoots" BOOLEAN,
"mnase_low_minus_high_rpm_roots" REAL,
"mnase_bayes_factor_roots" REAL,
"mnase_hotspot_roots" BOOLEAN,
"within_gene" BOOLEAN,
"within_transcript" BOOLEAN,
"within_exon" BOOLEAN,
"within_cds" BOOLEAN,
"within_cds_from_gff3" BOOLEAN,
"within_five_prime_utr" BOOLEAN,
"within_three_prime_utr" BOOLEAN,
"codon_position" SMALLINT,
"go_term_accession" CHARACTER LARGE OBJECT,
"go_term_name" CHARACTER LARGE OBJECT
);
sql>
Then I use a COPY INTO command (sometimes specifying number of records, sometimes not) to load my data file containing 60M lines: (this is the command when it WORKS on the Jul2015 load):
sql>COPY INTO annoSites FROM '/Users/lcj34/notes_files/machineLearningDB/annoDB_related/siteAnnoNoHdrsCol35Fixed_20151011.txt' USING DELIMITERS '\t','\n';
60362853 affected rows (3m 5s)
sql>
When running with the Jul2015-SP1 installed on the MAC (OS 10.9.5) . the COPY INTO command dies (i.e., I get back an sql prompt with no message). The merovingian.log from the MAC shows this:
2015-11-23 13:08:21 MSG test1[1531]: # MonetDB/SQL module loaded
2015-11-23 13:08:21 MSG merovingian[1528]: proxying client 0.0.0.0:0 for database 'test1' to mapi:monetdb:///users/lcj34/development/mydbfarm/test1/.mapi.sock?database=test1
2015-11-23 13:08:21 MSG merovingian[1528]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2015-11-23 13:08:21 MSG merovingian[1528]: proxying client 0.0.0.0:0 for database 'test1' to mapi:monetdb:///users/lcj34/development/mydbfarm/test1/.mapi.sock?database=test1
2015-11-23 13:08:21 MSG merovingian[1528]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2015-11-23 13:13:01 ERR test1[1531]: mserver5(1531,0x38967f000) malloc: *** error for object 0x7fc17393b208: incorrect checksum for freed object - object was probably modified after being freed.
2015-11-23 13:13:01 ERR test1[1531]: *** set a breakpoint in malloc_error_break to debug
2015-11-23 13:13:03 MSG merovingian[1528]: database 'test1' (1531) was killed by signal SIGABRT
My MAC is just a test bed, our real server is RedHat Release which I am trying to get setup with monetdb. On this machine I loaded the Jul2015-sp1 for REdHat/CentOS following these instructions: http://rogerhosto.com/installing-monetdb-on-centosredhat/
The errors are a bit different. On Redhat, the command also aborts, sometimes with no message, sometimes with the message below. I show running both with and without specifying the number of records as the error message is slightly different. (Note on the MAC I did not receive a command line message):
sql>COPY 61000000 records INTO annosites FROM '/home/lcj34/monetdbFiles/sites10000Jeff.txt' USING DELIMITERS '\t','\n';
Failed to import table Leftover data 'component of nuclear inner membrane;molecular_function;biological_process;endoplasmic reticulum'
sql>
sql>COPY INTO annosites FROM '/home/lcj34/monetdbFiles/sites10000Jeff.txt' USING DELIMITERS '\t','\n';
Failed to import table Leftover data 'binding'
sql>
There are multiple days/attempts (yesterday and today). The last attempt to COPY the file has these messages:
2015-11-24 12:39:45 MSG merovingian[15627]: proxying client (local) for database 'jeffTest' to mapi:monetdb:///opt/dbfarm/jeffTest/.mapi.sock?database=jeffTest
2015-11-24 12:39:45 MSG merovingian[15627]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2015-11-24 12:43:38 ERR jeffTest[15693]: *** Error in `/usr/bin/mserver5': free(): invalid next size (normal): 0x00007f5b44004ac0 ***
2015-11-24 12:43:38 ERR jeffTest[15693]: ======= Backtrace: =========
2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libc.so.6(+0x7d1fd)[0x7f5b628101fd]
2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libbat.so.12(GDKfree+0x13)[0x7f5b64f6f093]2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libmonetdb5.so.19(+0x135e11)[0x7f5b655e1e11]
2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libmonetdb5.so.19(+0x136534)[0x7f5b655e2534]2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libpthread.so.0(+0x7df5)[0x7f5b62b5bdf5]
2015-11-24 12:43:38 ERR jeffTest[15693]: /lib64/libc.so.6(clone+0x6d)[0x7f5b628891ad]2015-11-24 12:43:38 ERR jeffTest[15693]: ======= Memory map: ========
2015-11-24 12:43:38 ERR jeffTest[15693]: 00400000-00405000 r-xp 00000000 fd:01 1076941767 /usr/bin/mserver5
2015-11-24 12:43:38 ERR jeffTest[15693]: 00604000-00605000 r--p 00004000 fd:01 1076941767 /usr/bin/mserver5
2015-11-24 12:43:38 ERR jeffTest[15693]: 00605000-00606000 rw-p 00005000 fd:01 1076941767 /usr/bin/mserver5
2015-11-24 12:43:38 ERR jeffTest[15693]: 00606000-00608000 rw-p 00000000 00:00 02015-11-24 12:43:38 ERR jeffTest[15693]: 012af000-02860000 rw-p 00000000 00:00 0 [heap]
I can forward a log file if requested.
I’m thinking perhaps there is some problem in the file with a floating point/real number? Though importing this file works with the Jul2015 release. Were there changes related to this in the service pack?
Final question: We need to be testing on the RedHat machine. Is the Jul2015 EPEL for Redhat/CentOS still available?
Thanks - Lynn
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Brian Hood
-
Lynn Carol Johnson