Hi everybody,
I have a problem inserting a large amount of data into a monetdb database using bulk import.
I'm running the following commands in a loop:
connection.execute("COPY
19000000 OFFSET 2 RECORDS INTO XXX FROM '" + csv + "' USING DELIMITERS ';','\n' ;")
connection.commit()
, where csv is a different csv-file in each round of the loop, but always containing 18000001 rows of data.
To be sure that enough memory is allocated I chose 19000000 in the execute command.
I now have two questions:
1. Should the number of records (here 19000000) represent the number of lines per .csv-file or the number of lines of the final database (number of csv-files * 18Mio.)???
2. Can you think of any reason why monetdb would stop reading one specific variable, while continuing to read the others? Let's say my csv has 8 columns and 18000000 rows with no missing values in the raw data. Until Row
16537472 the total data is read-in, but for the following lines variable 3 is missing until line 18000000 while variable 1 as well as 3-8 are perfectly fine. Can this be due to memory or harddisk speed constraints? Why is no error message raised?