Re: Monetdb copy binary time varys very much!
Hi Ying,
As you mentioned the 2nd case, I just interchanged create 21 files and excute COPY INTO. The plot times include only the execution time of COPY INTO.
In 15 seconds, the data volumn i have to process is just similar to storing 200000 rows,21 columns in this simulation program.
Thanks!
Meng
------------------ Original ------------------
From: "Ying Zhang"
Hi dear all,
I did a test of insert 2,000,000,000 rows into MonetDB with "COPY BINARY INTO FROM binary_file", in this test 1. i generate 21 files, each file represents a table column and has 200000 rows. I created them with rand() number and use fwrite() binary write method.
the table creation sql command is : create table tmatch(id bigint,a float,b float,c float, d float, e float, f float,g float,h float, i float,j float, k float,l float,m float, n float,o float, p float,q float,r float, s float,t float); the table has 21 columns,each column has 8 bytes, so each column file is c1=200000*21*8 Byte= 268800000 Byte=3.2MB 2. I use "COPY BINARY INTO FROM above_binary" to load each binary file into tmatch. The test was run 10000 times repeatedly.
the average time of 10000 times is only 1.0635589558955727, but when at 9043th time, it cost 227m36.136 ,some times later the time value increase to a large number, is it because of flush data from cache into database after the cache is full? The problem is since we have to control the total process time within 15 seconds , I am wondering if you can help me reduce the maximum time to a lower point?
Thanks very much! Meng
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list .
Hai Meng, Thanks for your info. Jennie On Jul 28, 2013, at 11:57 , integrity wrote:
Hi Ying,
As you mentioned the 2nd case, I just interchanged create 21 files and excute COPY INTO. The plot times include only the execution time of COPY INTO.
In 15 seconds, the data volumn i have to process is just similar to storing 200000 rows,21 columns in this simulation program.
Thanks! Meng
------------------ Original ------------------ From: "Ying Zhang"
; Date: Sun, Jul 28, 2013 00:47 AM To: "Communication channel for MonetDB users" ; Subject: Re: Monetdb copy binary time varys very much! Hai Meng,
Since you repeat the COPY INTO statement 10000 times, I wonder how exactly did you it and measure the time. For instance, did you first create _all_ necessary files (i.e., 10000 x 21 files) and execute 10000 COPY INTO consecutively? Or did you interchange create 21 files and execute COPY INTO? In the second case, are the times in your plots only the execution time of COPY INTO, or do they also include the time to create the files? Again, in the second case, have you also measured the time of creating the files?
You have ever mentioned that the total available execution time in your application is 15 seconds. How many data do you need to process within this time?
Regards,
Jennie
On Jul 26, 2013, at 10:40, integrity wrote:
Hi dear all,
I did a test of insert 2,000,000,000 rows into MonetDB with "COPY BINARY INTO FROM binary_file", in this test 1. i generate 21 files, each file represents a table column and has 200000 rows. I created them with rand() number and use fwrite() binary write method.
the table creation sql command is : create table tmatch(id bigint,a float,b float,c float, d float, e float, f float,g float,h float, i float,j float, k float,l float,m float, n float,o float, p float,q float,r float, s float,t float); the table has 21 columns,each column has 8 bytes, so each column file is c1=200000*21*8 Byte= 268800000 Byte=3.2MB 2. I use "COPY BINARY INTO FROM above_binary" to load each binary file into tmatch. The test was run 10000 times repeatedly.
the average time of 10000 times is only 1.0635589558955727, but when at 9043th time, it cost 227m36.136 ,some times later the time value increase to a large number, is it because of flush data from cache into database after the cache is full? The problem is since we have to control the total process time within 15 seconds , I am wondering if you can help me reduce the maximum time to a lower point?
Thanks very much! Meng
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list . _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
participants (2)
-
integrity
-
Ying Zhang