Re: Monetdb copy binary time varys very much!
Is it possible to set the cache to low down the maximum copy time?
------------------ Original ------------------
From: "Martin Kersten"
Hi Martin, I follow you advise, and there is error as follows:
ERROR = !COPY INTO: cannot pass number of records when using binary COPY INTO in: "copy 200000 records binary into tmatch from('/data/writeBinaryWM/id','/data/writeBinaryWM/0','/data/writeBinaryWM/1','/data/writeBinaryWM/2','/data/writeBinaryWM/3','/data/writeBinaryWM/4', '/data/writeBinaryWM/5','/data/writeBinaryWM/6','/data/writeBinaryWM/7','/data/writeBinaryWM/8','/data/writeBinaryWM/9','/data/writeBinaryWM/10', '/data/writeBinaryWM/11', '/data/writeBinaryWM/12', '/data/writeBinaryWM/13', '/data/writeBinaryWM/14', '/data/writeBinaryWM/15', '/data/writeBinaryWM/16', '/data/writeBinaryWM/17', '/data/writeBinaryWM/18', '/data/writeBinaryWM/19' )" !current transaction is aborted (please ROLLBACK)
2013/7/26 Martin Kersten
mailto:Martin.Kersten@cwi.nl> This runs shows expected flushing.
You might get somewhat better performance when you follow the advice in http://www.monetdb.org/__Documentation/Cookbooks/__SQLrecipes/LoadingBulkDat... http://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData e.g. use COPY 44000000 RECORDS ....
The COPY INTO forms mentioned so far share one important drawback. When a lot of data is to be inserted, the server doesn't know how much memory to allocate for the tables, and so will likely allocate too little. This means that during the insertion process, the server has to grow the allocated memory area. This can be an expensive operation. Therefore, it is better to give the server a count of how many records are to be inserted:
COPY n RECORDS INTO table FROM 'file';
Here n should be a number that is at least as large as the actual number of records to be inserted. If the file contains more than n records, only n will be inserted, if the file contains fewer, all values will be inserted. Giving a higher number is especially useful if multiple COPY INTO queries are to be done on the same table. The first COPY INTO, when the table is still empty, should be the total count of to-be-inserted values so that the server will allocate enough memory when the tables are first created (they are only really created once data is inserted). On 7/26/13 10:58 AM, integrity wrote:
------------------ Original ------------------ *From: * "Martin Kersten"
mailto:Martin.Kersten@cwi.nl__>; *Date: * Fri, Jul 26, 2013 04:46 PM *To: * "users-list" mailto:users-list@monetdb.org>; *Subject: * Re: Monetdb copy binary time varys very much! Hi, You should plot the times of each of your runs to see the trend. This single run may be an outlier, which could come from anything in your system environment. Even a seemingly harmless concurrent program using significant memory could compete with MonetDB.
And indeed, at some point you will see disk IO.
regards, Martin
_________________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org http://mail.monetdb.org/__mailman/listinfo/users-list http://mail.monetdb.org/mailman/listinfo/users-list
_________________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org http://mail.monetdb.org/__mailman/listinfo/users-list http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list .
No, at some point your data has to be shipped from memory to disk. I would consider using an SSD if timing becomes an issue here. On 7/26/13 4:20 PM, integrity wrote:
Is it possible to set the cache to low down the maximum copy time?
------------------ Original ------------------ *From: * "Martin Kersten"
; *Date: * Fri, Jul 26, 2013 10:13 PM *To: * "users-list" ; *Subject: * Re: Monetdb copy binary time varys very much! Oops, correct, for binary copies it does not recognize it
On 7/26/13 4:11 PM, Angelasweet wrote:
Hi Martin, I follow you advise, and there is error as follows:
ERROR = !COPY INTO: cannot pass number of records when using binary COPY INTO in: "copy 200000 records binary into tmatch
from('/data/writeBinaryWM/id','/data/writeBinaryWM/0','/data/writeBinaryWM/1','/data/writeBinaryWM/2','/data/writeBinaryWM/3','/data/writeBinaryWM/4',
'/data/writeBinaryWM/5','/data/writeBinaryWM/6','/data/writeBinaryWM/7','/data/writeBinaryWM/8','/data/writeBinaryWM/9','/data/writeBinaryWM/10', '/data/writeBinaryWM/11', '/data/writeBinaryWM/12', '/data/writeBinaryWM/13', '/data/writeBinaryWM/14', '/data/writeBinaryWM/15', '/data/writeBinaryWM/16', '/data/writeBinaryWM/17', '/data/writeBinaryWM/18', '/data/writeBinaryWM/19' )"
!current transaction is aborted (please ROLLBACK)
2013/7/26 Martin Kersten
mailto:Martin.Kersten@cwi.nl> This runs shows expected flushing.
You might get somewhat better performance when you follow the advice in
http://www.monetdb.org/__Documentation/Cookbooks/__SQLrecipes/LoadingBulkDat...
http://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData
e.g. use COPY 44000000 RECORDS ....
The COPY INTO forms mentioned so far share one important drawback. When a lot of data is to be inserted, the server doesn't know how much memory to allocate for the tables, and so will likely allocate too little. This means that during the insertion process, the server has to grow the allocated memory area. This can be an expensive operation. Therefore, it is better to give the server a count of how many records are to be inserted:
COPY n RECORDS INTO table FROM 'file';
Here n should be a number that is at least as large as the actual number of records to be inserted. If the file contains more than n records, only n will be inserted, if the file contains fewer, all values will be inserted. Giving a higher number is especially useful if multiple COPY INTO queries are to be done on the same table. The first COPY INTO, when the table is still empty, should be the total count of to-be-inserted values so that the server will allocate enough memory when the tables are first created (they are only really created once data is inserted). On 7/26/13 10:58 AM, integrity wrote:
------------------ Original ------------------ *From: * "Martin Kersten"
mailto:Martin.Kersten@cwi.nl__>; *Date: * Fri, Jul 26, 2013 04:46 PM *To: * "users-list" mailto:users-list@monetdb.org>; *Subject: * Re: Monetdb copy binary time varys very much! Hi, You should plot the times of each of your runs to see the trend. This single run may be an outlier, which could come from
anything
in your system environment. Even a seemingly harmless concurrent program using significant memory could compete with MonetDB.
And indeed, at some point you will see disk IO.
regards, Martin
_________________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org http://mail.monetdb.org/__mailman/listinfo/users-list http://mail.monetdb.org/mailman/listinfo/users-list
_________________________________________________ users-list mailing list users-list@monetdb.org mailto:users-list@monetdb.org http://mail.monetdb.org/__mailman/listinfo/users-list http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list .
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
Yes, time is the biggest issue to our project, I cannot find SSD, could you please explain the SSD concept? Thanks!
see http://en.wikipedia.org/wiki/Solid-state_drive On 7/26/13 4:48 PM, Angelasweet wrote:
Yes, time is the biggest issue to our project, I cannot find SSD, could you please explain the SSD concept? Thanks!
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
participants (3)
-
Angelasweet
-
integrity
-
Martin Kersten