Re: [MonetDB-users] constant bulk inserts to monetdb
uriel katz wrote:
ok,but why it don`t utilize one core to the max,it seem to keep using 8% of the core and takes about 17 seconds to insert 200,000 record of 1Kb each (the actual file is 64MB).
That depends on the total setup. Especially, when you are exceeding memory then Linux is not the best operating system. One of the issues is that dirty pages are not properly flushed. A BATsave in the new bulk loader at critical points helped to avoid this case. if you do a COPY into a clean bat, then the log overhead is neglectable.
can i some how skip the logs and make it insert right into the bat files? also how can i stream the tuples programaticlly from C mapi or ODBC?
See the C Mapi library. Or you might look at the Stethoscope, which streams tuples from server to an application.
i am running MonetDB on windows with a 8GB of ram.
That should be more then enough ; )
what is Stethoscope?
it is part of the source distribution and a Linux utility that picks up a stream of tuples from the server. The information you provide is hard to track down to a possible cause. a COPY operation most likely reads the complete file into its buffers before processing it. One pitfall you may have stumbled upon is the following. Did you indicate the number of records that you are about to copy into the table? If not, then the system has to guess and will repeatedly adjust this guess, which involves quite some overhead. Pleae use, COPY 50000 RECORDS INTO .....
Do you happen to be a Ruby-on-Rails expert?
i am not a ruby on rails expert,sorry i am python guy :)
thanks again.
Martin Kersten wrote:
uriel katz wrote:
i have a application where i have constant bulk inserts(about 50000 rows of 1Kbyte each) every 1 minute,i am wondering which is the fastest and most efficient way to insert data.
COPY into is the fastest
i am using now the COPY command but it is kind of awkward since i need to dump my data as csv and then execute COPY to insert it,is there any streaming method or a special api for bulk inserts?
I think it is possible to inline the tuples into the sql stream as well. The SQL developer will answer this one.
also when i issue a copy command it uses a little bit of cpu aroud 2%(it is a quad core setup with windows so i guess this means 8% of one cpu) and keep loading and releasing memory(i have 8GB of ram) even that it doesn`t get near 2GB. is this ok,what it is actually doing?
We have recently upgraded our bulk loader to utilize as many cores as possible This is not yet in the release. The preview of effects you can see in http://monetdb.cwi.nl/projects/monetdb//SQL/Benchmark/TPCH/
P.S.: are insert/selects multi threaded?
thanks for this awsome peace of software!
-Uriel
participants (1)
-
Martin Kersten