Andreas Streichardt wrote:
Hi, Hi Andreas,
Thank you for taking the time to experiment and more importantly to report back. There has been a lot of work on the loader, but its importance for users to get started has our full attention. An external evaluation of MonetDB using extensive copying is reported in: http://www.mysqlperformanceblog.com/2009/09/29/quick-comparison-of-myisam-in... and http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-perform... and upto date information on the same experiment in http://www.cwi.nl/~mk/ontimeReport
i had a look at monetdb a while (a year or so) ago and stumbled upon a few crashes/memory issues back then. Apart from that it made a very promising impression to me.
Today i thought i could give it a try again to see if these issues have been fixed and found out that copy into still seems to have memory issues and i wonder if i am the only one who hits such issues.
First and important question is: Which version exactly did you use? mserver5 --version
I am experimeting with dummy data created by a script.
Data is created the following way:
20 smallints: Values 1-10. 100 tinyints: Values 0-1 20 ints: Values 1-10000 40 varchars: md5 of some random value 20 floats: rand(1000000)/1000000
I created a CSV file with ~2.1 million rows and then tried to insert the data.
My first attempt was:
sql>copy into md_data FROM '/home/mop/performance/hans.csv' delimiters ',','\n' null as ''; MAPI = monetdb@localhost:50000 ACTION= read_line QUERY = copy into md_data FROM '/home/mop/performance/hans.csv' delimiters ',','\n' null as ''; ERROR = Connection terminated
This was of course the quick and dirty way. What happened in the background was that monetdb consumed all memory. My system is a debian lenny 64bit. Server has 4GB and decent CPUs. Why does it need all my memory here? Quote from the manual:" It is strongly adviced to announce the number of records to be inserted. It avoids guessing by the server and subsequent re-allocation of
did you kill the connection or the server? if the server crashed, then the input file would be interesting to look at for use. table space, which involves potential copying." Furthermore, MonetDB uses whatever memory there is. The tables constructed are memory mapped files.
I then did the same but loaded only parts of the file (1000000 rows).
sql>copy 1000000 records into md_data FROM '/home/mop/performance/hans.csv' delimiters ',','\n' null as '' ;
Same issue. It will have to reallocate space, which may involve copying. (at least on the released versions)
This worked but the server used ~2-3GB of my RAM during the process and what's even more important it used the same amount even though the copy process finished. Why does it use the RAM even though it is not used anymore?
Tables are retained in memory until not needed and flushed. Persistent tables therefore likely stay around
Then i tried to insert 1000000 rows more and my expectation was that monetdb would reuse the memory it allocated (and didn't free) in the first loading process.
This wasn't the fact. Again monetdb consumed all my memory and when loading finished it indicated that it was still doing something (CPU @ ~20%).
log records have to be written, your OS may kick in to flush dirty pages, reloading swapped out processes, ....
When i tried a:
sql>SELECT COUNT(*) FROM md_data;
it hang completely. Same when trying to establish a new connection.
this is not what i would expect. Did you accidently ran the query as another parallel user? or the same session as the copy command?
I stopped the server and restarted again. Everything was fine again. I
good :)
inserted the rest of the data (memory raised again but there were only 100000 rows left so there wasn't any problem this time).
Am i the only one hitting such issues? One could argue that i should import less records at once but even then. There seems to be some nasty memleak in Memory leaks has been addressed in the Aug release, we are not aware of serious ones and would need to re-do the precise experiment to confirm it. What you see is memory fragmentation due to re-allocation of the BATs in memory.
the background. Or is this a configuration issue? Or am i doing something completely wrong?
I think your expectations are not yet aligned with what MonetDB does differently from other systems.
I am using the .debs: http://monetdb.cwi.nl/downloads/Debian/
Thanks in advance,
thanks to you for sharing this information, Martin Kersten
Andreas Streichardt
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users