[Monetdb-developers] mclient mem-usage during --dump
Hello, Question: is there any reason for mclient to use (large) amounts of memory during a dump of a sql database? syntax used: $ mclient -lsql -D -dsomedatabase > dump.sql I observe >12 GB of resident memory use when dumping a 2GB (in dump text format) database (it steadily grows), using the May2009 stable branch (of last week) Top shows: 28371 walink 16 0 12.2g 12g 2944 R 87 4.0 10:48.58 mclient I haven't investigated it any further, but I was first of all wondering whether it actually needs these amounts of memory? Greetings, Wouter
Hello,
I had a look at the code just now... looking for why so much memory
was used (i think mclient was using 100GB of memory in the end).
I am not familiar with the mapiclient, but perhaps is the following
diff a solution?
Index: src/mapiclient/MapiClient.mx
===================================================================
RCS file: /cvsroot/monetdb/clients/src/mapiclient/MapiClient.mx,v
retrieving revision 1.141
diff -u -r1.141 MapiClient.mx
--- src/mapiclient/MapiClient.mx 19 May 2009 12:02:59 -0000 1.141
+++ src/mapiclient/MapiClient.mx 27 May 2009 22:25:24 -0000
@@ -2048,7 +2048,7 @@
fprintf(stderr,"%s\n",mapi_error_str(mid));
exit(2);
}
- mapi_cache_limit(mid, -1);
+ /* mapi_cache_limit(mid, -1); */
if (dump) {
if (mode == SQL) {
dump_tables(mid, toConsole, 0);
This seems to work for me, (at least the moment mclient's memory
consumption remains constant), but I can't oversee the consequences.
Could somebody perhaps say something sensible about it?
Reasoning behind it: This call to mapi_cache_limit makes rowlimit==-1,
and this together with cacheall=0, makes mapi_extend_cache (in
Mapi.mx) allocate more memory each time it is called (so the cache
becomes as large as the largest table).
Without this call "mapi_cache_limit(mid, -1);" the default for the
rowlimit has been set to 100 lines, so with this change the cache will
get flushed every 100 lines.
I think I should have filed a bug :)
Wouter
p.s. while investigating this issue i tried to limit the amount of
memory that mclient would get using "ulimit -v $((256*1024))". This
revealed that there are a number of places in Mapi.mx where a
(m)alloc-call goes unchecked. I don't know the MonetDB coding policy
here, but perhaps they should all at least have an accompanying
assert? The following one-liner in the clients package reveals some
issues:
$ grep "alloc(" -A2 src/mapilib/Mapi.mx
2009/5/25 Wouter Alink
Hello,
Question: is there any reason for mclient to use (large) amounts of memory during a dump of a sql database?
syntax used: $ mclient -lsql -D -dsomedatabase > dump.sql
I observe >12 GB of resident memory use when dumping a 2GB (in dump text format) database (it steadily grows), using the May2009 stable branch (of last week)
Top shows: 28371 walink 16 0 12.2g 12g 2944 R 87 4.0 10:48.58 mclient
I haven't investigated it any further, but I was first of all wondering whether it actually needs these amounts of memory?
Greetings, Wouter
The log message for that line was (yes, just that single line): "Fix a potential deadlock. Suppose you have a largish file with SQL queries and that those queries are sent to the server in 8K (or so) chunks (as is done in the doFile function). Also suppose that many of those SQL queries produce a significant number of rows. What could happen in this situation is that the first SQL query to produce lots of output is executed. Only the first 100 or 10000 (depending on the age of your mclient) rows are produced. The server then continues on with the next query and produces its results (or the first 100/10000 rows). mclient consumes the rows of the first query and when at the end of the 100/10000 rows it needs to ask the server for more results for the query. This means that mclient has to read all results until it gets to a server prompt so that it can send the appropriate X command (X-export). However, chances are that the last part of the query set that mclient had sent ends in the middle of a query. This means that the server at this stage needs to read more of a query and cannot cope with an X command. mclient at the same time is somewhere deep in the Mapi library reading output from the server and it cannot send the rest of the queries that the server craves. The solution is to not limit the output of the server to only the first 100/10000 rows of each query but to let the server send it all so that the client doesn't have to send X-export commands." So I'm not going to change that the way you suggest. Wouter Alink wrote:
Hello,
I had a look at the code just now... looking for why so much memory was used (i think mclient was using 100GB of memory in the end).
I am not familiar with the mapiclient, but perhaps is the following diff a solution?
Index: src/mapiclient/MapiClient.mx =================================================================== RCS file: /cvsroot/monetdb/clients/src/mapiclient/MapiClient.mx,v retrieving revision 1.141 diff -u -r1.141 MapiClient.mx --- src/mapiclient/MapiClient.mx 19 May 2009 12:02:59 -0000 1.141 +++ src/mapiclient/MapiClient.mx 27 May 2009 22:25:24 -0000 @@ -2048,7 +2048,7 @@ fprintf(stderr,"%s\n",mapi_error_str(mid)); exit(2); } - mapi_cache_limit(mid, -1); + /* mapi_cache_limit(mid, -1); */ if (dump) { if (mode == SQL) { dump_tables(mid, toConsole, 0);
This seems to work for me, (at least the moment mclient's memory consumption remains constant), but I can't oversee the consequences. Could somebody perhaps say something sensible about it? Reasoning behind it: This call to mapi_cache_limit makes rowlimit==-1, and this together with cacheall=0, makes mapi_extend_cache (in Mapi.mx) allocate more memory each time it is called (so the cache becomes as large as the largest table). Without this call "mapi_cache_limit(mid, -1);" the default for the rowlimit has been set to 100 lines, so with this change the cache will get flushed every 100 lines.
I think I should have filed a bug :) Wouter
p.s. while investigating this issue i tried to limit the amount of memory that mclient would get using "ulimit -v $((256*1024))". This revealed that there are a number of places in Mapi.mx where a (m)alloc-call goes unchecked. I don't know the MonetDB coding policy here, but perhaps they should all at least have an accompanying assert? The following one-liner in the clients package reveals some issues: $ grep "alloc(" -A2 src/mapilib/Mapi.mx
2009/5/25 Wouter Alink
: Hello,
Question: is there any reason for mclient to use (large) amounts of memory during a dump of a sql database?
syntax used: $ mclient -lsql -D -dsomedatabase > dump.sql
I observe >12 GB of resident memory use when dumping a 2GB (in dump text format) database (it steadily grows), using the May2009 stable branch (of last week)
Top shows: 28371 walink 16 0 12.2g 12g 2944 R 87 4.0 10:48.58 mclient
I haven't investigated it any further, but I was first of all wondering whether it actually needs these amounts of memory?
Greetings, Wouter
------------------------------------------------------------------------------ Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- Sjoerd Mullender
participants (2)
-
Sjoerd Mullender
-
Wouter Alink