[MonetDB-users] multi-users and memory problems
Dear developers, I'm currently running the CVS head of Dec 11, 2008. I apologies in advance for missing on details. If needed I'll provide more. Thank you, and lots of wonderful wishes for Xmas and for the New Year!!! (this is the last email you'll get from me this year, I promise :)) l. Observation 1: When trying to run big numbers of clients simultaneously, the server crushes. For me it crushed with 70 and 100 clients fired at the same time. Observation 2: After running one query in a multi-user scenario (N clients/threads simultaneously) the memory footprint of mserver (% of the memory that the server occupies) grew. After repeating the experiment M times the memory grew each time with a constant. Running the same query sequentially the same number of times leaves the footprint of mserver constant. Could it be a faulty memory cleaning? Query: q3.xq let $col := fn:collection("MotiesTweedeKamer") let $years := fn:distinct-values( for $date in $col//hiddendatum return fn:substring(fn:string($date),1,4)) for $y in $years order by $y ascending return <result year="{$y}" count="{ count($col//document[fn:substring(fn:string(.//hiddendatum),1,4) = $y]) }"/> N=50 M=1 $ perl runNclients.pl N=50 $ top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28563 lafanasi 20 0 6266m 5.1g 100m S 0 26.0 10:50.71 Mserver M=2 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28563 lafanasi 20 0 7546m 5.9g 100m S 0 30.1 21:42.67 Mserver ... N=60 M=10 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31182 lafanasi 20 0 17.3g 8.3g 100m S 0 42.3 263:45.37 Mserver ... M=30 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31182 lafanasi 20 0 22.7g 9.6g 100m S 0 48.8 395:08.87 Mserver Sequential run: N=1 M=50 times PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28372 lafanasi 20 0 839m 430m 100m S 1 2.1 6:26.65 Mserver M=100 times PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28372 lafanasi 20 0 839m 430m 100m S 1 2.1 6:26.65 Mserver Observation 3: The times it takes to run 10 Tijah search queries simultaneously is the same as the time it takes to run them sequentially. The percentage of CPU used in both cases by mserver is also similar. Does tijah support multi-users? Query: q2.xq let $opt := <TijahOptions ft-index="polietiekedata" ir-model="NLLR"/> let $c := collection("HAN") let $qid := tijah:query-id($c, "//spreker[about(.,%KEYWORD%)]", $opt) for $res in tijah:nodes($qid) return <pair>{( string($res/@naam), tijah:score($qid, $res))}</pair> 20 users/threads stopMserver,startMserver (to make sure that the server freed the memory) $time Run20Threads(mclient -lx q2.xq) 0.134u 0.118s 0:20.01 1.1% 0+0k 0+8io 0pf+0w sequential run, 20 times stopMserver, startMserver $ time for i in `seq 1 10`; do mclient -lx q2.xq; done 0.060u 0.067s 0:19.74 0.6% 0+0k 0+8io 0pf+0w Observation 4: The mserver memory footprint grows very fast when running Tijah queries. When the footprint reaches 98% the query processing time gets really slow or the server crushes. Query: the same as above Sequential run, M=90 times $stopMserver, startMserver $for i in `seq 1 10`; do mclient -lx q2.xq; done $ top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24527 lafanasi 20 0 26.2g 19g 80m S 66 98.0 4:24.57 Mserver
Dear Loredana, thank you for your detailed report. Please find some initial comments (and questions ;-)) below. With some additional information (see my questions below) each of your 4 observation should become a bug report --- thus, both (interested) users (you and others) and (involved) developers can easily follow the discussions and progress with analysing and fixing them... All the best wishes for Xmas & the New Year! Stefan On Sun, Dec 21, 2008 at 12:26:48AM +0100, Loredana Afanasiev wrote:
Dear developers,
I'm currently running the CVS head of Dec 11, 2008. I apologies in advance for missing on details. If needed I'll provide more.
Thank you, and lots of wonderful wishes for Xmas and for the New Year!!! (this is the last email you'll get from me this year, I promise :)) l.
Observation 1: When trying to run big numbers of clients simultaneously, the server crushes. For me it crushed with 70 and 100 clients fired at the same time.
How does the server crash? Just end its services, or segfault, and/or does it any (error-) message before crashing? (I assume, you tested XQuery clients, only?) For this (and Observation 3 below), since you did not say differently, may I assume that you did not set the "mapi_clients" option when starting Mserver (either in its MonetDB.conf on on its commandline via `set mapi_clients=...`)? See also http://monetdb.cwi.nl/XQuery/Documentation/mclient-Options.html In fact, I am not sure, whether/how the "mapi_clients" option is still used in the server --- one of our MAPI experts should check and comment on this ...
Observation 2: After running one query in a multi-user scenario (N clients/threads simultaneously) the memory footprint of mserver (% of the memory that the server occupies) grew. After repeating the experiment M times the memory grew each time with a constant. Running the same query sequentially the same number of times leaves the footprint of mserver constant. Could it be a faulty memory cleaning?
Yes, this could indeed be some meory leak --- will have to investigate this in detail ...
Query: q3.xq let $col := fn:collection("MotiesTweedeKamer") let $years := fn:distinct-values( for $date in $col//hiddendatum return fn:substring(fn:string($date),1,4)) for $y in $years order by $y ascending return <result year="{$y}" count="{ count($col//document[fn:substring(fn:string(.//hiddendatum),1,4) = $y]) }"/>
N=50 M=1 $ perl runNclients.pl N=50 $ top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28563 lafanasi 20 0 6266m 5.1g 100m S 0 26.0 10:50.71 Mserver M=2 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28563 lafanasi 20 0 7546m 5.9g 100m S 0 30.1 21:42.67 Mserver ... N=60 M=10 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31182 lafanasi 20 0 17.3g 8.3g 100m S 0 42.3 263:45.37 Mserver ... M=30 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31182 lafanasi 20 0 22.7g 9.6g 100m S 0 48.8 395:08.87 Mserver
Sequential run:
N=1 M=50 times PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28372 lafanasi 20 0 839m 430m 100m S 1 2.1 6:26.65 Mserver
M=100 times PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28372 lafanasi 20 0 839m 430m 100m S 1 2.1 6:26.65 Mserver
Observation 3: The times it takes to run 10 Tijah search queries simultaneously is the same as the time it takes to run them sequentially. The percentage of CPU used in both cases by mserver is also similar. Does tijah support multi-users?
I cannot say anything about the multi-user capabilities of PF/tijah, but since the pathfinder compiler itself is currently not thread-safe, yet, MonetDB/XQuery can only *compile* one XQuery (to MIL) at a time (read: sequentially); the actual execution of the generated MIL plan is then run concurretly with both translation of (one) other XQuery query to MIL and exection of (possibly multiple) generated MIL plans. In your case, each query seems to run for about a second, hence, the sequential compilation should not determine total execution time (assuming that query compilation does not take a major fraction of a second ...) However, parallel speed-up is of course limited by the amount of resources that your machine provides (and your query demands), i.e., are you running on a multi-CPU/multi-core machine? If so, how many cores? How large is the data that you run you query on? How much menory does your machine have? Is the data already "hot" (e.g., in the systems filesystem cache) or is it initially loaded from disk? What are the (perfoamnce-) characteristics of your I/O system? How much memory does Mserver use while executing a single instance of your query? Do the multi-query memory problems of observation 2 also occur in this case?
Query: q2.xq let $opt := <TijahOptions ft-index="polietiekedata" ir-model="NLLR"/> let $c := collection("HAN") let $qid := tijah:query-id($c, "//spreker[about(.,%KEYWORD%)]", $opt) for $res in tijah:nodes($qid) return <pair>{( string($res/@naam), tijah:score($qid, $res))}</pair>
20 users/threads stopMserver,startMserver (to make sure that the server freed the memory) $time Run20Threads(mclient -lx q2.xq) 0.134u 0.118s 0:20.01 1.1% 0+0k 0+8io 0pf+0w
sequential run, 20 times ^ stopMserver, startMserver $ time for i in `seq 1 10`; do mclient -lx q2.xq; done ^ "10" or "20"? ;-)
0.060u 0.067s 0:19.74 0.6% 0+0k 0+8io 0pf+0w
Observation 4: The mserver memory footprint grows very fast when running Tijah queries. When the footprint reaches 98% the query processing time gets really slow or the server crushes.
Query: the same as above
This might (hence) be one reason for your Observation 3 above.
Sequential run, M=90 times $stopMserver, startMserver $for i in `seq 1 10`; do mclient -lx q2.xq; done $ top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24527 lafanasi 20 0 26.2g 19g 80m S 66 98.0 4:24.57 Mserver
------------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (2)
-
Loredana Afanasiev
-
Stefan Manegold