FW: Monetdb on Solaris/Sparc: high system time problem

20 Jan 2013

      Dear Martin,

Thanks for your prompt reply. We have verified that the order in which the queries are executed doesn't affect the behavior of the query. The same problem arises at the same queries in any arbitrary order, which implies that it's related to the query itself. In addition, the queries' execution time and utilization are pretty deterministic.
We have also verified that there is plenty of memory available and there are no other processes competing for resources (i.e., disk bandwidth, memory).

We have used dtrace to profile what functions are being executed by MonetDB while being in the two phases (normal and problematic). In the normal phase, which lasts for a few tens of seconds and the behavior is what we see on Linux, MonetDB mainly access the following functions:

- BUNfastins
- re_match_no_ignore
- strstr
When MonetDB enters into that second (problematic) phase, the User/System time is 4% to 21% for a couple of minutes w/o any disk activity or swapping, MonetDB mainly accesses the following functions:

- MT_sleep_ms
- memset
- __pollsys
- _pollsys
- _save_nv_regs
- pselect
- select

Do these function names help you to understand the problematic behavior?

Thanks in advance,
Javier

________________________________
From: Martin Kersten mailto:martin@monetdb.org>
Subject: Re: Monetdb on Solaris/Sparc: high system time problem
Date: January 17, 2013 6:56:00 PM GMT+01:00
To: mailto:users-list@monetdb.org>
Reply-To: Communication channel for MonetDB users mailto:users-list@monetdb.org>

Dear Javier,

Thank you for your report. It is hard to assess the consequences/impact of your
particular setup. Your report suggests that something else was also running on your
platform during this sequence, at that particular time. It could mean the MonetDB
was fighting with another process over resources, i.e. memory and disk bandwidth.
It could also indicate a thrashing kernel on shared kernel resources.

If the system persists during repetitive runs under controlled system settings
and it is immune to query order, then we might look further into the issue.

regards, Martin

On 1/17/13 3:44 PM, Javier Picorel wrote:
Dear all,

I'm running MonetDB (Dec2011-SP2) on Sparc Solaris 10 u9. I'm testing the TPC-H queries using SF10 and SF100 as the scale factors. My machine has 4 cores and 32GB of RAM and my farm is on a disk array.

Unfortunately, there are some queries (e.g., Qry 7, 8, 9) that enters into phase in which the User/System time is 4% to 21% for a couple of minutes. Interestingly, the same issue doesn't show up on a Linux machine with a similar setup and the queries'
system time is pretty low (<5%).

Interestingly, vmstat output doesn't show any disk activity during the phase where the process is just in system, only there are high values on "re" and "mf" fields. There is no swapping going on ("po" and "pi" values are 0 in vmstat).

These are the parameters of the DB:

type default database
shared default yes
nthreads default 4
optpipe default_pipe
master default no
slave default <unknown>
readonly default no
nclients default 64

The way that I test the query is by running just one at a time. How can I mitigate this OS time?

Thanks in advance,
Javier

_______________________________________________
users-list mailing list
users-list@monetdb.orgmailto:users-list@monetdb.org
http://mail.monetdb.org/mailman/listinfo/users-list

_______________________________________________
users-list mailing list
users-list@monetdb.orgmailto:users-list@monetdb.org
http://mail.monetdb.org/mailman/listinfo/users-list