Re: [MonetDB-users] Strange disk I/O behavior

11 Jan 2012

      Hi Viktor,

unfortunately, we did not manage to dive deeper into these problems, yet.
I hope to find some time by end of next week --- too much travelling and
other obligations in between.

Wrt. to the I/O, I did locate and fix some issues in the Dec2011 branch;
cf.,
http://dev.monetdb.org/hg/MonetDB/rev/dc8066d67faa
http://dev.monetdb.org/hg/MonetDB/rev/bb56535aa6df
http://dev.monetdb.org/hg/MonetDB/rev/d30146b4d610
http://dev.monetdb.org/hg/MonetDB/rev/cf2ff66d08fa

It would be great, if you could test your scenario(s) with the latest
Dec2011 branch, either from HG or from
http://monetdb.cwi.nl/testweb/web/42751:3ab84718f7e1/
or
http://monetdb.cwi.nl/testweb/web/MonetDB-Dec2011-latest.tar.bz2
and report.

(Dec2011 has not yet been released, yet, but we hope to do so still this
month.)

Please note that efficient handling of "BETWEEN SYMMETRIC" has not been
implemented, yet; cf.,
http://bugs.monetdb.org/show_bug.cgi?id=2945

regards,
Stefan

On Wed, Jan 11, 2012 at 11:19:25AM +0100, Viktor Rosenfeld wrote:
...
Hi Stephan,
has there been any progress on this problem? Or is there something I can
do to move it along?
Also, I still don't fully understand why the data is written to disk at
all especially since it would easily fit into main memory. Is it because
BATs are memory-mapped to files?
Cheers,
Viktor
Stefan Manegold wrote:
...
Hi Viktor,
thanks for the detailed report. I'll read it later once I find the time;
might be only end of next week, though ...
With "point predicates" I refer to predicates the check for equality with a
scalar value: column-expression = scalar-value, i.e., predicates that can
benefit from a hash index.
And my first guess with your Q8 and the inequality (self-?)join(?) is that this
creates almost a Cartesian product of the one/two inputs, with large inputs,
this can indeed become hugh ...
More hopefully soon,
Stefan
On Fri, Dec 16, 2011 at 04:06:58PM +0100, Viktor Rosenfeld wrote:
...
Hi,
Stefan Manegold wrote:
...
Ah, I now releaize that you have point predicates in you query/ies, right?
In that case, mitosis does not trigger since we want to (build and) exploit
hash indices in the base column(s).
What do you mean by point predicates?
...
Briefly looking at your plans, I notice that they contain BAT iterators in
MAL; that's why the traces blow up.
I don't have the time right now to explain what that means (the "experts
will know" ;-)).
It appears that they are triggered by the "between SYMMETRIC" construct.
I filed a bug report / feature request abut this:
http://bugs.monetdb.org/show_bug.cgi?id=2945
For now, if you would be able to omit the "SYMMETRIC", I expect at least the
traces not to blow up. I cannot tell, whether also the memory usage / IO
explosion would be avoided by this ...
I have done a more detailed analysis with the 3 problematic queries. (They are
called 7 through 9 in my benchmark set and I kept that numbering for simplicity.)
I've uploaded the original SQL queries:
http://www.informatik.hu-berlin.de/~rosenfel/monet/q7-count.sql
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-count.sql
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-count.sql
All three queries fill up the hard disk on the large dataset. Only queries 7
and 9 use a BETWEEN SYMMETRIC predicate. Query 8 uses an inequality predicate
on node.id (see below).
The table below shows their evaluation time and disk I/O behavior on the small
dataset.
| Query | Time (ms) | Data written |
|-------+-----------+--------------|
| 7     |       962 |           6M |
| 8     |       203 |          92k |
| 9     |      9646 |         100M |
As per Stefan's suggestion, I removed the SYMMETRIC predicate. I also tried 2
other strategies:
- rewriting the BETWEEN predicate with an equivalent >= and <= comparison and
- substituting the BETWEEN predicate with a single < comparison.
(The last strategy is not semantically equivalent, but nevertheless correct for
my application for these queries. However, I use the BETWEEN predicate in other
places where this option is not available.)
The next table shows the evaluation times (in ms) on the small dataset for
query 7 and 9 using these 4 strategies:
| Query | BETWEEN SYMMETRIC | BETWEEN | >= AND <= |   < |
|-------+-------------------+---------+-----------+-----|
|     7 |              1002 |     342 |       335 |  66 |
|     9 |              9908 |    6040 |      5990 | 185 |
The next table shows the disk I/O behavior of query 9 using these 4 strategies
on the small dataset.
| Strategy          | Read (MB) | Write (MB) |
|-------------------+-----------+------------|
| BETWEEN SYMMETRIC |         0 |        100 |
| BETWEEN           |         0 |         84 |
| >= AND <=         |         0 |         84 |
| <                 |         0 |          0 |
Using BETWEEN and using >= and <= has the same runtime behavior. Indeed, the
MAL plans are identical. I have uploaded the MAL plans for query 9 for all 4
strategies.
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-with-mitosis.plan (BETWEEN SYMMETRIC)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-symmetric.plan (BETWEEN)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-workaround.plan (>= AND <=)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-precedence-restrict... (<)
Without the SYMMETRIC predicate the trace do not explode any longer. I have
uploaded these for query 9.
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-symmetric.trace (BETWEEN)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-workaround.trace (>= AND <=)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-precedence-restrict... (<)
Out of curiosity, I also generated plans for a much simpler query. Starting
from the query below, I generated MAL plans for all 4 strategies:
SELECT 
    count(*)
  FROM
    (
      SELECT 
        node1.id AS id1, node2.id AS id2, node1.toplevel_corpus
      FROM
        node AS node1,
        node AS node2
      WHERE
        node1.right_token BETWEEN SYMMETRIC node2.left_token - 1 AND node2.left_token - 50 AND
        node1.text_ref = node2.text_ref
    ) AS solutions;
http://www.informatik.hu-berlin.de/~rosenfel/monet/between-symmetric.plan (BETWEEN SYMMETRIC)
http://www.informatik.hu-berlin.de/~rosenfel/monet/between-no-symmetric.plan (BETWEEN)
http://www.informatik.hu-berlin.de/~rosenfel/monet/between-workaround.plan (>= AND <=)
http://www.informatik.hu-berlin.de/~rosenfel/monet/no-precedence-restriction... (<)
Interestingly, the BETWEEN variant and the >= AND <= variant do not produce the
same MAL plan for the simple query as they do for query 9.
Without the SYMMETRIC predicate, queries 7 and 9 still fill up my disk when I
evaluate them on the large dataset. However, using only the < comparison they
finish. The next table shows their runtime and disk I/O behavior for the <
strategy.
| Query | Time (ms) | Read (MB) | Written (MB) |
|-------+-----------+-----------+--------------|
|     7 |       990 |         0 |            3 |
|     9 |     56903 |       266 |         1621 |
There was considerable variation in the evaluation times, which I usually do
not observe for other queries.
- Query 7 variation: 685 - 1833
- Query 9 variation: 48253 - 64511
There was also some variation in the amount of disk I/O for query 9:
- Query 9 reads: 257, 296, 254, 237, 285
- Query 9 writes: 1736, 1545, 1545, 1736, 1545
(However, due to the long runtime, there might have been competing I/O from
other processes.)
Query 8 does not use a BETWEEN SYMMETRIC predicate, but does a few comparisons
of the type nodeX.id <> nodeY.id where nodeX and nodeY are aliases for the node
table. This seems to be the reason for filling up the disk. If I remove these
predicates, query 8 finishes on the large dataset, although it still generates
quite a lot of I/O:
| Query | Time (ms) | Read (MB) | Written (MB) |
|-------+-----------+-----------+--------------|
|     8 |      6443 |         0 |          107 |
I've uploaded the traces and plans for query 8 with and without the <>
comparisons (the traces are from the small dataset):
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-no-identical-sibling.p... (with <>)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-no-identical-sibling.t... (with <>)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-identical-sibling.plan (without <>)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-identical-sibling.trac... (without <>)
I think the next step would be to run stethoscope on the query variations
(except BETWEEN SYMMETRIC) that still fill up the disk and report the last few
statements before the crash. Would this be helpful?
I'm still puzzled though why data is being written to disk in the first place.
The query runtime appears to be roughly proportional to the amount written
(which isn't surprising).
Thanks,
Viktor
------------------------------------------------------------------------------
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
MonetDB-users mailing list
MonetDB-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- 
| Stefan.Manegold @ CWI.nl | DB Architectures (INS1) |
| http://CWI.nl/~manegold/ | Science Park 123 (L321) |
| Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam  (NL) |
------------------------------------------------------------------------------
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
MonetDB-users mailing list
MonetDB-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- 
| Stefan.Manegold @ CWI.nl | DB Architectures (INS1) |
| http://CWI.nl/~manegold/ | Science Park 123 (L321) |
| Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam  (NL) |