Re: [MonetDB-users] Strange disk I/O behavior

16 Dec 2011

      Hi,

Stefan Manegold wrote:
...
Ah, I now releaize that you have point predicates in you query/ies, right?
In that case, mitosis does not trigger since we want to (build and) exploit
hash indices in the base column(s).
What do you mean by point predicates?
...
Briefly looking at your plans, I notice that they contain BAT iterators in
MAL; that's why the traces blow up.
I don't have the time right now to explain what that means (the "experts
will know" ;-)).
It appears that they are triggered by the "between SYMMETRIC" construct.
I filed a bug report / feature request abut this:
http://bugs.monetdb.org/show_bug.cgi?id=2945
For now, if you would be able to omit the "SYMMETRIC", I expect at least the
traces not to blow up. I cannot tell, whether also the memory usage / IO
explosion would be avoided by this ...
I have done a more detailed analysis with the 3 problematic queries. (They are
called 7 through 9 in my benchmark set and I kept that numbering for simplicity.)

I've uploaded the original SQL queries:

http://www.informatik.hu-berlin.de/~rosenfel/monet/q7-count.sql
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-count.sql
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-count.sql

All three queries fill up the hard disk on the large dataset. Only queries 7
and 9 use a BETWEEN SYMMETRIC predicate. Query 8 uses an inequality predicate
on node.id (see below).

The table below shows their evaluation time and disk I/O behavior on the small
dataset.

| Query | Time (ms) | Data written |
|-------+-----------+--------------|
| 7     |       962 |           6M |
| 8     |       203 |          92k |
| 9     |      9646 |         100M |

As per Stefan's suggestion, I removed the SYMMETRIC predicate. I also tried 2
other strategies:

- rewriting the BETWEEN predicate with an equivalent >= and <= comparison and
- substituting the BETWEEN predicate with a single < comparison.

(The last strategy is not semantically equivalent, but nevertheless correct for
my application for these queries. However, I use the BETWEEN predicate in other
places where this option is not available.)

The next table shows the evaluation times (in ms) on the small dataset for
query 7 and 9 using these 4 strategies:

| Query | BETWEEN SYMMETRIC | BETWEEN | >= AND <= |   < |
|-------+-------------------+---------+-----------+-----|
|     7 |              1002 |     342 |       335 |  66 |
|     9 |              9908 |    6040 |      5990 | 185 |

The next table shows the disk I/O behavior of query 9 using these 4 strategies
on the small dataset.

| Strategy          | Read (MB) | Write (MB) |
|-------------------+-----------+------------|
| BETWEEN SYMMETRIC |         0 |        100 |
| BETWEEN           |         0 |         84 |
| >= AND <=         |         0 |         84 |
| <                 |         0 |          0 |

Using BETWEEN and using >= and <= has the same runtime behavior. Indeed, the
MAL plans are identical. I have uploaded the MAL plans for query 9 for all 4
strategies.

http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-with-mitosis.plan (BETWEEN SYMMETRIC)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-symmetric.plan (BETWEEN)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-workaround.plan (>= AND <=)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-precedence-restrict... (<)

Without the SYMMETRIC predicate the trace do not explode any longer. I have
uploaded these for query 9.

http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-symmetric.trace (BETWEEN)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-workaround.trace (>= AND <=)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q9-no-precedence-restrict... (<)

Out of curiosity, I also generated plans for a much simpler query. Starting
from the query below, I generated MAL plans for all 4 strategies:

  SELECT 
    count(*)
  FROM
    (
      SELECT 
        node1.id AS id1, node2.id AS id2, node1.toplevel_corpus
      FROM
        node AS node1,
        node AS node2
      WHERE
        node1.right_token BETWEEN SYMMETRIC node2.left_token - 1 AND node2.left_token - 50 AND
        node1.text_ref = node2.text_ref
    ) AS solutions;

http://www.informatik.hu-berlin.de/~rosenfel/monet/between-symmetric.plan (BETWEEN SYMMETRIC)
http://www.informatik.hu-berlin.de/~rosenfel/monet/between-no-symmetric.plan (BETWEEN)
http://www.informatik.hu-berlin.de/~rosenfel/monet/between-workaround.plan (>= AND <=)
http://www.informatik.hu-berlin.de/~rosenfel/monet/no-precedence-restriction... (<)

Interestingly, the BETWEEN variant and the >= AND <= variant do not produce the
same MAL plan for the simple query as they do for query 9.

Without the SYMMETRIC predicate, queries 7 and 9 still fill up my disk when I
evaluate them on the large dataset. However, using only the < comparison they
finish. The next table shows their runtime and disk I/O behavior for the <
strategy.

| Query | Time (ms) | Read (MB) | Written (MB) |
|-------+-----------+-----------+--------------|
|     7 |       990 |         0 |            3 |
|     9 |     56903 |       266 |         1621 |

There was considerable variation in the evaluation times, which I usually do
not observe for other queries. 

- Query 7 variation: 685 - 1833
- Query 9 variation: 48253 - 64511

There was also some variation in the amount of disk I/O for query 9:

- Query 9 reads: 257, 296, 254, 237, 285
- Query 9 writes: 1736, 1545, 1545, 1736, 1545

(However, due to the long runtime, there might have been competing I/O from
other processes.)

Query 8 does not use a BETWEEN SYMMETRIC predicate, but does a few comparisons
of the type nodeX.id <> nodeY.id where nodeX and nodeY are aliases for the node
table. This seems to be the reason for filling up the disk. If I remove these
predicates, query 8 finishes on the large dataset, although it still generates
quite a lot of I/O:

| Query | Time (ms) | Read (MB) | Written (MB) |
|-------+-----------+-----------+--------------|
|     8 |      6443 |         0 |          107 |

I've uploaded the traces and plans for query 8 with and without the <>
comparisons (the traces are from the small dataset):

http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-no-identical-sibling.p... (with <>)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-no-identical-sibling.t... (with <>)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-identical-sibling.plan (without <>)
http://www.informatik.hu-berlin.de/~rosenfel/monet/q8-identical-sibling.trac... (without <>)

I think the next step would be to run stethoscope on the query variations
(except BETWEEN SYMMETRIC) that still fill up the disk and report the last few
statements before the crash. Would this be helpful?

I'm still puzzled though why data is being written to disk in the first place.
The query runtime appears to be roughly proportional to the amount written
(which isn't surprising).

Thanks,
Viktor

Re: [MonetDB-users] Strange disk I/O behavior

Viktor Rosenfeld