Bug in Query Plan??

9 Jan 2015

      Hi,

I have been studying on multiple join performances in MonetDB. I am running
the latest version of MonetDB Oct2014 - SP1 (MonetDB-11.19.3) on a CentOS
machine with 128 GB RAM.

All my tables have 100 to 300 columns and around 35 lakh rows.

I have the following query with 5 joins and a few criteria.

SELECT * FROM table1
INNER JOIN table2 on table1.col1=table2.col1
LEFT JOIN table3 ON table1.col2=table3.col1
LEFT JOIN table3 ON table1.col3=table3.col1
LEFT JOIN table4 ON table2.col2=table4.col1
LEFT JOIN table5 ON table2.col3=table5.col1
WHERE  (((table2.col10 like '%a%' OR table5.col11 like '%s%') and
((table2.col12 like '%w%' *and table4.col13 like '%k'*) OR table2.col14
like '%h%') and (table1.col21 = 1) AND (table1.col22 IN (1,2,3))) AND
(((table1.col23 >= 15842000000000000) AND (table1.col23 <=
15842999999999999)) OR ((table1.col23 >= 0) AND (table1.col23 <=
999999999999)))) ORDER BY 1 DESC LIMIT 10 OFFSET 0;

This query has only 2000 matching rows and it returns the result in *2.9
seconds*.

In the same query, if I remove one criteria alone (the bolded part), the
result would still match only 2000 rows, but the result would come in *300
milliseconds* even.

I used the PLAN statement and found that the query plan that the executor
takes for the first query is a much complex path and *it performs 3 joins
for all the 35 lakh rows* (i.e. before executing the criteria even) and
hence the time has spiked to 3 seconds. May be the hot data of 35 lakh
tuples would have not fit into the memory?? Not sure.

But if I remove the highlighted criteria, the executor evaluates the
criteria first and *it performs the joins for only 7000 rows* and hence it
is faster 1000 times.

Is this a bug?
Or is there any way to optimise my query to match the executor to choose
the right plan?

Any help much appreciated. I could even demo this.

Thanks & Regards,

Vijayakrishna.P.
Mobile : (+91) 9500402305.

Vijay Krishna

tags

participants (1)