Hi, I have been studying on multiple join performances in MonetDB. I am running the latest version of MonetDB Oct2014 - SP1 (MonetDB-11.19.3) on a CentOS machine with 128 GB RAM. All my tables have 100 to 300 columns and around 35 lakh rows. I have the following query with 5 joins and a few criteria. SELECT * FROM table1 INNER JOIN table2 on table1.col1=table2.col1 LEFT JOIN table3 ON table1.col2=table3.col1 LEFT JOIN table3 ON table1.col3=table3.col1 LEFT JOIN table4 ON table2.col2=table4.col1 LEFT JOIN table5 ON table2.col3=table5.col1 WHERE (((table2.col10 like '%a%' OR table5.col11 like '%s%') and ((table2.col12 like '%w%' *and table4.col13 like '%k'*) OR table2.col14 like '%h%') and (table1.col21 = 1) AND (table1.col22 IN (1,2,3))) AND (((table1.col23 >= 15842000000000000) AND (table1.col23 <= 15842999999999999)) OR ((table1.col23 >= 0) AND (table1.col23 <= 999999999999)))) ORDER BY 1 DESC LIMIT 10 OFFSET 0; This query has only 2000 matching rows and it returns the result in *2.9 seconds*. In the same query, if I remove one criteria alone (the bolded part), the result would still match only 2000 rows, but the result would come in *300 milliseconds* even. I used the PLAN statement and found that the query plan that the executor takes for the first query is a much complex path and *it performs 3 joins for all the 35 lakh rows* (i.e. before executing the criteria even) and hence the time has spiked to 3 seconds. May be the hot data of 35 lakh tuples would have not fit into the memory?? Not sure. But if I remove the highlighted criteria, the executor evaluates the criteria first and *it performs the joins for only 7000 rows* and hence it is faster 1000 times. Is this a bug? Or is there any way to optimise my query to match the executor to choose the right plan? *I am looking on to install MonetDB in our production setup, but finding this as a showstopper. * Any help much appreciated. I could even demo this. Thanks & Regards, Vijayakrishna.P. Mobile : (+91) 9500402305.
participants (1)
-
Vijay Krishna