
Hi All - I have 2 queries which run very different based on how I
order the WHERE clause.
When I put the larger filter first, it runs for 3.5 seconds.
Query 1:
select c5, c6, sum(sale) from sobi.sample
where
c5 in (5253,5110,5375,5268,5257,5266,5243,5256,5252,5361,5379,5363,5386,5367,5240,5241,5261,5249,5245,5260,5247,5264,5354,5250,5373,5366,5246,5356,5365,5255,5262,5259,5251,5369,5357,5269,5364,5380,5244,5248,5242,5254,5358,5360,5263,5362,5374,5378,5370,5294,5339,5272,5277,5275,5295,5340,5273,5276,5270,5061,5081,5079,5080,5066,5056,5073,5060,5070,5099,5101,5100,5067,5063,5077,5072,5062,5075,5083,5086,5071,5095,5058,5087,5090,5102,5084,5098,5076,5059,5074,5094,5091,5065,5069,5068,5104,5109,5107,5108,5013,5019,5022,5010,5009,5023,5011,5024,5012,5021,5007,5008,5025,5020,5055,5006,5036,5038,5054,5032,5033,5031,5039,5035,5026,5027,5030,5200,5043,5005,5041,5049,5042,5052,5040,5053,5045,5047,5048,5050,5051,5044,5046)
and c6 in (5182)
group by c5, c6;
When I reverse the order of the WHERE clause, it runs in about .4 seconds.
Query 2:
c5, c6, sum(sale) from sobi.sample
where
c6 in (5182)
and
c5 in (5253,5110,5375,5268,5257,5266,5243,5256,5252,5361,5379,5363,5386,5367,5240,5241,5261,5249,5245,5260,5247,5264,5354,5250,5373,5366,5246,5356,5365,5255,5262,5259,5251,5369,5357,5269,5364,5380,5244,5248,5242,5254,5358,5360,5263,5362,5374,5378,5370,5294,5339,5272,5277,5275,5295,5340,5273,5276,5270,5061,5081,5079,5080,5066,5056,5073,5060,5070,5099,5101,5100,5067,5063,5077,5072,5062,5075,5083,5086,5071,5095,5058,5087,5090,5102,5084,5098,5076,5059,5074,5094,5091,5065,5069,5068,5104,5109,5107,5108,5013,5019,5022,5010,5009,5023,5011,5024,5012,5021,5007,5008,5025,5020,5055,5006,5036,5038,5054,5032,5033,5031,5039,5035,5026,5027,5030,5200,5043,5005,5041,5049,5042,5052,5040,5053,5045,5047,5048,5050,5051,5044,5046)
group by c5, c6;
This in on MonetDB server v5.6.0, based on kernel v1.24.0 which I
built from the super tarball.
I can attach the entire trace files for each query if appropriate for the list.
While I am still learning to read the trace, it looks like most of the
time spent in the first query on a series of algebra.semijoin
operations. Here are the excerpted lines:
Query 1:
[2320345 usec # _461 :=
algebra.semijoin(_167=<~tmp_2116>bat[:str,:oid]{8280020},
_460=<~tmp_2016>bat[:str,:oid]{142}) ]
[776017 usec # _463 :=
algebra.semijoin(_158=