[MonetDB-users] Order of WHERE clause has significant performance impact

Hi All - I have 2 queries which run very different based on how I
order the WHERE clause.
When I put the larger filter first, it runs for 3.5 seconds.
Query 1:
select c5, c6, sum(sale) from sobi.sample
where
c5 in (5253,5110,5375,5268,5257,5266,5243,5256,5252,5361,5379,5363,5386,5367,5240,5241,5261,5249,5245,5260,5247,5264,5354,5250,5373,5366,5246,5356,5365,5255,5262,5259,5251,5369,5357,5269,5364,5380,5244,5248,5242,5254,5358,5360,5263,5362,5374,5378,5370,5294,5339,5272,5277,5275,5295,5340,5273,5276,5270,5061,5081,5079,5080,5066,5056,5073,5060,5070,5099,5101,5100,5067,5063,5077,5072,5062,5075,5083,5086,5071,5095,5058,5087,5090,5102,5084,5098,5076,5059,5074,5094,5091,5065,5069,5068,5104,5109,5107,5108,5013,5019,5022,5010,5009,5023,5011,5024,5012,5021,5007,5008,5025,5020,5055,5006,5036,5038,5054,5032,5033,5031,5039,5035,5026,5027,5030,5200,5043,5005,5041,5049,5042,5052,5040,5053,5045,5047,5048,5050,5051,5044,5046)
and c6 in (5182)
group by c5, c6;
When I reverse the order of the WHERE clause, it runs in about .4 seconds.
Query 2:
c5, c6, sum(sale) from sobi.sample
where
c6 in (5182)
and
c5 in (5253,5110,5375,5268,5257,5266,5243,5256,5252,5361,5379,5363,5386,5367,5240,5241,5261,5249,5245,5260,5247,5264,5354,5250,5373,5366,5246,5356,5365,5255,5262,5259,5251,5369,5357,5269,5364,5380,5244,5248,5242,5254,5358,5360,5263,5362,5374,5378,5370,5294,5339,5272,5277,5275,5295,5340,5273,5276,5270,5061,5081,5079,5080,5066,5056,5073,5060,5070,5099,5101,5100,5067,5063,5077,5072,5062,5075,5083,5086,5071,5095,5058,5087,5090,5102,5084,5098,5076,5059,5074,5094,5091,5065,5069,5068,5104,5109,5107,5108,5013,5019,5022,5010,5009,5023,5011,5024,5012,5021,5007,5008,5025,5020,5055,5006,5036,5038,5054,5032,5033,5031,5039,5035,5026,5027,5030,5200,5043,5005,5041,5049,5042,5052,5040,5053,5045,5047,5048,5050,5051,5044,5046)
group by c5, c6;
This in on MonetDB server v5.6.0, based on kernel v1.24.0 which I
built from the super tarball.
I can attach the entire trace files for each query if appropriate for the list.
While I am still learning to read the trace, it looks like most of the
time spent in the first query on a series of algebra.semijoin
operations. Here are the excerpted lines:
Query 1:
[2320345 usec # _461 :=
algebra.semijoin(_167=<~tmp_2116>bat[:str,:oid]{8280020},
_460=<~tmp_2016>bat[:str,:oid]{142}) ]
[776017 usec # _463 :=
algebra.semijoin(_158=

On Tue, Aug 12, 2008 at 08:31:22PM -0500, Ross Bates wrote:
Hi All - I have 2 queries which run very different based on how I order the WHERE clause.
When I put the larger filter first, it runs for 3.5 seconds.
Query 1:
select c5, c6, sum(sale) from sobi.sample where c5 in (5253,5110,5375,5268,5257,5266,5243,5256,5252,5361,5379,5363,5386,5367,5240,5241,5261,5249,5245,5260,5247,5264,5354,5250,5373,5366,5246,5356,5365,5255,5262,5259,5251,5369,5357,5269,5364,5380,5244,5248,5242,5254,5358,5360,5263,5362,5374,5378,5370,5294,5339,5272,5277,5275,5295,5340,5273,5276,5270,5061,5081,5079,5080,5066,5056,5073,5060,5070,5099,5101,5100,5067,5063,5077,5072,5062,5075,5083,5086,5071,5095,5058,5087,5090,5102,5084,5098,5076,5059,5074,5094,5091,5065,5069,5068,5104,5109,5107,5108,5013,5019,5022,5010,5009,5023,5011,5024,5012,5021,5007,5008,5025,5020,5055,5006,5036,5038,5054,5032,5033,5031,5039,5035,5026,5027,5030,5200,5043,5005,5041,5049,5042,5052,5040,5053,5045,5047,5048,5050,5051,5044,5046) and c6 in (5182) group by c5, c6;
When I reverse the order of the WHERE clause, it runs in about .4 seconds.
Query 2:
c5, c6, sum(sale) from sobi.sample where c6 in (5182) and c5 in (5253,5110,5375,5268,5257,5266,5243,5256,5252,5361,5379,5363,5386,5367,5240,5241,5261,5249,5245,5260,5247,5264,5354,5250,5373,5366,5246,5356,5365,5255,5262,5259,5251,5369,5357,5269,5364,5380,5244,5248,5242,5254,5358,5360,5263,5362,5374,5378,5370,5294,5339,5272,5277,5275,5295,5340,5273,5276,5270,5061,5081,5079,5080,5066,5056,5073,5060,5070,5099,5101,5100,5067,5063,5077,5072,5062,5075,5083,5086,5071,5095,5058,5087,5090,5102,5084,5098,5076,5059,5074,5094,5091,5065,5069,5068,5104,5109,5107,5108,5013,5019,5022,5010,5009,5023,5011,5024,5012,5021,5007,5008,5025,5020,5055,5006,5036,5038,5054,5032,5033,5031,5039,5035,5026,5027,5030,5200,5043,5005,5041,5049,5042,5052,5040,5053,5045,5047,5048,5050,5051,5044,5046) group by c5, c6;
This in on MonetDB server v5.6.0, based on kernel v1.24.0 which I built from the super tarball.
I can attach the entire trace files for each query if appropriate for the list.
While I am still learning to read the trace, it looks like most of the time spent in the first query on a series of algebra.semijoin operations. Here are the excerpted lines:
Query 1:
[2320345 usec # _461 := algebra.semijoin(_167=<~tmp_2116>bat[:str,:oid]{8280020}, _460=<~tmp_2016>bat[:str,:oid]{142}) ] [776017 usec # _463 := algebra.semijoin(_158=
bat[:oid,:str]{8280020}, _462=<~tmp_1572>bat[:oid,:str]{5135708}) ] [253551 usec # _470 := algebra.semijoin(_464=<~tmp_2116>bat[:str,:oid]{5135708}, _469=<~tmp_2016>bat[:str,:oid]{1}) ] Query 2:
[412122 usec # _179 := algebra.semijoin(_167=<~tmp_2076>bat[:str,:oid]{8280020}, _178=<~tmp_2016>bat[:str,:oid]{1}) ] [ 194 usec # _181 := algebra.semijoin(_158=
bat[:oid,:str]{8280020}, _180=<~tmp_1572>bat[:oid,:str]{304}) ] [ 0 usec # _470 := algebra.semijoin(_182=<~tmp_2076>bat[:str,:oid]{304}, _469=<~tmp_2016>bat[:str,:oid]{142}) ] Any assistance would be greatly appreciated.
Indeed the optimizer isn't smart enough to use the correct (most selective) in first. Could you send the create statement for the sobi.sample table? If possible could you try this on the 'current' and with mclient -lsql -G. That last -G enables the new implementation which first translates sql into a relational algebra and then in binary algebra. Niels
-Ross
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
participants (2)
-
Niels Nes
-
Ross Bates