hi,
my query schema is SSBM.My database size is 65G.There are five tables :
lineorder 600038145 rows
part 1400000
date 2556
supplier 1000000
customer 3000000
and my host have enough memory space up to 500G
when i run query
select sum(lo_extendedprice*lo_discount) as revenue
from voc.lineorder, voc.dates
where lo_orderdate = d_datekey
and d_weeknuminyear = 6
and d_year = 1994
and lo_discount between 5 and 7
and lo_quantity between 26 and 35;
the result is below
| revenue |
+===============+
| 2605756481323 |
+---------------+
1 tuple (6.2s)
but if i delete the condition of " lo_discount between 5 and 7"
the run result is more faster than the original one.
query
select sum(lo_extendedprice*lo_discount) as revenue
from voc.lineorder, voc.dates
where lo_orderdate = d_datekey
and d_weeknuminyear = 6
and d_year = 1994
and lo_quantity between 26 and 35;
+---------------+
| revenue |
+===============+
| 7990711777299 |
+---------------+
1 tuple (410.451ms)
at the same time ,if i delete the condition of "lo_quantity between 26 and 35"
the run result is more faster than the original one too.
query
select sum(lo_extendedprice*lo_discount) as revenue
from voc.lineorder, voc.dates
where lo_orderdate = d_datekey
and d_weeknuminyear = 6
and d_year = 1994
and lo_quantity between 26 and 35;
+---------------+
| revenue |
+===============+
| 7990711777299 |
+---------------+
1 tuple (463.617ms)
i am confused and if the semijoin costs a lot of execution time ? Thank you for your help.
best reguards,
guangliang peng