We are seeing a very strange behavior from MonetDB from simple queries on a single table.
In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table.
Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values?
Q1: select * from monthly_tou_avg_price where header_id = 6
Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = ''
Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null
Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07'
Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column.
Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6
Q6: select distinct yyyy_mm from monthly_tou_avg_price
Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7
yyyy_mm L15
9890
2009-04 7574
2009-05 7574
2009-06 7750
2009-07 7786
2009-08 7788
2009-09 7828
2009-10 7814
2009-11 7844
2009-12 8322
2010-01 8317
2010-02 8363
2010-03 8361
2010-04 9214
2010-05 9124
2010-06 9156
2010-07 9165
2010-08 9195
2010-09 9197
2010-10 9198
2010-11 9234
2010-12 9232
2011-01 9246
2011-02 9238
2011-03 9696
2011-04 9672
2011-05 9670
2011-06 9668
2011-07 9656
2011-08 9698
2011-09 9682
2011-10 9724
2011-11 9706
2011-12 9768
2012-01 9766
2012-02 9764
2012-03 9832
2012-04 9824
2012-05 9826
2012-06 9902
We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement.
We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results.
Q8: insert into "monthly_tou_avg_price"
select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m,
b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name,
round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5),
round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5),
round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count
from oasislive.price a
join oasislive.time b
on a.yyyymmddhh = b.yyyymmddhh
where a.opr_date between '2012-07-01' and '2012-07-31'
group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name
order by y, m, a.header_id, TOU, a.resource_name;
We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period!
Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L6
.
.
2011-03 OFF 16.721532797520894
2011-03 ON (null)
2011-04 OFF 15.427080074671187
.
.
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L11
2011-03 OFF 16.72153279752101
2011-03 ON 30.276486575893575
I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well.
Best regards,
Henry
We are currently running MonetDB August 2011 SP2 release on CentOS