[MonetDB-users] Incorrect Results from Very Simple Queries
We are seeing a very strange behavior from MonetDB from simple queries on a single table. In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table. Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values? Q1: select * from monthly_tou_avg_price where header_id = 6 Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '' Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07' Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column. Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6 Q6: select distinct yyyy_mm from monthly_tou_avg_price Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7 yyyy_mm L15 9890 2009-04 7574 2009-05 7574 2009-06 7750 2009-07 7786 2009-08 7788 2009-09 7828 2009-10 7814 2009-11 7844 2009-12 8322 2010-01 8317 2010-02 8363 2010-03 8361 2010-04 9214 2010-05 9124 2010-06 9156 2010-07 9165 2010-08 9195 2010-09 9197 2010-10 9198 2010-11 9234 2010-12 9232 2011-01 9246 2011-02 9238 2011-03 9696 2011-04 9672 2011-05 9670 2011-06 9668 2011-07 9656 2011-08 9698 2011-09 9682 2011-10 9724 2011-11 9706 2011-12 9768 2012-01 9766 2012-02 9764 2012-03 9832 2012-04 9824 2012-05 9826 2012-06 9902 We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement. We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results. Q8: insert into "monthly_tou_avg_price" select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m, b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name, round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5), round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5), round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count from oasislive.price a join oasislive.time b on a.yyyymmddhh = b.yyyymmddhh where a.opr_date between '2012-07-01' and '2012-07-31' group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name order by y, m, a.header_id, TOU, a.resource_name; We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period! Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou yyyy_mm tou L6 . . 2011-03 OFF 16.721532797520894 2011-03 ON (null) 2011-04 OFF 15.427080074671187 . . Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou yyyy_mm tou L11 2011-03 OFF 16.72153279752101 2011-03 ON 30.276486575893575 I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well. Best regards, Henry We are currently running MonetDB August 2011 SP2 release on CentOS
Henry, is there any chance you could upgrade from Aug2011-SP2 to a more recent release of MonetDB, preferably the latest Jul2012 ? (Or at least test the problematic cases with this latest release?) Stefan ----- Original Message -----
We are seeing a very strange behavior from MonetDB from simple queries on a single table.
In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table.
Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values?
Q1: select * from monthly_tou_avg_price where header_id = 6 Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '' Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07'
Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column.
Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6 Q6: select distinct yyyy_mm from monthly_tou_avg_price Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7
yyyy_mm L15 9890 2009-04 7574 2009-05 7574 2009-06 7750 2009-07 7786 2009-08 7788 2009-09 7828 2009-10 7814 2009-11 7844 2009-12 8322 2010-01 8317 2010-02 8363 2010-03 8361 2010-04 9214 2010-05 9124 2010-06 9156 2010-07 9165 2010-08 9195 2010-09 9197 2010-10 9198 2010-11 9234 2010-12 9232 2011-01 9246 2011-02 9238 2011-03 9696 2011-04 9672 2011-05 9670 2011-06 9668 2011-07 9656 2011-08 9698 2011-09 9682 2011-10 9724 2011-11 9706 2011-12 9768 2012-01 9766 2012-02 9764 2012-03 9832 2012-04 9824 2012-05 9826 2012-06 9902
We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement.
We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results.
Q8: insert into "monthly_tou_avg_price" select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m, b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name, round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5), round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5), round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count from oasislive.price a join oasislive.time b on a.yyyymmddhh = b.yyyymmddhh where a.opr_date between '2012-07-01' and '2012-07-31' group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name order by y, m, a.header_id, TOU, a.resource_name;
We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period!
Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L6 . .
2011-03 OFF 16.721532797520894 2011-03 ON (null) 2011-04 OFF 15.427080074671187 . .
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L11 2011-03 OFF 16.72153279752101 2011-03 ON 30.276486575893575
I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well.
Best regards,
Henry
We are currently running MonetDB August 2011 SP2 release on CentOS
------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Yes. We are building the entire db with the latest release in the test
environment right now. I will have the test result tomorrow. Thank you.
Best regards,
Henry
On Tue, Jul 31, 2012 at 1:11 AM, Stefan Manegold
Henry,
is there any chance you could upgrade from Aug2011-SP2 to a more recent release of MonetDB, preferably the latest Jul2012 ?
(Or at least test the problematic cases with this latest release?)
Stefan
----- Original Message -----
We are seeing a very strange behavior from MonetDB from simple queries on a single table.
In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table.
Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values?
Q1: select * from monthly_tou_avg_price where header_id = 6 Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '' Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07'
Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column.
Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6 Q6: select distinct yyyy_mm from monthly_tou_avg_price Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7
yyyy_mm L15 9890 2009-04 7574 2009-05 7574 2009-06 7750 2009-07 7786 2009-08 7788 2009-09 7828 2009-10 7814 2009-11 7844 2009-12 8322 2010-01 8317 2010-02 8363 2010-03 8361 2010-04 9214 2010-05 9124 2010-06 9156 2010-07 9165 2010-08 9195 2010-09 9197 2010-10 9198 2010-11 9234 2010-12 9232 2011-01 9246 2011-02 9238 2011-03 9696 2011-04 9672 2011-05 9670 2011-06 9668 2011-07 9656 2011-08 9698 2011-09 9682 2011-10 9724 2011-11 9706 2011-12 9768 2012-01 9766 2012-02 9764 2012-03 9832 2012-04 9824 2012-05 9826 2012-06 9902
We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement.
We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results.
Q8: insert into "monthly_tou_avg_price" select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m, b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name, round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5), round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5), round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count from oasislive.price a join oasislive.time b on a.yyyymmddhh = b.yyyymmddhh where a.opr_date between '2012-07-01' and '2012-07-31' group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name order by y, m, a.header_id, TOU, a.resource_name;
We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period!
Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L6 . .
2011-03 OFF 16.721532797520894 2011-03 ON (null) 2011-04 OFF 15.427080074671187 . .
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L11 2011-03 OFF 16.72153279752101 2011-03 ON 30.276486575893575
I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well.
Best regards,
Henry
We are currently running MonetDB August 2011 SP2 release on CentOS
------------------------------------------------------------------------------
Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi Stefan: We have rebuilt our database using the latest MonetDB release; and the issue described with queries 1 through 8 has been resolved. We are not positive; but we think there was some sort of data corruption in the large timeseries data which got carried over to the aggregate table. We some just a handful of records in the timeseries table that would randomly return garbled characters or empty string. The reason we think the issue is not completely resolved in the latest release is because when we rebuilt our database in Aug2011-SP2 the problem also went away. Here is the remaining issue. We ran Q9 and Q10 in the rebuilt Aug2011-SP2 release and the latest release. They show identical results as they are based on same data. However, the 2011-03 OFF aggregates (avg, min, max) are incorrect though the count seems to be ok. We have values in the 2011-03 OFF bucket and should have aggregates. Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
From rebuilt Aug2011-SP2 yyyy_mm, tou, avg(lmp_prc), count(lmp_prc), min(lmp_prc), max(lmp_prc) ...... ...... 2011-03 OFF 16.72153279686157 15167 0.00000 34.88962 2011-03 ON (null) 15166 (null) (null) 2011-04 OFF 15.427080074671187 15133 -24.89539 30.68835 2011-04 ON 32.05149548777285 15130 -34.29957 48.99472 ..... ......
From July2012 Feature Release yyyy_mm, tou, avg(lmp_prc), count(lmp_prc), min(lmp_prc), max(lmp_prc) ...... ...... 2011-03 OFF 16.721532797520897 1516700000 0.00000 34.88962 2011-03 ON (null) 1516600000 (null) (null) 2011-04 OFF 15.427080077975218 1513300000 -24.89539 30.68835 2011-04 ON 32.051495489755666 1513000000 -34.29957 48.99472 ....... .......
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
From rebuilt Aug2011-SP2 2011-03 OFF 16.721532796861684 15167 0.00000 34.88962 2011-03 ON 30.276486572596724 15166 -2.62440 54.96787
From July2012 Feature Release yyyy_mm, tou, avg(lmp_prc), count(lmp_prc), min(lmp_prc), max(lmp_prc) yyyy_mm tou L1 scale_up_L2 L3 L4 2011-03 OFF 16.72153279752101 1516700000 0.00000 34.88962 2011-03 ON 30.276486575893575 1516600000 -2.62440 54.96787
There are 2 things that I see are problems. First, the count() in the
July2012 release appear to 10,000 times the actual count in both Q9 and
Q10. Also, Q10 with the where clause shows that we do in fact have values
that should show up in the aggregates in Q9.
Do you have any suggestions as to how I can provide you with more
information to troubleshoot the issue? I know it will be difficult to
duplicate what I am seeing. Thanks for all your hard work!
Best regards,
Henry
On Tue, Jul 31, 2012 at 1:11 AM, Stefan Manegold
Henry,
is there any chance you could upgrade from Aug2011-SP2 to a more recent release of MonetDB, preferably the latest Jul2012 ?
(Or at least test the problematic cases with this latest release?)
Stefan
----- Original Message -----
We are seeing a very strange behavior from MonetDB from simple queries on a single table.
In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table.
Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values?
Q1: select * from monthly_tou_avg_price where header_id = 6 Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '' Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07'
Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column.
Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6 Q6: select distinct yyyy_mm from monthly_tou_avg_price Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7
yyyy_mm L15 9890 2009-04 7574 2009-05 7574 2009-06 7750 2009-07 7786 2009-08 7788 2009-09 7828 2009-10 7814 2009-11 7844 2009-12 8322 2010-01 8317 2010-02 8363 2010-03 8361 2010-04 9214 2010-05 9124 2010-06 9156 2010-07 9165 2010-08 9195 2010-09 9197 2010-10 9198 2010-11 9234 2010-12 9232 2011-01 9246 2011-02 9238 2011-03 9696 2011-04 9672 2011-05 9670 2011-06 9668 2011-07 9656 2011-08 9698 2011-09 9682 2011-10 9724 2011-11 9706 2011-12 9768 2012-01 9766 2012-02 9764 2012-03 9832 2012-04 9824 2012-05 9826 2012-06 9902
We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement.
We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results.
Q8: insert into "monthly_tou_avg_price" select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m, b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name, round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5), round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5), round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count from oasislive.price a join oasislive.time b on a.yyyymmddhh = b.yyyymmddhh where a.opr_date between '2012-07-01' and '2012-07-31' group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name order by y, m, a.header_id, TOU, a.resource_name;
We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period!
Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L6 . .
2011-03 OFF 16.721532797520894 2011-03 ON (null) 2011-04 OFF 15.427080074671187 . .
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L11 2011-03 OFF 16.72153279752101 2011-03 ON 30.276486575893575
I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well.
Best regards,
Henry
We are currently running MonetDB August 2011 SP2 release on CentOS
------------------------------------------------------------------------------
Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi Stefan:
I am very puzzled by the erratic behavior that we are seeing with both
Aug2011-SP2 and the latest version of MonetDB. I would appreciate if you
can share some of your insight on the erratic behavior we are seeing.
The problem we saw with Q1 through Q8 queries returned in Aug2011-SP2
environment after we inserted 3 batches of data. After 2 daily inserts
(copy into), the data worked fine. After inserting the data for 8/4/2012,
we started seeing 8/1/2012 data showing up with empty string in one of the
columns again and returning incorrect query results. We restored the
database from a working backup and started to insert the data for each
daily set to observe where the failure occurs. After inserting 8/3/2012
data, we immediately saw some 50,000 records with empty string in one of
the columns for 8/1/2012 data which looked fine earlier. We loaded 8/4/2012
data and saw no change in the problematic records. Ironically, after
inserting 8/5/2012 data, the problematic records were reduced to just 164
records. I can't find any rationale for the behavior that we are seeing in
our MonetDB.
You had suggested to run some tests using the latest release of MonetDB. We
did see couple of issues as described in our previous email. Unfortunately,
the latest version also has significant problems like the ones that we are
seeing with the Aug2011-SP2 version. The query below returned 37 records
with empty strings in yyyy_mm column. Ironically, MonetDB thinks they are
"different" empty strings because they are in different rows!
select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6
order by yyyy_mm
yyyy_mm L1
9890
9902
9826
9824
9832
9764
9766
9768
9706
9724
9682
9698
9656
9668
9670
9672
9696
9238
9246
9232
9234
9198
9197
9195
9165
9156
9124
9214
8361
8363
8317
8322
7844
7814
7828
7788
7786
2009-04 7574
2009-05 7574
2009-06 7750
We have a very large dataset (largest table has 2.5 billion rows with 18
columns) that we are working with. But, we have a robust environment (VMs
with 40G to 140G memory and plenty of hard disk space) to support the data
volume. The Aug2011-SP2 has been very stable up until this week when
everything seems to just fall apart.
Have we hit some sort of a limit in the data volume that MonetDB can handle?
Again, I apologize for the long-winded posts. Any suggestions for more
information or a way for us to troubleshoot more effectively will be very
much appreciated. Thank you.
Best regards,
Henry
On Thu, Aug 2, 2012 at 5:59 PM, Henry Addington
Hi Stefan:
We have rebuilt our database using the latest MonetDB release; and the issue described with queries 1 through 8 has been resolved. We are not positive; but we think there was some sort of data corruption in the large timeseries data which got carried over to the aggregate table. We some just a handful of records in the timeseries table that would randomly return garbled characters or empty string. The reason we think the issue is not completely resolved in the latest release is because when we rebuilt our database in Aug2011-SP2 the problem also went away.
Here is the remaining issue. We ran Q9 and Q10 in the rebuilt Aug2011-SP2 release and the latest release. They show identical results as they are based on same data. However, the 2011-03 OFF aggregates (avg, min, max) are incorrect though the count seems to be ok. We have values in the 2011-03 OFF bucket and should have aggregates.
Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
From rebuilt Aug2011-SP2 yyyy_mm, tou, avg(lmp_prc), count(lmp_prc), min(lmp_prc), max(lmp_prc) ...... ...... 2011-03 OFF 16.72153279686157 15167 0.00000 34.88962 2011-03 ON (null) 15166 (null) (null) 2011-04 OFF 15.427080074671187 15133 -24.89539 30.68835 2011-04 ON 32.05149548777285 15130 -34.29957 48.99472 ..... ......
From July2012 Feature Release yyyy_mm, tou, avg(lmp_prc), count(lmp_prc), min(lmp_prc), max(lmp_prc) ...... ...... 2011-03 OFF 16.721532797520897 1516700000 0.00000 34.88962 2011-03 ON (null) 1516600000 (null) (null) 2011-04 OFF 15.427080077975218 1513300000 -24.89539 30.68835 2011-04 ON 32.051495489755666 1513000000 -34.29957 48.99472 ....... .......
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
From rebuilt Aug2011-SP2 2011-03 OFF 16.721532796861684 15167 0.00000 34.88962 2011-03 ON 30.276486572596724 15166 -2.62440 54.96787
From July2012 Feature Release yyyy_mm, tou, avg(lmp_prc), count(lmp_prc), min(lmp_prc), max(lmp_prc) yyyy_mm tou L1 scale_up_L2 L3 L4 2011-03 OFF 16.72153279752101 1516700000 0.00000 34.88962 2011-03 ON 30.276486575893575 1516600000 -2.62440 54.96787
There are 2 things that I see are problems. First, the count() in the July2012 release appear to 10,000 times the actual count in both Q9 and Q10. Also, Q10 with the where clause shows that we do in fact have values that should show up in the aggregates in Q9.
Do you have any suggestions as to how I can provide you with more information to troubleshoot the issue? I know it will be difficult to duplicate what I am seeing. Thanks for all your hard work!
Best regards,
Henry
On Tue, Jul 31, 2012 at 1:11 AM, Stefan Manegold
wrote: Henry,
is there any chance you could upgrade from Aug2011-SP2 to a more recent release of MonetDB, preferably the latest Jul2012 ?
(Or at least test the problematic cases with this latest release?)
Stefan
----- Original Message -----
We are seeing a very strange behavior from MonetDB from simple queries on a single table.
In monthly_tou_avg_price table, we have monthly average aggregate data that were inserted from a very large table (> 2 billion) with hourly data. We have created the aggregate table to speed up certain queries. Strangely, we are getting incorrect results from very simple queries on the aggregate table.
Q1 below returns 9,890 records with yyyy_mm column having empty value for the 2012-07 records. All of the 2012-07 records are missing the yyyy_mm value. We ran the SQL statement that inserted those records and it does not show any empty yyyy_mm values. If we check for empty string or null value in yyyy_mm column (Q2 and Q3) against the aggregate table, we do not get any hits. If we query for yyyy_mm= '2012-07' as in Q4, we get exactly 9890 records showing yyyy_mm populated with value 2012-07. How could this be possible, Q1 showing empty yyyy_mm values for the same 9,890 records and Q4 showing 2012-07 yyyy_mm values?
Q1: select * from monthly_tou_avg_price where header_id = 6 Q2: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '' Q3: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm is null Q4: select * from monthly_tou_avg_price where header_id = 6 and yyyy_mm = '2012-07'
Q5 below returns empty value for yyyy_mm even though yyyy_mm column does not have any empty or null values. Q6 does not return any empty or null yyyy_mm values and contradicts Q5 result by returning 2012-07. Furthermore, Q7 contradicts Q5 with 9,890 records with 2012-07 value in yyyy_mm column.
Q5: select distinct yyyy_mm from monthly_tou_avg_price where header_id = 6 Q6: select distinct yyyy_mm from monthly_tou_avg_price Q7: select * from monthly_tou_avg_price where header_id = 6 and years = 2012 and months = 7
yyyy_mm L15 9890 2009-04 7574 2009-05 7574 2009-06 7750 2009-07 7786 2009-08 7788 2009-09 7828 2009-10 7814 2009-11 7844 2009-12 8322 2010-01 8317 2010-02 8363 2010-03 8361 2010-04 9214 2010-05 9124 2010-06 9156 2010-07 9165 2010-08 9195 2010-09 9197 2010-10 9198 2010-11 9234 2010-12 9232 2011-01 9246 2011-02 9238 2011-03 9696 2011-04 9672 2011-05 9670 2011-06 9668 2011-07 9656 2011-08 9698 2011-09 9682 2011-10 9724 2011-11 9706 2011-12 9768 2012-01 9766 2012-02 9764 2012-03 9832 2012-04 9824 2012-05 9826 2012-06 9902
We also ran an UPDATE statement to set yyyy_mm to 2012-07 for rows with header_id = 6 and years = 2012 and months = 7. And, 9,890 records were successfully updated. However, Q1 through Q8 returned same results as above with no improvement.
We have not seen this kind of issues in other tables. This problem seems to be limited to a table that is populated with an INSERT INTO statement. We checked the result returned by the select portion of Q8 and yyyy_mm is populated with 2012-07. But, the aggregate table seems to end up in a weird state where it returns incorrect results.
Q8: insert into "monthly_tou_avg_price" select a.header_id, extract(year from a.opr_date) as y, extract(month from a.opr_date) as m, b.yyyymm as yyyy_mm, b.svalue as TOU, a.resource_name, round(avg(a.lmp_cong_prc),5), round(avg(a.lmp_ene_prc),5), round(avg(a.lmp_loss_prc),5), round(avg(a.lmp_prc),5), round(avg(a.ns_clr_prc),5), round(avg(a.rd_clr_prc),5), round(avg(a.ru_clr_prc),5), round(avg(a.sp_clr_prc),5), round(avg(a.shadow_prc),5), round(avg(a.fuel_prc),5), count(lmp_cong_prc) as actual_lmpcong_tou_count from oasislive.price a join oasislive.time b on a.yyyymmddhh = b.yyyymmddhh where a.opr_date between '2012-07-01' and '2012-07-31' group by a.header_id, y, m, yyyy_mm, TOU, a.resource_name order by y, m, a.header_id, TOU, a.resource_name;
We can see this INSERT INTO related issue with other queries. Q9 shows that we do not have any values for 2011-03 ON period which is incorrect. In fact, if we specify '2011-03' as in Q10, it shows that we do in fact have values for the 2011-03 ON period!
Q9: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L6 . .
2011-03 OFF 16.721532797520894 2011-03 ON (null) 2011-04 OFF 15.427080074671187 . .
Q10: select yyyy_mm, tou, avg(lmp_prc) from monthly_tou_avg_price where yyyy_mm = '2011-03' group by yyyy_mm, tou order by yyyy_mm, tou
yyyy_mm tou L11 2011-03 OFF 16.72153279752101 2011-03 ON 30.276486575893575
I apologize for the long-winded email. I just wasn't sure how to describe what I was seeing.... Please let me know if there is any additional information that I can provide you with. By the way, we are currently using August 2011 -SP2 release, but saw this issue in older releases as well.
Best regards,
Henry
We are currently running MonetDB August 2011 SP2 release on CentOS
------------------------------------------------------------------------------
Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (2)
-
Henry Addington
-
Stefan Manegold