MonetDb crashes on a join using date between from and until date
Hi all, We are using the June 2020 release, trying to query a table with 24 mio rows and join each row from this 'left' table with one row from a right table according to a specific join clause. This clause does not result in a carthesian product. The query structure is as follows: select left.* from myschema.mytable left inner join otherschema.yourtable right on left.somedatefield >= right.fromdate and left.somedatefield <= right.untildate where left.otherdatefield is null ; This query results in following error in the merovingian.log: 2020-07-17 11:58:00 MSG merovingian[401281]: database 'datalake' (-1) has crashed with signal SIGSEGV (dumped core) However, if I change the query like below, making the left table a derived table and sorting the date field used in the join, the error doesn't occur: select left.* from (select * from myschema.mytable order by somedatefield) left inner join otherschema.yourtable right on left.somedatefield >= right.fromdate and left.somedatefield <= right.untildate where left.otherdatefield is null ; When I tested the original query with a subset from the left table there was a threshold of a certain amount of rows (in this case 26 rows) below which the query ran fine. Also, repeating the test on a different 'left' table with ~200k rows didn't result in a crash. Could it be that there's been a change in the query pipeline or join handling? Kind regards, Frank
Hello Frank, I think this a bug already reported on the range join: https://www.monetdb.org/bugzilla/show_bug.cgi?id=6878 It has already been fixed for the upcoming Jun2020-SP1 release. We are going to release it within the next few days. Best regards, Pedro On 7/17/20 1:52 PM, Groot, Frank (Sociale Verzekeringsbank) wrote:
Hi all,
We are using the June 2020 release, trying to query a table with 24 mio rows and join each row from this ‘left’ table with one row from a right table according to a specific join clause. This clause does not result in a carthesian product.
The query structure is as follows:
select left.*
from myschema.mytable left
inner join otherschema.yourtable right
on left.somedatefield >= right.fromdate
and left.somedatefield <= right.untildate
where left.otherdatefield is null ;
This query results in following error in the merovingian.log:
2020-07-17 11:58:00 MSG merovingian[401281]: database 'datalake' (-1) has crashed with signal SIGSEGV (dumped core)
However, if I change the query like below, making the left table a derived table and sorting the date field used in the join, the error doesn’t occur:
select left.*
from *(select * from myschema.mytable order by somedatefield)* left
inner join otherschema.yourtable right
on left.somedatefield >= right.fromdate
and left.somedatefield <= right.untildate
where left.otherdatefield is null ;
When I tested the original query with a subset from the left table there was a threshold of a certain amount of rows (in this case 26 rows) below which the query ran fine.
Also, repeating the test on a different ‘left’ table with ~200k rows didn’t result in a crash.
Could it be that there’s been a change in the query pipeline or join handling?
Kind regards,
Frank
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
ferreira
-
Groot, Frank (Sociale Verzekeringsbank)