Vasilis Vassalos wrote:
Hello again,
for the student project I've told you about, we need to perform range joins, so we tried some on MonetDB. We're having trouble getting range joins on strings working properly.
Michalis, my student, has looked into it and reports that ALGrangejoin (algebra.c) calls BATrangejoin (gdk_rangejoin.c) that performs the join and returns a BAT as a result. BATrangejoin has a switch with a case for TYPE_chr that compares strings You are really delving deep in the sources ; ) An area only meant for the brave at heart.
When we try to perform a range join on a string attribute (char(n), even for n=1, varchar(n)) we get an exception: !MALException:algebra.rangejoin:GDKerror !ERROR: BATrangejoin: type not implemented
The switch statement skips the TYPE_chr case and goes to the default. For example,
sql>create table s1 (id int, v char(10)); sql>create table s2 (id int, v char(10)); sql>insert into s1 values (1,'s1'); sql>insert into s1 values (2,'s2'); sql>insert into s1 values (3,'s3'); sql>insert into s1 values (4,'s4'); sql>insert into s1 values (5,'s5'); sql>insert into s1 values (6,'s6'); sql>insert into s2 values (1,'s1'); sql>insert into s2 values (2,'s2'); sql>insert into s2 values (3,'s3'); sql>insert into s2 values (4,'s4'); sql>insert into s2 values (5,'s5'); sql>insert into s2 values (6,'s6'); sql>select s1.id from s1, s2 where s1.v between s2.v and s2.v||'c' ; !MALException:algebra.rangejoin:GDKerror !ERROR: BATrangejoin: type not implemented
This is an instance of the more general theta join. We will have a look into the issue.. It certainly should not have falled back to an error this low in the system.
whereas
sql>select s1.id from s1, s2 where s2.id=2 and s1.id between s2.id and s2.id+1 ; +----+ | id | +====+ | 2 | | 3 | +----+
Thanks!