Hi there, Triggered by a suspiciously slow join, I had a look at BATsubjoin, in particular at how it is decided which algorithm to use. I looked at a couple of cases with a join on strings. This is one (pseudo syntax): l.count ~16M l.T.sorted = 0 l.T.revsorted = 0 r.count = 1 r.T.sorted = 1 r.T.revsorted = 1 In this case, a hash table on r is created (because it's smaller). This join takes *430ms*. I forced swapping l and r, thus built the hash table on the larger bat, and then it takes *0.8ms*. Notice that though r.count = 1 is a bit of an extreme case, it does happen often in practice. And the case with a few tuples would not be very different. The question I wanted to ask is: is it always right to build the hash table on the small table? Perhaps some more heuristics can help? Like: if the smaller is very small, and the larger is not extremely large, then hash on the larger one? Roberto