11 Apr
2007
11 Apr
'07
1:54 p.m.
hmm, this sounds like a triple store. If the number of link-ids is small enough (<100K), I would seriously consider horizontal fragmentation based on it as a starting point. Then the automatic hash on the (from_id,to_id) bat gives you the answer quickly.
We normally have between 25 to 100 million links... ;-)
How volatile is your data? Otherwise, simply sort them on the oid and the hash will bring you quickly to the desired place.
The data change rarely. So, do you think it'd be good to keep three sorted BATs (one for link_id, one for from, and one for to) and use the appropriate ones? Thanks, -- A