I see, thanks Sjoerd.
As a follow up, I'd like to share this - I hope it can be interesting.
The short summary is : I found that FP-key constraints can hurt performance (at least in this case, I didn't make a comprehensive test), while I had always assumed they would help.
CREATE TABLE "dict" (
"termid" INTEGER NOT NULL,
"term" CHARACTER LARGE OBJECT NOT NULL,
CONSTRAINT "dict" PRIMARY KEY ("termid"),
);
CREATE TABLE "termdoc" (
"termid" INTEGER,
"docid" INTEGER,
CONSTRAINT "termdoc_termid_fkey" FOREIGN KEY ("termid") REFERENCES "dict" ("termid")
);
This is the query:
select termid,docid
from termdoc, (select termid from dict where termid=100) as q
where termdoc.termid=q.termid;
I list here 4 cases:
1) with FP constraint, table "termdoc" not sorted on termid.
This query performs a foregign-primary key join. The explain shows indeed that the _idxbat is used.
This executes in about 150ms (the exact hw/sw specs are not important, this timing is only used for comparison).
2) without FP constraint, table "termdoc" not sorted on termid.
If I remove the constraint, the same query executes in about 250ms. That's what I would expect. The table wasn't sorted, so removing the constraints means hashing needs to take place.
3) without FP constraint, table "termdoc" sorted on termid.
Still without constraint, but now the table is stored with "ORDER BY termid".
This takes about 60ms. Which confirms what you explained earlier, Sjoerd. The bat isn't marked as sorted, but the join finds out and exploits that anyway. Good.
4) with FP constraint, table "termdoc" sorted on termid.
Now I re-introduce the FP-key constraint, whith the table still sorted on "termid".
This takes again 150ms. That is, the usage of the _idxbat has priority over the fact that we have a more useful sorting already. I think I understand why: using or not the FP-key constraint for the join is a plan-generation-time decision, while using the sort information is an execution-time decision - which is too late, the plan as already been written for the FP-key path. That is quite unfortunate.
Which takes me to the question: couldn't that have been solved at plan-generation-time, if the sorting information had been marked in the persistent bat as a result of the ORDER BY clause in the CREATE TABLE statement? (I understand why that information is lost due to indirection in the plan, but I suspect it could be retained with some more effort if needed).
Roberto.