
Hi Joris, On 08-09-2011 15:12:48 +0200, Joris Slob wrote:
For large tables (>1M rows), I encounter strange behavior when using the COUNT function in SQL in the mclient.
When I do the following query:
SELECT DISTINCT(predicate) FROM facts;
I get a nice resultset of 31 tuples in 20ms.
When I try the following query:
SELECT COUNT(DISTINCT(predicate)) FROM facts;
I get my answer in 565ms.
It seems weird that the database doesn't 'optimize' this count to:
SELECT COUNT(*) FROM (SELECT DISTINCT(predicate) FROM facts) AS b;
I get my answer in 20ms.
That is very well possible. Please file a bug and attach the outputs of your queries prefixed with 'PLAN' and 'EXPLAIN'. They should clarify what's going wrong.
This is my first attempt to make a knowledge store and I am aware there is lots I can do to optimize how I store the data. I was just surprised that the count took significantly longer than getting all the results out and counting them.
My first guess would be that the optimisers take a wrong option here, and do it the long and lengthy way. Regards, Fabian Groffen