New subject: [MonetDB-users] Query time takes equally long on whole and just small part of table

23 Feb 2011

      Hello,

I'm new to monetdb and installed it two weeks ago to do some tests. I was
very much impressed with the speed on large tables, but in some of my latest
tests regarding subparts of these tables based on a unix timestamp the time
was the same as querying the whole table. Does anyone know what I can do to
get higher speeds on queries in monetdb that only use part of the table,
like partition pruning, explicit foreign keys or unique statements,
indexes or some other way?

I have the following two tables and number of records:

CREATE TABLE table1 (T1vid INT NOT NULL, T1Field1 VARCHAR(20) NOT
NULL, T1Field2 TINYINT NOT NULL);
This table has about 400.000 records. The records are non time bounded, it's
just a list. The T1vid is an increased ID, but not generated in momentdb at
the moment, but in mysql

CREATE TABLE table2 (T2mid INT NOT NULL, T2vid INT NOT NULL, T2timestamp INT
NOT NULL, T2Field1 INT NOT NULL, T2Field2 INT NOT NULL);
This table has about 1.000.000 records and a unix_timestamp field. It get a
few new records every second and records older then 24 hours are deleted.
The T2mid-field refers to some other table that's not in momentdb's
database, so just an integer. The T2vid-field however refers to the
T1vid-field in table1 (a foreign key, but i didn't define it that way).

The following query takes about 3,5 seconds, it doesnt't have the timestamp
included in the WHERE-clause:
SELECT SUM(t2.T2Field2), t1.T1Field1, t1.T1Field2, (t2.T2timestamp / 3600)
as interval
            FROM table2 AS t2
            LEFT JOIN table 1 AS t1 ON t1.T1vid=t2.T2vid
            WHERE t2.T2Field1=8 AND t2.T2vid IN (*list of 15 unique v.T1vid
ID's*)
            GROUP BY interval, t1.T1Field1, t1.T1Field2 ORDER BY interval
DESC;
Resulting is a list of the 15 v.T1vid ID's times the number of interval's.
t2.T2timestamp is now devided by 3600 and grouped on this result, so each
set of 15 v.T1vid ID's would reflect 1 hour, returning max 24x15 rows.
**
 The following query also takes about 3,5 seconds, but it does have the
timestamp included in the WHERE-clause. It only needs to access 1 hour of
data instead of the whole 24 hours, and groups it into intervals of 5
minutes (300 sec):
SELECT SUM(t2.T2Field2), t1.T1Field1, t1.T1Field2, (t2.T2timestamp / 300)
as interval
            FROM table2 AS t2
             LEFT JOIN table 1 AS t1 ON t1.T1vid=t2.T2vid
            WHERE* t2.T2timestamp BETWEEN 1298360000 AND 1298363600*
            AND t2.T2Field1=8 AND t2.T2vid IN (*list of 15 unique v.T1vid
ID's*)
            GROUP BY interval, t1.T1Field1, t1.T1Field2 ORDER BY interval
DESC;

t2.T2timestamp is now devided by 300, so each set of 15 v.T1vid ID's would
reflect 5 minuntes, returning max 12x15 rows.

I would very much appreciate any help or hints; please let me know any
question you might have.

Kind regards,
Rob Berentsen

[MonetDB-users] Query time takes equally long on whole and just small part of table

Rob Berentsen

Henry Addington

Rob Berentsen

tags

participants (2)