On Wed, Apr 30, 2008 at 03:58:11PM +0200, Nils Grimsmo wrote:
I have tables with around 12 million rows and large primary keys. I find the performance for various queries to be rather strange.
* Why is Q2 (see below) so much slower than Q1? It is actually faster for me to use Q1 and do the rest of the matching myself.
* Why is Q3 so immensely much faster than Q1? I understand that the first part of the primary key is selective in Q3. Do I understand right if Q1 scans until "type" matches, then scans until "name" matches, then scans until "value" matches? No indexes are put on disk, right? Then the matching for Q3 should be scan until "value" matches, which shouldn't be more than at most three times faster, right (if three columns are read instead of one for Q1)? I do Q1 twice to look for cash effects, but none are seen... Could some columns from the table "value" be cached and kept, without giving room for those from "type_name_value"?
* Q4 has the same performance as Q1, as expected.
* Why isn't Q5 faster than Q6? 82% of the rows have type=3. Fewer should be scanned to find that "!!foo" doesn't exist than for "~~foo", as "!" has a low ASCII value, while "~" is high. (See COUNT queries at the bottom.)
The performance differences are related to the order in which the where conditions are applied. Please send us also the table create statement, then we can use 'explain' to see which order is taken. You could also run your queries prefixed with 'trace' to see where the time goes. Niels
Klem fra Nils
Q1: SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = 'Jurgen Annevelink' Q2: SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = 'Jurgen Annevelink' AND doc = 1 AND nenc = '1.1.1.1' AND pos = 4 Q3: SELECT * FROM value WHERE value = 'Jurgen Annevelink' AND type = 3 AND name = 'author' Q4: SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = '__bar' Q5: SELECT * FROM type_name_value WHERE type = 3 AND name = '!!foo' AND value = '__bar' Q6: SELECT * FROM type_name_value WHERE type = 3 AND name = '~~foo' AND value = '__bar' Q7: SELECT * FROM type_name_value WHERE type = 3000 AND name = '__foo' AND value = '__bar'
See runs below.
Number of rows SELECT COUNT(type) FROM type_name_value % nilsgri.type_name_value # table_name % count_no_nil_type # name % int # type % 8 # length [ 12165109 ] Query time: 0.066192 SELECT COUNT(type) FROM value % nilsgri.value # table_name % count_no_nil_type # name % int # type % 8 # length [ 12165109 ] Query time: 0.058111
The PRIMARY KEY of type_name_value is (type,name,value,doc,nenc,pos) The PRIMARY KEY of value is (value,doc,nenc,pos)
Q1: Search 3 first, match 3 first. SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = 'Jurgen Annevelink' % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 6, 17, 1, 12, 8, 3, 5 # length [ 3, "author", "Jurgen Annevelink", 1, "1.1.1.1", 4, 5, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.23579.2.1", 724888, 56, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.2487.3.1", 81996, 5, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.42.2.1", 1071, 5, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.533986.2.1", 17294725, 56, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.576139.3.1", 18745432, 56, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.944085.1.1", 30104242, 107, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.99173.2.1", 3191470, 56, true ] Query time: 1.643438
Q1: Repeat to check for disk cache effects. SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = 'Jurgen Annevelink' % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 6, 17, 1, 12, 8, 3, 5 # length [ 3, "author", "Jurgen Annevelink", 1, "1.1.1.1", 4, 5, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.23579.2.1", 724888, 56, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.2487.3.1", 81996, 5, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.42.2.1", 1071, 5, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.533986.2.1", 17294725, 56, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.576139.3.1", 18745432, 56, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.944085.1.1", 30104242, 107, true ] [ 3, "author", "Jurgen Annevelink", 1, "1.99173.2.1", 3191470, 56, true ] Query time: 1.615900
Q2: Search entire, match entire. SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = 'Jurgen Annevelink' AND doc = 1 AND nenc = '1.1.1.1' AND pos = 4 % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 6, 17, 1, 7, 1, 1, 5 # length [ 3, "author", "Jurgen Annevelink", 1, "1.1.1.1", 4, 5, true ] Query time: 11.073827
Q3: Use table value instead. SELECT * FROM value WHERE value = 'Jurgen Annevelink' AND type = 3 AND name = 'author' % nilsgri.value, nilsgri.value, nilsgri.value, nilsgri.value, nilsgri.value, nilsgri.value, nilsgri.value, nilsgri.value # table_name % value, doc, nenc, pos, type, name, pathid, complete # name % clob, int, varchar, int, int, varchar, int, boolean # type % 17, 1, 12, 8, 1, 6, 3, 5 # length [ "Jurgen Annevelink", 1, "1.1.1.1", 4, 3, "author", 5, true ] [ "Jurgen Annevelink", 1, "1.23579.2.1", 724888, 3, "author", 56, true ] [ "Jurgen Annevelink", 1, "1.2487.3.1", 81996, 3, "author", 5, true ] [ "Jurgen Annevelink", 1, "1.42.2.1", 1071, 3, "author", 5, true ] [ "Jurgen Annevelink", 1, "1.533986.2.1", 17294725, 3, "author", 56, true ] [ "Jurgen Annevelink", 1, "1.576139.3.1", 18745432, 3, "author", 56, true ] [ "Jurgen Annevelink", 1, "1.944085.1.1", 30104242, 3, "author", 107, true ] [ "Jurgen Annevelink", 1, "1.99173.2.1", 3191470, 3, "author", 56, true ] Query time: 0.000477
Q4: Search 3 first, match 2 first. SELECT * FROM type_name_value WHERE type = 3 AND name = 'author' AND value = '__bar' % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 0, 0, 1, 0, 1, 1, 5 # length Query time: 1.617259
Q5: Search 3 first, match 1 first. '!' early in alphabet. SELECT * FROM type_name_value WHERE type = 3 AND name = '!!foo' AND value = '__bar' % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 0, 0, 1, 0, 1, 1, 5 # length Query time: 0.530727
Q6: Search 3 first, match 1 first. '~' late in alphabet. SELECT * FROM type_name_value WHERE type = 3 AND name = '~~foo' AND value = '__bar' % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 0, 0, 1, 0, 1, 1, 5 # length Query time: 0.534592
Q7: Search 3 first, match 0 first. SELECT * FROM type_name_value WHERE type = 3000 AND name = '__foo' AND value = '__bar' % nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value, nilsgri.type_name_value # table_name % type, name, value, doc, nenc, pos, pathid, complete # name % int, varchar, clob, int, varchar, int, int, boolean # type % 1, 0, 0, 1, 0, 1, 1, 5 # length Query time: 0.000407
Number of rows with type and name mathcing. SELECT COUNT(type) FROM type_name_value WHERE type = 3 AND name = 'author' % nilsgri.type_name_value # table_name % count_no_nil_type # name % int # type % 7 # length [ 2730959 ] Query time: 0.683863
Number of rows with type mathcing. SELECT COUNT(type) FROM type_name_value WHERE type = 3 % nilsgri.type_name_value # table_name % count_no_nil_type # name % int # type % 8 # length [ 10020569 ] Query time: 0.032573
SELECT COUNT(type) FROM type_name_value WHERE type = 3 AND name < '!!foo' % nilsgri.type_name_value # table_name % count_no_nil_type # name % int # type % 1 # length [ 0 ] Query time: 0.161186
SELECT COUNT(type) FROM type_name_value WHERE type = 3 AND name < '~~foo' % nilsgri.type_name_value # table_name % count_no_nil_type # name % int # type % 8 # length [ 10020569 ] Query time: 0.341732
Number of rows with type mathcing.
real 0m18.372s user 0m0.005s sys 0m0.002s
------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javao... _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl