[MonetDB-users] monetdb sql - same query different results
Hi, Same query produces different results: sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql> Please help.
Darius, to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also - the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect) - the shcema of your databasem; at least of the table involved in the given query - if possible at least some sample data Please also consider filing a bug report that contains all this information to reproduce the problem. Regrads, Stefan On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4199 |
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql> test.c15 has 3227 zip codes zip_list.zip has 696 zip codes sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql> root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log# Thanks Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Darius, thanks for the details. Given your 16 core machine, I suspect some parallelisation problem (race condition, scheduling, etc.). Could you please run the following tests and report the results (possibly zipped, as they might become larger)? 1) Start your mserver5 with `--set gdk_nr_threads=1` to force single-threaded query evaluation, run your query multiple times, and check whether you still get varying results. 2) Still with the single-threaded server, run your query twice with prefix "PLAN", twice with prefix "EXPLAIN" and twice with prefix "TRACE" and check whether each pair of results per prefix is identical or differs. For those prefixes where the two results are identical, send one output, for those prefixes where the two results differ, please send both. 3) Restart your mserver5 without `--set gdk_nr_threads=1`, i.e., in normal multi-threaded mode, again, and repeat step 2) also for that case. Thanks! Stefan On Tue, Apr 13, 2010 at 03:00:14PM -0400, dariuszs wrote:
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql>
test.c15 has 3227 zip codes zip_list.zip has 696 zip codes
sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql>
root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log#
Thanks
Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4199 |
I'll do that tomorrow In the mean time, if I change this select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); to this select count(*) from test t1 join zip_list t2 on (t1.c15=t2.zip) and c66 in ('01'); Everything is fine. It looks like the problem is with subquery. Dariusz. Stefan Manegold wrote:
Darius,
thanks for the details. Given your 16 core machine, I suspect some parallelisation problem (race condition, scheduling, etc.).
Could you please run the following tests and report the results (possibly zipped, as they might become larger)?
1) Start your mserver5 with `--set gdk_nr_threads=1` to force single-threaded query evaluation, run your query multiple times, and check whether you still get varying results.
2) Still with the single-threaded server, run your query twice with prefix "PLAN", twice with prefix "EXPLAIN" and twice with prefix "TRACE" and check whether each pair of results per prefix is identical or differs. For those prefixes where the two results are identical, send one output, for those prefixes where the two results differ, please send both.
3) Restart your mserver5 without `--set gdk_nr_threads=1`, i.e., in normal multi-threaded mode, again, and repeat step 2) also for that case.
Thanks!
Stefan
On Tue, Apr 13, 2010 at 03:00:14PM -0400, dariuszs wrote:
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql>
test.c15 has 3227 zip codes zip_list.zip has 696 zip codes
sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql>
root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log#
Thanks
Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
select count(*) from test t1 join zip_list t2 on (t1.c15=t2.zip) where t1.c66 in ('01'); (that's what I meant) dariuszs wrote:
I'll do that tomorrow In the mean time, if I change this
select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01');
to this select count(*) from test t1 join zip_list t2 on (t1.c15=t2.zip) and c66 in ('01');
Everything is fine. It looks like the problem is with subquery. Dariusz.
Stefan Manegold wrote:
Darius,
thanks for the details. Given your 16 core machine, I suspect some parallelisation problem (race condition, scheduling, etc.).
Could you please run the following tests and report the results (possibly zipped, as they might become larger)?
1) Start your mserver5 with `--set gdk_nr_threads=1` to force single-threaded query evaluation, run your query multiple times, and check whether you still get varying results.
2) Still with the single-threaded server, run your query twice with prefix "PLAN", twice with prefix "EXPLAIN" and twice with prefix "TRACE" and check whether each pair of results per prefix is identical or differs. For those prefixes where the two results are identical, send one output, for those prefixes where the two results differ, please send both.
3) Restart your mserver5 without `--set gdk_nr_threads=1`, i.e., in normal multi-threaded mode, again, and repeat step 2) also for that case.
Thanks!
Stefan
On Tue, Apr 13, 2010 at 03:00:14PM -0400, dariuszs wrote:
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql>
test.c15 has 3227 zip codes zip_list.zip has 696 zip codes
sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql>
root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log#
Thanks
Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi, Single thread does the trick, but is that the way I should run it all the time? I've got 2 cpu's with 4 cores each in this box (http://ark.intel.com/Product.aspx?id=37111). Thanks. Stefan Manegold wrote:
Darius,
thanks for the details. Given your 16 core machine, I suspect some parallelisation problem (race condition, scheduling, etc.).
Could you please run the following tests and report the results (possibly zipped, as they might become larger)?
1) Start your mserver5 with `--set gdk_nr_threads=1` to force single-threaded query evaluation, run your query multiple times, and check whether you still get varying results.
2) Still with the single-threaded server, run your query twice with prefix "PLAN", twice with prefix "EXPLAIN" and twice with prefix "TRACE" and check whether each pair of results per prefix is identical or differs. For those prefixes where the two results are identical, send one output, for those prefixes where the two results differ, please send both.
3) Restart your mserver5 without `--set gdk_nr_threads=1`, i.e., in normal multi-threaded mode, again, and repeat step 2) also for that case.
Thanks!
Stefan
On Tue, Apr 13, 2010 at 03:00:14PM -0400, dariuszs wrote:
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql>
test.c15 has 3227 zip codes zip_list.zip has 696 zip codes
sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql>
root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log#
Thanks
Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
On Thu, Apr 15, 2010 at 03:54:54PM -0400, dariuszs wrote:
Hi, Single thread does the trick, but is that the way I should run it all the time?
No, surely not! I only asked you to run single thread (and multi-threaded) and send us the requested results (PLAN: 'logical' query plan; EXPLAIN: 'physical' query plan; TRACE: detailed execution trace) for both, so that we can analyze the differences to find out what is going wrong in the multi-threaded case. Basically, since you have the data and the system running, I ask(ed) you to help us with what we call "debugging" or "forensics" or "CSI" ;-) Hope you don't mind "being exploited" like this ... ;-)
I've got 2 cpu's with 4 cores each in this box
... and hyperthreading enabled; at least mserver5 reports 16 cores --- and yes, both you and we want ot exploit them and keep them busy, while still getting correct results ;-)
(http://ark.intel.com/Product.aspx?id=37111). Thanks.
Stefan
Stefan Manegold wrote:
Darius,
thanks for the details. Given your 16 core machine, I suspect some parallelisation problem (race condition, scheduling, etc.).
Could you please run the following tests and report the results (possibly zipped, as they might become larger)?
1) Start your mserver5 with `--set gdk_nr_threads=1` to force single-threaded query evaluation, run your query multiple times, and check whether you still get varying results.
2) Still with the single-threaded server, run your query twice with prefix "PLAN", twice with prefix "EXPLAIN" and twice with prefix "TRACE" and check whether each pair of results per prefix is identical or differs. For those prefixes where the two results are identical, send one output, for those prefixes where the two results differ, please send both.
3) Restart your mserver5 without `--set gdk_nr_threads=1`, i.e., in normal multi-threaded mode, again, and repeat step 2) also for that case.
Thanks!
Stefan
On Tue, Apr 13, 2010 at 03:00:14PM -0400, dariuszs wrote:
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql>
test.c15 has 3227 zip codes zip_list.zip has 696 zip codes
sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql>
root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log#
Thanks
Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4199 |
Hi, I'll do that as soon as I can. Dariusz. Stefan Manegold wrote:
On Thu, Apr 15, 2010 at 03:54:54PM -0400, dariuszs wrote:
Hi, Single thread does the trick, but is that the way I should run it all the time?
No, surely not!
I only asked you to run single thread (and multi-threaded) and send us the requested results (PLAN: 'logical' query plan; EXPLAIN: 'physical' query plan; TRACE: detailed execution trace) for both, so that we can analyze the differences to find out what is going wrong in the multi-threaded case.
Basically, since you have the data and the system running, I ask(ed) you to help us with what we call "debugging" or "forensics" or "CSI" ;-)
Hope you don't mind "being exploited" like this ... ;-)
I've got 2 cpu's with 4 cores each in this box
... and hyperthreading enabled; at least mserver5 reports 16 cores --- and yes, both you and we want ot exploit them and keep them busy, while still getting correct results ;-)
(http://ark.intel.com/Product.aspx?id=37111). Thanks.
Stefan
Stefan Manegold wrote:
Darius,
thanks for the details. Given your 16 core machine, I suspect some parallelisation problem (race condition, scheduling, etc.).
Could you please run the following tests and report the results (possibly zipped, as they might become larger)?
1) Start your mserver5 with `--set gdk_nr_threads=1` to force single-threaded query evaluation, run your query multiple times, and check whether you still get varying results.
2) Still with the single-threaded server, run your query twice with prefix "PLAN", twice with prefix "EXPLAIN" and twice with prefix "TRACE" and check whether each pair of results per prefix is identical or differs. For those prefixes where the two results are identical, send one output, for those prefixes where the two results differ, please send both.
3) Restart your mserver5 without `--set gdk_nr_threads=1`, i.e., in normal multi-threaded mode, again, and repeat step 2) also for that case.
Thanks!
Stefan
On Tue, Apr 13, 2010 at 03:00:14PM -0400, dariuszs wrote:
sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +-------+ | L31 | +=======+ | 26152 | +-------+ 1 tuple Timer 14.874 msec 1 rows sql>select count(*) from test where c15 in (select zip from zip_list) and c66 in ('01'); +--------+ | L31 | +========+ | 103081 | +--------+ 1 tuple Timer 12.718 msec 1 rows sql>\d test CREATE TABLE "sys"."test" ( "c15" varchar(5), "c66" varchar(2) ); sql>\d zip_list CREATE TABLE "sys"."zip_list" ( "zip" varchar(5) ); sql>
test.c15 has 3227 zip codes zip_list.zip has 696 zip codes
sql>select c66,count(*) from test group by c66; +------+---------+ | c66 | L4 | +======+=========+ | 12 | 2097484 | | 14 | 2431062 | | 13 | 276387 | | 11 | 135711 | | 16 | 303217 | | 15 | 12048 | | 02 | 989701 | | 03 | 407467 | | 06 | 735483 | | 01 | 470888 | | 05 | 211905 | | 08 | 464530 | | 04 | 1260159 | | 07 | 115491 | | 10 | 77411 | | 09 | 11056 | +------+---------+ 16 tuples Timer 48.639 msec 16 rows sql>
root@monetdb:/var/log# mserver5 --version MonetDB server v5.18.3 (64-bit), based on kernel v1.36.3 (64-bit oids) Copyright (c) 1993-July 2008 CWI Copyright (c) August 2008-2010 MonetDB B.V., all rights reserved Visit http://monetdb.cwi.nl/ for further information Found 47.3GiB available memory, 16 available cpu cores Configured for prefix: /usr Libraries: libpcre: 7.8 2008-09-05 (compiled with 7.8) openssl: OpenSSL 0.9.8g 19 Oct 2007 (compiled with OpenSSL 0.9.8g 19 Oct 2007) libxml2: 2.7.5 (compiled with 2.7.5) Compiled by: root@ubuntu-karmic-64 (x86_64-pc-linux-gnu) Compilation: gcc -O2 -Wall -O2 -O6 -fomit-frame-pointer -finline-functions -falign-loops=4 -falign-jumps=4 -falign-functions=4 -fexpensive-optimizations -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -ftree-vectorize Linking : ld -IPA -m elf_x86_64 -Wl,-Bsymbolic-functions root@monetdb:/var/log#
Thanks
Stefan Manegold wrote:
Darius,
to have any chance to help, we'd need to be able to reproduce the problem, which means we'd require sufficient information to do so. Such information includes (in addition to the queries you gave) also
- the version of MonetDB that your using and what kind of system you're running on (the complete output of `mserver5 --version` would be perfrect)
- the shcema of your databasem; at least of the table involved in the given query
- if possible at least some sample data
Please also consider filing a bug report that contains all this information to reproduce the problem.
Regrads, Stefan
On Tue, Apr 13, 2010 at 02:11:23PM -0400, dariuszs wrote:
Hi, Same query produces different results:
sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 14945 | +-------+ 1 tuple Timer 69.446 msec 1 rows sql>select count(*) from plk100 WHERE 0=0 and c15 in (select zip from zip_list) and ( c102 in ('1','0')) and ( c28 in ('1','0')) and ( c66 in ('01')); +-------+ | L73 | +=======+ | 57257 | +-------+ 1 tuple Timer 146.666 msec 1 rows sql>
Please help.
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (2)
-
dariuszs
-
Stefan Manegold