Adding sum() to a query appears to disable multithreading/parallel execution
All, Not entirely sure how to submit a bug report for this, but it appears that adding a sum() term to a query completely disables parallel execution of the query. When I run this query on my server (11.27.13 built from source), linux RH6, 24 core, 190GB memory: select s.product, d.fctrdt, d.delinq, count(*), count(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq; It executes in ~20 seconds (the table stt contains ~20 million records, and the table dyn contains ~550 million records), and when I run top I can see the process using all 24 cores, but when I change the term count(d.realized) to sum(d.realized), the query now takes 5-6 minutes to run, and when I run top, the mserver process appears to use a single core for most of the query. select s.product, d.fctrdt, d.delinq, count(*), sum(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq; Is this expected behavior? Am I doing something wrong? Queries are both run on a hot server. Note: the explain from the query containing sum(d.realized) only contains 103 tuples, but the initial query with count(d.realized) contains 1052 tuples. Thanks, Dave Here are the definitions of the stt/dyn tables: create TABLE stt ( product char(12) NULL, prefix char(2) NULL, coupon float NULL, productionyear smallint NULL, cusip char(9) NULL, loanseqnum char(12) NULL, loanpurpose char(1) NULL, tpoflag char(1) NULL, proptype char(2) NULL, occstatus char(1) NULL, numunits smallint NULL, state char(2) NULL, cs smallint NULL, origterm smallint NULL, origltv float NULL, orignoterate float NULL, origloanamt float NULL, origwala smallint NULL, origserv char(8) NULL, govtagy char(1) NULL); create TABLE dyn ( fctrdt DATE NULL, loanseqnum char(12) NULL, scheduled float NULL, realized float NULL, smm float NULL, delinq int NULL, paidahead int NULL); And here is the Merovingian log: $ more merovingian.log 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:17:16 MSG merovingian[26418]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 stopped 2018-03-26 09:17:16 ERR merovingian[26418]: binding to stream socket port 50000 failed: Address already in use 2018-03-26 09:17:16 ERR merovingian[26418]: fatal startup condition encountered, aborting startup 2018-03-26 09:18:15 MSG merovingian[28537]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:18:15 MSG merovingian[28537]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on TCP socket localhost:50000 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2018-03-26 09:18:15 MSG discovery[28537]: listening for UDP messages on localhost:50000 2018-03-26 09:18:15 MSG control[28537]: accepting connections on UNIX domain socket /tmp/.s.merovingian.50000 2018-03-26 09:18:15 MSG control[28537]: (local): created database 'loan' 2018-03-26 09:18:15 MSG control[28537]: (local): served status list 2018-03-26 09:18:15 MSG control[28537]: (local): released database 'loan' 2018-03-26 09:18:15 MSG merovingian[28537]: starting database 'loan', up min/avg/max: 0s/0s/0s, crash average: 0.00 0.00 0.00 (0-0=0) 2018-03-26 09:18:15 MSG merovingian[28537]: proxying client (local) for database 'loan' to mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock?database=loan 2018-03-26 09:18:15 MSG merovingian[28537]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2018-03-26 09:18:15 MSG loan[28550]: arguments: /opt/tools/root/MonetDB-11.27.13/bin/mserver5 --dbpath=/opt/flash1/test/DB/loan --set merovingian_uri=mapi:monetdb://malxcs1p:50000/loan --set mapi_open=false --set mapi_port=0 - -set mapi_usock=/opt/flash1/test/DB/loan/.mapi.sock --set monet_vault_key=/opt/flash1/test/DB/loan/.vaultkey --set gdk_nr_threads=24 --set max_clients=64 --set sql_optimizer=default_pipe --set monet_daemon=yes 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB 5 server v11.27.13 "Jul2017-SP4" 2018-03-26 09:18:15 MSG loan[28550]: # Serving database 'loan', using 24 threads 2018-03-26 09:18:15 MSG loan[28550]: # Compiled for x86_64-pc-linux-gnu/64bit with 128bit integers 2018-03-26 09:18:15 MSG loan[28550]: # Found 189.145 GiB available main-memory. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) 1993 - July 2008 CWI. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved 2018-03-26 09:18:15 MSG loan[28550]: # Visit https://www.monetdb.org/ for further information 2018-03-26 09:18:15 MSG loan[28550]: # Listening for UNIX domain connection requests on mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB/SQL module loaded 2018-03-26 09:18:16 MSG loan[28550]: # SQL catalog created, loading sql scripts once 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 09_like.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 10_math.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 11_times.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 12_url.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 13_date.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 14_inet.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 15_querylog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 16_tracelog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 17_temporal.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 18_index.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 20_vacuum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 21_dependency_functions.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 22_clients.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 23_skyserver.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 25_debug.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 26_sysmon.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 27_rejects.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 41_md5sum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 45_uuid.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 46_profiler.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 51_sys_schema_extension.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 75_storagemodel.sql
The reason for the lack of parallelization in case of sum() is that the column is of type float. When summing floating point numbers we try to give the correct result, even if there are large differences between individual values (e.g. a column that has 1e37, 1e-37, and -1e37, the result should be 1e-37, not 0). For this reason, we use an algorithm that remembers all bits during summing, but we then cannot parallelize the sum since the bits that didn't fit in a float (or double) for the partial sums get lost. On 27/03/18 15:42, Anderson, David B wrote:
All,
Not entirely sure how to submit a bug report for this, but it appears that adding a sum() term to a query completely disables parallel execution of the query.
When I run this query on my server (11.27.13 built from source), linux RH6, 24 core, 190GB memory:
select s.product, d.fctrdt, d.delinq, count(*), count(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
It executes in ~20 seconds (the table stt contains ~20 million records, and the table dyn contains ~550 million records), and when I run top I can see the process using all 24 cores,
but when I change the term count(d.realized) to sum(d.realized), the query now takes 5-6 minutes to run, and when I run top, the mserver process appears to use a single core for most of the query.
select s.product, d.fctrdt, d.delinq, count(*), sum(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
Is this expected behavior? Am I doing something wrong? Queries are both run on a hot server.
Note: the explain from the query containing sum(d.realized) only contains 103 tuples, but the initial query with count(d.realized) contains 1052 tuples.
Thanks, Dave
Here are the definitions of the stt/dyn tables:
create TABLE stt ( product char(12) NULL, prefix char(2) NULL, coupon float NULL, productionyear smallint NULL, cusip char(9) NULL, loanseqnum char(12) NULL, loanpurpose char(1) NULL, tpoflag char(1) NULL, proptype char(2) NULL, occstatus char(1) NULL, numunits smallint NULL, state char(2) NULL, cs smallint NULL, origterm smallint NULL, origltv float NULL, orignoterate float NULL, origloanamt float NULL, origwala smallint NULL, origserv char(8) NULL, govtagy char(1) NULL);
create TABLE dyn ( fctrdt DATE NULL, loanseqnum char(12) NULL, scheduled float NULL, realized float NULL, smm float NULL, delinq int NULL, paidahead int NULL);
And here is the Merovingian log:
$ more merovingian.log 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:17:16 MSG merovingian[26418]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 stopped 2018-03-26 09:17:16 ERR merovingian[26418]: binding to stream socket port 50000 failed: Address already in use 2018-03-26 09:17:16 ERR merovingian[26418]: fatal startup condition encountered, aborting startup 2018-03-26 09:18:15 MSG merovingian[28537]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:18:15 MSG merovingian[28537]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on TCP socket localhost:50000 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2018-03-26 09:18:15 MSG discovery[28537]: listening for UDP messages on localhost:50000 2018-03-26 09:18:15 MSG control[28537]: accepting connections on UNIX domain socket /tmp/.s.merovingian.50000 2018-03-26 09:18:15 MSG control[28537]: (local): created database 'loan' 2018-03-26 09:18:15 MSG control[28537]: (local): served status list 2018-03-26 09:18:15 MSG control[28537]: (local): released database 'loan' 2018-03-26 09:18:15 MSG merovingian[28537]: starting database 'loan', up min/avg/max: 0s/0s/0s, crash average: 0.00 0.00 0.00 (0-0=0) 2018-03-26 09:18:15 MSG merovingian[28537]: proxying client (local) for database 'loan' to mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock?database=loan 2018-03-26 09:18:15 MSG merovingian[28537]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2018-03-26 09:18:15 MSG loan[28550]: arguments: /opt/tools/root/MonetDB-11.27.13/bin/mserver5 --dbpath=/opt/flash1/test/DB/loan --set merovingian_uri=mapi:monetdb://malxcs1p:50000/loan --set mapi_open=false --set mapi_port=0 - -set mapi_usock=/opt/flash1/test/DB/loan/.mapi.sock --set monet_vault_key=/opt/flash1/test/DB/loan/.vaultkey --set gdk_nr_threads=24 --set max_clients=64 --set sql_optimizer=default_pipe --set monet_daemon=yes 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB 5 server v11.27.13 "Jul2017-SP4" 2018-03-26 09:18:15 MSG loan[28550]: # Serving database 'loan', using 24 threads 2018-03-26 09:18:15 MSG loan[28550]: # Compiled for x86_64-pc-linux-gnu/64bit with 128bit integers 2018-03-26 09:18:15 MSG loan[28550]: # Found 189.145 GiB available main-memory. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) 1993 - July 2008 CWI. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved 2018-03-26 09:18:15 MSG loan[28550]: # Visit https://www.monetdb.org/ for further information 2018-03-26 09:18:15 MSG loan[28550]: # Listening for UNIX domain connection requests on mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB/SQL module loaded 2018-03-26 09:18:16 MSG loan[28550]: # SQL catalog created, loading sql scripts once 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 09_like.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 10_math.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 11_times.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 12_url.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 13_date.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 14_inet.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 15_querylog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 16_tracelog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 17_temporal.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 18_index.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 20_vacuum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 21_dependency_functions.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 22_clients.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 23_skyserver.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 25_debug.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 26_sysmon.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 27_rejects.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 41_md5sum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 45_uuid.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 46_profiler.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 51_sys_schema_extension.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 75_storagemodel.sql
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
participants (2)
-
Anderson, David B
-
Sjoerd Mullender