RE: Adding sum() to a query appears to disable multithreading/parallel execution
But couldn't the exact summation operate in parallel on each of the sub-groups coming from the group by statement? I guess I was thinking that part of the computation would be split up as well. Would it be difficult to add a psum() function that would drop bits (so understood to be an approximation), but still operate in parallel? Thanks, Dave -----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Sjoerd Mullender Sent: Tuesday, March 27, 2018 10:36 AM To: users-list@monetdb.org Subject: Re: Adding sum() to a query appears to disable multithreading/parallel execution The reason for the lack of parallelization in case of sum() is that the column is of type float. When summing floating point numbers we try to give the correct result, even if there are large differences between individual values (e.g. a column that has 1e37, 1e-37, and -1e37, the result should be 1e-37, not 0). For this reason, we use an algorithm that remembers all bits during summing, but we then cannot parallelize the sum since the bits that didn't fit in a float (or double) for the partial sums get lost. On 27/03/18 15:42, Anderson, David B wrote:
All,
Not entirely sure how to submit a bug report for this, but it appears that adding a sum() term to a query completely disables parallel execution of the query.
When I run this query on my server (11.27.13 built from source), linux RH6, 24 core, 190GB memory:
select s.product, d.fctrdt, d.delinq, count(*), count(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
It executes in ~20 seconds (the table stt contains ~20 million records, and the table dyn contains ~550 million records), and when I run top I can see the process using all 24 cores,
but when I change the term count(d.realized) to sum(d.realized), the query now takes 5-6 minutes to run, and when I run top, the mserver process appears to use a single core for most of the query.
select s.product, d.fctrdt, d.delinq, count(*), sum(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
Is this expected behavior? Am I doing something wrong? Queries are both run on a hot server.
Note: the explain from the query containing sum(d.realized) only contains 103 tuples, but the initial query with count(d.realized) contains 1052 tuples.
Thanks, Dave
Here are the definitions of the stt/dyn tables:
create TABLE stt ( product char(12) NULL, prefix char(2) NULL, coupon float NULL, productionyear smallint NULL, cusip char(9) NULL, loanseqnum char(12) NULL, loanpurpose char(1) NULL, tpoflag char(1) NULL, proptype char(2) NULL, occstatus char(1) NULL, numunits smallint NULL, state char(2) NULL, cs smallint NULL, origterm smallint NULL, origltv float NULL, orignoterate float NULL, origloanamt float NULL, origwala smallint NULL, origserv char(8) NULL, govtagy char(1) NULL);
create TABLE dyn ( fctrdt DATE NULL, loanseqnum char(12) NULL, scheduled float NULL, realized float NULL, smm float NULL, delinq int NULL, paidahead int NULL);
And here is the Merovingian log:
$ more merovingian.log 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:17:16 MSG merovingian[26418]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 stopped 2018-03-26 09:17:16 ERR merovingian[26418]: binding to stream socket port 50000 failed: Address already in use 2018-03-26 09:17:16 ERR merovingian[26418]: fatal startup condition encountered, aborting startup 2018-03-26 09:18:15 MSG merovingian[28537]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:18:15 MSG merovingian[28537]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on TCP socket localhost:50000 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2018-03-26 09:18:15 MSG discovery[28537]: listening for UDP messages on localhost:50000 2018-03-26 09:18:15 MSG control[28537]: accepting connections on UNIX domain socket /tmp/.s.merovingian.50000 2018-03-26 09:18:15 MSG control[28537]: (local): created database 'loan' 2018-03-26 09:18:15 MSG control[28537]: (local): served status list 2018-03-26 09:18:15 MSG control[28537]: (local): released database 'loan' 2018-03-26 09:18:15 MSG merovingian[28537]: starting database 'loan', up min/avg/max: 0s/0s/0s, crash average: 0.00 0.00 0.00 (0-0=0) 2018-03-26 09:18:15 MSG merovingian[28537]: proxying client (local) for database 'loan' to mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock?database=loan 2018-03-26 09:18:15 MSG merovingian[28537]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2018-03-26 09:18:15 MSG loan[28550]: arguments: /opt/tools/root/MonetDB-11.27.13/bin/mserver5 --dbpath=/opt/flash1/test/DB/loan --set merovingian_uri=mapi:monetdb://malxcs1p:50000/loan --set mapi_open=false --set mapi_port=0 - -set mapi_usock=/opt/flash1/test/DB/loan/.mapi.sock --set monet_vault_key=/opt/flash1/test/DB/loan/.vaultkey --set gdk_nr_threads=24 --set max_clients=64 --set sql_optimizer=default_pipe --set monet_daemon=yes 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB 5 server v11.27.13 "Jul2017-SP4" 2018-03-26 09:18:15 MSG loan[28550]: # Serving database 'loan', using 24 threads 2018-03-26 09:18:15 MSG loan[28550]: # Compiled for x86_64-pc-linux-gnu/64bit with 128bit integers 2018-03-26 09:18:15 MSG loan[28550]: # Found 189.145 GiB available main-memory. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) 1993 - July 2008 CWI. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved 2018-03-26 09:18:15 MSG loan[28550]: # Visit https://www.monetdb.org/ for further information 2018-03-26 09:18:15 MSG loan[28550]: # Listening for UNIX domain connection requests on mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB/SQL module loaded 2018-03-26 09:18:16 MSG loan[28550]: # SQL catalog created, loading sql scripts once 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 09_like.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 10_math.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 11_times.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 12_url.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 13_date.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 14_inet.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 15_querylog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 16_tracelog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 17_temporal.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 18_index.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 20_vacuum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 21_dependency_functions.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 22_clients.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 23_skyserver.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 25_debug.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 26_sysmon.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 27_rejects.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 41_md5sum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 45_uuid.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 46_profiler.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 51_sys_schema_extension.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 75_storagemodel.sql
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
David, while currently not implemented, both options you suggest are in principle possible, with varying coding effort. However, they both come with caveats: psum(): - it is not part of the SQL standard - it will be hard (if not impossible) to assess the amount of approximation, i.e., the size of the error in the calculated result. parallelism by group: - (maximal) degree of parallelism is limited by the number of groups (of course, only a problem if there are very few groups) - (perfect) load-balancing between groups is difficult (NP-hard) - partitioning by groups requires an extra copy of the entire column currently, mitosis-based parallelism simply "slices" the input columns horizontally (irrespective of the actual data) and thus (1) does not require any (extra) data movement/copy, (2) allows (almost) arbitrary degree of parallelism (well, expect when correctness prevents such straight-forward parallelism as in the given case), and (2) "ensures" (almost) perfect load-balancing at virtually no cost. Having say that, we are looking into data-partitioned parallelism (rather than simple slice-based parallelism), and then your suggestion of parallelism by group is to be considered as well. Best, Stefan ----- On Mar 27, 2018, at 4:59 PM, Anderson, David B david.b.anderson@citi.com wrote:
But couldn't the exact summation operate in parallel on each of the sub-groups coming from the group by statement? I guess I was thinking that part of the computation would be split up as well.
Would it be difficult to add a psum() function that would drop bits (so understood to be an approximation), but still operate in parallel?
Thanks, Dave
-----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Sjoerd Mullender Sent: Tuesday, March 27, 2018 10:36 AM To: users-list@monetdb.org Subject: Re: Adding sum() to a query appears to disable multithreading/parallel execution
The reason for the lack of parallelization in case of sum() is that the column is of type float.
When summing floating point numbers we try to give the correct result, even if there are large differences between individual values (e.g. a column that has 1e37, 1e-37, and -1e37, the result should be 1e-37, not 0). For this reason, we use an algorithm that remembers all bits during summing, but we then cannot parallelize the sum since the bits that didn't fit in a float (or double) for the partial sums get lost.
On 27/03/18 15:42, Anderson, David B wrote:
All,
Not entirely sure how to submit a bug report for this, but it appears that adding a sum() term to a query completely disables parallel execution of the query.
When I run this query on my server (11.27.13 built from source), linux RH6, 24 core, 190GB memory:
select s.product, d.fctrdt, d.delinq, count(*), count(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
It executes in ~20 seconds (the table stt contains ~20 million records, and the table dyn contains ~550 million records), and when I run top I can see the process using all 24 cores,
but when I change the term count(d.realized) to sum(d.realized), the query now takes 5-6 minutes to run, and when I run top, the mserver process appears to use a single core for most of the query.
select s.product, d.fctrdt, d.delinq, count(*), sum(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
Is this expected behavior? Am I doing something wrong? Queries are both run on a hot server.
Note: the explain from the query containing sum(d.realized) only contains 103 tuples, but the initial query with count(d.realized) contains 1052 tuples.
Thanks, Dave
Here are the definitions of the stt/dyn tables:
create TABLE stt ( product char(12) NULL, prefix char(2) NULL, coupon float NULL, productionyear smallint NULL, cusip char(9) NULL, loanseqnum char(12) NULL, loanpurpose char(1) NULL, tpoflag char(1) NULL, proptype char(2) NULL, occstatus char(1) NULL, numunits smallint NULL, state char(2) NULL, cs smallint NULL, origterm smallint NULL, origltv float NULL, orignoterate float NULL, origloanamt float NULL, origwala smallint NULL, origserv char(8) NULL, govtagy char(1) NULL);
create TABLE dyn ( fctrdt DATE NULL, loanseqnum char(12) NULL, scheduled float NULL, realized float NULL, smm float NULL, delinq int NULL, paidahead int NULL);
And here is the Merovingian log:
$ more merovingian.log 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:17:16 MSG merovingian[26418]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 stopped 2018-03-26 09:17:16 ERR merovingian[26418]: binding to stream socket port 50000 failed: Address already in use 2018-03-26 09:17:16 ERR merovingian[26418]: fatal startup condition encountered, aborting startup 2018-03-26 09:18:15 MSG merovingian[28537]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:18:15 MSG merovingian[28537]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on TCP socket localhost:50000 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2018-03-26 09:18:15 MSG discovery[28537]: listening for UDP messages on localhost:50000 2018-03-26 09:18:15 MSG control[28537]: accepting connections on UNIX domain socket /tmp/.s.merovingian.50000 2018-03-26 09:18:15 MSG control[28537]: (local): created database 'loan' 2018-03-26 09:18:15 MSG control[28537]: (local): served status list 2018-03-26 09:18:15 MSG control[28537]: (local): released database 'loan' 2018-03-26 09:18:15 MSG merovingian[28537]: starting database 'loan', up min/avg/max: 0s/0s/0s, crash average: 0.00 0.00 0.00 (0-0=0) 2018-03-26 09:18:15 MSG merovingian[28537]: proxying client (local) for database 'loan' to mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock?database=loan 2018-03-26 09:18:15 MSG merovingian[28537]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2018-03-26 09:18:15 MSG loan[28550]: arguments: /opt/tools/root/MonetDB-11.27.13/bin/mserver5 --dbpath=/opt/flash1/test/DB/loan --set merovingian_uri=mapi:monetdb://malxcs1p:50000/loan --set mapi_open=false --set mapi_port=0 - -set mapi_usock=/opt/flash1/test/DB/loan/.mapi.sock --set monet_vault_key=/opt/flash1/test/DB/loan/.vaultkey --set gdk_nr_threads=24 --set max_clients=64 --set sql_optimizer=default_pipe --set monet_daemon=yes 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB 5 server v11.27.13 "Jul2017-SP4" 2018-03-26 09:18:15 MSG loan[28550]: # Serving database 'loan', using 24 threads 2018-03-26 09:18:15 MSG loan[28550]: # Compiled for x86_64-pc-linux-gnu/64bit with 128bit integers 2018-03-26 09:18:15 MSG loan[28550]: # Found 189.145 GiB available main-memory. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) 1993 - July 2008 CWI. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved 2018-03-26 09:18:15 MSG loan[28550]: # Visit https://www.monetdb.org/ for further information 2018-03-26 09:18:15 MSG loan[28550]: # Listening for UNIX domain connection requests on mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB/SQL module loaded 2018-03-26 09:18:16 MSG loan[28550]: # SQL catalog created, loading sql scripts once 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 09_like.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 10_math.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 11_times.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 12_url.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 13_date.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 14_inet.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 15_querylog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 16_tracelog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 17_temporal.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 18_index.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 20_vacuum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 21_dependency_functions.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 22_clients.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 23_skyserver.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 25_debug.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 26_sysmon.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 27_rejects.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 41_md5sum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 45_uuid.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 46_profiler.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 51_sys_schema_extension.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 75_storagemodel.sql
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
I know Monet keeps a lot of metadata around, and one thing I have done in a similar system is to leverage any existing grouping key ordering to partition for parallelism. It is not load balanced and vulnerable to skew, but it is not serial either.
On Mar 28, 2018, at 07:07, Stefan Manegold
wrote: parallelism by group: - (maximal) degree of parallelism is limited by the number of groups (of course, only a problem if there are very few groups) - (perfect) load-balancing between groups is difficult (NP-hard) - partitioning by groups requires an extra copy of the entire column currently, mitosis-based parallelism simply "slices" the input columns horizontally (irrespective of the actual data) and thus (1) does not require any (extra) data movement/copy, (2) allows (almost) arbitrary degree of parallelism (well, expect when correctness prevents such straight-forward parallelism as in the given case), and (2) "ensures" (almost) perfect load-balancing at virtually no cost.
Having say that, we are looking into data-partitioned parallelism (rather than simple slice-based parallelism), and then your suggestion of parallelism by group is to be considered as well.
Best regards, ------------------------------------------------- Richard Wesley Senior Research Scientist Tableau Software t: 206.633.3400 x6335249 f: 206.633.3004 e: hawkfish@tableau.com
Stefan, Thank you for the explanation. Dave -----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Stefan Manegold Sent: Wednesday, March 28, 2018 10:08 AM To: Communication channel for MonetDB users Subject: Re: Adding sum() to a query appears to disable multithreading/parallel execution David, while currently not implemented, both options you suggest are in principle possible, with varying coding effort. However, they both come with caveats: psum(): - it is not part of the SQL standard - it will be hard (if not impossible) to assess the amount of approximation, i.e., the size of the error in the calculated result. parallelism by group: - (maximal) degree of parallelism is limited by the number of groups (of course, only a problem if there are very few groups) - (perfect) load-balancing between groups is difficult (NP-hard) - partitioning by groups requires an extra copy of the entire column currently, mitosis-based parallelism simply "slices" the input columns horizontally (irrespective of the actual data) and thus (1) does not require any (extra) data movement/copy, (2) allows (almost) arbitrary degree of parallelism (well, expect when correctness prevents such straight-forward parallelism as in the given case), and (2) "ensures" (almost) perfect load-balancing at virtually no cost. Having say that, we are looking into data-partitioned parallelism (rather than simple slice-based parallelism), and then your suggestion of parallelism by group is to be considered as well. Best, Stefan ----- On Mar 27, 2018, at 4:59 PM, Anderson, David B david.b.anderson@citi.com wrote:
But couldn't the exact summation operate in parallel on each of the sub-groups coming from the group by statement? I guess I was thinking that part of the computation would be split up as well.
Would it be difficult to add a psum() function that would drop bits (so understood to be an approximation), but still operate in parallel?
Thanks, Dave
-----Original Message----- From: users-list [mailto:users-list-bounces+david.b.anderson=citi.com@monetdb.org] On Behalf Of Sjoerd Mullender Sent: Tuesday, March 27, 2018 10:36 AM To: users-list@monetdb.org Subject: Re: Adding sum() to a query appears to disable multithreading/parallel execution
The reason for the lack of parallelization in case of sum() is that the column is of type float.
When summing floating point numbers we try to give the correct result, even if there are large differences between individual values (e.g. a column that has 1e37, 1e-37, and -1e37, the result should be 1e-37, not 0). For this reason, we use an algorithm that remembers all bits during summing, but we then cannot parallelize the sum since the bits that didn't fit in a float (or double) for the partial sums get lost.
On 27/03/18 15:42, Anderson, David B wrote:
All,
Not entirely sure how to submit a bug report for this, but it appears that adding a sum() term to a query completely disables parallel execution of the query.
When I run this query on my server (11.27.13 built from source), linux RH6, 24 core, 190GB memory:
select s.product, d.fctrdt, d.delinq, count(*), count(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
It executes in ~20 seconds (the table stt contains ~20 million records, and the table dyn contains ~550 million records), and when I run top I can see the process using all 24 cores,
but when I change the term count(d.realized) to sum(d.realized), the query now takes 5-6 minutes to run, and when I run top, the mserver process appears to use a single core for most of the query.
select s.product, d.fctrdt, d.delinq, count(*), sum(d.realized) from dyn d, stt s where s.loanseqnum = d.loanseqnum group by s.product, d.fctrdt, d.delinq order by s.product, d.fctrdt, d.delinq;
Is this expected behavior? Am I doing something wrong? Queries are both run on a hot server.
Note: the explain from the query containing sum(d.realized) only contains 103 tuples, but the initial query with count(d.realized) contains 1052 tuples.
Thanks, Dave
Here are the definitions of the stt/dyn tables:
create TABLE stt ( product char(12) NULL, prefix char(2) NULL, coupon float NULL, productionyear smallint NULL, cusip char(9) NULL, loanseqnum char(12) NULL, loanpurpose char(1) NULL, tpoflag char(1) NULL, proptype char(2) NULL, occstatus char(1) NULL, numunits smallint NULL, state char(2) NULL, cs smallint NULL, origterm smallint NULL, origltv float NULL, orignoterate float NULL, origloanamt float NULL, origwala smallint NULL, origserv char(8) NULL, govtagy char(1) NULL);
create TABLE dyn ( fctrdt DATE NULL, loanseqnum char(12) NULL, scheduled float NULL, realized float NULL, smm float NULL, delinq int NULL, paidahead int NULL);
And here is the Merovingian log:
$ more merovingian.log 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:17:16 MSG merovingian[26418]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:17:16 MSG merovingian[26418]: Merovingian 1.7 stopped 2018-03-26 09:17:16 ERR merovingian[26418]: binding to stream socket port 50000 failed: Address already in use 2018-03-26 09:17:16 ERR merovingian[26418]: fatal startup condition encountered, aborting startup 2018-03-26 09:18:15 MSG merovingian[28537]: Merovingian 1.7 (Jul2017-SP4) starting 2018-03-26 09:18:15 MSG merovingian[28537]: monitoring dbfarm /opt/flash1/test/DB 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on TCP socket localhost:50000 2018-03-26 09:18:15 MSG merovingian[28537]: accepting connections on UNIX domain socket /tmp/.s.monetdb.50000 2018-03-26 09:18:15 MSG discovery[28537]: listening for UDP messages on localhost:50000 2018-03-26 09:18:15 MSG control[28537]: accepting connections on UNIX domain socket /tmp/.s.merovingian.50000 2018-03-26 09:18:15 MSG control[28537]: (local): created database 'loan' 2018-03-26 09:18:15 MSG control[28537]: (local): served status list 2018-03-26 09:18:15 MSG control[28537]: (local): released database 'loan' 2018-03-26 09:18:15 MSG merovingian[28537]: starting database 'loan', up min/avg/max: 0s/0s/0s, crash average: 0.00 0.00 0.00 (0-0=0) 2018-03-26 09:18:15 MSG merovingian[28537]: proxying client (local) for database 'loan' to mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock?database=loan 2018-03-26 09:18:15 MSG merovingian[28537]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying 2018-03-26 09:18:15 MSG loan[28550]: arguments: /opt/tools/root/MonetDB-11.27.13/bin/mserver5 --dbpath=/opt/flash1/test/DB/loan --set merovingian_uri=mapi:monetdb://malxcs1p:50000/loan --set mapi_open=false --set mapi_port=0 - -set mapi_usock=/opt/flash1/test/DB/loan/.mapi.sock --set monet_vault_key=/opt/flash1/test/DB/loan/.vaultkey --set gdk_nr_threads=24 --set max_clients=64 --set sql_optimizer=default_pipe --set monet_daemon=yes 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB 5 server v11.27.13 "Jul2017-SP4" 2018-03-26 09:18:15 MSG loan[28550]: # Serving database 'loan', using 24 threads 2018-03-26 09:18:15 MSG loan[28550]: # Compiled for x86_64-pc-linux-gnu/64bit with 128bit integers 2018-03-26 09:18:15 MSG loan[28550]: # Found 189.145 GiB available main-memory. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) 1993 - July 2008 CWI. 2018-03-26 09:18:15 MSG loan[28550]: # Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved 2018-03-26 09:18:15 MSG loan[28550]: # Visit https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_ &d=DwICAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_GYjk1edsyLJlNaxMxYxBJsviF3JXYv wDK42uy5KWU&m=dteTlzDsyubyQ7CHmAjvXoB96QBfced4aZs3EOAWqqY&s=DlnuisS0G 2F0pw4Nem6oCRIN1JcYO3vKwcjVnuBI3CE&e= for further information 2018-03-26 09:18:15 MSG loan[28550]: # Listening for UNIX domain connection requests on mapi:monetdb:///opt/flash1/test/DB/loan/.mapi.sock 2018-03-26 09:18:15 MSG loan[28550]: # MonetDB/SQL module loaded 2018-03-26 09:18:16 MSG loan[28550]: # SQL catalog created, loading sql scripts once 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 09_like.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 10_math.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 11_times.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 12_url.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 13_date.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 14_inet.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 15_querylog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 16_tracelog.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 17_temporal.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 18_index.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 20_vacuum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 21_dependency_functions.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 22_clients.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 23_skyserver.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 25_debug.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 26_sysmon.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 27_rejects.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 39_analytics_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 40_json_hge.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 41_md5sum.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 45_uuid.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 46_profiler.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 51_sys_schema_extension.sql 2018-03-26 09:18:16 MSG loan[28550]: # loading sql script: 75_storagemodel.sql
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_ mailman_listinfo_users-2Dlist&d=DwICAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_G Yjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=dteTlzDsyubyQ7CHmAjvXoB96Q Bfced4aZs3EOAWqqY&s=HyMR7a6SGEjDpKZDyD3hOsQIUPjFBMqj1X5CPtmK2Q8&e=
-- Sjoerd Mullender
_______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_m ailman_listinfo_users-2Dlist&d=DwICAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_GYj k1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=dteTlzDsyubyQ7CHmAjvXoB96QBfc ed4aZs3EOAWqqY&s=HyMR7a6SGEjDpKZDyD3hOsQIUPjFBMqj1X5CPtmK2Q8&e=
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | https://urldefense.proofpoint.com/v2/url?u=http-3A__www.CWI.nl_-7Emane | gold_&d=DwICAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_GYjk1edsyLJlNaxMxYxBJsviF3 | JXYvwDK42uy5KWU&m=dteTlzDsyubyQ7CHmAjvXoB96QBfced4aZs3EOAWqqY&s=Q0UJEi | YRH0DL2b1ahE9vmtbp-iE_DVNhbgpcEJFDnTc&e= | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_mailman_listinfo_users-2Dlist&d=DwICAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=8_GYjk1edsyLJlNaxMxYxBJsviF3JXYvwDK42uy5KWU&m=dteTlzDsyubyQ7CHmAjvXoB96QBfced4aZs3EOAWqqY&s=HyMR7a6SGEjDpKZDyD3hOsQIUPjFBMqj1X5CPtmK2Q8&e=
participants (3)
-
Anderson, David B
-
Richard Wesley
-
Stefan Manegold