
Hi, Thank you for your reply. The system has 16 physical cores (2 sockets, 8 cores/sockets, 20MB L3 cache, 256KB L2) and 64GB of RAM. The MonetDB server uses 16 threads to serve my test database and it finds ~63GB of available memory. I used synthetic data to populate the two tables (with names A and B). The particular query I reported about in my previous e-mail generates a lot of output: approx. 500 million tuples. The join predicate is (A.a1 < B.b1) where columns A.a1 and B.b1 contain all unique integers from 1 to 32768. With a lower selectivity (say, 1%, by controlling the values inserted into A.a1 and B.b1), the query executes faster but its behaviour is the same: only 1 thread is utilised. Attached is a trace of the query. Hope this helps, Alexandros PS: Apologies for the separate thread; somehow I ended up subscribing in digest mode
Hi
This is insufficient information to shed light on it. Crucial information is system characteristics. And also a TRACE of your query can shed light on the issue.
Furthermore, is it a cold or hot execution of the query?
regards, Martin
Hi,
I am joining two tables, each with 32K tuples. The query runs for ~23-24 seconds on my (multi-core) machine.
During execution, however, only 1 core is utilised, while the rest of the CPU cores are idle. The cores are under-utilised although I have configured the MonetDB server to use all the physical cores of my machine (either by setting 'nthreads' or, in the case of mserver5, 'gdk_nr_threads').
I am running MonetDB server v11.21.13 "Jul2015-SP2", compiled from
On 20/01/16 13:24, Alexandros Koliousis wrote: source.
Is this due to some configuration setting I missed? Or, due to the
fact that I am running a single-operator query?
Alternatively, I was thinking of creating merge tables, thus
partitioning my tables manually, hoping that this will allow for intra-operator parallelism.
Thanks,
Alexandros
_______________________________________________ users-list mailing list users-list at monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hi You are using a theta-join (A.a1 < B.b1) where columns A.a1 and B.b1 , which means a database engine mostly has to fall back on a Cartesian product. This leads to a large intermediate result. To make it more efficient, I would have expected at least one other join term, e.g. SELECT count(*) FROM A, B WHERE A.k = B.k AND A.a1 < B.b1 regards, Martin On 21/01/16 09:01, Koliousis, Alexandros wrote:
Hi,
Thank you for your reply.
The system has 16 physical cores (2 sockets, 8 cores/sockets, 20MB L3 cache, 256KB L2) and 64GB of RAM. The MonetDB server uses 16 threads to serve my test database and it finds ~63GB of available memory.
I used synthetic data to populate the two tables (with names A and B). The particular query I reported about in my previous e-mail generates a lot of output: approx. 500 million tuples. The join predicate is (A.a1 < B.b1) where columns A.a1 and B.b1 contain all unique integers from 1 to 32768.
With a lower selectivity (say, 1%, by controlling the values inserted into A.a1 and B.b1), the query executes faster but its behaviour is the same: only 1 thread is utilised.
Attached is a trace of the query.
Hope this helps,
Alexandros
PS: Apologies for the separate thread; somehow I ended up subscribing in digest mode
Hi
This is insufficient information to shed light on it. Crucial information is system characteristics. And also a TRACE of your query can shed light on the issue.
Furthermore, is it a cold or hot execution of the query?
regards, Martin
On 20/01/16 13:24, Alexandros Koliousis wrote:
Hi,
I am joining two tables, each with 32K tuples. The query runs for ~23-24 seconds on my (multi-core) machine.
During execution, however, only 1 core is utilised, while the rest of the CPU cores are idle. The cores are under-utilised although I have configured the MonetDB server to use all the physical cores of my machine (either by setting 'nthreads' or, in the case of mserver5, 'gdk_nr_threads').
I am running MonetDB server v11.21.13 "Jul2015-SP2", compiled from source.
Is this due to some configuration setting I missed? Or, due to the fact that I am running a single-operator query?
Alternatively, I was thinking of creating merge tables, thus partitioning my tables manually, hoping that this will allow for intra-operator parallelism.
Thanks,
Alexandros
_______________________________________________ users-list mailing list users-list at monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hi, On 21/01/16 16:56, Martin Kersten wrote:
You are using a theta-join (A.a1 < B.b1) where columns A.a1 and B.b1 , which means a database engine mostly has to fall back on a Cartesian product. This leads to a large intermediate result.
To make it more efficient, I would have expected at least one other join term, e.g.
SELECT count(*) FROM A, B WHERE A.k = B.k AND A.a1 < B.b1
That's a valid point. I have reduced the selectivity of my query. At this point, however, I am more interested in data parallelism rather than absolute query execution time. In other words, I wanted to show that my query uses all available cores. I have achieved this by: a) partitioning my two tables manually by creating merge tables. b) reshaping my query to: select * from (TJ1 union all TJ2 ... union TJn) as tmp; where TJ(i) is a a theta-join operating on different pairs of table partitions. I can observe now (using tomograph, which by the way is an excellent tool) that the engine parallelises the sub-thetajoin operations across all cores. I notice a number of "bat.append" calls, which I presume put the partial results back together. Do these calls materialise the results in memory or the disk? Thanks, Alexandros
On 21/01/16 09:01, Koliousis, Alexandros wrote:
Hi,
Thank you for your reply.
The system has 16 physical cores (2 sockets, 8 cores/sockets, 20MB L3 cache, 256KB L2) and 64GB of RAM. The MonetDB server uses 16 threads to serve my test database and it finds ~63GB of available memory.
I used synthetic data to populate the two tables (with names A and B). The particular query I reported about in my previous e-mail generates a lot of output: approx. 500 million tuples. The join predicate is (A.a1 < B.b1) where columns A.a1 and B.b1 contain all unique integers from 1 to 32768.
With a lower selectivity (say, 1%, by controlling the values inserted into A.a1 and B.b1), the query executes faster but its behaviour is the same: only 1 thread is utilised.
Attached is a trace of the query.
Hope this helps,
Alexandros
PS: Apologies for the separate thread; somehow I ended up subscribing in digest mode
Hi
This is insufficient information to shed light on it. Crucial information is system characteristics. And also a TRACE of your query can shed light on the issue.
Furthermore, is it a cold or hot execution of the query?
regards, Martin
Hi,
I am joining two tables, each with 32K tuples. The query runs for ~23-24 seconds on my (multi-core) machine.
During execution, however, only 1 core is utilised, while the rest of the CPU cores are idle. The cores are under-utilised although I have configured the MonetDB server to use all the physical cores of my machine (either by setting 'nthreads' or, in the case of mserver5, 'gdk_nr_threads').
I am running MonetDB server v11.21.13 "Jul2015-SP2", compiled from
On 20/01/16 13:24, Alexandros Koliousis wrote: source.
Is this due to some configuration setting I missed? Or, due to the
fact that I am running a single-operator query?
Alternatively, I was thinking of creating merge tables, thus
partitioning my tables manually, hoping that this will allow for intra-operator parallelism.
Thanks,
Alexandros
_______________________________________________ users-list mailing list users-list at monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (3)
-
Alexandros Koliousis
-
Koliousis, Alexandros
-
Martin Kersten