Hello MonetDB Users,

In short:
1) What does the nthreads setting mean?
2) Why does performance increase as you increase the number of database instances of a single farm? Is there any way to avoid this?




In long:
What does the nthreads setting mean? From the manpage[1] it's the number of worker threads that perform main processing. Is this the total number of threads for that database instance? Or is it per query? I've compared the performance of different settings. I find it strange that nthreads=8 would perform the best because I have 12 cores. I confirmed the number of cores by checking nproc. Here are my plots showing that 8 threads performs best on my system:

Throughput: https://cs.uwaterloo.ca/~jmate/nthreads-and-database-instances/throughput-vs-num-clients-vs-nthreads.pdf
Average Response Ttime: https://cs.uwaterloo.ca/~jmate/nthreads-and-database-instances/rsptm-avg-vs-num-clients-vs-nthreads.pdf
99th Percentile Response Time: https://cs.uwaterloo.ca/~jmate/nthreads-and-database-instances/rsptm-p99-vs-num-clients-vs-nthreads.pdf



The second thing is: why does the performance increase if I distribute the conncurent clients over multiple database instances on the same data farm? I expected the opposite; performance should decrease. I thought I would increase the overhead by adding more database instances to the same farm. Is there any way to avoid this? Below are the plots I have. I used 40 concurrent TPC-H clients for all:

Throughput: https://cs.uwaterloo.ca/~jmate/nthreads-and-database-instances/throughput-vs-num-dbs-vs-nthreads.pdf
Average Response Time: https://cs.uwaterloo.ca/~jmate/nthreads-and-database-instances/rsptm-avg-vs-num-dbs-vs-nthreads.pdf
99th Percentile Response Time: https://cs.uwaterloo.ca/~jmate/nthreads-and-database-instances/rsptm-p99-vs-num-dbs-vs-nthreads.pdf




Context:
To give you guys some context on my project: I'm a Master's student doing my research on DBaaS tenant placement. I am evaluating some placement algorithms by using the TPC-H workload and MonetDB as the database. I am using separate database instances on a single farm for isolation.

The workload I'm testing with runs read only queries from the TPC-H benchmark. Each TPC-H client is a thread each with its own persistent connection to the database running the following pseudocode:

while true
  for queryNum 1 ... 22 # except query 15 which creates a tmp table
    run queury queryNum

Each database instance has 100MB of data.

Thank you for your time,
Joseph

References:
[1] https://www.monetdb.org/Documentation/monetdb-man-page