Join memory overhead, COPY INTO csv, MT_mremap failure.

Valentin Reis

8 Apr 2015 8 Apr '15

2:01 p.m.

Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following: Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A. This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed Does somebody have ideas or explanations? Details follow: I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv I create a database in a separate hard drive B (2TB): rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace I then load the data into the database: mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" - using very standard scripts: $ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced; CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced; CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT); These two operations take about 50 minutes, which is very reasonable. I then use mclient to do my join: mclient join.sql -d google-trace using the script: $ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens. I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join? Thanks a lot, Valentin Reis University of Warsaw

Show replies by date

Valentin Reis

8 Apr 8 Apr

2:21 p.m.

More information: This is using MonetDB packages on Debian Wheezy :) Thanks, Valentin Quoting Valentin Reis (2015-04-08 16:01:29)

...

Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n';

This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw

Niels Nes

5:16 p.m.

On Wed, Apr 08, 2015 at 04:01:29PM +0200, Valentin Reis wrote:

...

Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; Why are you using a RIGHT 'OUTER' join? Outer joins are much more expensive then innen joins.

Niels

...

This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Manager ITF, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: https://www.cwi.nl/people/niels e-mail: Niels.Nes@cwi.nl

Valentin Reis

7:05 p.m.

Quoting Niels Nes (2015-04-08 19:16:09)

...

On Wed, Apr 08, 2015 at 04:01:29PM +0200, Valentin Reis wrote:

...
Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; Why are you using a RIGHT 'OUTER' join? Outer joins are much more expensive then innen joins.

Niels Hello,

denote: A=task_usage_reduced B=task_events_reduced Morally, the goal is to augment rows of A -big table- with some information (class and priority) from B -smaller table-. I am not sure whether B does contain information for all (job_id,task_id) entries of A. In the cases where a row from A wouldn't have an entry in B, I would like that the row be present in the output, only with NA's instead of the information from B. Therefore: RIGHT JOIN. Please do correct me again if I am mistaken, the SQL class is far away :) What I hoped from MonetDB (and what I still hope to get :) ) is the following behavior, denoting n the length of A, m the length of B: -build a hash-table of B indexed by job_id and task_id (somehow O(m)) -augment each row of A by doing a O(1) lookup in the hash table. -> O(n) Therefore, O(m+n). processing time with O(m+n) memory usage (hashtable+result). I am nonetheless looking at the performance of the inner join now. Cheers, Valentin

...

...
This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Manager ITF, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: https://www.cwi.nl/people/niels e-mail: Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Martin Kersten

7:14 p.m.

Hi MonetDB will automatically create the hashes if that is consider worthwhile for the join operations. These hashes are non-persistent (in the stable releases) and it is known that there are corner cases where the hash is build on the wrong join leg. Calling the server with --algorithms will show you some details on what decisions are made. In general, understanding the behavior of a system should be a multistep approach, from tiny, small,... huge tables. Then the SQL TRACE command will show you where time is actually spent. This is particularly relevant when queries become more complex. regards, Martin On 08/04/15 21:05, Valentin Reis wrote:

...

Quoting Niels Nes (2015-04-08 19:16:09)

...
On Wed, Apr 08, 2015 at 04:01:29PM +0200, Valentin Reis wrote:

...
Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; Why are you using a RIGHT 'OUTER' join? Outer joins are much more expensive then innen joins.

Niels Hello,

denote: A=task_usage_reduced B=task_events_reduced

Morally, the goal is to augment rows of A -big table- with some information (class and priority) from B -smaller table-.

I am not sure whether B does contain information for all (job_id,task_id) entries of A. In the cases where a row from A wouldn't have an entry in B, I would like that the row be present in the output, only with NA's instead of the information from B. Therefore: RIGHT JOIN. Please do correct me again if I am mistaken, the SQL class is far away :)

What I hoped from MonetDB (and what I still hope to get :) ) is the following behavior, denoting n the length of A, m the length of B: -build a hash-table of B indexed by job_id and task_id (somehow O(m)) -augment each row of A by doing a O(1) lookup in the hash table. -> O(n) Therefore, O(m+n). processing time with O(m+n) memory usage (hashtable+result).

I am nonetheless looking at the performance of the inner join now.

Cheers, Valentin

...
...
This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Manager ITF, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: https://www.cwi.nl/people/niels e-mail: Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Stefan Manegold

8:08 p.m.

Valentin, some more question for clarification: task_usage_reduced (task_usage_cut.csv) is your larger table? task_events_reduced (task_events_cut.csv) is your smaller table? How many rows does each table have? Is (job_id, task_id) unique in both tables (or at least in the smaller table), i.e. is select count(*) from (select job_id, task_id from task_events_reduced group by job_id, task_id) as x; equal to select count(*) from task_events_reduced; and select count(*) from (select job_id, task_id from task_usage_reduced group by job_id, task_id) as x; equal to select count(*) from task_usage_reduced; ? Could you share the "PLAN" and "EXPLAIN" of your query? (Well, of the inner part without "COPY ... INTO ...") Could you try to run (SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) ); and report behaviour and result? Thanks! Stefan ----- Original Message -----

...

Quoting Niels Nes (2015-04-08 19:16:09)

...
On Wed, Apr 08, 2015 at 04:01:29PM +0200, Valentin Reis wrote:

...
Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; Why are you using a RIGHT 'OUTER' join? Outer joins are much more expensive then innen joins.

Niels Hello,

denote: A=task_usage_reduced B=task_events_reduced

Morally, the goal is to augment rows of A -big table- with some information (class and priority) from B -smaller table-.

I am not sure whether B does contain information for all (job_id,task_id) entries of A. In the cases where a row from A wouldn't have an entry in B, I would like that the row be present in the output, only with NA's instead of the information from B. Therefore: RIGHT JOIN. Please do correct me again if I am mistaken, the SQL class is far away :)

What I hoped from MonetDB (and what I still hope to get :) ) is the following behavior, denoting n the length of A, m the length of B: -build a hash-table of B indexed by job_id and task_id (somehow O(m)) -augment each row of A by doing a O(1) lookup in the hash table. -> O(n) Therefore, O(m+n). processing time with O(m+n) memory usage (hashtable+result).

I am nonetheless looking at the performance of the inner join now.

Cheers, Valentin

...
...
This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Manager ITF, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: https://www.cwi.nl/people/niels e-mail: Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Valentin Reis

9 Apr 9 Apr

9:17 a.m.

Hello, @Niels: I ran the same query using an INNER JOIN. MonetDB shows the same behavior: disk IO until the 2TB are used, then a crash on the same function (observed in merovingian.log). @Martin: Is the --algorithms command supposed to be used with monetdb or monetdbd? I can't seem to find it in the manpage. @Stefan:

...

task_usage_reduced (task_usage_cut.csv) is your larger table? task_events_reduced (task_events_cut.csv) is your smaller table? Correct.

...

How many rows does each table have? select count(*) from (select job_id, task_id from task_events_reduced group by job_id, task_id) as x; 25424731 select count(*) from task_events_reduced; 144648288 select count(*) from (select job_id, task_id from task_usage_reduced group by job_id, task_id) as x; 24281242 select count(*) from task_usage_reduced; 1232799308

...

Could you share the "PLAN" and "EXPLAIN" of your query? (Well, of the inner part without "COPY ... INTO ...") I ran PLAN and EXPLAIN on the following query: (SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) ); The results are attached.

...

Could you try to run

(SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) ); The query does not finish. Same behavior as always: the disk containing the mdb database fills up until the disk is full. then, the aforementionned error happens.

To my understanding of the RIGHT JOIN, it should have exactly the same result as: SELECT count(*) FROM task_usage_reduced; Shouldn't it? Please do correct me if I am wrong on this one :) It is surprising to us: shouldn't the optimizer reduce to this much simpler query? Thanks all for the very responsive support, Valentin & Krzysztof. Quoting Stefan Manegold (2015-04-08 22:08:08)

...

Valentin,

some more question for clarification:

task_usage_reduced (task_usage_cut.csv) is your larger table?

task_events_reduced (task_events_cut.csv) is your smaller table?

How many rows does each table have?

Is (job_id, task_id) unique in both tables (or at least in the smaller table), i.e. is select count(*) from (select job_id, task_id from task_events_reduced group by job_id, task_id) as x; equal to select count(*) from task_events_reduced; and select count(*) from (select job_id, task_id from task_usage_reduced group by job_id, task_id) as x; equal to select count(*) from task_usage_reduced; ?

Could you share the "PLAN" and "EXPLAIN" of your query? (Well, of the inner part without "COPY ... INTO ...")

Could you try to run

(SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) );

and report behaviour and result?

Thanks! Stefan

----- Original Message -----

...
Quoting Niels Nes (2015-04-08 19:16:09)

...
On Wed, Apr 08, 2015 at 04:01:29PM +0200, Valentin Reis wrote:

...
Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; Why are you using a RIGHT 'OUTER' join? Outer joins are much more expensive then innen joins.

Niels Hello,

denote: A=task_usage_reduced B=task_events_reduced

Morally, the goal is to augment rows of A -big table- with some information (class and priority) from B -smaller table-.

I am not sure whether B does contain information for all (job_id,task_id) entries of A. In the cases where a row from A wouldn't have an entry in B, I would like that the row be present in the output, only with NA's instead of the information from B. Therefore: RIGHT JOIN. Please do correct me again if I am mistaken, the SQL class is far away :)

What I hoped from MonetDB (and what I still hope to get :) ) is the following behavior, denoting n the length of A, m the length of B: -build a hash-table of B indexed by job_id and task_id (somehow O(m)) -augment each row of A by doing a O(1) lookup in the hash table. -> O(n) Therefore, O(m+n). processing time with O(m+n) memory usage (hashtable+result).

I am nonetheless looking at the performance of the inner join now.

Cheers, Valentin

...
...
This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Manager ITF, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: https://www.cwi.nl/people/niels e-mail: Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Stefan Manegold

9:37 a.m.

...

Hello,

@Niels: I ran the same query using an INNER JOIN. MonetDB shows the same behavior: disk IO until the 2TB are used, then a crash on the same function (observed in merovingian.log).

@Martin: Is the --algorithms command supposed to be used with monetdb or monetdbd? I can't seem to find it in the manpage.

@Stefan:

...
task_usage_reduced (task_usage_cut.csv) is your larger table? task_events_reduced (task_events_cut.csv) is your smaller table? Correct.

...
How many rows does each table have? select count(*) from (select job_id, task_id from task_events_reduced group by job_id, task_id) as x; 25424731 select count(*) from task_events_reduced; 144648288 select count(*) from (select job_id, task_id from task_usage_reduced group by job_id, task_id) as x; 24281242 select count(*) from task_usage_reduced; 1232799308

aha, roughly on average each job_id, task_id pair occurs in task_events_reduced 144648288/25424731 ~= 5.7 times, thus, each tuples from task_usage_reduced that find a match in task_events_reduced actually finds ~5.7 matches on average; plus all that don't find a match; hence, the result is larger than task_usage_reduced; in fact both longer (more tuples) and wider (more attributes).

...

...
Could you share the "PLAN" and "EXPLAIN" of your query? (Well, of the inner part without "COPY ... INTO ...") I ran PLAN and EXPLAIN on the following query: (SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) ); The results are attached.

unfortunately not. Forgotten?

...

...
Could you try to run

(SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) ); The query does not finish. Same behavior as always: the disk containing the mdb database fills up until the disk is full. then, the aforementionned error happens.

To my understanding of the RIGHT JOIN, it should have exactly the same result as: SELECT count(*) FROM task_usage_reduced; Shouldn't it? Please do correct me if I am wrong on this one :)

nope. It's a join, thus all multiple matches are in the result, plus all non-matches once.

...

It is surprising to us: shouldn't the optimizer reduce to this much simpler query?

Reduce to which simple query? Stefan

...

Thanks all for the very responsive support, Valentin & Krzysztof.

Quoting Stefan Manegold (2015-04-08 22:08:08)

...
Valentin,

some more question for clarification:

task_usage_reduced (task_usage_cut.csv) is your larger table?

task_events_reduced (task_events_cut.csv) is your smaller table?

How many rows does each table have?

Is (job_id, task_id) unique in both tables (or at least in the smaller table), i.e. is select count(*) from (select job_id, task_id from task_events_reduced group by job_id, task_id) as x; equal to select count(*) from task_events_reduced; and select count(*) from (select job_id, task_id from task_usage_reduced group by job_id, task_id) as x; equal to select count(*) from task_usage_reduced; ?

Could you share the "PLAN" and "EXPLAIN" of your query? (Well, of the inner part without "COPY ... INTO ...")

Could you try to run

(SELECT count(*) FROM (SELECT job_id, task_id FROM task_events_reduced) AS te RIGHT JOIN (SELECT job_id, task_id FROM task_usage_reduced) AS tu ON (te.job_id=tu.job_id AND te.task_id=tu.task_id) );

and report behaviour and result?

Thanks! Stefan

----- Original Message -----

...
Quoting Niels Nes (2015-04-08 19:16:09)

...
On Wed, Apr 08, 2015 at 04:01:29PM +0200, Valentin Reis wrote:

...
Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following:

Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A.

This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed

Does somebody have ideas or explanations? Details follow:

I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv

I create a database in a separate hard drive B (2TB):

rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace

I then load the data into the database:

mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" -

using very standard scripts:

$ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced;

CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced;

CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT);

These two operations take about 50 minutes, which is very reasonable.

I then use mclient to do my join:

mclient join.sql -d google-trace

using the script:

$ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; Why are you using a RIGHT 'OUTER' join? Outer joins are much more expensive then innen joins.

Niels Hello,

denote: A=task_usage_reduced B=task_events_reduced

Morally, the goal is to augment rows of A -big table- with some information (class and priority) from B -smaller table-.

I am not sure whether B does contain information for all (job_id,task_id) entries of A. In the cases where a row from A wouldn't have an entry in B, I would like that the row be present in the output, only with NA's instead of the information from B. Therefore: RIGHT JOIN. Please do correct me again if I am mistaken, the SQL class is far away :)

What I hoped from MonetDB (and what I still hope to get :) ) is the following behavior, denoting n the length of A, m the length of B: -build a hash-table of B indexed by job_id and task_id (somehow O(m)) -augment each row of A by doing a O(1) lookup in the hash table. -> O(n) Therefore, O(m+n). processing time with O(m+n) memory usage (hashtable+result).

I am nonetheless looking at the performance of the inner join now.

Cheers, Valentin

...
...
This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens.

I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join?

Thanks a lot, Valentin Reis University of Warsaw _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- Niels Nes, Manager ITF, Centrum Wiskunde & Informatica (CWI) Science Park 123, 1098 XG Amsterdam, The Netherlands room L3.14, phone ++31 20 592-4098 sip:4098@sip.cwi.nl url: https://www.cwi.nl/people/niels e-mail: Niels.Nes@cwi.nl

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

3551

Age (days ago)

3552

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

Martin Kersten
Niels Nes
Stefan Manegold
Valentin Reis