Hello all, To the purpose of evaluating MonetDB, I am trying to perform the following: Load two csv files into two tables, from disk A. Perform a JOIN operation on two fields, using disk B for the database. Pipe the result to a csv file, on disk A. This results in a very high disk B usage, and subsequent MT_mremap failure, as seen in merovingian.log: 2015-04-08 12:50:14 ERR google-trace[2737]: = gdk_posix.c:428: MT_mremap(./bat/06/675.tail,7e8738e90000,91992817664,110391328768): GDKextendf() failed Does somebody have ideas or explanations? Details follow: I am loading csv data from hard drive A. The files have size: -rw-r--r-- 1 fre fre 2.6G Apr 5 16:25 task_events_cut.csv -rw-r--r-- 1 fre fre 34G Apr 5 16:04 task_usage_cut.csv I create a database in a separate hard drive B (2TB): rm -rf /mnt/diskB/mdb mkdir /mnt/diskB/mdb/ monetdbd create /mnt/diskB/mdb monetdbd start /mnt/diskB/mdb monetdb create google-trace monetdb release google-trace I then load the data into the database: mclient ct_trace_events_reduced.sql -d google-trace pv /mnt/diskA/task_events_cut.csv |mclient -d google-trace -s "COPY INTO task_events_reduced FROM STDIN USING DELIMITERS ',','\\n'" - mclient ct_trace_usage_reduced.sql -d google-trace pv /mnt/diskA/task_usage_cut.csv |mclient -d google-trace -s "COPY INTO task_usage_reduced FROM STDIN USING DELIMITERS ',','\\n'" - using very standard scripts: $ cat ct_trace_events_reduced.sql ct_trace_usage_reduced.sql DROP TABLE task_events_reduced; CREATE TABLE task_events_reduced ( "job_id" BIGINT , "task_id" BIGINT , "class" SMALLINT , "priority" SMALLINT); DROP TABLE task_usage_reduced; CREATE TABLE task_usage_reduced ( "job_id" BIGINT , "task_id" BIGINT , "cpu_mean" FLOAT , "cpu_sample" FLOAT); These two operations take about 50 minutes, which is very reasonable. I then use mclient to do my join: mclient join.sql -d google-trace using the script: $ cat join.sql COPY (SELECT te.job_id, te.task_id, te.class, te.priority, tu.cpu_mean, tu.cpu_sample FROM (SELECT * FROM task_events_reduced )AS te RIGHT JOIN (SELECT * FROM task_usage_reduced )AS tu ON(te.job_id=tu.job_id AND te.task_id=tu.task_id) ) INTO '/diskA/join.csv' USING DELIMITERS ',','\n'; This results in more than three hours of data crunching on a google compute engine machine (16 processors and 100GB RAM), where disk B is being increasingly used, until it is full (2TB HDD). Then, the aforementionned error happens. I am not hoping that MonetDB would perform streaming I/O on the right file from the join. However the disk usage seems quite high. Is there a way to force MonetDB to do a hash join? Thanks a lot, Valentin Reis University of Warsaw