INSERT using multiple cores (inspired by COPY from CSV in LOCKED mode)
Hi there, 1) It looks like INSERTing into tables is bounded by a single CPU. And it kills performance of a query that normally uses multiple CPUs... CREATE TEMP TABLE t1 AS (some 'select query' that uses multiple cores when run standalone) ON COMMIT PRESERVE ROWS; or INSERT into t1 (some 'select query' that uses multiple cores when run standalone); 2) Do you think you could implement INSERT with LOCKED option? (like it was done for COPY from CSV in LOCKED mode) Please let me know whether you think it's doable. Thank you, Anton
Here is a reproducible example. 1) select count(t1.v1) from t1, t2 where t1.v1=t2.v1; # run time=4.6s, 4 CPUs are used 2) create temp table t3 as (select count(t1.v1) from t1, t2 where t1.v1=t2.v1) on commit preserve rows; # run time=15s, 1CPU is used Tables t1, t2 were created from: create or replace function test1() RETURNS TABLE (v0 INT, v1 INT) LANGUAGE PYTHON { import numpy as np nrows=100000000 i_var=np.random.randint(1,10000000, (100000000,)) return [[i for i in range(nrows)],i_var] }; create table t1 as (select * from test1()) with data; create table t2 as (select distinct v1 from t1) with data; I am using MonetDB 5 server v11.29.3 "Mar2018" (64-bit, 128-bit integers) Copyright (c) 1993 - July 2008 CWI Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved Visit https://www.monetdb.org/ for further information Found 31.3GiB available memory, 4 available cpu cores Libraries: libpcre: 8.32 2012-11-30 (compiled with 8.32) openssl: OpenSSL 1.0.2k 26 Jan 2017 (compiled with OpenSSL 1.0.2l 25 May 2017) libxml2: 2.9.1 (compiled with 2.9.1) Compiled by: akravchenko@cent7-1 (x86_64-pc-linux-gnu) Compilation: gcc -std=gnu99 -O3 -fomit-frame-pointer -pipe -D_FORTIFY_SOURCE=2 Linking : /usr/bin/ld -m elf_x86_64 -Wl,-Bsymbolic-functions Thanks, Anton On Thu, Apr 12, 2018 at 9:51 AM, Anton Kravchenko < kravchenko.anton86@gmail.com> wrote:
Hi there,
1) It looks like INSERTing into tables is bounded by a single CPU. And it kills performance of a query that normally uses multiple CPUs...
CREATE TEMP TABLE t1 AS (some 'select query' that uses multiple cores when run standalone) ON COMMIT PRESERVE ROWS; or INSERT into t1 (some 'select query' that uses multiple cores when run standalone);
2) Do you think you could implement INSERT with LOCKED option? (like it was done for COPY from CSV in LOCKED mode)
Please let me know whether you think it's doable.
Thank you, Anton
participants (1)
-
Anton Kravchenko