On Mon, Jun 22, 2020 at 08:11:26PM +0400, Miro Mannino wrote:
Dear Niels Nes,
We would like to insert data column by column, for various reasons.
The problem is that if we create a larger table with column T1 and column T2, and insert 1000 rows on column T1, then the next 1000 insert on T2 are going to be appended, resulting in 1000 rows of (val, NULL) followed by another 1000 rows of (NULL, val). We could do that if there is an option to INSERT INTO from the first row as well (ignoring non-null values).
In general, we are trying to use MonetDB taking advantage of a column store DB, but we can't really insert a batch of values for one column after another column due to the usual semantic of INSERT INTO that works really in rows.
Looking forward to hearing from you,
Column by column inserts is indeed a problem with the sql language being row oriented. We do have some solution, ie binary copy into, but that maybe a bit too much for you. See https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/BinaryBulkLoad Niels
Miro
On Mon, Jun 22, 2020 at 8:02 PM Niels Nes
wrote: Min
Why don't you just create one larger table?
Niels
On 22 June 2020 14:59:44 CEST, Min Kyu Jung
wrote: To whom this may concern,
I am Min, and I am currently utilizing MonetDB for a project and I was wondering if there are any optimizations made in MonetDB such that two tables with the same number of rows can be joined together? For example, consider the following two table: CREATE TABLE t1(rowNumber, val); WITH SERIES AS( SELECT * FROM generate_series(0, 10000)) INSERT INTO t1 SELECT ROW_NUMBER() over (), rand() FROM series; CREATE TABLE t2(rowNumber, val); WITH SERIES AS( SELECT * FROM generate_series(0, 10000)) INSERT INTO t2 SELECT ROW_NUMBER() over (), rand() FROM series;
In this situation, I currently join the table as follows: CREATE TABLE t3(row#, val1, val2); INSERT INTO t3 SELECT ROW_NUMBER(), t1.val, t2.val FROM t1 INNER JOIN t2 on t1.rowNumber = t2.rowNumber;
Given that MonetDB is column-stored, I was wondering if there is a simpler/faster/more optimized way of stitching the two tables into one table.
To be clearer, as of right now, I make sure all tables have the rowNumber column. However, is there a way where I can join multiple tables (all with the same number of rows) without pre-generating rowNumber column for each table? An example is this: INSERT INTO t3 SELECT t1.val, t2.val FROM t1 INNER JOIN t2 on row_number(t1) = row_number(t2);
Sincerely, Min
-- Niels Nes, CTO, MonetDB Solutions url: https://www.monetdbsolutions.com e-mail: Niels.Nes@monetdbsolutions.com