large virtual memory spike on BLOB column select

I am using MonetDB v11.27.9 on cent7. 1) 'select v_blob from blob_table' produces a virtual memory spike ~800Gb [using a 'limit' reproduces virtual memory spike too]
From 'select * from storage()' it looks a blob column size is ~25Gb.
2) And 'create table new_blob_table as (select v_blob from blob_table) with data;' produces only ~50Gb of virtual memory. 3) Also for a table with non-blob column both 'select * from nonblob_table' and 'create new_nonblobtable as (select * from nonblob_table) with data;' produce the ~same virtual memory consumption. p.s. Here are steps to reproduce this issue # to generate date by using Python hex_blob='47'*2000 hex_blob_chunk=(hex_blob+'\n')*1000 f=open('blob_data.txt','w') for irow in range(10000): f.write(hex_blob_chunk) f.close() # to load data into Monet blob_table create table blob_table(v_blob blob); COPY 10000000 RECORDS INTO blob_table FROM '/home/blob_data.txt' using delimiters ',','\n' NULL AS '' LOCKED; # to produce virtual memory spike of ~800GB select * from blob_table limit 50000; # to produce virtual memory of ~50Gb create table new_blob_table as (select * from blob_table) with data; Thanks, Anton

Please file as a bug at https://bugs.monetdb.org/. The problem is mitosis which splits up the table in pieces and then assembles them again before applying the limit. This produces a copy of the data. On 08/11/17 23:44, Anton Kravchenko wrote:
I am using MonetDB v11.27.9 on cent7.
1) 'select v_blob from blob_table' produces a virtual memory spike ~800Gb [using a 'limit' reproduces virtual memory spike too] From 'select * from storage()' it looks a blob column size is ~25Gb.
2) And 'create table new_blob_table as (select v_blob from blob_table) with data;' produces only ~50Gb of virtual memory.
3) Also for a table with non-blob column both 'select * from nonblob_table' and 'create new_nonblobtable as (select * from nonblob_table) with data;' produce the ~same virtual memory consumption.
p.s. Here are steps to reproduce this issue
# to generate date by using Python hex_blob='47'*2000 hex_blob_chunk=(hex_blob+'\n')*1000
f=open('blob_data.txt','w') for irow in range(10000): f.write(hex_blob_chunk) f.close()
# to load data into Monet blob_table create table blob_table(v_blob blob); COPY 10000000 RECORDS INTO blob_table FROM '/home/blob_data.txt' using delimiters ',','\n' NULL AS '' LOCKED;
# to produce virtual memory spike of ~800GB select * from blob_table limit 50000;
# to produce virtual memory of ~50Gb create table new_blob_table as (select * from blob_table) with data;
Thanks, Anton
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- Sjoerd Mullender
participants (2)
-
Anton Kravchenko
-
Sjoerd Mullender