New subject: JSON column disk space

6 Oct 2022

      Hello Alberto,

I am looking into this issue and I would like to give you a small update 
and ask a couple of questions.

Some background first. JSON objects are stored in MonetDB as strings. 
This creates overhead from the object keys and from numeric values that 
are now stored as strings. If I understand your issue correctly you 
don't have the chance to commit the reconstructed JSON column. Is this 
correct? From some tests I performed, I noticed that even before 
committing, disk usage goes up slowly. This is probably unrelated to 
JSON itself although I have not had a chance to verify this yet.

So I think the most relevant question at this stage is how long are the 
keys in the JSON objects? I think that these are the most likely cause 
for the large overhead you are observing.

Best regards,
Panos.

On 2022-09-28 21:17, Alberto Ferrari wrote:
...
Niels,
It's hard to me to create a demo script, but at leats I'd want to know
if it's a known "issue" (maybe it's not an issue, just as Monet works
internally)
I can tell you that a table with 100 columns (ints, varchars, dates,
etc) with 30 million records, may occupy 30 Gb of disk.
When I merge some of these columns into one new json column -(i.e. 5
single columns into one object like {col1:value, col2:value...}- and
dropped the 5 old columns (already merged in json), the table now
grows upto 100 Gb of disk.
Thanks
El jue, 22 sept 2022 a la(s) 10:35, Niels Nes
(niels.nes@monetdbsolutions.com) escribió:
...
Alberto
Do you have some script which replicates this issue?
And could you post the issue on the github issues?
Niels
On Thu, Sep 22, 2022 at 07:57:07AM -0300, Alberto Ferrari wrote:
...
Hi all:
Monet version 11.35.19 (Nov2019-SP3)
We realized that json columns in any table produces bigger stored
disk, no matter how many rows has big data.
It seems like depends on rows with different values in same json
column: if many rows has different values, then the space on disk
increases exponentially. So we can't use jaon columns in big tables,
since we run out of space disk in short time.
(also tested with 11.43.5 -Jan2022, same issue)
Is there any solution to this problem?
Tanks!
Alberto
_______________________________________________
users-list mailing list -- users-list@monetdb.org
To unsubscribe send an email to users-list-leave@monetdb.org
_______________________________________________
users-list mailing list -- users-list@monetdb.org
To unsubscribe send an email to users-list-leave@monetdb.org
_______________________________________________
users-list mailing list -- users-list@monetdb.org
To unsubscribe send an email to users-list-leave@monetdb.org

Re: JSON column disk space

Panagiotis Koutsourakis

Ariel Abadi

Sjoerd Mullender

Panagiotis Koutsourakis

Alberto Ferrari

tags

participants (4)