MonetDB :: Memory Blots Up On High Concurrency - users-list - monetdb.org

newer
Re: Re: Add entry in a table only...

MonetDB :: Memory Blots Up On High Concurrency

older
Monetdb Installation

Sharma, Sreejith

15 Jul 2017 15 Jul '17

5:31 a.m.

Hi, Does anyone encountered memory blot issue on high concurrency? Also, I don't see it coming down when idle. When its blots up, it's slows down the response time. Regards, Sreejith

Attachments:

attachment.html (text/html — 2.2 KB)

Reply

Sign in to reply online Use email software

Show replies by date

Stefano Piani

17 Jul 17 Jul

12:18 p.m.

Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again. I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04). This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem. Best regards, Stefano On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith wrote:

Hi,

Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.

Regards,

Sreejith

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Reply

Sign in to reply online Use email software

Ying Zhang

18 Jul 18 Jul

9:09 a.m.

On 17 Jul 2017, at 14:18, Stefano Piani wrote:

Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.

I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).

This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases

Hello Stefano, I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours. I’ll come back with more information for your issues in your other thread. Regards, Jennie

As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.

Best regards, Stefano

On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith wrote:

...
Hi,

Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.

Regards,

Sreejith

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Reply

Sign in to reply online Use email software

Stefano Piani

19 Jul 19 Jul

12:39 p.m.

Hello Jennie, I have found your last comment extremely interesting because inserting individual tuples into a table with primary key is exactly what I do most of the time in my application. Might you please elaborate further this concept? Do I have some alternatives? Let me explain exactly what I want to do. In my database I have two tables created in the following way: CREATE SEQUENCE "retrievalid_seq" as bigint; CREATE TABLE retrievals (id BIGINT DEFAULT NEXT VALUE FOR "retrievalid_seq", created_at TIMESTAMP, geolocalized BOOLEAN NOT NULL, latitude DECIMAL(8,5), longitude DECIMAL(8,5), user_id BIGINT NOT NULL, data VARCHAR(500), PRIMARY KEY(id)); CREATE TABLE retrieval_users (user_id BIGINT, user_name VARCHAR(100), user_description VARCHAR(500), user_location VARCHAR(300), PRIMARY KEY(user_id)); I receive about ten retrievals per second, and I want to to save them in my table "retrievals". Moreover, if a retrieval comes from a user that is not already registered in the "retrieval_users" table, I will also add one entry to that table. What is your advice? Do you think that I should somehow disable the index on the primary key of the tables (actually, I am not even sure that this is possible)? I tried to save the retrievals in bunches, so that each transaction saves one hundred or one thousand of retrievals, but it didn't help. Thank you for your time, Stefano On Tue, Jul 18, 2017 at 11:09 AM, Ying Zhang wrote:

...
On 17 Jul 2017, at 14:18, Stefano Piani wrote:

Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.

I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).

This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases

Hello Stefano,

I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours.

I’ll come back with more information for your issues in your other thread.

Regards, Jennie

...
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.

Best regards, Stefano

On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith wrote:

...
Hi,

Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.

Regards,

Sreejith

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Reply

Sign in to reply online Use email software

Ying Zhang

21 Jul 21 Jul

2:08 p.m.

On 19 Jul 2017, at 14:39, Stefano Piani wrote:

Hello Jennie, I have found your last comment extremely interesting because inserting individual tuples into a table with primary key is exactly what I do most of the time in my application. Might you please elaborate further this concept? Do I have some alternatives?

Hai Stefano, This is not something particular in MonetDB. It’s a fairly general situation: - Short transactions have relatively high overhead in an analytical DBMS such as MonetDB. I once ran a little experiment, executing ~20K individual INSERT INTOs (i.e. each in a transaction) took 3m53.353s, while wrapping all those INSERT INTOs in a single transaction took 0m0.779s in total to execute. - If a table has a primary key, for every INSERT (and UPDATE) that involves the key, MonetDB has to check if a key with the same value already exists. If the number of values in the table grows, It’s simple to check the “id” key of “retrievals”, because it’s a sequence, but the “user_id” key of “retrival_users” might require a full column scan, because the keys might not be sorted, and the INSERT will cause the indices to be discarded (if there were any). Therefore, in the other e-mail thread ("Add entry in a table only if that entry does not already exist”), I recommended to use the first option, because both your original python script and the SQL UDF cause the primary key column to be checked twice. In addition, the python script does two round trips to the database server for each INSERT. A general recommendation is to do bulk data load. For more information, please see here: https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData, and https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/BinaryBulkLoad

Let me explain exactly what I want to do. In my database I have two tables created in the following way: CREATE SEQUENCE "retrievalid_seq" as bigint;

CREATE TABLE retrievals (id BIGINT DEFAULT NEXT VALUE FOR "retrievalid_seq", created_at TIMESTAMP, geolocalized BOOLEAN NOT NULL, latitude DECIMAL(8,5), longitude DECIMAL(8,5), user_id BIGINT NOT NULL, data VARCHAR(500), PRIMARY KEY(id));

CREATE TABLE retrieval_users (user_id BIGINT, user_name VARCHAR(100), user_description VARCHAR(500), user_location VARCHAR(300), PRIMARY KEY(user_id));

I receive about ten retrievals per second, and I want to to save them in my table "retrievals". Moreover, if a retrieval comes from a user that is not already registered in the "retrieval_users" table, I will also add one entry to that table.

How are the “user_id”-s generated? 10 retrievals shouldn’t be a problem, even if you do the INSERTs in individual transactions. Unless you expect this rate to grow much higher, or there are many other transactions/quieries, your python script should be sufficient...

What is your advice? Do you think that I should somehow disable the index on the primary key of the tables (actually, I am not even sure that this is possible)? I tried to save the retrievals in bunches, so that each transaction saves one hundred or one thousand of retrievals, but it didn't help. Thank you for your time,

Normally, for users who are sure that the to-be-inserted tuples won’t violate the primary key constraint, we recommend them to i) disable the primary key constraint; ii) do batch insert; iii) re-add the primary key constraint. In this way, in step iii), MonetDB only conduct primary key checking once for the whole batch. Yeah, you can disable the primary key with an ALTER TABLE statement: sql>\d myusers CREATE TABLE "sys"."myusers" ( "user_id" INTEGER NOT NULL, "name" VARCHAR(20), "description" VARCHAR(20), "location" VARCHAR(20), CONSTRAINT "myusers_user_id_pkey" PRIMARY KEY ("user_id") ); sql>alter table myusers drop constraint "myusers_user_id_pkey" ; operation successful (12.123ms) To re-enable the primary key: sql>alter table myusers add primary key (user_id); operation successful (56.804ms) But this may not be useful for your application… Based on the information you gave so far, I’d still say the easiest way is to just always execute the INSERT and ignore the error. Regards, Jennie

Stefano

On Tue, Jul 18, 2017 at 11:09 AM, Ying Zhang wrote:

...
...
On 17 Jul 2017, at 14:18, Stefano Piani wrote:

Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.

I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).

This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases

Hello Stefano,

I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours.

I’ll come back with more information for your issues in your other thread.

Regards, Jennie

...
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.

Best regards, Stefano

On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith wrote:

...
Hi,

Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.

Regards,

Sreejith

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Reply

Sign in to reply online Use email software

Ying Zhang

2:11 p.m.

On 19 Jul 2017, at 14:39, Stefano Piani wrote:

Hello Jennie, I have found your last comment extremely interesting because inserting individual tuples into a table with primary key is exactly what I do most of the time in my application. Might you please elaborate further this concept? Do I have some alternatives?

Let me explain exactly what I want to do. In my database I have two tables created in the following way: CREATE SEQUENCE "retrievalid_seq" as bigint;

CREATE TABLE retrievals (id BIGINT DEFAULT NEXT VALUE FOR "retrievalid_seq", created_at TIMESTAMP, geolocalized BOOLEAN NOT NULL, latitude DECIMAL(8,5), longitude DECIMAL(8,5), user_id BIGINT NOT NULL, data VARCHAR(500), PRIMARY KEY(id));

CREATE TABLE retrieval_users (user_id BIGINT, user_name VARCHAR(100), user_description VARCHAR(500), user_location VARCHAR(300), PRIMARY KEY(user_id));

I receive about ten retrievals per second, and I want to to save them in my table "retrievals". Moreover, if a retrieval comes from a user that is not already registered in the "retrieval_users" table, I will also add one entry to that table.

What is your advice? Do you think that I should somehow disable the index on the primary key of the tables (actually, I am not even sure that this is possible)?

Hai Stefano,

I tried to save the retrievals in bunches, so that each transaction saves one hundred or one thousand of retrievals,

We’d definitely recommend such bulk operations.

but it didn't help.

What exactly didn’t it help? Performance? Error? Jennie

Thank you for your time,

Stefano

On Tue, Jul 18, 2017 at 11:09 AM, Ying Zhang wrote:

...
...
On 17 Jul 2017, at 14:18, Stefano Piani wrote:

Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.

I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).

This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases

Hello Stefano,

I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours.

I’ll come back with more information for your issues in your other thread.

Regards, Jennie

...
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.

Best regards, Stefano

On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith wrote:

...
Hi,

Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.

Regards,

Sreejith

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Reply

Sign in to reply online Use email software

Ying Zhang

18 Jul 18 Jul

9:05 a.m.

Hello Sreejith, Can you be a bit more specific about the memory blot issue you have encountered? - which version of MonetDB - hardware / VM environment - what (type of) queries? - what level of concurrency: rate and nr. concurrent users? - how much data are processed? - how much memory is consume by MonetDB? Growth of memory consumption? - What’s the growth of response time? Thanks! Jennie

On 15 Jul 2017, at 07:31, Sharma, Sreejith wrote:

Hi,

Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.

Regards, Sreejith

_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Reply

Sign in to reply online Use email software

2717

Age (days ago)

2723

Last active (days ago)

Download

6 comments

3 participants

tags

participants (3)

Sharma, Sreejith
Stefano Piani
Ying Zhang