MonetDB :: Memory Blots Up On High Concurrency
Hi, Does anyone encountered memory blot issue on high concurrency? Also, I don't see it coming down when idle. When its blots up, it's slows down the response time. Regards, Sreejith
Hello,
I think that I have faced the same issue. I have a database with a
table that contains a few millions of entries. I have been trying to
add about ten new entries per second on that table. At the beginning,
the database was able to keep the pace with the data but, after a
while, the database starts to consume more and more ram and to slow
down. In that situation, it also becomes extremely slows to answer to
the queries. If I restart the database, everything seems to work fine
again.
I am using Ubuntu 16.04 and the monetdb instance is running inside a
docker (again with ubuntu 16.04).
This is what I have noticed so far:
* It took a while before the problem starts to appear
* The bigger the table the sooner the problem appears
* The time required for an insertion is more or less the same, the
time required for a query increases
As soon I can, I will try to prepare a few scripts to check if I can
simulate the problem that I have with my application. Then, I will
give you a more accurate description of the problem.
Best regards,
Stefano
On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith
Hi,
Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.
Regards,
Sreejith
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
On 17 Jul 2017, at 14:18, Stefano Piani
wrote: Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.
I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).
This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases
Hello Stefano, I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours. I’ll come back with more information for your issues in your other thread. Regards, Jennie
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.
Best regards, Stefano
On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith
wrote: Hi,
Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.
Regards,
Sreejith
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hello Jennie,
I have found your last comment extremely interesting because inserting
individual tuples into a table with primary key is exactly what I do
most of the time in my application. Might you please elaborate further
this concept? Do I have some alternatives?
Let me explain exactly what I want to do. In my database I have two
tables created in the following way:
CREATE SEQUENCE "retrievalid_seq" as bigint;
CREATE TABLE retrievals (id BIGINT DEFAULT NEXT VALUE FOR
"retrievalid_seq", created_at TIMESTAMP, geolocalized BOOLEAN NOT
NULL, latitude DECIMAL(8,5), longitude DECIMAL(8,5), user_id BIGINT
NOT NULL, data VARCHAR(500), PRIMARY KEY(id));
CREATE TABLE retrieval_users (user_id BIGINT, user_name VARCHAR(100),
user_description VARCHAR(500), user_location VARCHAR(300), PRIMARY
KEY(user_id));
I receive about ten retrievals per second, and I want to to save them
in my table "retrievals". Moreover, if a retrieval comes from a user
that is not already registered in the "retrieval_users" table, I will
also add one entry to that table.
What is your advice? Do you think that I should somehow disable the
index on the primary key of the tables (actually, I am not even sure
that this is possible)?
I tried to save the retrievals in bunches, so that each transaction
saves one hundred or one thousand of retrievals, but it didn't help.
Thank you for your time,
Stefano
On Tue, Jul 18, 2017 at 11:09 AM, Ying Zhang
On 17 Jul 2017, at 14:18, Stefano Piani
wrote: Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.
I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).
This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases
Hello Stefano,
I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours.
I’ll come back with more information for your issues in your other thread.
Regards, Jennie
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.
Best regards, Stefano
On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith
wrote: Hi,
Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.
Regards,
Sreejith
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
On 19 Jul 2017, at 14:39, Stefano Piani
wrote: Hello Jennie, I have found your last comment extremely interesting because inserting individual tuples into a table with primary key is exactly what I do most of the time in my application. Might you please elaborate further this concept? Do I have some alternatives?
Hai Stefano, This is not something particular in MonetDB. It’s a fairly general situation: - Short transactions have relatively high overhead in an analytical DBMS such as MonetDB. I once ran a little experiment, executing ~20K individual INSERT INTOs (i.e. each in a transaction) took 3m53.353s, while wrapping all those INSERT INTOs in a single transaction took 0m0.779s in total to execute. - If a table has a primary key, for every INSERT (and UPDATE) that involves the key, MonetDB has to check if a key with the same value already exists. If the number of values in the table grows, It’s simple to check the “id” key of “retrievals”, because it’s a sequence, but the “user_id” key of “retrival_users” might require a full column scan, because the keys might not be sorted, and the INSERT will cause the indices to be discarded (if there were any). Therefore, in the other e-mail thread ("Add entry in a table only if that entry does not already exist”), I recommended to use the first option, because both your original python script and the SQL UDF cause the primary key column to be checked twice. In addition, the python script does two round trips to the database server for each INSERT. A general recommendation is to do bulk data load. For more information, please see here: https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData, and https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/BinaryBulkLoad
Let me explain exactly what I want to do. In my database I have two tables created in the following way: CREATE SEQUENCE "retrievalid_seq" as bigint;
CREATE TABLE retrievals (id BIGINT DEFAULT NEXT VALUE FOR "retrievalid_seq", created_at TIMESTAMP, geolocalized BOOLEAN NOT NULL, latitude DECIMAL(8,5), longitude DECIMAL(8,5), user_id BIGINT NOT NULL, data VARCHAR(500), PRIMARY KEY(id));
CREATE TABLE retrieval_users (user_id BIGINT, user_name VARCHAR(100), user_description VARCHAR(500), user_location VARCHAR(300), PRIMARY KEY(user_id));
I receive about ten retrievals per second, and I want to to save them in my table "retrievals". Moreover, if a retrieval comes from a user that is not already registered in the "retrieval_users" table, I will also add one entry to that table.
How are the “user_id”-s generated? 10 retrievals shouldn’t be a problem, even if you do the INSERTs in individual transactions. Unless you expect this rate to grow much higher, or there are many other transactions/quieries, your python script should be sufficient...
What is your advice? Do you think that I should somehow disable the index on the primary key of the tables (actually, I am not even sure that this is possible)? I tried to save the retrievals in bunches, so that each transaction saves one hundred or one thousand of retrievals, but it didn't help. Thank you for your time,
Normally, for users who are sure that the to-be-inserted tuples won’t violate the primary key constraint, we recommend them to i) disable the primary key constraint; ii) do batch insert; iii) re-add the primary key constraint. In this way, in step iii), MonetDB only conduct primary key checking once for the whole batch. Yeah, you can disable the primary key with an ALTER TABLE statement: sql>\d myusers CREATE TABLE "sys"."myusers" ( "user_id" INTEGER NOT NULL, "name" VARCHAR(20), "description" VARCHAR(20), "location" VARCHAR(20), CONSTRAINT "myusers_user_id_pkey" PRIMARY KEY ("user_id") ); sql>alter table myusers drop constraint "myusers_user_id_pkey" ; operation successful (12.123ms) To re-enable the primary key: sql>alter table myusers add primary key (user_id); operation successful (56.804ms) But this may not be useful for your application… Based on the information you gave so far, I’d still say the easiest way is to just always execute the INSERT and ignore the error. Regards, Jennie
Stefano
On Tue, Jul 18, 2017 at 11:09 AM, Ying Zhang
wrote: On 17 Jul 2017, at 14:18, Stefano Piani
wrote: Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.
I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).
This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases
Hello Stefano,
I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours.
I’ll come back with more information for your issues in your other thread.
Regards, Jennie
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.
Best regards, Stefano
On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith
wrote: Hi,
Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.
Regards,
Sreejith
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
On 19 Jul 2017, at 14:39, Stefano Piani
wrote: Hello Jennie, I have found your last comment extremely interesting because inserting individual tuples into a table with primary key is exactly what I do most of the time in my application. Might you please elaborate further this concept? Do I have some alternatives?
Let me explain exactly what I want to do. In my database I have two tables created in the following way: CREATE SEQUENCE "retrievalid_seq" as bigint;
CREATE TABLE retrievals (id BIGINT DEFAULT NEXT VALUE FOR "retrievalid_seq", created_at TIMESTAMP, geolocalized BOOLEAN NOT NULL, latitude DECIMAL(8,5), longitude DECIMAL(8,5), user_id BIGINT NOT NULL, data VARCHAR(500), PRIMARY KEY(id));
CREATE TABLE retrieval_users (user_id BIGINT, user_name VARCHAR(100), user_description VARCHAR(500), user_location VARCHAR(300), PRIMARY KEY(user_id));
I receive about ten retrievals per second, and I want to to save them in my table "retrievals". Moreover, if a retrieval comes from a user that is not already registered in the "retrieval_users" table, I will also add one entry to that table.
What is your advice? Do you think that I should somehow disable the index on the primary key of the tables (actually, I am not even sure that this is possible)?
Hai Stefano,
I tried to save the retrievals in bunches, so that each transaction saves one hundred or one thousand of retrievals,
We’d definitely recommend such bulk operations.
but it didn't help.
What exactly didn’t it help? Performance? Error? Jennie
Thank you for your time,
Stefano
On Tue, Jul 18, 2017 at 11:09 AM, Ying Zhang
wrote: On 17 Jul 2017, at 14:18, Stefano Piani
wrote: Hello, I think that I have faced the same issue. I have a database with a table that contains a few millions of entries. I have been trying to add about ten new entries per second on that table. At the beginning, the database was able to keep the pace with the data but, after a while, the database starts to consume more and more ram and to slow down. In that situation, it also becomes extremely slows to answer to the queries. If I restart the database, everything seems to work fine again.
I am using Ubuntu 16.04 and the monetdb instance is running inside a docker (again with ubuntu 16.04).
This is what I have noticed so far: * It took a while before the problem starts to appear * The bigger the table the sooner the problem appears * The time required for an insertion is more or less the same, the time required for a query increases
Hello Stefano,
I was just thinking that the increasing response time of your queries might have been caused by that you insert individual tuples into a table with primary key (that will become slower over time for sure), until I saw this remark of yours.
I’ll come back with more information for your issues in your other thread.
Regards, Jennie
As soon I can, I will try to prepare a few scripts to check if I can simulate the problem that I have with my application. Then, I will give you a more accurate description of the problem.
Best regards, Stefano
On Sat, Jul 15, 2017 at 7:31 AM, Sharma, Sreejith
wrote: Hi,
Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.
Regards,
Sreejith
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hello Sreejith, Can you be a bit more specific about the memory blot issue you have encountered? - which version of MonetDB - hardware / VM environment - what (type of) queries? - what level of concurrency: rate and nr. concurrent users? - how much data are processed? - how much memory is consume by MonetDB? Growth of memory consumption? - What’s the growth of response time? Thanks! Jennie
On 15 Jul 2017, at 07:31, Sharma, Sreejith
wrote: Hi,
Does anyone encountered memory blot issue on high concurrency? Also, I don’t see it coming down when idle. When its blots up, it’s slows down the response time.
Regards, Sreejith
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (3)
-
Sharma, Sreejith
-
Stefano Piani
-
Ying Zhang