Possible problem with C++ based CREATE FUNCTION
I am trying to create a function that works around concurrency issues that are documented, and that we are seeing, when doing simultaneous "create table" statements and bulk insert statements (or selects) off of other tables. To do this, I am implementing a function that uses a boost::shared_mutex that we call around transactions. We are using a custom version of the boost library, so it is installed in a non-standard place: /usr/local/boost-1.54.0 for development, and /opt/xxx/ for production.
From the documentation, it appears that I can use #pragma LDFLAGS and CFLAGS to point to the non-standard directories. It appears to honor the CFLAGS option, but not the LDFLAGS option. If I use the standard /usr/lib64 libraries, calling my function works (but with the wrong library). Adding the #pragma in, I get the following link error:
Failed to open shared library: libboost_thread.so.1.54.0: cannot open shared object file: No such file or directory.
If I modify LD_LIBRARY_PATH before starting MonetDB, the problem goes away. Ideally, we don't want to have to add all of our relevant directories into the environment, but would expect the #pragma to work.
Here is the function definition:
CREATE FUNCTION rwlocker(input INT)
RETURNS INT
LANGUAGE CPP {
#pragma LDFLAGS -L/usr/local/boost-1.54.0/lib -lboost_thread
#pragma CFLAGS -I/opt/tc3/include -I/usr/local/boost-1.54.0/include
#include
Hai Dave, I’ll leave the solution to your missing lib. to someone else, but after having read your email, I’d like to point out two things: i) we do not recommend external “interference” in the transaction management. We’ve seen various disasters caused by such interference in the past. Sometimes, they surface after having been run for months in production systems. What you’re doing here has high potential for deadlocks, because you’re adding more locks next to the MonetDB internal locks. It’s more advisable to first try to resolve the concurrency issues you described here in your application set-up. ii) the SQL C/C++ UDFs are more suitable for prototyping. A more stable alternative to add UDFs is described in this repository: https://dev.monetdb.org/hg/MonetDB-extend Regards, Jennie
On 3 Sep 2019, at 03:16, Gotwisner, Dave
wrote: I am trying to create a function that works around concurrency issues that are documented, and that we are seeing, when doing simultaneous “create table” statements and bulk insert statements (or selects) off of other tables.
To do this, I am implementing a function that uses a boost::shared_mutex that we call around transactions.
We are using a custom version of the boost library, so it is installed in a non-standard place: /usr/local/boost-1.54.0 for development, and /opt/xxx/ for production.
From the documentation, it appears that I can use #pragma LDFLAGS and CFLAGS to point to the non-standard directories. It appears to honor the CFLAGS option, but not the LDFLAGS option. If I use the standard /usr/lib64 libraries, calling my function works (but with the wrong library). Adding the #pragma in, I get the following link error:
Failed to open shared library: libboost_thread.so.1.54.0: cannot open shared object file: No such file or directory.
If I modify LD_LIBRARY_PATH before starting MonetDB, the problem goes away. Ideally, we don’t want to have to add all of our relevant directories into the environment, but would expect the #pragma to work.
Here is the function definition: CREATE FUNCTION rwlocker(input INT) RETURNS INT LANGUAGE CPP { #pragma LDFLAGS -L/usr/local/boost-1.54.0/lib -lboost_thread #pragma CFLAGS -I/opt/tc3/include -I/usr/local/boost-1.54.0/include
#include
// use lock() for schema changes, lock_shared() for inserts and queries static boost::shared_mutex readerWriterMutex;
result->initialize(result, 1); int val = input.data[0]; switch(val) { case 1: // shared lock readerWriterMutex.lock_shared(); break; case 2: // shared unlock readerWriterMutex.unlock_shared(); break; case 3: // unique lock readerWriterMutex.lock(); break; case 4: // unique unlock readerWriterMutex.unlock(); break; default: // Error break; }
result->data[0] = val; };
Invocation is easiest with: select rwlocker(12); which will just output a 12, bypassing the reader/writer locking.
Any thoughts on why #pragma LDFLAGS doesn’t appear to work right?
We are using 11.33.3 on CentOS 7.3.1611
Thanks,
Dave _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
I understand. I would prefer not putting a mutex inside (or outside for that matter) MonetDB.
The problem is, we are seeing concurrency errors at our ingests and on our queries. Because of the high data rates being ingested, we use a series of tables that are one hour long, and we create tables in the future. We never try and query from, or ingest into these "future tables" until they are no longer "future tables". For example, we may create tables 7 days in advance. We typically keep 30 days worth of tables. Each hour as around 4 sets of tables (we ingest 4 with 4 sets of copy commands to different tables, in parallel) and have merge tables connecting them (different hours have different merge tables). For each of these 4 sets of tables, there are maybe 5 companion tables. This is the only way we were able to get the ingest rates we require.
Ingests are done by wrapping all files for one thread in a transaction. Queries are "auto transaction" at present.
With the prior version of MonetDB, we were seeing numerous "COMMIT: transaction is aborted because of concurrency conflicts, will ROLLBACK instead" errors on the COMMIT (or for the select statement). One of the MonetDB documents gives the following comment:
WARNING.
The transaction management scheme is based on optimistic concurrency control.
It provides each transaction with a consistent view on the database, but updates are collected in an addendum processed on transaction commit.
If at commit time it can be assured that the data prepared for update affects tables has not changed in the mean time, the results are merged. Otherwise the transaction is aborted.
This optimistic concurrency scheme is particularly useful for query dominant environments.
It negatively affects long running transactions which concurrently are affected by updates on their underlying tables.
The same holds for applications that tries to perform concurrent updates from multiple threads in a single application. They should be internally serialized by the application to avoid unexpected transaction aborts.
This seems to relate to this error. Also of reference from MonetDB site:
MonetDB does recommend our method of inserting on different threads into different tables (Example 2 in https://www.monetdb.org/Documentation/SQLreference/Permissions) but with different users and without mention of merge tables.
I searched MonetDB's site and I don't see a report of a concurrency problem involving only one thread of inserts and no create tables.
There's a fixed bug that seems similar to our problem: https://www.monetdb.org/bugzilla/show_bug.cgi?id=3210, although that’s with an older version and we are still seeing the problem.
We also tried the latest version 11.33.11, and no longer see this specific problem, but are now seeing a problem identical to that reported in https://www.monetdb.org/bugzilla/show_bug.cgi?id=3853. The problem with this new behavior is we get bad data and don’t know (without looking through the log file) that it failed. The concurrency problem is actually better for us than the new problem, as we can detect it and retry.
We do queries on a different machine then where MonetDB resides, and we do ingests (and table add/drop on the machine where MonetDB resides). Because of this, I need essentially a network wide mutex implementation. Implementing on a network is problematic and will be slow performing (if it is indeed possible). So, the single common point is MonetDB (or a parallel server with another port that all our programs must communicate with). Deadlock risk is identical in either case. So, a function that we call whenever we need to appropriately protect the database is the best way I have come up with. Of course, if someone has found a better way to protect ingest/query from table add/drop, I am open to doing this a different way.
Thanks again,
Dave
-----Original Message-----
From: users-list
On 3 Sep 2019, at 03:16, Gotwisner, Dave
mailto:Dave.Gotwisner@netscout.com> wrote:
I am trying to create a function that works around concurrency issues that are documented, and that we are seeing, when doing simultaneous “create table” statements and bulk insert statements (or selects) off of other tables.
To do this, I am implementing a function that uses a boost::shared_mutex that we call around transactions.
We are using a custom version of the boost library, so it is installed in a non-standard place: /usr/local/boost-1.54.0 for development, and /opt/xxx/ for production.
From the documentation, it appears that I can use #pragma LDFLAGS and CFLAGS to point to the non-standard directories. It appears to honor the CFLAGS option, but not the LDFLAGS option. If I use the standard /usr/lib64 libraries, calling my function works (but with the wrong library). Adding the #pragma in, I get the following link error:
Failed to open shared library: libboost_thread.so.1.54.0: cannot open shared object file: No such file or directory.
If I modify LD_LIBRARY_PATH before starting MonetDB, the problem goes away. Ideally, we don’t want to have to add all of our relevant directories into the environment, but would expect the #pragma to work.
Here is the function definition:
CREATE FUNCTION rwlocker(input INT)
RETURNS INT
LANGUAGE CPP {
#pragma LDFLAGS -L/usr/local/boost-1.54.0/lib -lboost_thread #pragma
CFLAGS -I/opt/tc3/include -I/usr/local/boost-1.54.0/include
#include
// use lock() for schema changes, lock_shared() for inserts and queries
static boost::shared_mutex readerWriterMutex;
result->initialize(result, 1);
int val = input.data[0];
switch(val) {
case 1: // shared lock
readerWriterMutex.lock_shared();
break;
case 2: // shared unlock
readerWriterMutex.unlock_shared();
break;
case 3: // unique lock
readerWriterMutex.lock();
break;
case 4: // unique unlock
readerWriterMutex.unlock();
break;
default: // Error
break;
}
result->data[0] = val;
};
Invocation is easiest with: select rwlocker(12); which will just output a 12, bypassing the reader/writer locking.
Any thoughts on why #pragma LDFLAGS doesn’t appear to work right?
We are using 11.33.3 on CentOS 7.3.1611
Thanks,
Dave
_______________________________________________
users-list mailing list
users-list@monetdb.orgmailto:users-list@monetdb.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_m
ailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iO
HYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=C9y2sKBY0FF_tBmW1_dN4Rk_UnBgz
nNuMtE6nfUQKHM&s=tUKNv3UB9bzqNF2wZjeBPMfmaHoNsocpJ3S6eZgQibg&e=
_______________________________________________ users-list mailing list users-list@monetdb.orgmailto:users-list@monetdb.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.monetdb.org_mailman_listinfo_users-2Dlist&d=DwIGaQ&c=Hlvprqonr5LuCN9TN65xNw&r=u41iOHYIJhkr4P2f2CDLh8D6LDUMXy2xJPhXyGKDzns&m=C9y2sKBY0FF_tBmW1_dN4Rk_UnBgznNuMtE6nfUQKHM&s=tUKNv3UB9bzqNF2wZjeBPMfmaHoNsocpJ3S6eZgQibg&e=
participants (2)
-
Gotwisner, Dave
-
Ying Zhang