Using the Monet API to connect to MonetDB if MonetDB comes up AFTER our application
I am currently trying to update my code that uses the Mapi API to retry connections if we start an application when MonetDB is down. Our programs do not run interactively, so detecting the failure, terminating the program, and restarting the application is problematic. There are a couple of issues here that I can use help with. What we are doing is calling mapi_connect(). Mapi_connect returns what appears to be a pointer (it's non-zero). So we call mapi_error with this return value. If we are not connected, we get an MERROR, not an MTIMEOUT, which is what I would expect. With an MERROR failure, we call mapi_error_str() to see what kind of error. We then must do a string compare (actually, a strstr() call looking for a "Connection refused" pattern inside the string. Is there a way to get an enum or #define value instead, with a different function call? Sort of like looking at errno rather than calling strerror(errno) and doing a string compare on the result. This would be way more efficient if we could do this. Yes, errors should be rare, but determining the type of error (or protecting against a string text change on an RPM update) shouldn't be expensive. We then call mapi_explain() to stderr. Is there a routine that returns the value of the output of explain() to a std::string or a "const char *"? We have a logger that does stuff with the strings before logging (timestamping, __line__ __file__, putting error codes, etc. Having the multiline output of the explain() information would be very useful for us. The above process is executed multiple times until we connect. Since the failed connects return what appears to be a pointer, do we need to do anything to free the pointers up? It doesn't appear that we have a memory leak (I ran valgrind). Note that if the connect fails, calling mapi_destroy() crashes. From the documentation, "mapi_destroy()" does "Free handle resources", so it seems reasonable to call it after a failed (and non-null) mapi_connect(). It should probably do nothing, it should not crash. Also, it appears that if I connect multiple times without disconnecting (I had a bug in my code), I can't connect more than 65 times (simultaneous connections). Is this an API issue or a server issue? The reason I ask, is that if it a client issue, we will be having a single client that potentially does queries against hundreds of servers, and aggregates the result. If it is a server issue, is there a parameter that we can tweak to increase this? We do multiple queries in parallel, each on their own connections. Thanks, Dave
Hello Dave, On 6/21/19 1:16 AM, Gotwisner, Dave wrote:
I am currently trying to update my code that uses the Mapi API to retry connections if we start an application when MonetDB is down. Our programs do not run interactively, so detecting the failure, terminating the program, and restarting the application is problematic.
There are a couple of issues here that I can use help with.
What we are doing is calling mapi_connect(). Mapi_connect returns what appears to be a pointer (it's non-zero). So we call mapi_error with this return value. If we are not connected, we get an MERROR, not an MTIMEOUT, which is what I would expect. With an MERROR failure, we call mapi_error_str() to see what kind of error. We then must do a string compare (actually, a strstr() call looking for a "Connection refused" pattern inside the string. Is there a way to get an enum or #define value instead, with a different function call? Sort of like looking at errno rather than calling strerror(errno) and doing a string compare on the result. This would be way more efficient if we could do this. Yes, errors should be rare, but determining the type of error (or protecting against a string text change on an RPM update) shouldn't be expensive.
Unfortunately right now there is no way to differentiate between errors in the Mapi API, other than string handling. This is mostly due to historical reasons and should be improved, but the change is not so easy because it will potentially affect every MonetDB client library/program. As far as I can tell, the logic behind this is that in the various clients we do not care that much what kind of error we get, but in the fact that we got an error at all. Here are the main uses of mapi_connect in the MonetDB codebase: https://github.com/MonetDB/MonetDB/blob/ba1ab45148c6ab9041198dc29ccaef48a22c... https://github.com/MonetDB/MonetDB/blob/93d0d6c0f295ccd1432a6cf03aeedaba7aaa...
We then call mapi_explain() to stderr. Is there a routine that returns the value of the output of explain() to a std::string or a "const char *"? We have a logger that does stuff with the strings before logging (timestamping, __line__ __file__, putting error codes, etc. Having the multiline output of the explain() information would be very useful for us.
mapi_explain prints a string that is built from information contained in the Mapi structure itself. Specifically if mid is a variable of type MapiStruct * (what mapi_connect returns) mid->hostname mid->username mid->port mid->action are the relevant fields that you can use to build a meaningful error message. Take a look at the definition of mapi_explain to see how those fields are used: https://github.com/MonetDB/MonetDB/blob/3a0b8a65925167fc89d943a66984b14afcdc... and also at the definition of MapiStruct itself to see what fields are there: https://github.com/MonetDB/MonetDB/blob/3a0b8a65925167fc89d943a66984b14afcdc...
The above process is executed multiple times until we connect. Since the failed connects return what appears to be a pointer, do we need to do anything to free the pointers up? It doesn't appear that we have a memory leak (I ran valgrind). Note that if the connect fails, calling mapi_destroy() crashes. From the documentation, "mapi_destroy()" does "Free handle resources", so it seems reasonable to call it after a failed (and non-null) mapi_connect(). It should probably do nothing, it should not crash.
Based on the above discussion, you should use mapi_destroy. If this crashes, something has gone wrong. I will perform some tests on our end to see if there are any non obvious constraints when using mapi_connect, that might result in a crash.
Also, it appears that if I connect multiple times without disconnecting (I had a bug in my code), I can't connect more than 65 times (simultaneous connections). Is this an API issue or a server issue? The reason I ask, is that if it a client issue, we will be having a single client that potentially does queries against hundreds of servers, and aggregates the result. If it is a server issue, is there a parameter that we can tweak to increase this? We do multiple queries in parallel, each on their own connections.
There is indeed a parameter that you can specify, and its default value is 64. If you are starting mserver5 by hand you can specify --set max_clients <n> (e.g. mserver5 --dbpath=/path/to/database --set max_clients 128) whereas if you start the server through the MonetDB daemon you can: monetdb set nclients=<n> <dbname> before you start the server. Generally you should be careful with this setting, since it might create issues with performance if too many clients are served concurrently. I hope this helps. Best regards, Panos.
Thanks,
Dave _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Thanks. A few comments. We take the pre-built RPMs and use them. We do not build from source.
See below to preserve comment flow.
Dave
-----Original Message-----
From: users-list
I am currently trying to update my code that uses the Mapi API to retry connections if we start an application when MonetDB is down. Our programs do not run interactively, so detecting the failure, terminating the program, and restarting the application is problematic.
There are a couple of issues here that I can use help with.
What we are doing is calling mapi_connect(). Mapi_connect returns what appears to be a pointer (it's non-zero). So we call mapi_error with this return value. If we are not connected, we get an MERROR, not an MTIMEOUT, which is what I would expect. With an MERROR failure, we call mapi_error_str() to see what kind of error. We then must do a string compare (actually, a strstr() call looking for a "Connection refused" pattern inside the string. Is there a way to get an enum or #define value instead, with a different function call? Sort of like looking at errno rather than calling strerror(errno) and doing a string compare on the result. This would be way more efficient if we could do this. Yes, errors should be rare, but determining the type of error (or protecting against a string text change on an RPM update) shouldn't be expensive.
Unfortunately right now there is no way to differentiate between errors in the Mapi API, other than string handling. This is mostly due to historical reasons and should be improved, but the change is not so easy because it will potentially affect every MonetDB client library/program. As far as I can tell, the logic behind this is that in the various clients we do not care that much what kind of error we get, but in the fact that we got an error at all. Here are the main uses of mapi_connect in the MonetDB codebase: https://github.com/MonetDB/MonetDB/blob/ba1ab45148c6ab9041198dc29ccaef48a22c... https://github.com/MonetDB/MonetDB/blob/93d0d6c0f295ccd1432a6cf03aeedaba7aaa... [dg] We link to libmapi.so using libdl. We do it this way because we have deployments that are not MonetDB based, and we don't want to require that any part of MonetDB be installed in those deployments. Because we are providing a product that our customers use (and MonetDB is a component of our product), we must manage MonetDB crashes without having to restart our programs. Neither of your examples do retries, they simply connect, and log an error. We also have a deployment scenario where a single instance of our application may connect to hundreds of servers spread throughout a country (for a nationwide view of our data), so if one node goes down, we must still handle this case. Thus we need to look at the error. Non-connection errors are bad, so we fail. Connection errors are actually pretty normal in some of the countries we deploy to, so we must handle that. If we are stuck dealing with string compares for error conditions (which, I admin are rare), we will do so. As a suggestion, if you ever revamp your structure, stick a version number in it (protocol version), and then you can change the structure, as long as you append to the end and have some version-awareness in the API. [dg] In fact, assuming the MapiStruct is what gets passed back and forth, you have a "mapiversion" string, that should be able to handle an incompatibility between versions.
We then call mapi_explain() to stderr. Is there a routine that returns the value of the output of explain() to a std::string or a "const char *"? We have a logger that does stuff with the strings before logging (timestamping, __line__ __file__, putting error codes, etc. Having the multiline output of the explain() information would be very useful for us.
mapi_explain prints a string that is built from information contained in the Mapi structure itself. Specifically if mid is a variable of type MapiStruct * (what mapi_connect returns) mid->hostname mid->username mid->port mid->action are the relevant fields that you can use to build a meaningful error message. Take a look at the definition of mapi_explain to see how those fields are used: https://github.com/MonetDB/MonetDB/blob/3a0b8a65925167fc89d943a66984b14afcdc... and also at the definition of MapiStruct itself to see what fields are there: https://github.com/MonetDB/MonetDB/blob/3a0b8a65925167fc89d943a66984b14afcdc... [dg] As I said, we run from your pre-built RPM. If I look in the files in /usr/include/monetdb for MapiStruct, the only thing I see is the typedef in mapi.h. I don't have access to the structure. I would also prefer not to dummy up a file with a copy of the structure, in case the structure changes release to release. So, would it be possible to get a function that takes a Mapi, and returns the 4 things you output (as references)? That would solve the problem. Again, it's not worth our building from source and managing that process internally here for just this minor change.
The above process is executed multiple times until we connect. Since the failed connects return what appears to be a pointer, do we need to do anything to free the pointers up? It doesn't appear that we have a memory leak (I ran valgrind). Note that if the connect fails, calling mapi_destroy() crashes. From the documentation, "mapi_destroy()" does "Free handle resources", so it seems reasonable to call it after a failed (and non-null) mapi_connect(). It should probably do nothing, it should not crash.
Based on the above discussion, you should use mapi_destroy. If this crashes, something has gone wrong. I will perform some tests on our end to see if there are any non obvious constraints when using mapi_connect, that might result in a crash. [dg] The problem was I called mapi_destroy twice. Once in the error handling code, and once in the reconnect code. Operator error here.
Also, it appears that if I connect multiple times without disconnecting (I had a bug in my code), I can't connect more than 65 times (simultaneous connections). Is this an API issue or a server issue? The reason I ask, is that if it a client issue, we will be having a single client that potentially does queries against hundreds of servers, and aggregates the result. If it is a server issue, is there a parameter that we can tweak to increase this? We do multiple queries in parallel, each on their own connections.
There is indeed a parameter that you can specify, and its default value is 64. If you are starting mserver5 by hand you can specify --set max_clients <n> (e.g. mserver5 --dbpath=/path/to/database --set max_clients 128) whereas if you start the server through the MonetDB daemon you can: monetdb set nclients=<n> <dbname> before you start the server. Generally you should be careful with this setting, since it might create issues with performance if too many clients are served concurrently. [dg] Thanks for the clarification. We don't need the setting. We will have lots of MonetDB server instances, and a few client instances. Clients connect to multiple MonetDB servers, a server should not have a lot of clients connecting, although we will have to manage our threading model with threads connecting. I hope this helps. Best regards, Panos.
Thanks,
Dave _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Gotwisner, Dave
-
Panagiotis Koutsourakis