Question on communications path between mserver5 and the MonetDB API (libmapi)
We are transferring a lot of data as the result of some of our queries (20-30 columns of integers and floats) for about 50,000 records, aggregating these from multiple MonetDB instances (on different machines). For this pattern of query, the largest amount of time spent on the query (from the network onwards, ignoring Mserver5 lookup time) is on the data conversion. When running "perf top" on our application, we notice we are spending a significant amount of time (the second most samples per perf top item at 14%) converting the data returned by libmapi from "const char *" to int64_t, int32_t, double, etc. We know the type of the data, and we have very few string columns in the database. Is there an option to MonetDB that allows it to pass binary data rather than textual data back and forth on the wire? Is there an API that will give the data back in the native type rather than in the on-the-wire string type? Both client and server are always running on 64-bit X86 boxes running Linux (meaning, little endian is guaranteed)? We use serialized protobufs whenever we send data on the wire when communicating between our various programs, and the data stream is significantly smaller than sending integers back and forth in ascii. If not, would switching to ODBC / JDBC give us better performance in this area? Thanks, Dave
Some work has been done to analyse and improve the speed of the client/server communication. See the pubication of article "Don't hold my data hostage - A case for client protocol redesign" available from https://ir.cwi.nl/pub/26415 So your problem and the redesign solution is known. The server side of the new protocol (internally called PROTOCOL_10) is implemented in MonetDB including optional compression using snappy or LZ4. The status is "experimental" as it is currently not actively tested or used. See: https://dev.monetdb.org/hg/MonetDB/file/Apr2019/common/stream/stream.h#l213 However the new PROTOCOL_10 is *not* implemented yet in the client-side MAPI interface. :-(( Also it is *not* yet used by ODBC (which relies on mapilib) or JDBC or other (Python, Ruby, PHP) MonetDB client APIs. We need some volunteer(s) to work on this. Anybody interested? On 02-07-19 01:53, Gotwisner, Dave wrote:
We are transferring a lot of data as the result of some of our queries (20-30 columns of integers and floats) for about 50,000 records, aggregating these from multiple MonetDB instances (on different machines).
For this pattern of query, the largest amount of time spent on the query (from the network onwards, ignoring Mserver5 lookup time) is on the data conversion.
When running “perf top” on our application, we notice we are spending a significant amount of time (the second most samples per perf top item at 14%) converting the data returned by libmapi from “const char *” to int64_t, int32_t, double, etc. We know the type of the data, and we have very few string columns in the database.
Is there an option to MonetDB that allows it to pass binary data rather than textual data back and forth on the wire? Is there an API that will give the data back in the native type rather than in the on-the-wire string type? Both client and server are always running on 64-bit X86 boxes running Linux (meaning, little endian is guaranteed)? We use serialized protobufs whenever we send data on the wire when communicating between our various programs, and the data stream is significantly smaller than sending integers back and forth in ascii. If not, would switching to ODBC / JDBC give us better performance in this area?
Thanks,
Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Gotwisner, Dave
-
martin van dinther