monetdbe, fetching date, daytime, timestamp in their MonetDB native memory storage format
Hello, Since the first version of monetdblite (monetdbe now), passing/fetching the date and time columns (in monetdbe or capi code for instance) is done using intermediary structures, instead of native types (int, __int64). typedef struct { unsigned char day; unsigned char month; short year; } monetdbe_data_date; typedef struct { unsigned int ms; unsigned char seconds; unsigned char minutes; unsigned char hours; } monetdbe_data_time; typedef struct { monetdbe_data_date date; monetdbe_data_time time; } monetdbe_data_timestamp; typedef struct { unsigned char day; unsigned char month; int year; } cudf_data_date; typedef struct { unsigned int ms; unsigned char seconds; unsigned char minutes; unsigned char hours; } cudf_data_time; typedef struct { cudf_data_date date; cudf_data_time time; } cudf_data_timestamp; I am wondering if this conversion is truly necessary, considering that this effort is usually doubled by either converting back to native format (for append operations, for instance) or extra conversion in other flavours of date/time structures some other C code might need. The very few macros required for time information extraction from the native types (is_nil, extract) can be done on the client side when needed. Related to this, a low level function "monetdbe_result_scan" alternative to "monetdbe_result_fetch" would avoid extra memory allocation in case no storage is required or if a different storage for the result (user list, compressed, straight on disk, ...) is considered by a developer. For integers, doubles, float, date, daytime, timestamp types passing the data address Tloc(b, 0) in the same way GENERATE_BAT_INPUT is doing it, looks good enough, assuming the developer knows to use that memory address for reading purposes. For the rest of the types (str, blob, others), invoking a pointer to a provided user function (void *data, size_t data_size, size_t j) inside the BATloop would suffice. This "monetdbe_result_scan" part can be adjusted by a custom development, sure, but passing date-time types columns in native format (which can come from a remote connection as well) should be uniform. Having a version ID for the monetdbe_result structure would also help in case a remote connection can be instructed to provide results in different ways. Dan
Hai Daniel, We're definitely interested to further working on the API (extensions, improvements, etc). Can you please open an "enhancement" ticket on https://github.com/monetdb/monetdb/issues? https://github.com/monetdb/monetdb/issues? Thanks! Jennie
On 26 Feb 2021, at 12:57, Daniel Zvinca
wrote: Hello,
Since the first version of monetdblite (monetdbe now), passing/fetching the date and time columns (in monetdbe or capi code for instance) is done using intermediary structures, instead of native types (int, __int64).
typedef struct { unsigned char day; unsigned char month; short year; } monetdbe_data_date;
typedef struct { unsigned int ms; unsigned char seconds; unsigned char minutes; unsigned char hours; } monetdbe_data_time;
typedef struct { monetdbe_data_date date; monetdbe_data_time time; } monetdbe_data_timestamp;
typedef struct { unsigned char day; unsigned char month; int year; } cudf_data_date;
typedef struct { unsigned int ms; unsigned char seconds; unsigned char minutes; unsigned char hours; } cudf_data_time;
typedef struct { cudf_data_date date; cudf_data_time time; } cudf_data_timestamp;
I am wondering if this conversion is truly necessary, considering that this effort is usually doubled by either converting back to native format (for append operations, for instance) or extra conversion in other flavours of date/time structures some other C code might need. The very few macros required for time information extraction from the native types (is_nil, extract) can be done on the client side when needed.
Related to this, a low level function "monetdbe_result_scan" alternative to "monetdbe_result_fetch" would avoid extra memory allocation in case no storage is required or if a different storage for the result (user list, compressed, straight on disk, ...) is considered by a developer. For integers, doubles, float, date, daytime, timestamp types passing the data address Tloc(b, 0) in the same way GENERATE_BAT_INPUT is doing it, looks good enough, assuming the developer knows to use that memory address for reading purposes. For the rest of the types (str, blob, others), invoking a pointer to a provided user function (void *data, size_t data_size, size_t j) inside the BATloop would suffice.
This "monetdbe_result_scan" part can be adjusted by a custom development, sure, but passing date-time types columns in native format (which can come from a remote connection as well) should be uniform. Having a version ID for the monetdbe_result structure would also help in case a remote connection can be instructed to provide results in different ways.
Dan
_______________________________________________ developers-list mailing list developers-list@monetdb.org https://www.monetdb.org/mailman/listinfo/developers-list
participants (2)
-
Daniel Zvinca
-
Ying Zhang