Can I expect to be able to use BBP code without GDKinit'ing?
Bottom line question: BBPinit() is not exposed in gdk_bbp.h, but rather only called from GDKinit() (which sees it through gdk_private.h). I just want to load data from persisted BAT files - not into MonetDB and not within the mserver5 process - so, speaking conceptually, I do not want to initialize the GDK, but do want to initialize the BBP. Questions: * Should BBPinit() work outside the scope of a GDKinit()? * If not, is it adaptable so as to allow this? * More generally, how much of GDK do I need to have running, just so as load persisted data into memory, using the code in gdk_bbp (or a slight variation thereof)? Now for the introduction and the motivation: As I mentioned a few MADADMs ago, I'll need to load persisted MonetDB columns into my GPU kernel testbench. This is relatively simple for numeric columns, given some scripting work to extract catalog data in a parsable format or to build named symlinks to columns (eg /path/to/dbfarm/tpch-sf-1/named_bats/lineitem/l_shipdate -> /path/to/dbfarm/tpch-sf-1/bat/12/34.tail). But it won't do for other kinds of data, most importantly strings - which I do want to work on. Plus, writing a wider-scope loader will let me avoid depending on MonetDB running for access to persisted columns. So, I'm writing a (selective) loader of persisted MonetDB columns, or a BBP loader if you will. My strategy is the following: 1. Copying the files in the GDK codebase which are necessary for building a binary which can call all code in gdk_bbp.h and not have unmet dependencies (this is mostly done). 2. Get that leaner slice of code, with a small main(), to actually work, i.e. not fail due to weird errors or result in junk data. 3. Peel away functions from the code I've copied from the repository which are not actually used. 4. Peel away the parts of the code which are not necessary for the actual loading - from within functions. In this initial stage this may involve work that becomes unnecessary when you have auxiliary data obtained from querying the DB (such as table-column name to BAT filename mapping). 5. Expand functionality and/or optimize performance after the peeling and/or C++ify for integration with my code 6. If other people / MDBS are interested, collaborate on making this support different BBP versions - so that the result is an inter-version export/forensics library.
I suggest you look at the MonetDBLite init code, we worked long to make it minimal.
https://github.com/hannesmuehleisen/MonetDBLite/blob/master/src/embedded/emb...
Hannes
----- Original Message -----
From: "Eyal Rozenberg"
participants (2)
-
Eyal Rozenberg
-
Hannes Mühleisen