Adding columns to tables via binary bulk loading
All, Is it possible to add a column to an existing table via binary bulk loading? We have a table with ~1 billion rows and we would like to add columns to the table, but would prefer bulk loads for speed. Also, does anyone have a snippet of code to create the binary file format that bulk loading expects. I've tried a couple versions (simple int and double), but the server wouldn't recognize the files. Thanks, Dave
Dave, MonetDB does not (yet?) support growing tables "horizontally" / in width, i.e., adding columns, using binary bulk loads. In fact, not even growing tables "horizontally" / in width (i.e., adding columns) using bulk load from CSV files is possible. One reason for this is that the system could (formally/semantically) not judge if, let alone ensure that, the newly loaded values/tuples match the correct existing tuples in the DB (relational tables are (multi-)sets, hence, order-oblivious). (Yes, in practice in MonetDB with both CSV and binary bluk loads tuples end up in the DB in the same order as provided ...). A standard "ALTER TABLE ADD COLUMN" statement fills the values for existing tuples with the given default value (or NULL if not other specified). A subsequent "COPY INTO" extends the table "vertically" / in length by adding tuples. Of course, you can consider loading the new/extra columns into an extra table, and then joining both tables, or adding columns with default values via "ALTER TABLE ADD COLUMNS" and then using "UPDATE" statement(s) to fill in the bulk loaded new/extra values. In either case, matching tuples correctly (i.e., the join predicate) is your own responsibility. For adding columns with data, one could envision, say, a "ALTER TABLE ADD COLUMN ... WITH DATA ..." or alike syntax, but MonetDB does not provide that, yet (to the best of my knowledge). Please feel free to file a respective feature request via https://bugs.monetdb.org/ . Regarding the file format for ("standard") binary bulk load, please refer to https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/BinaryBulkLoad For fixed-width types (mainly all numerical types), the file should simply be a sequence of values of the respective C type. If the server does not "recognize the files", it most probably gives an error message explaining why not, or what it would expect, instead. Best, Stefan ----- On Nov 12, 2018, at 6:58 PM, Anderson, David B david.b.anderson@citi.com wrote:
All,
Is it possible to add a column to an existing table via binary bulk loading? We have a table with ~1 billion rows and we would like to add columns to the table, but would prefer bulk loads for speed. Also, does anyone have a snippet of code to create the binary file format that bulk loading expects. I’ve tried a couple versions (simple int and double), but the server wouldn’t recognize the files.
Thanks,
Dave
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (2)
-
Anderson, David B
-
Stefan Manegold