Re: [MonetDB-users] Information request

On Fri, Dec 16, 2011 at 10:51:18PM +0800, dj zhang wrote:
Hello Jennie,
Glad to hear from you! Thanks for your reply! I'm sorry that I didn't realize that you guys are in a new year vacation.
Hello Dongjie, I'd like to recommend you to subscribe to the MonetDB users mailling list: http://www.monetdb.org/Developers/Mailinglists , and sent your feature questions to the mailinglist. This way, your questions will reach the whole developer team at one, and you'll most probably get answers faster. I felt free to forward this e-mail to the mailing list.
Maybe I didn't express myself very clear. I've generated a table file in which each line represents a record, and the fields of a record are separated by "|", e.g. "abc|1.2.3.3|www.monetdb.org|3". I can bulk load this file to MonetDB by SQL COPY INTO command.
Unfortunately, you can't load this table using the binary bulk load feature of MonetDB, since this feature only handles numerical data, and doesn't not work with strings. How large (in GBs) is your total dataset? Maybe you don't have to use the binary bulk load feature...
Now I want to change the table file to column files, each file just contains one column's values which also stored line by line. I want to know if I can load this table by loading these column files.
From your information, Can I say that I have to change these column files to binary format that MonetDB recognise(BAT storage), then use binary bulk load command to load the table to MonetDB. And if I change the files to other binary format it will not work. Is that right?
That's right. MonetDB just works with one binary format. Regards, Jennie
Thank you so much! Once again happy new year!
Best wishes,
Dongjie -------------------------------------
On Fri, Dec 16, 2011 at 8:55 PM, Ying Zhang
wrote: Hello Zhang Dongjie,
First of all, thank you very much for your interest in MonetDB.
I have learn from the website document that I can load data use column files when migrate between MonetDB instances, but the file has to be in the format of binary version of the BAT storage if I didn't misunderstand.
If each column of a table is already stored in binary format in a file, the table can be loaded using the "binary bulk load" feature of MonetDB, as described here: http://www.monetdb.org/Documentation/Cookbooks/SQLrecipies/BinaryBulkLoad
If you are migrating your existing database to an exactly the same machine, you can just copy dbfarm sub-directory for your database. If you are migrating to a machine with different hardware, you need to dump the database and reload it on the destination machine, as described here: http://www.monetdb.org/Documentation/UserGuide/DumpRestore However, in this case, the dumped data can not be loaded using the binary bulk load feature, because the file containing dumpted data is an ASCII file.
So my question is, is there any way to load data to monetdb by column as long as the record is store by column. Or do I have to first change the column files to the binary version of BAT format then use the Binary Bulk Load commands.
You can only use the binary bulk load feature, iff the data of each column is stored in a separate file in binary format. Hence, yes, you'll have to generate those binary files, if you don't have them yet. However, if your data is already in, e.g., CSV format, you don't have to create binary files for the columns, since you can just load the CSV file. If your (CSV) data is not too large, the standard SQL COPY INTO feature should just work.
And if I have to change the format, is there any way to use google protobuf to serialize the column file to the BAT format, or is there any solutions like handlersocket for mysql?
I'm unfortunately not familiar with protobuf and handlersocket. Maybe someone else in our group can answer these questions.
Hope this helps a bit.
With kind regads,
Jennie Zhang
By the way I think MonetDB is a great project and I wish it can find it's way to commercialize very soon!
Thank you very much, and Merry Christmas!
Best regards,
Dongjie ------------------ Dongjie Zhang
Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
华中科技大学计算机科学与技术学院 服务计算技术与系统教育部重点实验室 集群与网格计算湖北省重点实验室 湖北 武汉 430074
_______________________________________________ Info mailing list Info@monetdb.org http://mail.monetdb.org/mailman/listinfo/info
-- ------------------ Dongjie Zhang
Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
华中科技大学计算机科学与技术学院 服务计算技术与系统教育部重点实验室 集群与网格计算湖北省重点实验室 湖北 武汉 430074
participants (1)
-
Ying Zhang