Hello everyone on the mailing list, Please find some time to go through my queries. I created a table with some fields out of which 1 is timestamp. I wanted to have indexing on this table. While going through the available documentation for monetdb, I found out that it creates its own indexing and might ignore user provided indexing. I went ahead and started pushing in data into the table with single inserts (10000 of them accumulated in a transaction). My table is pretty big now and it contains about 250 million records (over 2 months of data). Even simple queries to find records between specified timestamps takes a lot of time(12-13 mins in some cases) We really feel the need of indexing on the table(don't know if that can help). I happened to stumble upon an article on monetdb mailers: http://mail.monetdb.org/pipermail/users-list/2008-March/001726.html http://mail.monetdb.org/pipermail/users-list/2008-March/001726.html According to this article if I insert data into the table using - INSERT INTO table_name (...) ORDER BY X,Y. The order by here will enforce indexing on X, Y. I would like to confirm whether this really enforce indexing? I have a few other questions pertaining to my case: 1) Is there a predefined way of speeding up such queries in monetdb? 2) If the insert with an order by clause is a work around then what if going further, I start inserting with the insert query with the order by clause. a)Will that enforce indexing on the whole table or on only those records which were inserted using such inserts? b) Or should I just move the data from my table with 250 million records into another table using the insert with order by clause? 3)One more question regarding the setup of monetdb - I have initiated monetdb using the M5server.bat on my windows 2008 R2 64 bit server. This makes it run as a service and starts off at the server start-up. This is all nice and acceptable. The problem comes when I need to restart the monetdb server. The only way I was able to do this was to kill the process (mServer5.exe) since I was not able to figure out the service (in the windows services) which starts monetdb. So my simple question is how to restart monetdb server if it gets installed by M5server.bat? Can somebody please help me with the queries? Regards, Anshuman
Hi Anshuman,
Hello everyone on the mailing list,
Please find some time to go through my queries.
I created a table with some fields out of which 1 is timestamp. I wanted to have indexing on this table. While going through the available documentation for monetdb, I found out that it creates its own indexing and might ignore user provided indexing.
Indeed. In fact, all explicit CREATE INDEX statements are NO-OPs in MonetDB.
I went ahead and started pushing in data into the table with single inserts (10000 of them accumulated in a transaction).
If you care about loading performance, you might want to consider bulkloading via COPY INFO, if that suits your scenario; cf., http://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto, http://www.monetdb.org/Documentation/Cookbooks/SQLrecipies/LoadingBulkData
My table is pretty big now and it contains about 250 million records (over 2 months of data). Even simple queries to find records between specified timestamps takes a lot of time(12-13 mins in some cases)
If you timestamp are not ordered (i.e., inserted in sort order), MonetDB's Feb2013 release (and older) will generally resort in a scan to evaluate range predicates. (For point predicates, MonetDB will transparently create a hash index with the first query and then use this index as long as the server runs, or until the data is updated.) In the (not yet released) development code base (as available from our Mercurial repository), there is also transparent support for a secondary index to speed up range queries; cf., "Column Imprints: A Secondary Index Structure" at http://homepages.cwi.nl/~lsidir/publications.html .
We really feel the need of indexing on the table(don't know if that can help). I happened to stumble upon an article on monetdb mailers:
http://mail.monetdb.org/pipermail/users-list/2008-March/001726.html http://mail.monetdb.org/pipermail/users-list/2008-March/001726.html
According to this article if I insert data into the table using - INSERT INTO table_name (...) ORDER BY X,Y.
The order by here will enforce indexing on X, Y. I would like to confirm whether this really enforce indexing?
It does not enforce indexing but creates a physical sort order on X that is then exploited by using binary search (rather than scan) for range predicates (on X).
I have a few other questions pertaining to my case:
1) Is there a predefined way of speeding up such queries in monetdb?
What exactly do you refer to with "such queries"? Single (or multi?) column range queries? In the Feb2013 release, creating a sorted copy as you did is one way. Trying the (transparent) "column imprints" of the development code base (see above) is an other alternative.
2) If the insert with an order by clause is a work around then what if going further, I start inserting with the insert query with the order by clause.
a)Will that enforce indexing on the whole table or on only those records which were inserted using such inserts? b) Or should I just move the data from my table with 250 million records into another table using the insert with order by clause?
Only if the entire table is sorted, MonetDB can exploit binary search. There is a possible work around using merged tables to use binary search also on chunk-wise sorted tables, but there is no documentation for this, yet. I might be able to share my ideas (hopefully) within the next two weeks ...
3)One more question regarding the setup of monetdb - I have initiated monetdb using the M5server.bat on my windows 2008 R2 64 bit server. This makes it run as a service and starts off at the server start-up. This is all nice and acceptable. The problem comes when I need to restart the monetdb server. The only way I was able to do this was to kill the process (mServer5.exe) since I was not able to figure out the service (in the windows services) which starts monetdb.
If the server indeed runs as a service in the background, killing it is the only option to stop it (AFAIK). However, I doubt that M5server.bat indeed starts mserver5.exe as a service. Rather, it IMHO starts the server and provides you with a server console window. Qutting that console window stops the server -- as far as I know, but I'm a Windows illiterate ... Stefan
So my simple question is how to restart monetdb server if it gets installed by M5server.bat?
Can somebody please help me with the queries?
Regards,
Anshuman
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (2)
-
Anshuman Kumar
-
Stefan Manegold