I have been using bulk loads and that has been great (although
performance seems to decrease as more data is in the database). The
update speed is not the problem. It's simple queries on the data
that's my main problem. I first thought maybe MonetDB needed some time
to build an index, but after waiting a while and running more queries,
waiting again and running more it doesn't ever seem to improve.
Example: Table with structure like
Using single table and bulk loading updates to it in scheduled batches.
I never use inserts in column stores like this.
Sent from my iPhone
On Apr 1, 2014, at 7:25 AM, Dino Veritas
wrote: Hi,
My firm is investigating the use of MonetDB as a potential market data store. I am interested in the opinion of those on this list as to how best to structure the data for optimum performance.
Basically, our data consists of ~50,000 instruments. One use we are looking at is for the storage of bar data, one for each minute.The resulting data for each bar is like
, with one observation per minute. While not all symbols will have an entry for each minute, many observations will share the same Timestamp. Is this an appropriate use for MonetDB? One schema I tried was to put all symbols into the same table, and to separate the tables by month. So for 5 years that would be 5*12=60 tables. A query along the lines of "SELECT * FROM "200801" WHERE symbol='IBM' UNION ALL SELECT * FROM "200802" WHERE symbol='IBM' UNION ALL ...)" The reads for a single table were quite good, however, when reading across multiple tables I encountered a lot of swap space and disk IO which drastically slowed down the query.
I also tried one table per symbol but found this to be simply untenable as INSERTS became extremely slow (on the order of minutes) and once again, massive swap space usage.
How would you partition this data? We will be inserting once per day in bulk, and I am not super concerned with the insert performance. However, query performance is paramount, with some users querying data very frequently for long time periods for single or groups of symbols.
The machine I am testing on has 16GB of RAM and is running Windows. However, the machine MonetDB would ultimately be deployed on would have much more RAM. Even so, the size of the results of a single query are less than 16GB and should fit into memory without issue.
So please help me, MonetDB gurus, with any suggestions as to how you would partition and store this data. I would appreciate any input. Thank you! _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list