Re: MonetDB Performance and Data Schema

2 Apr 2014

      I have been using bulk loads and that has been great (although
performance seems to decrease as more data is in the database). The
update speed is not the problem. It's simple queries on the data
that's my main problem. I first thought maybe MonetDB needed some time
to build an index, but after waiting a while and running more queries,
waiting again and running more it doesn't ever seem to improve.

Example: Table with structure like , a simple query like "SELECT
Timestamp,Value1,Value2,Value3,Value4 FROM table WHERE Key='key'"
takes 90 seconds! I'm on a Windows box with 16GB of RAM and the system
process is writing to disk a lot while the MServer5 is reading from
disk, which tells me the source of the slowdown is likely disk IO as
data is being moved into the disk swap file. I thought MonetDB was
able to take advantage of memory mapping to avoid this kind of
overhead.

Any thoughts on how I can achieve better performance?

On Tue, Apr 1, 2014 at 3:48 PM, Brandon Jackson  wrote:
...
Using single table and bulk loading updates to it in scheduled batches.
I never use inserts in column stores like this.
Sent from my iPhone
...
On Apr 1, 2014, at 7:25 AM, Dino Veritas  wrote:
Hi,
My firm is investigating the use of MonetDB as a potential market data
store. I am interested in the opinion of those on this list as to how
best to structure the data for optimum performance.
Basically, our data consists of ~50,000 instruments. One use we are
looking at is for the storage of bar data, one for each minute.The
resulting data for each bar is like
, with one observation per
minute. While not all symbols will have an entry for each minute, many
observations will share the same Timestamp.
Is this an appropriate use for MonetDB? One schema I tried was to put
all symbols into the same table, and to separate the tables by month.
So for 5 years that would be 5*12=60 tables. A query along the lines
of "SELECT * FROM "200801" WHERE symbol='IBM' UNION ALL SELECT * FROM
"200802" WHERE symbol='IBM' UNION ALL ...)" The reads for a single
table were quite good, however, when reading across multiple tables I
encountered a lot of swap space and disk IO which drastically slowed
down the query.
I also tried one table per symbol but found this to be simply
untenable as INSERTS became extremely slow (on the order of minutes)
and once again, massive swap space usage.
How would you partition this data? We will be inserting once per day
in bulk, and I am not super concerned with the insert performance.
However, query performance is paramount, with some users querying data
very frequently for long time periods for single or groups of symbols.
The machine I am testing on has 16GB of RAM and is running Windows.
However, the machine MonetDB would ultimately be deployed on would
have much more RAM. Even so, the size of the results of a single query
are less than 16GB and should fit into memory without issue.
So please help me, MonetDB gurus, with any suggestions as to how you
would partition and store this data. I would appreciate any input.
Thank you!
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list