Re: [MonetDB-users] Performance vs postgresql(Stefan Manegold)
Hi Stefan, Thanks for your reply. Well, I understand that a simple dump of the table wouldn't be that fast in monetdb cuz it was not built for that purpose. I am evaluating monetdb 5 and postgresql plus standard server from enterprise db. I have not yet tested performance of running complex queries on both systems. I am testing on windows 32 bit, core2duo processor, 2.8 ghz, with 3gb ram 2.8 ghz I hope this config would work fine for backtesting with around 100 gb databases. Will post a detailed result of using both postgresql and monetdb later. Let me go into some detail.....my database would have 4 collumns(sno, price, name, volume) and my backtesting would generally involve querying this dataset and checking for moving average crossovers of price with the last n days moving averages, and may grow to more complex querying involving volumes. My concern is, if I need all the collumns for the queries, will monetdb perform better or should I stick with postgresql plus standard. I'll post the pperformance results once I have copmp[leted populating the databases in postgresql......But I want to try out monetdb also, I know that the postgresql interpreter may perform very slow..... Anyways, thanks for this amazing product.....Also, when will the X100 be available as an extension in Monetdb --------------------------------------------------------- *Hi Jatin, thank you very much for your interest in MonetDB! We are glad to hear that you found MonetDB "way faster" than PostgreSQL with bulk loading your 250 MB dataset --- btw, which versions of MonetDB and PostgreSQL are you comparing on what kind of system (OS, hardware (CPU, RAM, I/O system))? With a simple select * from tablename, you basically evaluate to ability of both server and client to pump your 250 MB of data from the server to the client. That is probably an interesting measurement in case you mostly do select * from table queries. However, we do not consider such queries the "major challenges" for DBMSs and designed MonetDB not for plainly pumping all data from the server to the client in the most efficient way. Rather, we assume that the major part of data-management, i.e., processing (more) complex queries including selections, joins, aggregation, etc. are done by the DBMSs (server) and only the rather small results of such (analytical) queries are then sent to the server. Moreover, as opposed to "classical" row-stores, column-stores in general are not necessarily designed with a major focus on most efficiently re-constructing the whole table. They rather assume that most (analytical) queries usually only access a subset of all column of a table. Being able to access only the requested data and hence not having to carry-around excess-luggage of non-used columns during query processing is one of the key design differences of column-stores over row-stores. Hence, to speed up simple select * from tablename queries in MonetDB, you'd probably (at least) need to re-design and -implement MonetDB's client-server protocol (MAPI). We are curious, though, to also hear about your experiences with more complex queries that return much smaller (i.e., "more reasonable"?) result sets. Stefan ps: It is interesting to hear that the "vacuum" command in PostgreSQL had a significant impact on a simple select * from table query over bulk-loaded data, i.e., without any updates being performed --- I wouldn't know wht akind of "garbage" bulk-loading leaves behind that needs to be cleaned-up ... On Fri, Aug 21, 2009 at 07:18:46PM +0530, jatin patni wrote:*
Hi, I am in the process of evaluating two major databases....monetdb and postgresql for building a backend database system for backtesting on huge historical data sets around 100GB. I copied some test data(a 250mb table) using bulk copy in both and monetdb was way faster. But when I did a simple query "SELECT * from tablename", it took around 3 minutes for 1 lakh rows via mclient and a gui interface (aqua studio based on JVM simply said out of memory). The bulk copy file was in the format collumns seperated by pipes in a text file. While using postgresql, at first it was slow but once I did a vacuum(function in postgresql), the (select * from tablename) was much faster than monetdb.
So, what am I missing here, Do I need to do something to enhance the performance in Monetdb....... I have high hopes from monetdb, so please help. Thanks
-- Jatin Patni Tel: 91 9911 649 657 jatinpatni@gmail.com www.jatinpatni.co.nr
------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/http://www.cwi.nl/%7Emanegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 | -- Jatin Patni Tel: 91 9911 649 657 jatinpatni@gmail.com www.jatinpatni.co.nr
On Sat, Aug 22, 2009 at 02:02:16PM +0530, jatin patni wrote:
Hi Stefan, Thanks for your reply.
Well, I understand that a simple dump of the table wouldn't be that fast in monetdb cuz it was not built for that purpose.
I am evaluating monetdb 5 and postgresql plus standard server from enterprise db.
Which release of MonetDB? The previous May2009-SP2, or the latest Aug2009?
I have not yet tested performance of running complex queries on both systems. I am testing on windows 32 bit, core2duo processor, 2.8 ghz, with 3gb ram 2.8 ghz
I hope this config would work fine for backtesting with around 100 gb databases.
100 GB DB on a 32-bit machine will surely not work with MonetDB. By its design (that it own much of its performacne to), MonetDB needs to have all data that is being access at any point in time during query processing (or bulk-loading) accessible in the adress space. With 32-bit systems, this is obviously limited to 32-bit, i.e., 4 GB (in practice in particular on Windows the actual limit is 3 GB, or eveb 2 GB). For a 100 GB DB, you'd have to go for a 64-bit system, and at least 8 to 16 GB RAM (or more, if possible) to exploit the full potential of MonetDB. If that's an option for you, I'd also personally recommend to consider a Unix-like (e.g., Linux) system when building a large database server ...
Will post a detailed result of using both postgresql and monetdb later.
Looking forward to!
Let me go into some detail.....my database would have 4 collumns(sno, price, name, volume) and my backtesting would generally involve querying this dataset and checking for moving average crossovers of price with the last n days moving averages, and may grow to more complex querying involving volumes. My concern is, if I need all the collumns for the queries, will monetdb perform better or should I stick with postgresql plus standard.
Good question --- it depends on may factors as the detailed structures of your queries, the characteristics of your data (-distribution), your hardware and OS, etc. --- you benchmark will give the answers for your particular case.
I'll post the pperformance results once I have copmp[leted populating the databases in postgresql......But I want to try out monetdb also, I know that the postgresql interpreter may perform very slow.....
We are curious to hear about them!
Anyways, thanks for this amazing product.....
you are more than welcome ;-) --- thanks for the compliments!
Also, when will the X100 be available as an extension in Monetdb
For X100, please see http://www.vectorwise.com/ Regards, Stefan
--------------------------------------------------------- *Hi Jatin,
thank you very much for your interest in MonetDB!
We are glad to hear that you found MonetDB "way faster" than PostgreSQL with bulk loading your 250 MB dataset --- btw, which versions of MonetDB and PostgreSQL are you comparing on what kind of system (OS, hardware (CPU, RAM, I/O system))?
With a simple select * from tablename, you basically evaluate to ability of both server and client to pump your 250 MB of data from the server to the client. That is probably an interesting measurement in case you mostly do select * from table queries. However, we do not consider such queries the "major challenges" for DBMSs and designed MonetDB not for plainly pumping all data from the server to the client in the most efficient way. Rather, we assume that the major part of data-management, i.e., processing (more) complex queries including selections, joins, aggregation, etc. are done by the DBMSs (server) and only the rather small results of such (analytical) queries are then sent to the server. Moreover, as opposed to "classical" row-stores, column-stores in general are not necessarily designed with a major focus on most efficiently re-constructing the whole table. They rather assume that most (analytical) queries usually only access a subset of all column of a table. Being able to access only the requested data and hence not having to carry-around excess-luggage of non-used columns during query processing is one of the key design differences of column-stores over row-stores.
Hence, to speed up simple select * from tablename queries in MonetDB, you'd probably (at least) need to re-design and -implement MonetDB's client-server protocol (MAPI).
We are curious, though, to also hear about your experiences with more complex queries that return much smaller (i.e., "more reasonable"?) result sets.
Stefan
ps: It is interesting to hear that the "vacuum" command in PostgreSQL had a significant impact on a simple select * from table query over bulk-loaded data, i.e., without any updates being performed --- I wouldn't know wht akind of "garbage" bulk-loading leaves behind that needs to be cleaned-up ...
On Fri, Aug 21, 2009 at 07:18:46PM +0530, jatin patni wrote:*
Hi, I am in the process of evaluating two major databases....monetdb and postgresql for building a backend database system for backtesting on huge historical data sets around 100GB. I copied some test data(a 250mb table) using bulk copy in both and monetdb was way faster. But when I did a simple query "SELECT * from tablename", it took around 3 minutes for 1 lakh rows via mclient and a gui interface (aqua studio based on JVM simply said out of memory). The bulk copy file was in the format collumns seperated by pipes in a text file. While using postgresql, at first it was slow but once I did a vacuum(function in postgresql), the (select * from tablename) was much faster than monetdb.
So, what am I missing here, Do I need to do something to enhance the performance in Monetdb....... I have high hopes from monetdb, so please help. Thanks
-- Jatin Patni Tel: 91 9911 649 657 jatinpatni@gmail.com www.jatinpatni.co.nr
------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/http://www.cwi.nl/%7Emanegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
-- Jatin Patni Tel: 91 9911 649 657 jatinpatni@gmail.com www.jatinpatni.co.nr
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (2)
-
jatin patni
-
Stefan Manegold