Hi,
I've been doing performance tests on MondetDB to see how it scales. I've been running tests on Amazon EC2 (7GB of memory, 64-Bit, CentOS 5.2) The effect of running MonetDB on a cloud machine is new to us.
The table I am using has 60 columns with most being just simple
Hi, Martin, When you state that index will be maintained during the session only, does that mean the index built by the optimizer will not benefit the other users connect to the same database and run the same query? Regards Allen -----Original Message----- From: Martin Kersten [mailto:Martin.Kersten@cwi.nl] Sent: Thursday, August 06, 2009 2:21 AM To: Communication channel for MonetDB users Subject: Re: [MonetDB-users] MonetDB Scaleability Hi Calum, Thank you very much for sharing this information. Calum Miller wrote: the table contains (primary) keys?
characters of length 2, here is the query: select CollectionDEPOTID as c0, CollectionCountryID as c1, sum(ParcelCount) as m0 from parcel_stage_short as parcel_stage_short group by c0, c1;
Here are the load time in batches of 8,000,000 using the following command: COPY 8000000 RECORDS INTO parcel_stage_short FROM '/mnt/parcel_stage_short.txt' USING DELIMITERS '\t','\n';
Rows Load Time Increase % Increase 8000000 240434.17 8000000 424937.54 184503.37 76.74% 8000000 599621.98 359187.81 149.39% 8000000 769276.70 528842.53 219.95% 8000000 889600.24 889600.24 209.35% msecs msecs
As can be seen the load time increases by around 200,000 msec for each
additional 8Million rows added. Any tips on how to reduce the load speed? The copy command in the latest release was not running in parallel on multicore machines due to a design error discovered in that code. Parallel loading alone would give you another factor >=2
Would it be faster to load the table in one go (tried loading in bigger batches but loader fails)? It should save. Because each time you run a batch it has to make room for the newly tuples, which implies a copying of what you already had. The counter effect is that you stress the operating system with lot of dirty memory, which it has to swap out.
If it failed, then we would like to know the error message it generated.
Here are the performance times for each additional batch of 8,000,000
rows:
Rows 1st Query 1st Increase % 1st Increase 2nd Query 2nd Increase %
Increase
08000000 1176.53 16000000 4838.52 3661.99 311.25% 2239.61 1168.30 109.05% 24000000 33947.84 32771.31 2785.43% 9323.29 8251.98 770.27% 32000000 837114.13 835937.61 71051.29% 5456.96 4385.65 409.37% 40000000 668380.26 667203.73 56709.60% 8538.02 7466.71 696.97% msecs msecs msecs msecs
The performance of the 1st query is unsurprisingly poor just after the
data has been copied in MonetDB decides itself when a (hash) index might speed up the query. This index will be maintained during the session only. You see here that effect of self-management. The performance of the same query a second time yields much faster results The performance of the monetDb with 1st/2nd query degrades as load increases. yes, building a hash index is linear in the size of the table.
Are these results typical? Is there anything I can do to increase performance on the 1st or 2nd queries? no, traditional systems would use a persistent index build *before* you run any query.
regards, Martin
Thanks in advance
Calum
------------------------------------------------------------------------
------------------------------------------------------------------------ ------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------ ------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users