[MonetDB-users] MonetDB vs MySQL naive benchmark.
Hi, I just downloaded and started experimenting with MonetDB today. We have a large PHP/mysql app that runs millions of small/simple queries to generate various reports. (It needs to be refactored, but that is hard). Anyway, for starters, I wrote a very small/stupid benchmark script for both monetdb and mysql. I expected monetdb to be slower at populating the table, and faster at selecting from it, but it was slower at both. So below I am presenting my results, and my php scripts, and I would be very interested to know if there is a problem in my methodology, or if I should be running certain types of queries that monetdb will excel at. Mysql Results: -------------- $ php testmysql.php MySQL Test Dropping previous table (if present) Dropped in 0.000669 seconds creating table Created in 0.007732 seconds populating table Populated in 1.163141 seconds selecting from table queried 10000 records in 1.446646 seconds MonetDB Results: ---------------- $ php testmonet.php MonetDB Test Dropping table Dropped in 0.039597 seconds creating table Created in 0.001858 seconds populating table Populated in 7.513073 seconds selecting from table Queried 10000 records in 2.680834 seconds One interesting thing I noticed is that if I did not include the primary key index on the where-clause column, then mysql took ~80 seconds, and Monet still took about ~2.5 seconds. I guess this is because Monet just uses user-specified indexes as hints, so it had really created one anyway, yes? Anyway, here are the PHP scripts I used: testmonet.php ---------------- <?php echo "MonetDB Test\n"; $db = @monetdb_connect('sql', 'localhost', 50000, 'monetdb', 'monetdb') or die('Failed to connect to MonetDB<br>'); echo "Dropping table\n"; $start_time = microtime(true); $sql = "drop table big_table;"; $res = monetdb_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time); echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = monetdb_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time); echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = monetdb_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time); echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = monetdb_query($sql); while ( $row = monetdb_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "Queried 10000 records in %f seconds\n\n", microtime(true) - $start_time); ?> testmysql.php ---------------- <?php echo "MySQL Test\n"; $db = @mysql_connect('localhost', 'root', '', 'monetdb') or die('Failed to connect to MySQL<br>'); mysql_select_db( "perftest" ) or die ("Failed to select DB" ); echo "Dropping previous table (if present)\n"; $start_time = microtime(true); $sql = "drop table big_table"; $res = mysql_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time); echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = mysql_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time); echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = mysql_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time); echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = mysql_query($sql); while ( $row = mysql_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "queried 10000 records in %f seconds\n\n", microtime(true) - $start_time); ?> </body> </html>
On Fri, Sep 14, 2007 at 09:55:21PM -0600, Dan Libby wrote:
Hi,
I just downloaded and started experimenting with MonetDB today. We have a large PHP/mysql app that runs millions of small/simple queries to generate various reports. (It needs to be refactored, but that is hard).
Anyway, for starters, I wrote a very small/stupid benchmark script for both monetdb and mysql. I expected monetdb to be slower at populating the table, and faster at selecting from it, but it was slower at both. So below I am presenting my results, and my php scripts, and I would be very interested to know if there is a problem in my methodology, or if I should be running certain types of queries that monetdb will excel at.
First thanks for trying out MonetDB. It seems your benchmark is mostly measuring the communication overhead of both database systems. Indeed thats where MySql excels. Add some more complexitity to the queries (joins and/or aggregates) and MonetDB should behave just fine. Maybe even changing your query from "select col1 from big_table where col1 = $i"; into "select count(*) from big_table where col1 = $i"; may allready change the picture. For sure your big_table isn't that big. Niels
Mysql Results: --------------
$ php testmysql.php MySQL Test Dropping previous table (if present) Dropped in 0.000669 seconds
creating table Created in 0.007732 seconds
populating table Populated in 1.163141 seconds
selecting from table queried 10000 records in 1.446646 seconds
MonetDB Results: ----------------
$ php testmonet.php MonetDB Test Dropping table Dropped in 0.039597 seconds
creating table Created in 0.001858 seconds
populating table Populated in 7.513073 seconds
selecting from table Queried 10000 records in 2.680834 seconds
One interesting thing I noticed is that if I did not include the primary key index on the where-clause column, then mysql took ~80 seconds, and Monet still took about ~2.5 seconds. I guess this is because Monet just uses user-specified indexes as hints, so it had really created one anyway, yes?
Anyway, here are the PHP scripts I used:
testmonet.php ----------------
echo "MonetDB Test\n";
$db = @monetdb_connect('sql', 'localhost', 50000, 'monetdb', 'monetdb') or die('Failed to connect to MonetDB<br>');
echo "Dropping table\n"; $start_time = microtime(true); $sql = "drop table big_table;"; $res = monetdb_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time);
echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = monetdb_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time);
echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = monetdb_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time);
echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = monetdb_query($sql); while ( $row = monetdb_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "Queried 10000 records in %f seconds\n\n", microtime(true) - $start_time);
?>
testmysql.php ----------------
echo "MySQL Test\n";
$db = @mysql_connect('localhost', 'root', '', 'monetdb') or die('Failed to connect to MySQL<br>');
mysql_select_db( "perftest" ) or die ("Failed to select DB" );
echo "Dropping previous table (if present)\n"; $start_time = microtime(true); $sql = "drop table big_table"; $res = mysql_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time);
echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = mysql_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time);
echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = mysql_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time);
echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = mysql_query($sql); while ( $row = mysql_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "queried 10000 records in %f seconds\n\n", microtime(true) - $start_time); ?> </body> </html>
------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
Hi Niels, Thanks for your answers. On Saturday 15 September 2007 14:53, Niels Nes wrote:
It seems your benchmark is mostly measuring the communication overhead of both database systems. Indeed thats where MySql excels.
Are you saying the MySQL communication overhead is generally lower (eg more efficient protocol and/or client libs) or just that their PHP extension is better optimized (for this use case)? Do you know if MonetDB supports unix domain sockets, as postgres and mysql do? On a local machine, that should prover faster.
Add some more complexitity to the queries (joins and/or aggregates) and MonetDB should behave just fine. Maybe even changing your query from
"select col1 from big_table where col1 = $i";
into
"select count(*) from big_table where col1 = $i";
I'll try both an aggregate and a join, separately. thanks for the suggestion.
may allready change the picture. For sure your big_table isn't that big.
Do you mean "not big" in terms of number of rows, or columns, or both? With these very simple select queries, my expectation would be that I would see the same linear trend if the number of rows were 10000 or 10,000,000. regards, -- Dan Libby Open Source Consulting San Jose, Costa Rica http://osc.co.cr phone: 011 506 223 7382 Fax: 011 506 223 7359
Hi Niels,
Thanks for your answers.
On Saturday 15 September 2007 14:53, Niels Nes wrote:
It seems your benchmark is mostly measuring the communication overhead of both database systems. Indeed thats where MySql excels.
Are you saying the MySQL communication overhead is generally lower (eg more efficient protocol and/or client libs) or just that their PHP extension is better optimized (for this use case)? Indeed the PHP extension of MySQL seems very optimized. Simply wrapped our default client library (libMapi) and didn't spend time on optimizing this. So I'm not sure were the php extention of MonetDB stands.
Do you know if MonetDB supports unix domain sockets, as postgres and mysql do? In deed we do support unix named sockets (didn't test it recently though) on
On Sat, Sep 15, 2007 at 04:05:07PM -0600, Dan Libby wrote: the Mapi layer. I guess the php extention doesn't however.
On a local machine, that should prover faster.
Add some more complexitity to the queries (joins and/or aggregates) and MonetDB should behave just fine. Maybe even changing your query from
"select col1 from big_table where col1 = $i";
into
"select count(*) from big_table where col1 = $i";
I'll try both an aggregate and a join, separately. thanks for the suggestion.
may allready change the picture. For sure your big_table isn't that big.
Do you mean "not big" in terms of number of rows, or columns, or both? Both.
With these very simple select queries, my expectation would be that I would see the same linear trend if the number of rows were 10000 or 10,000,000. This completely depends on the amount of physical memory and in case of Mysql also on the sizes of your io buffers. In such small sizes (number of rows) the overhead of the protocol is a much bigger part of the total cost then with larger database sizes.
Niels
regards,
-- Dan Libby
Open Source Consulting San Jose, Costa Rica http://osc.co.cr phone: 011 506 223 7382 Fax: 011 506 223 7359
------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
On 16-09-2007 09:48:38 +0200, Niels Nes wrote:
On Sat, Sep 15, 2007 at 04:05:07PM -0600, Dan Libby wrote:
Hi Niels,
Thanks for your answers.
On Saturday 15 September 2007 14:53, Niels Nes wrote:
It seems your benchmark is mostly measuring the communication overhead of both database systems. Indeed thats where MySql excels.
Are you saying the MySQL communication overhead is generally lower (eg more efficient protocol and/or client libs) or just that their PHP extension is better optimized (for this use case)? Indeed the PHP extension of MySQL seems very optimized. Simply wrapped our default client library (libMapi) and didn't spend time on optimizing this. So I'm not sure were the php extention of MonetDB stands.
I think our PHP implementation is not so bad at all. MonetDB is just known to be not the best player in the area of small and ultra-simple queries. It's communication protocol is also not suited for handling those fast. MonetDB pays off on complex queries, i.e. those queries your application at the moment probably avoids doing because it makes it slow when using MySQL.
Do you know if MonetDB supports unix domain sockets, as postgres and mysql do? In deed we do support unix named sockets (didn't test it recently though) on the Mapi layer. I guess the php extention doesn't however.
If I recall correctly, the unix domain sockets are used by using a file as hostname, and port 0. Since PHP just throws over these arguments to libMapi it should just work (tm). In any case, since libMapi has support for it, it is trivial to add to the PHP module, IMO. My $0.02
participants (3)
-
Dan Libby
-
Fabian Groffen
-
Niels Nes