[MonetDB-users] MonetDB vs MySQL naive benchmark.
Hi, I just downloaded and started experimenting with MonetDB today. We have a large PHP/mysql app that runs millions of small/simple queries to generate various reports. (It needs to be refactored, but that is hard). Anyway, for starters, I wrote a very small/stupid benchmark script for both monetdb and mysql. I expected monetdb to be slower at populating the table, and faster at selecting from it, but it was slower at both. So below I am presenting my results, and my php scripts, and I would be very interested to know if there is a problem in my methodology, or if I should be running certain types of queries that monetdb will excel at. Mysql Results: -------------- $ php testmysql.php MySQL Test Dropping previous table (if present) Dropped in 0.000669 seconds creating table Created in 0.007732 seconds populating table Populated in 1.163141 seconds selecting from table queried 10000 records in 1.446646 seconds MonetDB Results: ---------------- $ php testmonet.php MonetDB Test Dropping table Dropped in 0.039597 seconds creating table Created in 0.001858 seconds populating table Populated in 7.513073 seconds selecting from table Queried 10000 records in 2.680834 seconds One interesting thing I noticed is that if I did not include the primary key index on the where-clause column, then mysql took ~80 seconds, and Monet still took about ~2.5 seconds. I guess this is because Monet just uses user-specified indexes as hints, so it had really created one anyway, yes? Anyway, here are the PHP scripts I used: testmonet.php ---------------- <?php echo "MonetDB Test\n"; $db = @monetdb_connect('sql', 'localhost', 50000, 'monetdb', 'monetdb') or die('Failed to connect to MonetDB<br>'); echo "Dropping table\n"; $start_time = microtime(true); $sql = "drop table big_table;"; $res = monetdb_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time); echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = monetdb_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time); echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = monetdb_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time); echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = monetdb_query($sql); while ( $row = monetdb_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "Queried 10000 records in %f seconds\n\n", microtime(true) - $start_time); ?> testmysql.php ---------------- <?php echo "MySQL Test\n"; $db = @mysql_connect('localhost', 'root', '', 'monetdb') or die('Failed to connect to MySQL<br>'); mysql_select_db( "perftest" ) or die ("Failed to select DB" ); echo "Dropping previous table (if present)\n"; $start_time = microtime(true); $sql = "drop table big_table"; $res = mysql_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time); echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = mysql_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time); echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = mysql_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time); echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = mysql_query($sql); while ( $row = mysql_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "queried 10000 records in %f seconds\n\n", microtime(true) - $start_time); ?> </body> </html>
Hi, Thanks for the feedback. We will certainly look at your results in detail, but a quick scan already raises a number of issues. 1) Given the DVD benchmark results we know that the MySQL/PHP coupling has been well-optimized. An area we haven't spent much time on. (MonetDB/JDBC was faster then MySQL/JDBC) 2) What are the transaction settings? It makes a huge difference if you run in auto-commit mode or without transaction control. Also the finalization of a transaction commit should not be forgotten in the equation. 3) Running such query sequences would benefit from prepare() calls and batching 4) Integrity checking is another major source of differences 5) Data distributions may lead to (hash) collisions, or are too regular to measure anything So overall, comparing systems to understand their characteristics is a difficult job. Naive tests should be replaced by simple tests and a deed analysis. Their are a plethora of reasons why systems differ and there is certainly no single answer. The database complexity you measure is covered by (old) benchmarks like the Wisconsin and AS3AP benchmarks. Success with your experiments. A clear application focus will help to determine which route to take is best. regards, Martin To illustrate Dan Libby wrote:
Hi,
I just downloaded and started experimenting with MonetDB today. We have a large PHP/mysql app that runs millions of small/simple queries to generate various reports. (It needs to be refactored, but that is hard).
Anyway, for starters, I wrote a very small/stupid benchmark script for both monetdb and mysql. I expected monetdb to be slower at populating the table, and faster at selecting from it, but it was slower at both. So below I am presenting my results, and my php scripts, and I would be very interested to know if there is a problem in my methodology, or if I should be running certain types of queries that monetdb will excel at.
Mysql Results: --------------
$ php testmysql.php MySQL Test Dropping previous table (if present) Dropped in 0.000669 seconds
creating table Created in 0.007732 seconds
populating table Populated in 1.163141 seconds
selecting from table queried 10000 records in 1.446646 seconds
MonetDB Results: ----------------
$ php testmonet.php MonetDB Test Dropping table Dropped in 0.039597 seconds
creating table Created in 0.001858 seconds
populating table Populated in 7.513073 seconds
selecting from table Queried 10000 records in 2.680834 seconds
One interesting thing I noticed is that if I did not include the primary key index on the where-clause column, then mysql took ~80 seconds, and Monet still took about ~2.5 seconds. I guess this is because Monet just uses user-specified indexes as hints, so it had really created one anyway, yes?
Anyway, here are the PHP scripts I used:
testmonet.php ----------------
echo "MonetDB Test\n";
$db = @monetdb_connect('sql', 'localhost', 50000, 'monetdb', 'monetdb') or die('Failed to connect to MonetDB<br>');
echo "Dropping table\n"; $start_time = microtime(true); $sql = "drop table big_table;"; $res = monetdb_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time);
echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = monetdb_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time);
echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = monetdb_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time);
echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = monetdb_query($sql); while ( $row = monetdb_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "Queried 10000 records in %f seconds\n\n", microtime(true) - $start_time);
?>
testmysql.php ----------------
echo "MySQL Test\n";
$db = @mysql_connect('localhost', 'root', '', 'monetdb') or die('Failed to connect to MySQL<br>');
mysql_select_db( "perftest" ) or die ("Failed to select DB" );
echo "Dropping previous table (if present)\n"; $start_time = microtime(true); $sql = "drop table big_table"; $res = mysql_query($sql); echo sprintf( "Dropped in %f seconds\n\n", microtime(true) - $start_time);
echo "creating table\n"; $start_time = microtime(true); $sql = "create table big_table( col1 int primary key, col2 varchar(20), col3 int, col4 varchar(20) );"; $res = mysql_query($sql) or die ("failed to create table" ); echo sprintf( "Created in %f seconds\n\n", microtime(true) - $start_time);
echo "populating table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "insert into big_table values ($i, 'col2 $i', $i, 'col4 $i' )"; $res = mysql_query( $sql ) or die ("failed to insert at row $i " ); } echo sprintf( "Populated in %f seconds\n\n", microtime(true) - $start_time);
echo "selecting from table\n"; $start_time = microtime(true); for( $i = 0; $i < 10000; $i ++ ) { $sql = "select col1 from big_table where col1 = $i"; $res = mysql_query($sql); while ( $row = mysql_fetch_assoc($res) ) { // print_r($row); } } echo sprintf( "queried 10000 records in %f seconds\n\n", microtime(true) - $start_time); ?> </body> </html>
------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi Martin, On Saturday 15 September 2007 15:01, Martin Kersten wrote:
Hi,
Thanks for the feedback. We will certainly look at your results in detail, but a quick scan already raises a number of issues.
Great! :)
1) Given the DVD benchmark results we know that the MySQL/PHP coupling has been well-optimized. An area we haven't spent much time on. (MonetDB/JDBC was faster then MySQL/JDBC)
mmm. Any idea what sort of optimizations are needed/possible in Monet's PHP extension?
2) What are the transaction settings? It makes a huge difference if you run in auto-commit mode or without transaction control. Also the finalization of a transaction commit should not be forgotten in the equation.
It's all in the scripts I sent. There were no begin/commit statements, so the default mode of both DBs was used, which I assume to be auto-commit. I'll try wrapping the whole thing in a begin/commit.
3) Running such query sequences would benefit from prepare() calls and batching
Ok. I'll see if that changes the results as well.
4) Integrity checking is another major source of differences
Well, I am really only concerned with read performance. I wouldn't think integrity checking would come into play there...
5) Data distributions may lead to (hash) collisions, or are too regular to measure anything
So overall, comparing systems to understand their characteristics is a difficult job. Naive tests should be replaced by simple tests and a deed analysis. Their are a plethora of reasons why systems differ and there is certainly no single answer.
Can you give an example of a "simple test" that would be worth doing? I thought this was rather simple actually.
The database complexity you measure is covered by (old) benchmarks like the Wisconsin and AS3AP benchmarks.
Are those published, with results for MonetDB and MySQL? I don't see them linked on the monetdb site.
Success with your experiments. A clear application focus will help to determine which route to take is best.
Yes, I'd like to import all of our mysql data and run some real app tests. I haven't tackled that project yet. Do you know if there is a document anywhere that discusses mysql to monetdb migration? I did see that monet can handle the mysql auto_increment keyword, which should be helpful. -- Dan Libby Open Source Consulting San Jose, Costa Rica http://osc.co.cr phone: 011 506 223 7382 Fax: 011 506 223 7359
Dan Libby wrote:
Hi Martin,
On Saturday 15 September 2007 15:01, Martin Kersten wrote:
Hi,
4) Integrity checking is another major source of differences
Well, I am really only concerned with read performance. I wouldn't think integrity checking would come into play there... Your primary key is an integrity constraint
The database complexity you measure is covered by (old) benchmarks like the Wisconsin and AS3AP benchmarks.
Are those published, with results for MonetDB and MySQL? I don't see them linked on the monetdb site.
In sql/src/benchmarks you find a serie.
Do you know if there is a document anywhere that discusses mysql to monetdb migration? I did see that monet can handle the mysql auto_increment keyword, which should be helpful.
MySQL/SQL is not standard compliant. MonetDB supports the SEQUENCE construct, i.e. auto-increment See website and SQL documentation. A migration document does not exist. Would be helpful. But better is to find a tool in the open-source that can migrate between any database
participants (2)
-
Dan Libby
-
Martin Kersten