[MonetDB-users] Load times expected
Is this reasonable : $ time ( echo "copy 587321890 records into sample from STDIN using delimiters '\t','\n','';" ; zcat logs20090511.txt.gz ) | recode iso8859-1..utf-8 | mclient -lsql -ukb -Pkb [ 587321890 ] real 618m29.030s 618m is a bit excessive, and well over what I had expected! Background on the machine : its a dual cpu opteron 252, 6 GB ram, running centos5.3/x86_64 stock - storage is from a 2 disk raid0, local to the machine ( sata 7200 disks, md-raid ). size of this logs file, unzipped is : 56854417129 There is almost nothing else running on the machine ( load avg from pre-kickoff was ~ 0 0 0 ). An aggregate querry that I kicked off last night ( about 14 hrs back ) is still running... Are these levels of performance in line with what most people would expect ? Is there a generic tuning guide or a best-practises doc that I should be looking at about now ? As something to compare against, pgsql-8.3 on the same machine, was able to do its bulk load in under 58 min, and then took just over 10 hours to run the aggregate query ( no index's on data ) : - KB
On Wed, May 13, 2009 at 02:11:29PM +0100, Karanbir Singh wrote:
Is this reasonable :
$ time ( echo "copy 587321890 records into sample from STDIN using delimiters '\t','\n','';" ; zcat logs20090511.txt.gz ) | recode iso8859-1..utf-8 | mclient -lsql -ukb -Pkb [ 587321890 ]
real 618m29.030s
First, to see where time actually goes, try $ time zcat logs20090511.txt.gz ) | recode iso8859-1..utf-8 > logs20090511.utf8 $ time ( echo "copy 587321890 records into sample from STDIN using delimiters '\t','\n','';" ; cat logs20090511.utf8 | mclient -lsql -ukb -Pkb or $ time zcat logs20090511.txt.gz ) | recode iso8859-1..utf-8 > logs20090511.utf8 $ time echo "copy 587321890 records into sample from '$PWD/ogs20090511.utf8' using delimiters '\t','\n','';" | mclient -lsql -ukb -Pkb or even $ time zcat logs20090511.txt.gz ) | recode iso8859-1..utf-8 | gzip -c > logs20090511.utf8.gz $ time echo "copy 587321890 records into sample from '$PWD/logs20090511.utf8.gz' using delimiters '\t','\n','';" | mclient -lsql -ukb -Pkb Stefan
618m is a bit excessive, and well over what I had expected! Background on the machine :
its a dual cpu opteron 252, 6 GB ram, running centos5.3/x86_64 stock - storage is from a 2 disk raid0, local to the machine ( sata 7200 disks, md-raid ). size of this logs file, unzipped is : 56854417129
There is almost nothing else running on the machine ( load avg from pre-kickoff was ~ 0 0 0 ).
An aggregate querry that I kicked off last night ( about 14 hrs back ) is still running...
Are these levels of performance in line with what most people would expect ? Is there a generic tuning guide or a best-practises doc that I should be looking at about now ?
As something to compare against, pgsql-8.3 on the same machine, was able to do its bulk load in under 58 min, and then took just over 10 hours to run the aggregate query ( no index's on data ) :
- KB
------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (2)
-
Karanbir Singh
-
Stefan Manegold