Re: [MonetDB-users] SQL server startup time problem (undo/redo problem?)

19 Jun 2007

      Dear Martin,
...
Can you specify the system you are working on, e.g. hardware/os.
The hardware is a dual-core 2.2GHz AMD64, 4G RAM, 8 disk raid0 (256k 
chunk size) with XFS filesystem for the database. The OS is a 64bit 
Debian Etch.
...
You are really beating the system. I don't know a system that
can handle a TPC-H SF100 out of the box. This size often
require a multistage process and careful setting of the system
parameters.
But, let's see what you have done and learn from your experience.
I didn't expect sub millisecond response times for the tpc-h queries out 
of box, so I am open for any system or database setting modification.

I have noticed that the MonetDB initial memory usage is slightly lower 
when I changed the kernel parameter vm/overcommit_memory from 2 to 0.
...
Did you stop and restart the server between the loads? If not, then
from a recovery point of view all 420M are stored in a single log file
and become the target of a single reload. It behaves as if you loaded
the 7x60M as a single batch.
I didn't try to restart the server on every slice load - so that was the 
problem. Is there any way to control the checkpoint process? Stopping 
and restarting the server is well ... not the most elegant way to 
trigger a checkpoint.
...
Killing a database process is of course hard. In that case, the
recovery process has to reload the data and enters a really expensive
part of TPC-H: ensure correctness of the integrity relationships.
Protection against this is hard, because it requires that integrity
rules enforcement should either be disabled (the method persued in MySQL).
Is there any formula that can predict the recovery time after killing 
the database process based on the log sizes or other parameters?

It would be good to have a bulk data load method without integrity rules 
enforcement - similar to pg_bulkload [1] in postgres. Using the same 
database schema as in MonetDB I have loaded the whole TPC-H dataset to 
postgres within 90 minutes (~19M/sec, which is still pretty slow imho).
...
...
to restart the server process but after three hours of intensive 
processing the sql module still not started. Please note that the 
initialization process allocated nearly all memory and swap.
This is what we expect. Your tables require a lot of space, because
MonetDB does not automatically partitioning it. (That's scheduled
for an upcoming release ;-))
Hmm, looks pretty interesting. Any plan for reasonably sized (eg.: 64M) 
table slice partitioning across several machines? I am trying to build a 
scalable log analysis framework (append only, read mostly data flow) so 
this feature would help a lot.

Regards,
J.

[1] http://pgbulkload.projects.postgresql.org/

Re: [MonetDB-users] SQL server startup time problem (undo/redo problem?)

James Laken