Re: [Monetdb-developers] [MonetDB-users] SQL server startup time problem (undo/redo problem?)
Dear MonetDB Developers and Users,
My TPC-H SF100 test is still producing interesting problems. After several hours of work I have managed to import near 420 million record to the lineitem table (7x 60 million record slice). Accidentally I Ok, you perform a sliced base load. i.e. 7 x a SF-10 killed the import process, and stopped the server process. I have tried Did you stop and restart the server between the loads? If not, then from a recovery point of view all 420M are stored in a single log file and become the target of a single reload. It behaves as if you loaded
Dear James, Can you specify the system you are working on, e.g. hardware/os. You are really beating the system. I don't know a system that can handle a TPC-H SF100 out of the box. This size often require a multistage process and careful setting of the system parameters. But, let's see what you have done and learn from your experience. James Laken wrote: the 7x60M as a single batch. Killing a database process is of course hard. In that case, the recovery process has to reload the data and enters a really expensive part of TPC-H: ensure correctness of the integrity relationships. Protection against this is hard, because it requires that integrity rules enforcement should either be disabled (the method persued in MySQL).
to restart the server process but after three hours of intensive processing the sql module still not started. Please note that the initialization process allocated nearly all memory and swap. This is what we expect. Your tables require a lot of space, because MonetDB does not automatically partitioning it. (That's scheduled for an upcoming release ;-))
I have attached a gdb to the server process and the execution stacks looks like this:
Program received signal SIGINT, Interrupt. 0x00002b240cb1c2a4 in file_read () from /usr/lib/libstream.so.0 (gdb) where #0 0x00002b240cb1c2a4 in file_read () from /usr/lib/libstream.so.0 #1 0x00002b240cb1b744 in stream_readLngArray () from /usr/lib/libstream.so.0 #2 0x00002b240c67a27f in lngRead () from /usr/lib/libbat.so.0 #3 0x00002b240c788e90 in logger_readlog () from /usr/lib/libbat.so.0 #4 0x00002b240c789b0c in logger_create () from /usr/lib/libbat.so.0 #5 0x00002aaaaab42658 in store_init () from /usr/lib/MonetDB5/lib/lib_sql.so #6 0x00002aaaaab179b3 in mvc_init () from /usr/lib/MonetDB5/lib/lib_sql.so #7 0x00002aaaaaacf9f4 in SQLinit () from /usr/lib/MonetDB5/lib/lib_sql.so #8 0x00002b240bf89e3a in initScenario () from /usr/lib/libmal.so.0 #9 0x00002aaaaaacf968 in SQLsession () from /usr/lib/MonetDB5/lib/lib_sql.so #10 0x00002b240bf6536e in runMALsequence () from /usr/lib/libmal.so.0 #11 0x00002b240bf6697b in runMAL () from /usr/lib/libmal.so.0 #12 0x00002b240bf5f3f3 in MALengine () from /usr/lib/libmal.so.0 #13 0x00002b240bf5e3ab in callString () from /usr/lib/libmal.so.0 #14 0x0000000000402a65 in main () (gdb) c Continuing.
Program received signal SIGINT, Interrupt. 0x00002b240d6c31a0 in malloc () from /lib/libc.so.6 (gdb) where #0 0x00002b240d6c31a0 in malloc () from /lib/libc.so.6 #1 0x00002b240c675c04 in GDKmallocmax () from /usr/lib/libbat.so.0 #2 0x00002b240c675da9 in GDKmalloc () from /usr/lib/libbat.so.0 #3 0x00002b240c6787ee in strRead () from /usr/lib/libbat.so.0 #4 0x00002b240c788e90 in logger_readlog () from /usr/lib/libbat.so.0 #5 0x00002b240c789b0c in logger_create () from /usr/lib/libbat.so.0 #6 0x00002aaaaab42658 in store_init () from /usr/lib/MonetDB5/lib/lib_sql.so #7 0x00002aaaaab179b3 in mvc_init () from /usr/lib/MonetDB5/lib/lib_sql.so #8 0x00002aaaaaacf9f4 in SQLinit () from /usr/lib/MonetDB5/lib/lib_sql.so #9 0x00002b240bf89e3a in initScenario () from /usr/lib/libmal.so.0 #10 0x00002aaaaaacf968 in SQLsession () from /usr/lib/MonetDB5/lib/lib_sql.so #11 0x00002b240bf6536e in runMALsequence () from /usr/lib/libmal.so.0 #12 0x00002b240bf6697b in runMAL () from /usr/lib/libmal.so.0 #13 0x00002b240bf5f3f3 in MALengine () from /usr/lib/libmal.so.0 #14 0x00002b240bf5e3ab in callString () from /usr/lib/libmal.so.0 #15 0x0000000000402a65 in main ()
Any idea?
Regards, J.
------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Dear Martin,
Can you specify the system you are working on, e.g. hardware/os.
The hardware is a dual-core 2.2GHz AMD64, 4G RAM, 8 disk raid0 (256k chunk size) with XFS filesystem for the database. The OS is a 64bit Debian Etch.
You are really beating the system. I don't know a system that can handle a TPC-H SF100 out of the box. This size often require a multistage process and careful setting of the system parameters. But, let's see what you have done and learn from your experience.
I didn't expect sub millisecond response times for the tpc-h queries out of box, so I am open for any system or database setting modification. I have noticed that the MonetDB initial memory usage is slightly lower when I changed the kernel parameter vm/overcommit_memory from 2 to 0.
Did you stop and restart the server between the loads? If not, then from a recovery point of view all 420M are stored in a single log file and become the target of a single reload. It behaves as if you loaded the 7x60M as a single batch.
I didn't try to restart the server on every slice load - so that was the problem. Is there any way to control the checkpoint process? Stopping and restarting the server is well ... not the most elegant way to trigger a checkpoint.
Killing a database process is of course hard. In that case, the recovery process has to reload the data and enters a really expensive part of TPC-H: ensure correctness of the integrity relationships. Protection against this is hard, because it requires that integrity rules enforcement should either be disabled (the method persued in MySQL).
Is there any formula that can predict the recovery time after killing the database process based on the log sizes or other parameters? It would be good to have a bulk data load method without integrity rules enforcement - similar to pg_bulkload [1] in postgres. Using the same database schema as in MonetDB I have loaded the whole TPC-H dataset to postgres within 90 minutes (~19M/sec, which is still pretty slow imho).
to restart the server process but after three hours of intensive processing the sql module still not started. Please note that the initialization process allocated nearly all memory and swap. This is what we expect. Your tables require a lot of space, because MonetDB does not automatically partitioning it. (That's scheduled for an upcoming release ;-))
Hmm, looks pretty interesting. Any plan for reasonably sized (eg.: 64M) table slice partitioning across several machines? I am trying to build a scalable log analysis framework (append only, read mostly data flow) so this feature would help a lot. Regards, J. [1] http://pgbulkload.projects.postgresql.org/
Dear Martin,
Can you specify the system you are working on, e.g. hardware/os.
The hardware is a dual-core 2.2GHz AMD64, 4G RAM, 8 disk raid0 (256k chunk size) with XFS filesystem for the database. The OS is a 64bit Debian Etch.
You are really beating the system. I don't know a system that can handle a TPC-H SF100 out of the box. This size often require a multistage process and careful setting of the system parameters. But, let's see what you have done and learn from your experience.
I didn't expect sub millisecond response times for the tpc-h queries out of box, so I am open for any system or database setting modification.
I have noticed that the MonetDB initial memory usage is slightly lower when I changed the kernel parameter vm/overcommit_memory from 2 to 0. Wow, you went deep. And useful information.
Did you stop and restart the server between the loads? If not, then from a recovery point of view all 420M are stored in a single log file and become the target of a single reload. It behaves as if you loaded the 7x60M as a single batch.
I didn't try to restart the server on every slice load - so that was the problem. Is there any way to control the checkpoint process? Stopping and restarting the server is well ... not the most elegant way to trigger a checkpoint. This is indeed a useful feature to add. You can add it to the corresponding
Dear James, thanks for the info and suggestions. James Laken wrote: list. We probably need it internally shortly as well, pushing it up our development priority list.
Killing a database process is of course hard. In that case, the recovery process has to reload the data and enters a really expensive part of TPC-H: ensure correctness of the integrity relationships. Protection against this is hard, because it requires that integrity rules enforcement should either be disabled (the method persued in MySQL).
Is there any formula that can predict the recovery time after killing the database process based on the log sizes or other parameters?
We haven't studied this area so far. The way it is done after a COPY is to put the backup BATs in place at practialy no cost. What delays the situation is that upon recovery you always have to check integrity. We are aware that this is overly expensive in some cases and a bug report has been put in place for attention.
It would be good to have a bulk data load method without integrity rules enforcement - similar to pg_bulkload [1] in postgres. Using the same database schema as in MonetDB I have loaded the whole TPC-H dataset to postgres within 90 minutes (~19M/sec, which is still pretty slow imho).
Indeed, in sql/src/benchmarks/tpch you'll notice different load scripts with/without integrity enforcement. A separate load functionality is relatively easy to construct if you know the mapping of SQL table to underlying BATs. BUT, it somehow violates the basics assumptions on what the system should do.
to restart the server process but after three hours of intensive processing the sql module still not started. Please note that the initialization process allocated nearly all memory and swap. This is what we expect. Your tables require a lot of space, because MonetDB does not automatically partitioning it. (That's scheduled for an upcoming release ;-))
Hmm, looks pretty interesting. Any plan for reasonably sized (eg.: 64M) table slice partitioning across several machines? I am trying to build a scalable log analysis framework (append only, read mostly data flow) so this feature would help a lot.
We are aiming for adaptive segmentation based on query load. See e.g. the papers on Crackers.
Regards, J.
participants (2)
-
James Laken
-
Martin Kersten