[MonetDB-users] Add primary key performance
I've loaded a 27GB (~165M rows) file into a table in MonetDB and now I am trying to add a primary key which consists of 4 fields - 2 varchars and 2 ints. It has been running for a few days (since Friday of last week). I've followed the instructions laid out on this post: http://old.nabble.com/MonetDB-SQL:-bulk-loading-large-data-sets---adding-key... which involves setting the optimizer variable to "nov2009_pipe". I am unable to log in via mclient - I am able to put in my credentials but it just hangs at the "Welcome to mclient, the MonetDB/SQL interactive terminal (Jun2010-SP2)" line. So I have a few questions: 1. Is there any way I can see the status of the alter table ... add primary key command? 2. Is it possible to stop the alter table gracefully? Or will I have to shut the db down? 3. Is there anything I should do in future to speed up adding of primary keys? Thanks. -- View this message in context: http://old.nabble.com/Add-primary-key-performance-tp30397515p30397515.html Sent from the monetdb-users mailing list archive at Nabble.com.
On Tue, Dec 07, 2010 at 07:44:14AM -0800, paulr135 wrote:
I've loaded a 27GB (~165M rows) file into a table in MonetDB and now I am trying to add a primary key which consists of 4 fields - 2 varchars and 2 ints. It has been running for a few days (since Friday of last week).
I've followed the instructions laid out on this post: http://old.nabble.com/MonetDB-SQL:-bulk-loading-large-data-sets---adding-key... which involves setting the optimizer variable to "nov2009_pipe".
Using the nov2009_pipe optimizer pipeline for adding constraint only holds for the Feb2010 release of MonetDB. It should not be required anymore for later releases as of Jun2010-SP1, incl. the latest Oct2010 release. Which version of MonetDB are you using? Which OS are you running on? What kind of manchine/hardware are you running on, in particular how much memory does it have? Can you check what the machine is doing? Is it running out of physical memory and swapping, i.e., is the virtual size of the mserver5 (much) larger than the machine's physical memory size?
I am unable to log in via mclient - I am able to put in my credentials but it just hangs at the "Welcome to mclient, the MonetDB/SQL interactive terminal (Jun2010-SP2)" line.
So I have a few questions: 1. Is there any way I can see the status of the alter table ... add primary key command?
You can use MonetDB's stethoscope tool to "listen" what the server is doing; however, while it is busy with a single (MAL-)operation, there will not be any output; hence, you might have to wait for some time to see some output. Unless very familiar with MonetDB internals, the output of stethoscope might not be too informative for you. But you're welcome to send us the output to have a look at it. In addition, you can check what the mserver5 is doing as suggested above.
2. Is it possible to stop the alter table gracefully? Or will I have to shut the db down?
No, unfortunately there is no way to gracefully stop a running query. You'd have to way until it finishes (successfully or with an error), or shut down the DB.
3. Is there anything I should do in future to speed up adding of primary keys?
In priciple not. We are aware that with data sizes (i.e., total size of all columns that the primary keys consists of) that significantly exceed main memory, adding the key (i.e., checking whether the data fulfills the key condition) does not work as efficiently as it could/should, in particular with multi-column keys --- we're working on that ... You could, though, run the ALTER TABLE statement in trace mode, i.e., `TRACE ALTER TABLE ...`, collect the output and send it to us for analysis; that could help us to improve the creation of large multi-column primary keys. Regards, Stefan
Thanks. -- View this message in context: http://old.nabble.com/Add-primary-key-performance-tp30397515p30397515.html Sent from the monetdb-users mailing list archive at Nabble.com.
------------------------------------------------------------------------------ What happens now with your Lotus Notes apps - do you make another costly upgrade, or settle for being marooned without product support? Time to move off Lotus Notes and onto the cloud with Force.com, apps are easier to build, use, and manage than apps on traditional platforms. Sign up for the Lotus Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Apologies for the delay. Stefan Manegold wrote:
Which version of MonetDB are you using?
Database: MonetDB v5.20.5 Stefan Manegold wrote:
Which OS are you running on?
$ cat /etc/redhat-release CentOS release 5.5 (Final) Stefan Manegold wrote:
What kind of manchine/hardware are you running on, in particular how much memory does it have?
Intel(R) Core(TM)2 Quad CPU Q8400 @ 2.66GHz $ free -m total used free shared buffers cached Mem: 2983 2839 143 0 207 2425 -/+ buffers/cache: 206 2776 Swap: 4991 56 4935 (output of this command NOT taken while monetdb was adding primary keys) Stefan Manegold wrote:
Can you check what the machine is doing? Is it running out of physical memory and swapping, i.e., is the virtual size of the mserver5 (much) larger than the machine's physical memory size?
Memory usage from top reported 130% usage for memory. I'm assuming that means it was using most physical memory and swapping. Stefan Manegold wrote:
3. Is there anything I should do in future to speed up adding of primary keys?
In priciple not. We are aware that with data sizes (i.e., total size of all columns that the primary keys consists of) that significantly exceed main memory, adding the key (i.e., checking whether the data fulfills the key condition) does not work as efficiently as it could/should, in particular with multi-column keys --- we're working on that ... You could, though, run the ALTER TABLE statement in trace mode, i.e., `TRACE ALTER TABLE ...`, collect the output and send it to us for analysis; that could help us to improve the creation of large multi-column primary keys.
I'll issue this command over the weekend when I don't have anything planned for this machine. Thanks! -- View this message in context: http://old.nabble.com/Add-primary-key-performance-tp30397515p30474207.html Sent from the monetdb-users mailing list archive at Nabble.com.
paulr135 wrote:
I'll issue this command over the weekend when I don't have anything planned for this machine.
Here is the output from this command:
sql>trace ALTER TABLE table1 ADD CONSTRAINT table1_pkey PRIMARY KEY
(opc,session_id,session_start,domain_id);
operation successful
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| ticks | stmt
|
+===============+==========================================================================================================================================+
| 2 | _2 := sql.mvc();
|
| 31639286 | sql.catalog(29,"sys",46912720421368,0);
|
| 17078 | _50:bat[:oid,:int]
participants (2)
-
paulr135
-
Stefan Manegold