[MonetDB-users] Dual instances of MonetDB for 1 dbfarm - is it possible?
Hi Matthew, On 02-09-2008 21:56:40 -0400, McKennirey.Matthew wrote:
While we look forward to the availability of a new real-time replication strategy for MonetDB, we were wondering if it would be plausible to configure two instances of MonetDB, on different machines, to point to the same dbfarm.
I assume you mean not only using the same dbfarm, but also using the same databases. MonetDB locks the database it is using, so unless NFS locking or something is malfunctioning you should see this doesn't work.
Only one instance would be used at a time; one instance would be the primary instance and the second instance would only be used if the first failed to respond (at which time we would stop sending requests to the primary instance and raise an alert for the system adminsitrator)
Sounds like a "failover".
We can provide for the replication of the database data at the file system layer (ZFS) but are still susceptible to a failure of MonetDB or the machine it is running on.
The danger of the ZFS solution here is that you get a copy, that doesn't include locks. We once had some thoughts of supporting read-only databases (think of a LiveCD), but for your use that sounds not quite like what you need either. I'm wondering why you actually want to "failover" from another machine. Does it mean that merovingian isn't able to cover up mserver crashes? Does mserver take the entire machine down? Or are there other conditions (network?) that lead to this multi machine failover strategy?
On Wed, Sep 03, 2008 at 09:38:46AM +0200, Fabian Groffen wrote:
Hi Matthew,
On 02-09-2008 21:56:40 -0400, McKennirey.Matthew wrote:
While we look forward to the availability of a new real-time replication strategy for MonetDB, we were wondering if it would be plausible to configure two instances of MonetDB, on different machines, to point to the same dbfarm.
I assume you mean not only using the same dbfarm, but also using the same databases. MonetDB locks the database it is using, so unless NFS locking or something is malfunctioning you should see this doesn't work.
At different points in time (i.e., not concurrently) two different instances of MonetDB can (technically) very well share the same dbfarm --- provided the two instances of MonetDB are binary compatible. In fact, multiple instances of MonetDB can even concurrently share the same dbfarm, provided they all use a different database (dbname). MonetDB locks the database such that only a single instance can use a particular database at a time.
Only one instance would be used at a time; one instance would be the primary instance and the second instance would only be used if the first failed to respond (at which time we would stop sending requests to the primary instance and raise an alert for the system adminsitrator)
Sounds like a "failover".
We can provide for the replication of the database data at the file system layer (ZFS) but are still susceptible to a failure of MonetDB or the machine it is running on.
The danger of the ZFS solution here is that you get a copy, that doesn't include locks. We once had some thoughts of supporting read-only databases (think of a LiveCD), but for your use that sounds not quite like what you need either.
I'm wondering why you actually want to "failover" from another machine. Does it mean that merovingian isn't able to cover up mserver crashes? Does mserver take the entire machine down? Or are there other conditions (network?) that lead to this multi machine failover strategy?
I do share Fabians concerns. Stefan
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Fabian and Stefan, Thank you both for the time you took to reply. We are trying to create a deployment with as much redundancy (failover) as we can. We assume hardware will fail (drives, hardware network interfaces, memory, etc, etc) and we may lose a machine, a switch, etc. The issue is not the ability of the merovingian daemon to restart mserver5 processes on the same machine, the issue is what if we lose the machine or we can't connect to the merovingian daemon? (As an aside my understanding is that merovingian starts and monitors mserver5 processes on the same machine - I do not see a way to configure merovingian to start mserver5 processes on other machines.) Our plan is to use multiple instances of MonetDB (running on multiple machines) each serving an architecturally distinct portion of the system's data such that the failure of one instance would not prevent other parts of the system from functioning. However, we would dearly like to have a failover capability on each instance of MonetDB. Again, only one instance of MonetDB would be interacting with a specific dbfarm and dbname at a time, but if it (or the machine it is on) failed to respond, we would redirect the work to a 'backup' instance on another machine. The merovingian daemon of the 'backup' would be started but there would be no activity until needed. So I guess the question is, when instance 1 of merovingian or a mserver5 process locks a dbfarm and dbname when does it release the lock? and if it fails (software or hardware failure) I presume the locks still exist preventing instance 2 from using that dbfarm and dbname? In which case we are out of luck. On Wednesday 03 September 2008 04:31:06 Stefan Manegold wrote:
On Wed, Sep 03, 2008 at 09:38:46AM +0200, Fabian Groffen wrote:
Hi Matthew,
On 02-09-2008 21:56:40 -0400, McKennirey.Matthew wrote:
While we look forward to the availability of a new real-time replication strategy for MonetDB, we were wondering if it would be plausible to configure two instances of MonetDB, on different machines, to point to the same dbfarm.
I assume you mean not only using the same dbfarm, but also using the same databases. MonetDB locks the database it is using, so unless NFS locking or something is malfunctioning you should see this doesn't work.
At different points in time (i.e., not concurrently) two different instances of MonetDB can (technically) very well share the same dbfarm --- provided the two instances of MonetDB are binary compatible. In fact, multiple instances of MonetDB can even concurrently share the same dbfarm, provided they all use a different database (dbname). MonetDB locks the database such that only a single instance can use a particular database at a time.
Only one instance would be used at a time; one instance would be the primary instance and the second instance would only be used if the first failed to respond (at which time we would stop sending requests to the primary instance and raise an alert for the system adminsitrator)
Sounds like a "failover".
We can provide for the replication of the database data at the file system layer (ZFS) but are still susceptible to a failure of MonetDB or the machine it is running on.
The danger of the ZFS solution here is that you get a copy, that doesn't include locks. We once had some thoughts of supporting read-only databases (think of a LiveCD), but for your use that sounds not quite like what you need either.
I'm wondering why you actually want to "failover" from another machine. Does it mean that merovingian isn't able to cover up mserver crashes? Does mserver take the entire machine down? Or are there other conditions (network?) that lead to this multi machine failover strategy?
I do share Fabians concerns.
Stefan
On 04-09-2008 12:11:49 -0400, McKennirey.Matthew wrote:
We are trying to create a deployment with as much redundancy (failover) as we can. We assume hardware will fail (drives, hardware network interfaces, memory, etc, etc) and we may lose a machine, a switch, etc.
understandable
(As an aside my understanding is that merovingian starts and monitors mserver5 processes on the same machine - I do not see a way to configure merovingian to start mserver5 processes on other machines.)
(it can't start, but it *does* discover neighbour databases)
Our plan is to use multiple instances of MonetDB (running on multiple machines) each serving an architecturally distinct portion of the system's data such that the failure of one instance would not prevent other parts of the system from functioning. However, we would dearly like to have a failover capability on each instance of MonetDB. Again, only one instance of MonetDB would be interacting with a specific dbfarm and dbname at a time, but if it (or the machine it is on) failed to respond, we would redirect the work to a 'backup' instance on another machine. The merovingian daemon of the 'backup' would be started but there would be no activity until needed.
Here an interesting opportunity is for the merovingian "network". Each merovingian does announcing and listening to others. This makes remote databases known at the local merovingian. The current branch has code to also list this remote information (instead of peeking in merovingian's logs). Currently, it is a very simple idea: a database is announced, and as such stored by other merovingians that receive the message. Each database received can be redirected to. Merovingian will transparantly do that when a remote database name is requested. The rules of "resolving" are simple: always first find a local database, and if not present, look in the remote list. This remote list can be in any order and can contain duplicates. First one is taken. Currently no proprities are encoded in here. However, it is not impossible to think of a priority scheme (like DHCP authority, or WINNT PDC master negociations) in this picture. It would allow to have the same database being installed on more machines, but the primary always be the first in merovingians remote list. As such a stand-alone merovingian could do the fail-over step once the primary falls out.
So I guess the question is, when instance 1 of merovingian or a mserver5 process locks a dbfarm and dbname when does it release the lock? and if it fails (software or hardware failure) I presume the locks still exist preventing instance 2 from using that dbfarm and dbname? In which case we are out of luck.
The operating system should release all locks as soon as the program is terminated. The lock is only active as long as the filedescriptor is held open, and the OS closes all file descriptors when it cleans up a terminated or crashed process. Locks cannot be "stored", so that should be safe too.
participants (3)
-
Fabian Groffen
-
McKennirey.Matthew
-
Stefan Manegold