Rerunning rsync did, indeed, resolve the issue. That is great news. I asked my question because I had already re-run it and it failed, but then we did some additional maintenance on the DB and ran it again. The 3rd time worked.
But there are some possibly better options mentioned in this thread that we will be exploring, specifically LVM snapshots. Our configuration is complex, but we may be able to make snapshots work.
At some point we will also explore BTRFS, but we don’t deem it quite ready for our needs.
For what it is worth our rsync command line options are: -a –delete
Thanks,
Vince
On 1/25/17, 12:58 AM, "Sjoerd Mullender" wrote:
On 25/01/17 02:30, Vincent Sheffer wrote:
> Martin,
>
> The sequence of actions you specify is exactly what the backup script does.
>
> The question you did not answer is really the most important one, so I will restate it:
>
> If there is a single glitch in any given rsync, for whatever reason, can we expect another incremental rsync to repair the damage? I ask, because if the answer is no, then MonetDB does not have anything close to a viable incremental backup strategy. There is no way we can expect no problem to ever occur. And we can’t have our system offline for hours while we do a complete copy of TBs worth of data every time a glitch does occur.
Depending on the glitch in rsync, you should be able to rerun it and
rsync will fix the backup. Do look at the rsync options, though. In
particular, there are options that tell rsync to not just look at the
timestamp and size of a file, but also at the contents. You may need to
enable that option in the second run. It will slow down the rsync
process considerably, though.
If the failure is because of a full disk, you first need to make space
(obviously). And if the failure is due to a broken disk, you need to
replace the disk.
> Without a solution to this problem MonetDB is just not viable for us.
>
> Thanks,
> Vince
>
>
> On 1/24/17, 1:32 PM, "users-list on behalf of Martin Kersten" wrote:
>
> On 24/01/2017 21:41, Vincent Sheffer wrote:
> > Martin,
> >
> > Thanks for the details, particularly the exist status list for rsync.
> >
> You should stop MonetDB using the monetdb command and also lock it. Otherwise, users can again gain access.
>
> monetdb stop <databasename>
> monetdb lock <databasename>
> rsync
> monetdb release <databasename>
> monetdb start <databasename>
>
> All transactions are properly finished.
>
> > More questions:
> > 1. When you say “restart rsync” that means start a new, non-incremental, rsync, right? Or do you think it should it be possible to attempt another incremental rsync?
> > 2. Any idea if we can recover from the error message I sent over? !FATAL: logger_load: BBPrename to sql_snapshots_bid failed
> It it reads an broken backup then likely somewhere the system encounters an unexpected situation and stops.
>
> > 3. Is stopping and locking MonetDB sufficient to ensure a consistent state? Does stopping allow running queries to complete, or at least modifications complete? Or are the sys.sessions query and sys.shutdown also required?
> >
> Queries will complete or receive a soft termination signal.
>
> Never simply stop the server using e.g. a kill command.!!
>
> regards, Martin
>
> > Thanks,
> > Vince
> >
> > On 1/24/17, 11:11 AM, "users-list on behalf of Martin Kersten" wrote:
> >
> > Hi Vincent
> >
> > On 24/01/2017 17:51, Vincent Sheffer wrote:
> > > I have been attempting to create a viable period backup strategy for MonetDB and have had little success to date.
> > >
> > > Our database is way too big to use msqldump, so our only option is the one outlined here: https://www.monetdb.org/Documentation/UserGuide/FastDumpRestore.
> > >
> > > I follow those steps and use rsync to do an incremental backup of the dbfarm to another device (from fast SSD to RAID5 array). That works fine, until it doesn’t, in which case the back up gets corrupted somehow and MonetDB won’t start from the backup. That is the state I am currently in.
> > >
> >
> > Key issue is that indeed the database server has been stopped before you start a rsync.
> > Furthermore, Rsync is an elaborate program with lots of options.
> > If an rsync fails, for whatever system error, indeed the backup data can not be trusted.
> > The list of possible errors to cope with are:
> > 1 Syntax or usage error
> > 2 Protocol incompatibility
> > 3 Errors selecting input/output files, dirs
> > 4 Requested action not supported: an attempt was made to manipulate 64-bit files on a platform that cannot support them; or an option was specified that is supported by the client and not by the server.
> > 5 Error starting client-server protocol
> > 6 Daemon unable to append to log-file
> > 10 Error in socket I/O
> > 11 Error in file I/O
> > 12 Error in rsync protocol data stream
> > 13 Errors with program diagnostics
> > 14 Error in IPC code
> > 20 Received SIGUSR1 or SIGINT
> > 21 Some error returned by waitpid()
> > 22 Error allocating core memory buffers
> > 23 Partial transfer due to error
> > 24 Partial transfer due to vanished source files
> > 25 The --max-delete limit stopped deletions
> > 30 Timeout in data send/receive
> > 35 Timeout waiting for daemon connection
> >
> > From all these exit values, the backup version can not be trusted and Rsync should be restarted.
> > Recurring errors can indicate a broken disk.
> >
> > Recovering from a corrupted backup (disk) can not be detected by a DBMS.
> >
> > regards, Martin
> >
> > > This is the error that I am getting:
> > >
> > > !FATAL: logger_load: BBPrename to sql_snapshots_bid failed
> > >
> > > Any help on
> > >
> > > 1. Fixing the existing backup, and/or
> > > 2. Helping me understand a better way to do period, incremental backups.
> > >
> > > Thanks,
> > > Vince
> > >
> > >
> > > The information transmitted, including any attachments, is intended only for the individual or entity to which it is addressed, and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by individuals or entities other than the intended recipient is prohibited, and all liability arising therefrom is disclaimed. If you have received this communication in error, please delete the information from any computer and notify the sender.
> > > _______________________________________________
> > > users-list mailing list
> > > users-list@monetdb.org
> > > https://www.monetdb.org/mailman/listinfo/users-list
> > >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
--
Sjoerd Mullender