Backup

Backup giulia Mon, 05/04/2020 - 14:15

MonetDB offers different options for backup and dump of the database

SQL dump and restore

SQL dump and restore mk Sun, 03/28/2010 - 13:56

An SQL dump of a database is a common method to safely store away a snapshot of the database for archival purposes or to migrate data between database instances, e.g. between two major system releases. The content of an SQL dump is a large collection of SQL statements as text. Running the script will recreate the database in the same state as it was when the dump was created. Since MonetDB does not provide global locking schemes, a time-consuming dump operation may become invalidated by a concurrent update query.

The primary tool to consider for making a database SQL dump is msqldump, which includes a wide variety of options:.

Usage: msqldump [ options ] [ dbname ]

Options are:
 -h hostname | --host=hostname    host to connect to
 -p portnr   | --port=portnr      port to connect to
 -u user     | --user=user        user id
 -d database | --database=database  database to connect to
 -f          | --functions        dump functions
 -t table    | --table=table      dump a database table
 -D          | --describe         describe database
 -N          | --inserts          use INSERT INTO statements
 -q          | --quiet            don't print welcome message
 -X          | --Xdebug           trace mapi network interaction
 -?          | --help             show this usage message
--functions and --table are mutually exclusive

msqldump program acts as a normal MonetDB client application, e.g. it runs concurrently with all other user sessions, and dumps the generated SQL statements onto standard output. Safe this to a file (by using redirection of stdout) for a later restore session by calling mclient with the saved file as input argument. Details on the various arguments can be found in the manual page for mclient.

For example, consider you have already installed the SQL tutorial database voc on a Linux platform and wishes to transport it to another machine. Then the following command line option generates the dump file.

shell> msqldump --database=voc >/tmp/voc.sql

You can inspect the file /tmp/voc.sql to confirm that indeed a readable SQL database dump is available. If storage space is a problem, you can readily pipe the output of this command to a compression tool, e.g. gzip, or sent it directly to another machine in your environment using conventional Linux tools, e.g. scpy. As a baseline for recovery, move the file over to the new machine. Then the monetdb tool can be used to create the database on the recipient machine. Once done, it suffices to feed the dump file to mclient to populate the database.

Creation of the dump respects your MonetDB server login credentials, which means you only can dump the tables you have access to.

Migration of a database from other database systems follow the same route, but be aware that SQL dialects often differ. A manual or scripted patch of a foreign SQL dump is often needed. An overview of the various SQL dialects can be found in the SQL dialects book and general background on database compatibility on Wikipedia.

shell> monetdb release demo

Windows platforms

On Windows platforms we advise to always dump the old database into a mydbdump.sql file before upgrading to a new MonetDB software release. Subsequently stop the MonetDB server and rename/remove the dbfarm/demo. After upgrade of MonetDB software and restart of MonetDB server, the dump can be restored.

To dump the SQL database, start the MonetDB SQL Client program (mclient.bat) and type the command:

\>C:\...\mydbdump.sql
\D
\>

The path after \> should be an absolute path name (i.e. start with a drive letter) and be in a save location. By default the database is located in %APPDATA%\MonetDB5. After having made a database dump it can be removed. This folder is located inside the dbfarm\demo folder.

Restoring an SQL database can be done using the MonetDB SQL Client program (mclient.bat) with the following command

\<C:\...\mydbdump.sql

Caveats

  • When the --echo or --interactive option is used, mclient sends its input file(s) line-by-line and waits for confirmation before sending the next line. Therefore, when using mclient to restore large SQL dump files, these options should not be used since they can slow down the process 10s or even 100s times depending on the file size.
  • By default (i.e. without --echo and --interactive), mclient sends an input file in large chunks of 10240 bytes to the MonetDB server. On rare occasions, a block ends exactly at a semicolon (';') which closes a COPY INTO statement. This causes the MonetDB server to complain about invalid input data for this COPY INTO (with "Failed to import table '<table name>', line 1: column 1: Column value missing") even though valid data records will be sent in the next block. This is because the MonetDB server regards the semicolon as the marker for the end-of-input and continues to process the query without asking for more input, but it finds no value for this COPY INTO. A good workaround for this problem is to add a white-space ('\ ') before the semicolon. For large files on Linux, one can use this command: sed -i 's/INTO <table name>/ &/' $file.

Hot backup

Hot backup dinther Thu, 07/30/2020 - 20:21

Creating a full database backup online

Use system procedure: sys.hot_snapshot(full_path_to_tar_file string)
to quickly create a binary copy of all persisted data files of the current connected database into one tar file.
To activate it, simply execute SQL statement:

call sys.hot_snapshot('/tmp/db_backup_2020_07_27.tar');

or on MS Windows (note the R string value prefix to interpret the string characters as Raw):

call sys.hot_snapshot(R'D:\Temp\db_backup_2020_07_27.tar');

This will create a tar file to the given location and name on the server.

Warning: the resulting tar file may become very big (TB's) depending on the size of your database on the file system. Check your database size (in the dbfarm directory) and assure your file system has enough free disk space to store a copy of it as tar file.

Warning: always include the .tar part in the file name as done in the examples above as some tools (on Windows) use it to associate the proper program to open it.

Note: Procedure sys.hot_snapshot() is introduced in release Jun2020 (11.37.7) and not available in older releases.

 

Optionally the tar file can be compressed by adding an extra file name suffix: .lz4 or .gz or .bz2 or .xz.

call sys.hot_snapshot('/tmp/db_backup_2020_07_27.tar.lz4');

or on MS Windows:

call sys.hot_snapshot(R'D:\Temp\db_backup_2020_07_27.tar.gz');

It is recommended to use no or fast lz4 compression so the tar file is written as fast as possible. This is desired to keep the db-lock as short as possible during the copying of the data files.
Compressions gz and bz2 take (much) more time. Compression xz compresses best but also takes most time to complete and thus keeps the database locked the longest. It can be useful when you want the smallest backup file size and no other users are connected to the database server.

Note: for releases Jun2020 and Jun2020-SP1 on MS Windows, only .gz and .bz2 compression formats were supported. With Oct2020 release now also .xz and .lz4 compression formats are supported on MS Windows. We recommend to use no or gz or xz compression on Windows.

 

By default only monetdb user (the admin) may execute this system procedure. If needed the monetdb user may grant execute privilege to other users of the database, for instance to user mr_backup via SQL:

grant execute on procedure sys.hot_snapshot to mr_backup;

 

Restoring a database from the (compressed) tar file

The tar file expands to a single directory with the same name as the database that was snapshotted.
This directory can be passed directly as the --dbpath startup argument of mserver5 or it can be copied into an existing dbfarm and started from monetdbd.
For example, on the machine with the database:

$ monetdb status
name  state   health               remarks
tpch  R  8m  100%  0s  mapi:monetdb://hank:50000/tpch
$ mclient -d tpch -s "call sys.hot_snapshot('/tmp/tpch.tar.gz')"
$ ls -l /tmp/tpch.tar.gz
-rw------- 1 jvr jvr 13337600 Dec  4 11:16 /tmp/tpch.tar.gz

On the other machine:

$ monetdbd create $HOME/myfarm
$ monetdbd start $HOME/myfarm
$ tar -C $HOME/myfarm -axf /tmp/tpch.tar.gz
$ monetdb status
name  state   health               remarks
tpch  S                 mapi:monetdb://hank:50000/tpch
$ mclient -d tpch -s "select count(*) from lineitem"
+-------+
| %1    |
+=======+
| 60175 |
+-------+
1 tuple

To expand the tar file into a directory with a different name, for example because there already is a database with the original name, use the `--strip-components` option of (GNU) tar:

$ mkdir $HOME/myfarm/tpch2
$ tar -C $HOME/myfarm/tpch2 --strip-components=1 -axf /tmp/tpch.tar.gz
$ monetdb status
name   state   health               remarks
tpch   R 11m  100%  0s  mapi:monetdb://hank:50000/tpch
tpch2  S                mapi:monetdb://hank:50000/tpch2

In this example, we manually create the directory tpch2 in the dbfarm. The -C option makes tar unpack in this directory. The --strip-components=1 option removes the original directory name tpch/ from the extract files.
Alternatively, we can avoid the mkdir using the even more nonstandard --one-top-level option.

$ tar -C $HOME/myfarm --one-top-level=tpch3 --strip-components=1 -axf /tmp/tpch.tar.gz

 

On MS Windows we recommend users to install 7-zip utlity to open the (compressed) tar file and extract the database files.

 

File system level backup

File system level backup mk Mon, 03/24/2014 - 22:14

Aside from the SQL dumps, one can also take the database server out of production using the following commands. Be careful in using shutdown(), because there may be users running transactions. This can be seen using the sys.sessions table or the system monitor.

shell> mclient -d demo
select * from sys.sessions;
call sys.shutdown(10);
<cntrl-d>
shell> monetdb lock demo
shell> monetdb stop demo

The "lock" step is necessary to prevent monetdbd from automatically restarting the database "demo" upon receiving another client connection request.

After the database has been securely stopped, we can create a copy of the database directory (including all subdirectories) in the dbfarm and put it aside in a safe place. You may want to use a archiver (e.g. tar) and compression program (gzip, 7-zip) to reduce the size of the copied directory. Alternatively, incremental file-system dumps can be used to reduce the time and storage space for a recovery point. Finally, the database is released for production again using:

shell>monetdb release <dbname>