monetdb is slow to backup

Hello, I'm used to save monetdb using msqldump : msqldump --database=base --table=table > backup.sql This command takes 30 minutes to backup 6 GB , but 4 minutes to restore. How can I increase backup speed ? Thanks Pierre -- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35

Hello, Perhaps this is an option https://www.monetdb.org/Documentation/UserGuide/FastDumpRestore regards, Martin On 07/24/2014 02:25 PM, Pierre-Adrien Coustillas wrote:
Hello,
I'm used to save monetdb using msqldump :
msqldump --database=base --table=table > backup.sql
This command takes 30 minutes to backup 6 GB , but 4 minutes to restore.
How can I increase backup speed ?
Thanks
Pierre
-- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hello,
Thank you very much for your reply.
I can't use this method because it's a cold full backup and it's not possible to restore only one table.
Pierre
--
1G6
52 route de bischwiller
67300 Schiltigheim
Société de Services et de Formations en Logiciels Libres
http://1g6.biz
Tél : 06 64 63 70 35
----- Mail original -----
De: "Martin Kersten"

Hello, Have you deleted significant amount of rows before backup? Delete only marks rows as deleted and during backup it scans the whole table including deleted rows. I would check table sizes using sys.storage view. What is the size of the database you are backing up on the disk? It should be roughly the same as the size of the backup. Radovan On 07/24/2014 02:25 PM, Pierre-Adrien Coustillas wrote:
Hello,
I'm used to save monetdb using msqldump :
msqldump --database=base --table=table > backup.sql
This command takes 30 minutes to backup 6 GB , but 4 minutes to restore.
How can I increase backup speed ?
Thanks
Pierre
-- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hi,
Here are the queries I use to load and backup the base with :
sql>COPY INTO data_100 FROM '/root/data_maria100_utf.txt' DELIMITERS '\t' ;
103704336 affected rows (3m 10s)
The imported file :
ls -lh /root/data_maria100_utf.txt
-rw-r--r-- 1 root root 6,4G juin 10 18:22 /root/data_maria100_utf.txt
The backup command :
date ; msqldump --database=test --table=data_100 > data_100.sql ; date
jeudi 24 juillet 2014, 17:40:51 (UTC+0200)
jeudi 24 juillet 2014, 18:09:29 (UTC+0200)
ls -lh data_100.sql
-rw-r--r-- 1 root users 7,0G juil. 24 18:09 data_100.sql
If I use COPY query INTO file to backup instead of msqldump :
sql>COPY SELECT * FROM data_100 INTO '/home/pcoustillas/dump/data_100_coyp.csv' USING DELIMITERS '\t','\n','"';
103704336 affected rows (17m 53s)
It divide the necessary time by 2 , but i don't know if it is a good practice .
For reference here is the the output of sys.storage for this table :
sql>SELECT * FROM sys.storage WHERE "table"='data_100';
+--------+----------+----------------------+---------+----------+-----------+-----------+------------+------------+---------+--------+
| schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted |
+========+==========+======================+=========+==========+===========+===========+============+============+=========+========+
| sys | data_100 | id_visite | int | 62/6262 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | id_visiteur_unique | varchar | 60/6071 | 103704336 | 19 | 414817344 | 1286078464 | 0 | false |
| sys | data_100 | visiteur_identifie | int | 63/6316 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | site_niveau1 | int | 62/6270 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | site_niveau2 | varchar | 07/764 | 103704336 | 12 | 207408672 | 10240 | 0 | false |
| sys | data_100 | chapitre_niv1 | int | 60/6016 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | chapitre_niv2 | int | 56/5605 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | indicateur_global | varchar | 53/5335 | 103704336 | 1 | 207408672 | 20480 | 0 | false |
| sys | data_100 | id_cmp_autopromotion | int | 17/1700 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | nb_pages_vues | int | 56/5652 | 103704336 | 4 | 414817344 | 0 | 0 | false |
| sys | data_100 | temps_passe | int | 62/6264 | 103704336 | 4 | 414817344 | 0 | 0 | false |
+--------+----------+----------------------+---------+----------+-----------+-----------+------------+------------+---------+--------+
Pierre
--
1G6
52 route de bischwiller
67300 Schiltigheim
Société de Services et de Formations en Logiciels Libres
http://1g6.biz
Tél : 06 64 63 70 35
----- Mail original -----
De: "Radovan Bičiště"
Hello,
I'm used to save monetdb using msqldump :
msqldump --database=base --table=table > backup.sql
This command takes 30 minutes to backup 6 GB , but 4 minutes to restore.
How can I increase backup speed ?
Thanks
Pierre
-- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list

Hi, The only thing that comes to my mind is the cache. Do you test in the exact order as below? It could be that msqldump loads the table into the memory. That is what takes the time. Right after when you do COPY command the table is already in the memory and dumping is much faster. You can try following: CASE 1 1. load the table using COPY 2. dump the table using COPY 3. check the time CASE 2 1. load the table using COPY 2. dump the table using msqldump 3. check the time CASE 3 1. load the table using COPY 2. do SELECT * FROM table LIMIT 1000 or something else to force fullscan on the table 3. dump the table using msqldump 4. check the time Regards, Radovan On 07/25/2014 11:17 AM, Pierre-Adrien Coustillas wrote:
Hi,
Here are the queries I use to load and backup the base with :
sql>COPY INTO data_100 FROM '/root/data_maria100_utf.txt' DELIMITERS '\t' ; 103704336 affected rows (3m 10s)
The imported file : ls -lh /root/data_maria100_utf.txt -rw-r--r-- 1 root root 6,4G juin 10 18:22 /root/data_maria100_utf.txt
The backup command : date ; msqldump --database=test --table=data_100 > data_100.sql ; date jeudi 24 juillet 2014, 17:40:51 (UTC+0200) jeudi 24 juillet 2014, 18:09:29 (UTC+0200)
ls -lh data_100.sql -rw-r--r-- 1 root users 7,0G juil. 24 18:09 data_100.sql
If I use COPY query INTO file to backup instead of msqldump : sql>COPY SELECT * FROM data_100 INTO '/home/pcoustillas/dump/data_100_coyp.csv' USING DELIMITERS '\t','\n','"'; 103704336 affected rows (17m 53s)
It divide the necessary timeby 2 , but i don't know if it is a good practice .
For reference here is the the output of sys.storage for this table : sql>SELECT * FROM sys.storage WHERE "table"='data_100'; +--------+----------+----------------------+---------+----------+-----------+-----------+------------+------------+---------+--------+ | schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted | +========+==========+======================+=========+==========+===========+===========+============+============+=========+========+ | sys | data_100 | id_visite | int | 62/6262 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | id_visiteur_unique | varchar | 60/6071 | 103704336 | 19 | 414817344 | 1286078464 | 0 | false | | sys | data_100 | visiteur_identifie | int | 63/6316 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | site_niveau1 | int | 62/6270 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | site_niveau2 | varchar | 07/764 | 103704336 | 12 | 207408672 | 10240 | 0 | false | | sys | data_100 | chapitre_niv1 | int | 60/6016 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | chapitre_niv2 | int | 56/5605 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | indicateur_global | varchar | 53/5335 | 103704336 | 1 | 207408672 | 20480 | 0 | false | | sys | data_100 | id_cmp_autopromotion | int | 17/1700 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | nb_pages_vues | int | 56/5652 | 103704336 | 4 | 414817344 | 0 | 0 | false | | sys | data_100 | temps_passe | int | 62/6264 | 103704336 | 4 | 414817344 | 0 | 0 | false | +--------+----------+----------------------+---------+----------+-----------+-----------+------------+------------+---------+--------+
Pierre
-- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35
------------------------------------------------------------------------ *De: *"Radovan Bičiště"
*À: *users-list@monetdb.org *Envoyé: *Jeudi 24 Juillet 2014 17:10:25 *Objet: *Re: monetdb is slow to backup Hello, Have you deleted significant amount of rows before backup? Delete only marks rows as deleted and during backup it scans the whole table including deleted rows. I would check table sizes using sys.storage view. What is the size of the database you are backing up on the disk? It should be roughly the same as the size of the backup.
Radovan
participants (3)
-
Martin Kersten
-
Pierre-Adrien Coustillas
-
Radovan Bičiště