pg_rman -- manages backup and recovery of PostgreSQL.
pg_rman [ OPTIONS ] { init | backup | restore | show [ DATE | timeline ] | validate [ DATE ] | delete DATE }
pg_rman has the features below:
DATE
is the start time of the target backup in ISO-format (YYYY-MM-DD HH:MI:SS). Prefix match is used to compare DATE
and backup files.
$ pg_rman show 2009-12 # show backups in a month of December 2009
$ pg_rman validate # validate all unvalidated backups
pg_rman supports the following commands. See also Options for details of OPTIONS
.
init
backup
restore
show
validate
delete
pg_rman is a utility program to backup and restore PostgreSQL database. It takes a physical online backup of whole database cluster, archive WALs, and server logs.
pg_rman supports getting backup from standby-site with PostgreSQL 9.0 later.
And pg_rman supports storage snapshot backup.
First, you need to create "a backup catalog" to store backup files and their metadata.
$ pg_rman init -B <a backup catalog path>
It is recommended to setup log_directory
, archive_mode
and archive_command
in postgresql.conf before initialize the backup catalog. If the variables are initialized, pg_rman can adjust the config file to the setting. In this case, you have to specify the database cluster path for PostgreSQL. Please specify it in PGDATA
environmental variable or -D
/--pgdata
option.
The mode of backup can be one of the following types. Also serverlogs can be added.
It is necessary to validate the data backuped by pg_rman. Pg_rman uses file size check and CRC for validation.
It is recommended to verify backup files as soon as possible after backup. Unverified backup cannot be used in restore nor in incremental backup.
The show
command outputs backup lists.
$ pg_rman show
============================================================================
Start Time Total Data WAL Log Backup Status
============================================================================
2015-03-10 13:26:37 0m ---- ---- 16MB ---- 26kB OK
2015-03-10 13:26:06 0m ---- 16kB 33MB ---- 54kB OK
2015-03-10 13:25:05 0m 28MB ---- 838MB 150B 6549kB OK
The fields are:
with timeline
, you can see the timeline with the backup catalog.
$ pg_rman show timeline
============================================================
Start Mode Current TLI Parent TLI Status
============================================================
2011-11-27 19:16:37 INCR 1 0 RUNNING
2011-11-27 19:16:20 INCR 1 0 OK
2011-11-27 19:15:45 FULL 1 0 OK
And more, when you specify the date in “Start” field, you can see the detail information of the backup.
$ pg_rman show '2011-11-27 19:15:45'
# configuration
BACKUP_MODE=FULL
WITH_SERVERLOG=false
COMPRESS_DATA=false
# result
TIMELINEID=1
START_LSN=0/08000020
STOP_LSN=0/080000a0
START_TIME='2011-11-27 19:15:45'
END_TIME='2011-11-27 19:19:02'
RECOVERY_XID=1759
RECOVERY_TIME='2011-11-27 19:15:53'
TOTAL_DATA_BYTES=1242420184
READ_DATA_BYTES=25420184
READ_ARCLOG_BYTES=32218912
WRITE_BYTES=242919520
BLOCK_SIZE=8192
XLOG_BLOCK_SIZE=8192
STATUS=OK
Pg_rman restore the backuped data into target database cluster path.
PostgreSQL server should be stopped before restore. In addition, do not erase a original database cluster, because pg_rman has to check the timeline ID or data checksum status from it. Restore command will save unarchived transaction log and delete all database files. You can retry recovery until a new backup is taken. After restoring files, pg_rman create recovery.conf in $PGDATA
. The conf file contains parameters to recovery, and you can also modify the file if needed.
It is recommended to take a full backup as soon as possible after recovery is succeeded.
If --recovery-target-timeline
is not specifed, the last checkpoint’s TimeLineID in control file ($PGDATA/global/pg_control
) will be a restore target. If pg_control is not present, TimeLineID in the full backup used by the restore will be a restore target.
The delete command deletes backup files not required by recovery after the specified date. The following example deletes unneeded backup files to recovery at 12:00 11, September 2009.
$ pg_rman show
============================================================================
Start Time Total Data WAL Log Backup Status
============================================================================
2009-09-11 20:00:01 0m ---- ---- 0B ---- 0B OK
2009-09-11 15:00:53 0m ---- 8363B 16MB ---- 2346kB OK
2009-09-11 10:00:48 0m ---- ---- 0B ---- 0B OK
2009-09-11 05:00:06 0m 40MB ---- 16MB ---- 5277kB OK
2009-09-11 00:00:02 0m ---- 8363B 16MB ---- 464kB OK
2009-09-10 20:30:12 0m ---- ---- 0B ---- 0B OK
2009-09-10 15:00:06 0m ---- ---- 16MB ---- 16kB OK
2009-09-10 10:00:02 0m ---- 8363B 16MB ---- 16kB OK
2009-09-10 05:00:08 0m 40MB ---- 150MB ---- 13MB OK
$ pg_rman delete 2009-09-11 12:00:00
$ pg_rman show
============================================================================
Start Time Total Data WAL Log Backup Status
============================================================================
2009-09-11 20:00:01 0m ---- ---- 0B ---- 0B OK
2009-09-11 15:00:53 0m ---- 8363B 16MB ---- 2346kB OK
2009-09-11 10:00:48 0m ---- ---- 0B ---- 0B OK
2009-09-11 05:00:06 0m 40MB ---- 16MB ---- 5277kB OK
If you use replication feature on PostgreSQL 9.0 later, you can get backup from standby-site.
You should specify different options from usual use for getting backup from standby-site. In detail, you should specify the database cluster on standby-site by -D
/--pgdata
option. And you should specify information on master-site by connection options (-d
/--dbname
, -h
/--host
, -p
/--port
). In addition, you should specify information to connect standby-site by standby connection options (--standby-host
, --standby-port
).
$ pg_rman init -B <a backup catalog path> -D <(the database cluster path(on standby-site)>
Here shows an example with the below environment.
Then, the backup from standby site can be done with the below command:
$ pg_rman backup --pgdata=/home/postgres/pgdata_sby --backup-mode=full --host=master --standby-host=localhost --standby-port=5432
In this example, let's consider about PostgreSQL server with the following configurations.
postgres=# SHOW log_directory ;
log_directory
---------------
pg_log
(1 row)
postgres=# SHOW archive_command ;
archive_command
--------------------------------------------
cp %p /home/postgres/arc_log/%f
(1 row)
And the PGDATA
and BACKUP_PATH
are set as environmet variables.
$ echo $PGDATA
/home/postgres/pgdata
$ echo $BACKUP_PATH
/home/postgres/backup
Initialize a backup catalog.
$ pg_rman init
INFO: ARCLOG_PATH is set to '/home/postgres/arclog'
INFO: SRVLOG_PATH is set to '/home/postgres/pgdata/pg_log'
By this, the configuration file for pg_rman, named pg_rman.init
, is created.
All the commands of pg_rman load configurations from this file as default.
For this example, we use the following configurtaions.
$ cat $BACKUP_PATH/pg_rman.ini
ARCLOG_PATH = /home/postgres/arclog
SRVLOG_PATH = /home/postgres/pgdata/pg_log
BACKUP_MODE = F
COMPRESS_DATA = YES
KEEP_ARCLOG_FILES = 10
KEEP_ARCLOG_DAYS = 10
KEEP_DATA_GENERATIONS = 3
KEEP_DATA_DAYS = 120
KEEP_SRVLOG_FILES = 10
KEEP_SRVLOG_DAYS = 10
Then, do a backup. It should be start from a full backup. Hrere, we will also take server log files.
$ pg_rman backup --backup-mode=full --with-serverlog
INFO: database backup start
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
Check the result by show
command.
$ pg_rman show
============================================================================
Start Time Total Data WAL Log Backup Status
============================================================================
2015-03-10 13:25:05 0m 28MB ---- 838MB 150B 6549kB DONE
The status of the backup we have just taken is DONE.
This is because we does not do validate yet.
So, do validate
command next.
$ pg_rman validate
INFO: validate: 2015-03-10 13:25:05 backup and archive log files by CRC
$ pg_rman show
============================================================================
Start Time Total Data WAL Log Backup Status
============================================================================
2015-03-10 13:25:05 0m 28MB ---- 838MB 150B 6549kB OK
Now the status has been changed to OK.
Let's try to restore the backup data. Before try to do it, PostgreSQL server should be stopped.
$ pg_ctl stop -m immediate
$ pg_rman restore
The pg_rman has created recovery.conf. If necessary, modify it as you wanted. In this example, we use this without modifications and will try to do PITR to latest database status.
$ cat $PGDATA/recovery.conf
# recovery.conf generated by pg_rman 1.2.11
restore_command = 'cp /home/postgres/arclog/%f %p'
recovery_target_timeline = '1'
$ pg_ctl start
pg_rman accepts the following command line parameters. Some of them can be also sepcified as environment variables. See also Parameters for the details.
As a general rule, paths for data location need to be specified as absolute paths; relative paths are not allowed.
-D PATH
/ --pgdata=PATH
-A PATH
/ --arclog-path=PATH
-S PATH
/ --srvlog-path=PATH
-B PATH
/ --backup-path=PATH
-c
/ --check
--verbose
option to verify the operation.-v
/ --verbose
-b { full | incremental | archive }
/ --backup-mode={ full | incremental | archive }
full
backup, incremental
backup, and archive
backup. Abbreviated forms (prefix match) are also available. For example, -b f
means full
backup.
-s
/ --with-serverlog
-Z
/ --compress-data
-C
/ --smooth-checkpoint
pg_start_backup()
.--keep-data-generations
/ --keep-data-days
--keep-data-generations
means number of backup generations. --keep-data-days
means days to be kept. If you want to do with these options, you have to specify both options. Only files exceeded both settings are deleted.--keep-arclog-files
/ --keep-arclog-days
--keep-arclog-files
means number of files. --keep-arclog-days
means days to be kept. When you do backup, only files which have been already backuped and exceeded both settings are deleted from archive log directory ($ARCLOG_PATH). If you want to do with these options, you have to specify both options --keep-arclog-files
and --keep-arclog-days
.--keep-srvlog-files
/ --keep-srvlog-days
--keep-srvlog-files
means number of files. --keep-srvlog-days
means days to be kept. When you do backup, only files exceeded both settings are deleted from server log directory (log_directory). This option works when you specify --with-serverlog
and --srvlog-path
options in backup command. And If you want to do with these options, you have to specify both options --keep-srvlog-files
and --keep-srvlog-days
.The parameters which are started with –recovery are same as parameters in recovery.conf. See also “Recovery Configuration” for details.
--recovery-target-timeline TIMELINE
$PGDATA/global/pg_control
) is used.--recovery-target-time TIMESTAMP
--recovery-target-xid XID
--recovery-target-inclusive
--hard-copy
-a
/ --show-all
Parameters to connect PostgreSQL server.
-d DBNAME
/ --dbname=DBNAME
-h HOSTNAME
/ --host=HOSTNAME
-p PORT
/ --port=PORT
-U USERNAME
/ --username=USERNAME
-w
/ --no-password
-W
/ --password
Parameters to connect standby server. They are used only when you get backup from the standby site.
--standby-host
--standby-port
--help
-V
/ --version
-!
/ --debug
Some of parameters can be specified in commandline arguments, environment variables or configuration file as follows:
Short | Long | Environment variable | Conf file | Description | Remarks |
---|---|---|---|---|---|
-h | –host | PGHOST | database server host or socket directory | ||
-p | –port | PGPORT | database server port | ||
-d | –dbname | PGDATABASE | database to connect | ||
-U | –username | PGUSER | user name to connect as | ||
PGPASSWORD | password used to connect | ||||
-w | –password | force password prompt | |||
-W | –no-password | never prompt for password | |||
-D | –pgdata | PGDATA | Yes | location of the database storage area | |
-B | –backup-path | BACKUP_PATH | Yes | location of the backup storage area | |
-A | –arclog-path | ARCLOG_PATH | Yes | location of archive WAL storage area | |
-S | –srvlog-path | SRVLOG_PATH | Yes | location of server log storage area | |
-b | –backup-mode | BACKUP_MODE | Yes | backup mode (full, incremental, or archive) | |
-s | –with-serverlog | WITH_SERVERLOG | Yes | also backup server log files | specify boolean type in environmental variable or configuration file |
-Z | –compress-data | COMPRESS_DATA | Yes | compress data backup with zlib | specify boolean type in environmental variable or configuration file |
-C | –smooth-checkpoint | SMOOTH_CHECKPOINT | Yes | do smooth checkpoint before backup | specify boolean type in environmental variable or configuration file |
–standby-host | STANDBY_HOST | Yes | standby server host or socket directory | ||
–standby-port | STANDBY_PORT | Yes | standby server port | ||
–keep-data-generations | KEEP_DATA_GENERATIONS | Yes | keep GENERATION of full data backup | ||
–keep-data-days | KEEP_DATA_DAYS | Yes | keep enough data backup to recover to DAY days age | ||
–keep-srvlog-files | KEEP_SRVLOG_FILES | Yes | keep NUM of serverlogs | ||
–keep-srvlog-days | KEEP_SRVLOG_DAYS | Yes | keep serverlog modified in DAY days | ||
–keep-arclog-files | KEEP_ARCLOG_FILES | Yes | keep NUM of archived WAL | ||
–keep-arclog-days | KEEP_ARCLOG_DAYS | Yes | keep archived WAL modified in DAY days | ||
–recovery-target-timeline | RECOVERY_TARGET_TIMELINE | Yes | recovering into a particular timeline | ||
–recovery-target-xid | RECOVERY_TARGET_XID | Yes | transaction ID up to which recovery will proceed | ||
–recovery-target-time | RECOVERY_TARGET_TIME | Yes | time stamp up to which recovery will proceed | ||
–recovery-target-inclusive | RECOVERY_TARGET_INCLUSIVE | Yes | whether we stop just after the recovery target | ||
–hard-copy | HARD_COPY | Yes | how to restore archive WAL | specify boolean type in environmental variable or configuration file |
This utility, like most other PostgreSQL utilities, also uses the environment variables supported by libpq (see Environment Variables)
pg_rman has the following restrictions.
Getting backup from standby-site, pg_rman has the follow restrictions too.
When using storage snapshot, pg_rman has the following restrictions too.
pg_rman can recover to point-in-time if timeline, transaction ID, or timestamp are specified in recovery. pg_xlogdump(9.3 or later)xlogdump(9.2 or before) is an useful tool to check the contents of WAL files and determine when to recover. See Continuous Archiving and Point-in-Time Recovery (PITR) for the details.
Setting parameters can be specified with form of “name=value
” in the configuration file. Quotes are required if the value contains whitespaces. Comments starts with “#
”. Whitespaces and tabs are ignored excluding values.
pg_rman returns exit codes for each error status.
Code | Name | Description |
---|---|---|
0 | SUCCESS | Succeeded. |
1 | HELP | Print a help, then exit. |
2 | ERROR | Generic error. |
3 | FATAL | Exit because of repeated errors |
4 | PANIC | Unknown critical condition. |
10 | ERROR_SYSTEM | I/O or system error. |
11 | ERROR_NOMEM | Out of memory. |
12 | ERROR_ARGS | Invalid input parameters. |
13 | ERROR_INTERRUPTED | Interrupted by user. (Ctrl+C etc.) |
14 | ERROR_PG_COMMAND | SQL error. |
15 | ERROR_PG_CONNECT | Cannot connect to PostgreSQL server. |
20 | ERROR_ARCHIVE_FAILED | Cannot archive WAL files. |
21 | ERROR_NO_BACKUP | Backup file not found. |
22 | ERROR_CORRUPTED | Backup file is broken. |
23 | ERROR_ALREADY_RUNNING | Cannot start because another pg_rman is running. |
24 | ERROR_PG_INCOMPATIBLE | Version conflicted with PostgreSQL server. |
25 | ERROR_PG_RUNNING | Cannot restore because PostgreSQL server is running. |
26 | ERROR_PID_BROKEN | postmaster.pid file is broken. |
This is the script to getting snapshot and mounting file systems. If you want to add outer scripts, you should make your script corresponding outer script interface according to referring manuals of the storage. Please refer Interface Specification about what you should make.
Outer script performs some operation for getting several snapshots in a time execution.
If you want to use outer script, you should set the script in backup catalog directory and rename it to “snapshot_script
”.
A sample outer script is released for LVM(Logical Volume Manager).
$ ${BACKUP_PATH}/snapshot_script { split | resync | mount | umount | freeze | unfreeze } [cleanup]
{ split | resync | mount | umount | freeze | unfreeze }
[cleanup]
cleanup
, error occuring doesn’t stop the process. just output warning messages.resync
, umount
, unfreeze
only.split
operation.freeze
operation.mount
operation.split
operation.mount
operation. The template is <tablespace name>=<path to directory for the tablespace>SUCCESS
”, otherwise output nothing. If the command is split
or mount
, output in last line.split
PG-DATA
”.resync [cleanup]
split
operation.cleanup
is specified and occuring errors, output warning messages and continue to get rest snapshots.mount
split
operation to the filesystem.PG-DATA
”.umount [cleanup]
mount
operation.cleanup
is specified and occuring errors, output warning messages and continue to unmount rest snapshots.freeze
unfreeze [cleanup]
freeze
operation.split
perform lvcreate command as root authority against a volume for getting snapshot.
$ sudo /usr/sbin/lvcreate --snapshot --size=2G --name snap00 /dev/VolGroup00/LogVolume00
Above example is getting snapshot for logical volume “LogVolume00”.
resync
perform lvremove command as root authority against a volume for getting snapshot.
$ sudo /usr/sbin/lvremove -f /dev/VolGroup00/snap00
mount
perform mount command as root authority against a volume for getting snapshot.
$ sudo /bin/mount /dev/VolGroup00/snap00 /mnt/snapshot_lvm/pgdata
Above example is mounting snapshot volume made by split operation to “/mnt/snapshot_lvm/pgdata”.
umount
perform umount command as root authority against a volume for getting snapshot.
$ sudo /bin/umount /mnt/snapshot_lvm/pgdata
freeze
unfreeze
pg_rman can be installed as same as standard contrib modules.
The module can be built with pgxs.
$ cd pg_rman
$ make
$ make install
No need to register to databases.