Backing up and restoring a PostgreSQL database is an essential task for any system administrator. Fortunately, there are built-in pg_dump
and pg_restore
utilities to make these tasks easier to complete.
Using these utilities, administrators are able to create a full, incremental, or continuous backup either locally or remotely.
PostgreSQL is a relational database system that is widely used. The system is open-source and offers a wide array of tools to accomplish tasks quickly.
To help you understand more about these processes we will cover what they are and work through some examples.
It's important to note that running these commands assumes that you already have a server up and running the Linux operating system, and have PostgreSQL installed. There will also have to be a root password setup on your server.
pg_dump
The pg_dump
command extracts a PostgreSQL database into a script file or another archive file. This utility is for backing up databases. The utility makes consistent backups even if the database is being used concurrently. Readers, writers, and other users won't be blocked from using the database while using pg_dump
.
Only a single database will be dumped. If there is a cluster of databases that need to be dumped, scroll down to the pg_dumpall
command.
The outputs are either script or archive files.
A script dump is a plain text file that contains SQL commands which can reconstruct the database to the exact state it was in when it was saved.
Feeding the file to psql will restore these scripts. Script files work on other databases, and can be used with different machines and architectures.
A few modifications in the options even allows this file type to be used with other SQL database products.
Archive file formats aren't as universal and must be used by pg_restore
to get the database back. While not as generic, this format allows the user to be selective of what gets restored. The user can even reorder items before restoring them, which makes this format ideal for porting the database across architectures.
Output format of pg_dump
There is also flexibility when it comes to archiving and transferring information. Using the archive format withpg_dump
and bringing it back in withpg_restore
allows for more specific restoration.pg_dump
will back up the entire database, thenpg_restore
will then select which parts of the database to archive or restore.
Using output file formats like "custom" and "directory" allows for the greatest flexibility. These formats support reordering archived items, parallel restoration, and automatically come compressed. If you want to run parallel dumps, then the "directory" format will be your only choice.
The PostgreSQL statistics collector is a subsystem within PostgreSQL designed to collect and provide information about server activity.
This includes a wide range of data, such as the number of rows fetched or affected by queries, the number of index scans, tables scans, the number of tuples inserted, updated, or deleted, and more.
If pg_dump
has been run, be sure to take a look at the output of any standard errors that get printed.
Running pg_dump
will internally execute "select" statements. You will have to be able to select information from the database using psql for the operations to work properly. Any front-end library variables, like for libpq, will continue to be turned on when running the utility.
The statistics collector in PostgreSQL typically gathers information about the activities performed by pg_dump
. However, if you prefer not to have this data collected, you can disable it by setting the track_counts
parameter to false. This can be done either through the PGOPTIONS
environment variable or by using the ALTER USER
command.
A standard command will follow this format:
pg_dump [connection-option…] [option…] [dbname]
Postgresql has online documentation that covers all of the options available for pg_dump
, you'll find a few of the most common options below.
The pg_dump
command in PostgreSQL offers a variety of options to control the content and format of the output when backing up a database. Here are some of the key options available:
Basic Options:
dbname
: Specifies the name of the database to be dumped.-a
, --data-only
: Dump only the data, not the schema.-b
, --large-objects
, --blobs
(deprecated): Include large objects in the dump.-B
, --no-large-objects
, --no-blobs
(deprecated): Exclude large objects in the dump.-c
, --clean
: Output commands to DROP
all the dumped database objects before creating them.Formatting and Filtering Options:
-C
, --create
: Begin the output with a command to create the database itself.-e pattern
, --extension=pattern
: Dump only extensions matching the pattern.-E encoding
, --encoding=encoding
: Set the character set encoding for the dump.-F format
, --format=format
: Selects the format of the output (plain, custom, directory, tar).-j njobs
, --jobs=njobs
: Run the dump in parallel by dumping multiple tables simultaneously.Schema and Table Selection:
-n pattern
, --schema=pattern
: Dump only schemas matching the pattern.-N pattern
, --exclude-schema=pattern
: Exclude schemas matching the pattern.-t pattern
, --table=pattern
: Dump only tables with names matching the pattern.-T pattern
, --exclude-table=pattern
: Exclude tables matching the pattern.Advanced and Specialized Options:
--inserts
, --column-inserts
: Dump data as INSERT
commands with explicit column names.--disable-triggers
: Include commands to temporarily disable triggers on the target tables.--enable-row-security
: Dump the contents of a table with row security enabled.--include-foreign-data=foreignserver
: Dump the data for foreign tables with a matching foreign server.Dumping a database called dangerousdb
into an SQL file:
pg_dump dangerousdb > db.sql
Backing up the dangerousdb
in with a tar format::
pg_dump -U postgres -F c dangerousdb > dangerousdb.tar
Saving the dangerousdb
in a directory format runs this command:
pg_dump -U postgres -F d dangerousdb > db1_backup
Large databases that want smaller file formats may use the utility with a compression tool like gzip when running the backup.
pg_dump -U postgres dangerousdb | gzip > dangerousdb.gz
Reloading a script into a newly created database called nowdb
:
psql -d nowdb -f db.sql
Dumping a single PostgreSQL table is possible by using the -t option and specifying the database and tab name (here: dangerousdb; tallytab)
pg_dump -t tallytab dangerousdb > db.sql
Dumping selected PostgreSQL tables by putting in conditions. Here the command will dump all tables that start with "sam" in the "ple" schema, but will exclude the table "samson."
pg_dump -t 'ple.sam*' -T ple.samson dangerousdb > db.sql
pg_restore
The pg_restore
command takes an archive file created by a pg_dump
command and restores the selected PostgreSQL database.
When pg_dump
is used with one of the non-plain text formats, the restore command will bring the database back. The utility knows how to issue commands in the proper order to make sure the database is reconstructed to the exact state it was in when the save occurred.
Since this format is supposed to be portable across architectures, the user may select what is restored and in what order.
The options for what can be done with the data depends on how the source file was generated. The command can't regenerate data that isn't there, and it can't change the nature of commands used to create the file.
The pg_restore
command in PostgreSQL is used to restore databases from an archive created by pg_dump
. Here are some of the most important options for this command:
Basic Options:
filename
: Specifies the location of the archive file to be restored.-a
, --data-only
: Restore only the data, not the schema.-c
, --clean
: Drop all objects before restoring them, useful for overwriting an existing database.-C
, --create
: Create the database before restoring into it.-d dbname
, --dbname=dbname
: Connect to a specific database and restore directly into it.Performance Options:
-j number-of-jobs
, --jobs=number-of-jobs
: Run the most time-consuming steps concurrently, reducing the time to restore a large database.Filtering Options:
-n schema
, --schema=schema
: Restore only objects in the specified schema.There are two modes the utility uses. If the user specifies a database name, pg_restore
connects to that database and restores content directly on it. If the name is left out, the process creates a script with the SQL commands to rebuild the database.
For the second choice, the result is a file with a standard output and will look similar to the standard text script that pg_dump
would generate.
The command will follow this format:
pg_restore [connection-option…] [option…] [filename]
For a full list of options, for these commands, you may check out the PostgreSQL documentation.
Restoring a backup with a .tar file name requires that the user consider whether the database already exists, and what the format of the backup is.
If the database dbcooper already exists, the following command will restore it:
pg_restore -U postgres -Ft -d dbcooper < dbcooper.tar
If the database doesn't yet exist, tweaking the command like the following will restore it:
pg_restore -U postgres -Ft -C -d dbcooper < dbcooper.tar
The following command will restore a backup from a backup file i.e. name: back_it_on_up.sql
psql -f back_it_on_up.sql
pg_dumpall
The PostgreSQL utility extracts a database cluster into a script file
Using pg_dumpall
, one command allows the user to back up an entire cluster of databases and dump them out into one script file. The file works the same as the pg_dump
command, meaning that the script will use SQL commands to restore all databases.
In fact, this command will call pg_dump
for each database in the cluster. Some parts of the architecture, like global objects, are saved too. Database roles, tablespaces, and any information that is common to all databases will be saved by pg_dumpall
, something that pg_dump
will never touch.
Requirements
To effectively use the tool, you will likely have to be logged into the database as a superuser to get a complete dump. Superuser privileges will also be useful to execute the saved script so that you can add roles and create databases.
The final file will use the standard SQL script output. Running this utility will require connecting to the PostgreSQL server once per database while performing the dump. If you use password authentication, you will have to provide the password for each database in the cluster.
The command for the pg_dumpall
utility will be structured like the following:
pg_dumpall [connection-option…] [option…]
PostgreSQL has extensive documentation that covers all options available for using the tool if you are looking for something specific.
The following command will dump all databases:
pg_dumpall > db.out
This command will reload databases from file:
psql -f db.out postgres
This command will dump all files and create a single file called back_it_on_up.sql
pg_dumpall -f back_it_on_up.sql
Many of the error messages that pop up will refer to pg_dump
because the command runs this utility internally. Some errors will inevitably come up, but won't mean anything. The script will "create roles" for every role existing in the cluster you are using.
Roles like the bootstrap superuser, will likely get an error that says, "role already exists."
Databases will retain any previous contents and database-level properties. If you want to be sure that the databases are restored exactly as they are, using the --clean
option may be useful.
The option authorized the script to recreate the built-in databases, and makes sure that each database will have the same properties they had previously in the cluster. Using this option will kick back some errors about non-existent objects, but these errors can be ignored.
Adding the --if-exists
option will take those errors out if they are too distracting.
Running "analyze" on each database will give the optimizer useful statistics to determine how a restore went.
By using these utilities many of the backup features for a PostgreSQL can be accomplished with little effort.
The command that is right depends on whether the user wants to dump or restore a file and whether they want to back up everything or just a single database. Adding in the options that are available for the database system allows the user to customize how the backup happens and what the final result looks like.
SimpleBackups is a database and website backup automation tool that offloads all the backups tasks.
It automates PostgreSQL backups and sends them to the cloud storage of your choice.
SimpleBackups helps you make sure your backups are running well, on time, securely, with all the bells and whistles.
Free 7-day trial. No credit card required.
Have a question? Need help getting started?
Get in touch via chat or at [email protected]