PostgreSQL pg_dump & pg_restore Guide

SimpleBackups founder

Laurent Lemaire

Co-founder, SimpleBackups

December 1st, 2020

Backing up and restoring a PostgreSQL database is an essential task for any system administrator. Fortunately, there are built-in pg_dump and pg_restore utilities to make these tasks easier to complete.

Looking for simple service to manage your PostgreSQL backups?
→ Try SimpleBackups for free

PostgreSQL Backup & Restore Guide

Table of Contents

Introduction

Using these utilities, administrators are able to create a full, incremental, or continuous backup either locally or remotely.

PostgreSQL is a relational database system that is widely used. The system is open-source and offers a wide array of tools to accomplish tasks quickly.

To help you understand more about these processes we will cover what they are and work through some examples.

It's important to note that running these commands assumes that you already have a server up and running the Linux operating system, and have PostgreSQL installed. There will also have to be a root password setup on your server.

Don't want to maintain you backup scripts?

Try SimpleBackups Now →

The pg_dump command

The pg_dump command extracts a PostgreSQL database into a script file or another archive file. This utility is for backing up databases. The utility makes consistent backups even if the database is being used concurrently. Readers, writers, and other users won't be blocked from using the database while using pg_dump.

Only a single database will be dumped. If there is a cluster of databases that need to be dumped, scroll down to the pg_dumpall command.

The output of pg_dump

The outputs are either script or archive files. A script dump is a plain text file that contains SQL commands which can reconstruct the database to the exact state it was in when it was saved. Feeding the file to psql will restore these scripts. Script files work on other databases, and can be used with different machines and architectures.

A few modifications in the options even allows this file type to be used with other SQL database products.

Archive file formats aren't as universal and must be used by pg_restore to get the database back. While not as generic, this format allows the user to be selective of what gets restored. The user can even reorder items before restoring them, which makes this format ideal for porting the database across architectures.

Output format

There is also flexibility when it comes to archiving and transferring information. Using the archive format with pg_dump and bringing it back in with pg_restore allows for more specific restoration. pg_dump will back up the entire database, then pg_restore will then select which parts of the database to archive or restore.

Using output file formats like "custom" and "directory" allows for the greatest flexibility. These formats support reordering archived items, parallel restoration, and automatically come compressed. If you want to run parallel dumps, then the "directory" format will be your only choice.

PostgreSQL statistics collector

If pg_dump has been run, be sure to take a look at the output of any standard errors that get printed.
Running pg_dump will internally execute "select" statements. You will have to be able to select information from the database using psql for the operations to work properly. Any front-end library variables, like for libpq, will continue to be turned on when running the utility.

The statistics collector is in charge of collecting database activity for pg_dump. If you don't want to use this, you can use the "alter user" command or go to "pgoptions" and change the parameter track_counts.

Command structure

A standard command will follow this format:

pg_dump [connection-option…] [option…] [dbname]`

Postgresql has online documentation that covers all of the options available for pg_dump including how to write them and what the option does.

See a pg_dump example

Dumping a database called dangerousdb into an SQL file:

pg_dump dangerousdb > db.sql

Backing up the dangerousdb in with a tar format:

pg_dump -U postgres -F c dangerousdb > dangerousdb.tar

Saving the dangerousdb in a directory format runs this command:

pg_dump -U postgres -F d dangerousdb > db1_backup

Large databases that want smaller file formats may use the utility with a compression tool like gzip when running the backup.

pg_dump -U postgres dangerousdb | gzip > dangerousdb.gz

Reloading a script into a newly created database called nowdb:

psql -d nowdb -f db.sql

Dumping a single table is possible by using the -t option and specifying the database and tab name (here: dangerousdb; tallytab)

pg_dump -t tallytab dangerousdb > db.sql

Dumping selected tables by putting in conditions. Here the command will dump all tables that start with "sam" in the "ple" schema, but will exclude the table "samson."

pg_dump -t 'ple.sam*' -T ple.samson dangerousdb > db.sql

Summary of the pg_restore command

The pg_restore command takes an archive file created by a pg_dump command and restores the selected PostgreSQL database.

When pg_dump is used with one of the non-plain text formats, the restore command will bring the database back. The utility knows how to issue commands in the proper order to make sure the database is reconstructed to the exact state it was in when the save occurred.

Since this format is supposed to be portable across architectures, the user may select what is restored and in what order.

The options for what can be done with the data depends on how the source file was generated. The command can't regenerate data that isn't there, and it can't change the nature of commands used to create the file.

pg_restore modes

There are two modes the utility uses. If the user specifies a database name, pg_restore connects to that database and restores content directly on it. If the name is left out, the process creates a script with the SQL commands to rebuild the database.

For the second choice, the result is a file with a standard output and will look similar to the standard text script that pg_dump would generate.

Command structure and format

The command will follow this format:

pg_restore [connection-option…] [option…] [filename]`

For a full list of options, for these commands, you may check out the PostgreSQL documentation.

See a pg_restore example

Restoring a backup with a .tar file name requires that the user consider whether the database already exists, and what the format of the backup is.

If the database dbcooper already exists, the following command will restore it:

pg_restore -U postgres -Ft -d dbcooper < dbcooper.tar

If the database doesn't yet exist, tweaking the command like the following will restore it:

pg_restore -U postgres -Ft -C -d dbcooper < dbcooper.tar

The following command will restore a backup from a backup file i.e. name: back_it_on_up.sql

psql -f back_it_on_up.sql

Summary of the pg_dumpall command

The PostgreSQL utility extracts a database cluster into a script file

Using pg_dumpall, one command allows the user to back up an entire cluster of databases and dump them out into one script file. The file works the same as the pg_dump command, meaning that the script will use SQL commands to restore all databases.

In fact, this command will call pg_dump for each database in the cluster. Some parts of the architecture, like global objects, are saved too. Database roles, tablespaces, and any information that is common to all databases will be saved by pg_dumpall, something that pg_dump will never touch.

Using pg_dumpall

To effectively use the tool, you will likely have to be logged into the database as a superuser to get a complete dump. Superuser privileges will also be useful to execute the saved script so that you can add roles and create databases.

The final file will use the standard SQL script output. Running this utility will require connecting to the PostgreSQL server once per database while performing the dump. If you use password authentication, you will have to provide the password for each database in the cluster.

Error messages

Many of the error messages that pop up will refer to pg_dump because the command runs this utility internally. Some errors will inevitably come up, but won't mean anything. The script will "create roles" for every role existing in the cluster you are using.

Roles like the bootstrap superuser, will likely get an error that says, "role already exists."

Using the --clean option

Databases will retain any previous contents and database-level properties. If you want to be sure that the databases are restored exactly as they are, using the --clean option may be useful.

The option authorized the script to recreate the built-in databases, and makes sure that each database will have the same properties they had previously in the cluster. Using this option will kick back some errors about non-existent objects, but these errors can be ignored.

Using the --if-exists option

Adding the --if-exists option will take those errors out if they are too distracting.

Running "analyze" on each database will give the optimizer useful statistics to determine how a restore went.

Command structure and format

The command for the pg_dumpall utility will be structured like the following:

pg_dumpall [connection-option…] [option…]

PostgreSQL has extensive documentation that covers all options available for using the tool if you are looking for something specific.

See a pg_dumpall example

The following command will dump all databases:

pg_dumpall > db.out

This command will reload databases from file:

psql -f db.out postgres

This command will dump all files and create a single file called back_it_on_up.sql

pg_dumpall -f back_it_on_up.sql

Conclusion

By using these utilities many of the backup features for a PostgreSQL can be accomplished with little effort.

The command that is right depends on whether the user wants to dump or restore a file and whether they want to back up everything or just a single database. Adding in the options that are available for the database system allows the user to customize how the backup happens and what the final result looks like.

Automated PostgreSQL Backups as a service

SimpleBackups is a database and website backup automation tool that offloads all the backups tasks.
It automates PostgreSQL backups and sends them to the cloud storage of your choice.
SimpleBackups helps you make sure your backups are running well, on time, securely, with all the bells and whistles.

→ Set up your first PostgreSQL backup for free



Back to blog

Stop worrying about your backups.
Focus on building amazing things!

Free 7-day trial. No credit card required.

Not convinced yet? Need help?
Get in touch via chat or at [email protected]

Customer support with experts
Security & privacy first
Service that you'll love using