Nour Sofanati
Developer, SimpleBackups
February 16, 2024
GitLab is often the heart of your development workflow. Having reliable backups ensures you can recover your code repositories, issues, and project data in case of unexpected problems or data loss.
This guide focuses on a simple, customizable Bash script for automating your GitLab backups.
git
Command: Ensure the git
command-line tool is installed on your system. It's used to clone repositories from the GitLab instance.jq
Command: If not already installed, get the jq command-line JSON processor. It helps easily parse data from the GitLab API responses.Let's dive into the core Bash script for backing up your GitLab repositories.
#!/bin/bash
# Configuration
GITLAB_URL="https://your-gitlab-instance.com"
PRIVATE_TOKEN="YOUR-PRIVATE-TOKEN"
BACKUP_DIR="/absolute/path/to/your/backup/dir"
GITLAB_USERNAME="gitlab-username"
# Ensure backup directory exists and is empty
rm -rf "$BACKUP_DIR"
mkdir -p "$BACKUP_DIR"
# Improved Error Handling (print error message and exit)
handle_error() {
echo "ERROR: $1"
exit 1
}
# Pagination variables
page=1 # Initial page number
per_page=100 # Number of projects per page (maximum is 100)
# Main loop for fetching and backing up projects
while true; do
echo "Fetching page $page;"
# Fetch project data using the GitLab API
API_HEADERS=$(curl -s -L -I -H "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
"$GITLAB_URL/api/v4/projects?membership=true&per_page=$per_page&page=$page")
# Error handling
if [ $? -ne 0 ]; then
handle_error "Error fetching projects from GitLab api (page $page): curl command failed."
fi
# Check if the API response contains a 'rel="next"' link header, increment page number if so.
if [[ $API_HEADERS == *'rel="next"'* ]]; then
((page++))
else
unset page
fi
API_RESPONSE=$(curl -s -L -H "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
"$GITLAB_URL/api/v4/projects?membership=true&per_page=$per_page&page=$page")
# Extract the body containing the project data
project_data=$(echo "$API_RESPONSE") # Adjust for potential extra headers
# Store paginated response directly
echo "$project_data" > "$BACKUP_DIR/projects_page_$page_$TIMESTAMP.json"
# Iterate through projects within the page and clone repositories
jq -c '.[]' "$BACKUP_DIR/projects_page_$page_$TIMESTAMP.json" | while read -r project; do
PROJECT_ID=$(echo "$project" | jq -r '.id')
PROJECT_NAME=$(echo "$project" | jq -r '.name')
CLONE_URL=$(echo "$project" | jq -r '.http_url_to_repo')
PATH_WITH_NAMESPACE=$(echo "$project" | jq -r '.path_with_namespace')
# Check if values are extracted correctly
if [ -z "$CLONE_URL" ]; then
echo "Error: Empty clone URL for project $PROJECT_NAME"
continue # Skip to the next project
fi
# Construct authenticated URL
AUTH_CLONE_URL=$(echo "$CLONE_URL" | sed "s|https://|https://$GITLAB_USERNAME:$PRIVATE_TOKEN@|")
# Derive repository directory name.
REPO_DIR=$(echo "$PATH_WITH_NAMESPACE")
echo "Backing up project (mirror): $PROJECT_NAME ($PROJECT_ID)"
git clone --mirror "$AUTH_CLONE_URL" "$BACKUP_DIR/$REPO_DIR"
done
if [ -z $page ]; then
echo "All projects backed up."
break # Exit the loop
fi
done
exit 0
Want to trust your GitLab backups are running well without hastle?
Try SimpleBackups Now →
To restore a repository from a backup, you can use the git clone
command to create a new repository from the backup. For example:
git clone /path/to/backup/repo.git /path/to/new/repo.git
Works as usual, you can check the repository and push it to your GitLab instance.
The script is designed to be easily customizable. You can modify the GITLAB_URL
, PRIVATE_TOKEN
, BACKUP_DIR
, and GITLAB_USERNAME
variables to suit your environment. You can also adjust the per_page
variable to control the number of projects fetched per API request.
To automate the backup process, you can use a cron job to run the script at regular intervals. For example, to run the script every day at 3 AM, you can add the following line to your crontab:
0 3 * * * /path/to/backup_script.sh
You can also sync the backup directory to cloud storage services like Amazon S3, Google Cloud Storage, or Azure Blob Storage. This ensures your backups are stored offsite and protected from local data loss.
To Setup a sync with Amazon S3, you can use the aws s3 sync
command to sync the backup directory with an S3 bucket. Here's an example script to sync the backup directory with an S3 bucket:
#!/bin/bash
BACKUP_DIR="/absolute/path/to/your/backup/dir"
S3_BUCKET="your-s3-bucket-name"
AWS_CLI_PATH="/usr/local/bin/aws"
$AWS_CLI_PATH s3 sync $BACKUP_DIR s3://$S3_BUCKET
# (optional) print the sync status
echo "Backup sync to S3 completed with status: $?"
exit 0
With this script, you can easily automate your GitLab backups and ensure your repositories are always protected. You can customize the script to fit your specific needs and automate the backup process to run at regular intervals. This way, you can have peace of mind knowing your GitLab data is always safe and recoverable.
Free 7-day trial. No credit card required.
Have a question? Need help getting started?
Get in touch via chat or at [email protected]