Automated Backups with CloudNativePG for PostgreSQL on Kubernetes

CloudNativePG makes it rather easy to set up and operate a highly available PostgreSQL cluster on Kubernetes. See this post for more details.
Now while setting up is already good - operating PostgreSQL obviously requires one additional component: Backups.
- pgBarman - Overview
- Prerequisites
- Setting up Barman backups with CloudNativePG
- Executing backups
- A note on how often to schedule backups
- Recovery
- Summary
The TLDR is found in the Summary
There are a ton of backup solutions for PostgreSQL.
to just mention 4 of the more popular options.
Now - if we look at how we deployed our cluster - using CloudNativePG - it's clear that we could use any of the methods. Nevertheless, the CloudNativePG operator comes with first class pgBarman support. Which is no surprise as both pgBarman and CloudNativePG are from the awesome people at Enterprise DB.
pgBarman - Overview
pgBarman or short Barman was developed to implement backup and disaster recovery for PostgreSQL databases. One of the main design aspects was to optimize for maximum business continuity in case of a disaster.
Features: Some of the more exciting features are:
- Point in time recovery
- Remote backup
- WAL archiving and streaming
- Synchronous WAL streaming (means exactly zero data loss in case of failures)
- Incremental and parallel backups
- Backup catalog
This guide will show us some of these features in action.
Prerequisites
To follow this guide, you need:
Setting up Barman backups with CloudNativePG
First, we will add Barman to our clust objects - and configure it to store it's backup in an Azure Blob Storage Container.
Setting up Barman with CloudNativePG is as simple as adding an additional backup
configuration block to the PostgreSQL cluster object and creating a k8s secret for accessign the backup backend.
CloudNativePG currently supports, Azure Blob Storage, Google Cloud Storage and AWS S3 for backup backends. In this guide we'll use Azure Blob Storage - but the steps are very similar for the other backends. See the following documentation section for more details: Link to CloudNativePG docs.
First, let's create our backup backend secret - the secret to access the cloud backend (Azure Blob Storage in our case). Please replace <base64-encoded-connection-string>
with your Azure Blob Storage container connection string. You can find the connection string in the Azure Console at your storage account page -> Access keys --> Connection string
Azure Blob Storage Access Keys
-
Create the following secret and apply it to your cluster.
1apiVersion: v12kind: Secret3metadata:4name: backup-creds5namespace: postgres-namespace6data:7AZURE_CONNECTION_STRING: <base64-encoded-connection-string>8 -
Next, update the Cluster k8s object. If we look at the Summary of how we set up our cluster, there we had our cluster manifest. To now add backups to the cluster, simply add the following lines to the manifest (see highlighted lines 3 to 16)
1superuserSecret:2 name: example-superuser3backup:4 barmanObjectStore:5 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path6 azureCredentials:7 connectionString:8 name: backup-creds9 key: AZURE_CONNECTION_STRING10 wal:11 compression: gzip12 maxParallel: 813 encryption: AES25614 data:15 compression: gzip16 encryption: AES25617 immediateCheckpoint: false18 jobs: 219 retentionPolicy: "30d"20storage:21 pvcTemplate:Afterward, apply the manifest to your cluster.
The settings explained
- destinationPath: Destination path of the Microsoft Azure Blob Storage Container. Format:
<http|https>://<account-name>.<service-name>.core.windows.net/<container>/<blob>
. Note, that<container>
refers to the name of your Blob Storage container and<blob>
to the name of the Blob inside the container. The "blob" will automatically be created - with the name you set here. - connectionString: Reference to the secret which stores the connection string
- wal: Define the WAL archiving/recovery behavior:
- maxParallel: Number of WAL files to be either archived in parallel.
- compression: Whether to compress the backups. Options are: gzip, bzip2, snappy. Off by default
- encryption: Whether to encrypt the WAL files. Options are: AES256 or aws:kms. Leave empty to use the backup backend storage policy.
- data: Defines the data backup behavior:
- immediateCheckpoint: If set to true, an immediate checkpoint will be used, meaning PostgreSQL will complete the checkpoint as soon as possible.
- compression:: Whether to compress the backups. Options are: gzip, bzip2, snappy. Off by default.
- encryption: Whether to encrypt the WAL files. Options are: AES256 or aws:kms. Leave empty to use the backup backend storage policy.
- jobs: The number of parallel jobs to be used to upload the backup.
- retentionPolicy: Defines when old backups should be deleted.
NOTE: That's actually all we need. Apply the cluster manifest and your CloudNativePG operated PostgreSQL cluster is ready to make it's first backup. WAL archiving by the way is already started directly after applying these changes.
Executing backups
There are two ways to execute a backup.
- On-demand and
- Scheduled
While On-demand backups are fine if you need a backup immediately - eg. because you attempt complex maintenance - scheduled backups are used during everydays operation.
On-demand backups
For on-demand backups simply create a Backup
object by creating the following yaml manifest and applying it:
1apiVersion: postgresql.cnpg.io/v12kind: Backup3metadata:4 name: general-purpose-backup5 namespace: postgres-namespace6spec:7 cluster:8 name: example-cluster
Directly after applying the manifest, the operator will attempt to initiate the backup. You can check progress (and potential errors) by running kubectl describe backups -n postgres-namespace general-purpose-backup
.
Scheduled backups
Scheduled backups are not more complex to set up. Create a yaml manifest and apply it:
1apiVersion: postgresql.cnpg.io/v12kind: ScheduledBackup3metadata:4 name: general-purpose-scheduled-backup5 namespace: postgres-namespace6spec:7 # Note that this cron dialect has 6 places - an additional one for seconds8 schedule: "1 0 0 * * 0"9 # Set this to true, if you want to suspend the backup for now10 suspend: false11 # Determines if the first backup should be done immediately12 immediate: true13 #Indicates which ownerReference should be put inside the created backup resources.14 # - none: no owner reference for created backup objects (same behavior as before the field was introduced)15 # - self: sets the Scheduled backup object as owner of the backup16 # - cluster: set the cluster as owner of the backup17 backupOwnerReference: self18 cluster:19 name: example-cluster
After applying the manifest, the backup will be scheduled as defined. If immediate
is set to true, the backup will execute immediately.
Check the backup state by running kubectl describe backups -n postgres-namespace general-purpose-scheduled-backup
.
A note on how often to schedule backups
The interval of how often to schedule backups is determined mainly by how fast one needs to recover after a desaster. Between two backups, Barman needs to recover from the WAL-Archive which takes longer than recovery from backup. That being said, backup intervals of faster than once per week are rarely needed and simply lead to unnecessary load and costs.
Recovery
While taking backups is nice - we need to be able to recover a cluster from backup in case the unthinkable happens.
To recover a cluster from backup, we can bootstrap a new cluster by referencing the backup data. This means we can't recover our backup into an existing Cluster.
Recover from an existing Backup
If there is a Backup
object inside the same namespace as you want your cluster to recover to, simply add the following snippets to your cluster yaml manifest (in the spec
- section):
1 bootstrap:2 recovery:3 backup:4 name: general-purpose-scheduled-backup
Applying the manifest will create a cluster and recover from the data referenced in the backup.
Recover from backup object storage
If there is NO Backup
object inside the same namespace as you want your cluster to recover to, add the following externalCluster
configuration to the spec
section of your cluster manifest.
Replace <your-previous-cluster-name>
with the name of what your cluster was previously called.
If you do not recall the name of your cluster, you can the Azure blob storage path (which is https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose
) with the Azure Blob Storage console. The first subdirectory found in this folder is the name of the server.
1 bootstrap:2 recovery:3 source: clusterBackup45 externalClusters:6 - name: clusterBackup7 barmanObjectStore:8 serverName: "<your-previous-cluster-name>"9 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path10 azureCredentials:11 connectionString:12 name: backup-creds13 key: AZURE_CONNECTION_STRING14 wal:15 maxParallel: 8
Point in time recovery
Point in time recovery is the process of not recovering all the WALs up to the latest one, but until a certain point in time. This comes handy, if you messed up your database and want to restore the database state of eg. yesterday.
Compare with Chapter Setting up Barman - as the backup
configuration section needs to match the externalCluster
configuration during recovery.
While this process is rather complex in the background, CloudNativePG as well as Barman help us tremendously. We again simply need to define, how we want to bootstrap a new cluster. We - again as for normal recovery - can choose to either recover from Backup
or from an backup object store. As the process is similar to the chapters above, we are only going to demonstrate point in time recovery for object store.
Add the following snippets to your cluster spec
yaml section:
1 bootstrap:2 recovery:3 source: clusterBackup4 recoveryTarget:5 targetTime: "2020-11-26 15:22:00.00000+00"67 externalClusters:8 - name: clusterBackup9 barmanObjectStore:10 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path11 azureCredentials:12 connectionString:13 name: backup-creds14 key: AZURE_CONNECTION_STRING15 wal:16 maxParallel: 8
As you can see it's literally the same configuration as for normal backup recovery, but with the additional targetTime
setting.
Important to note for recovery
- You need to bootstrap a fresh cluster
- Use a different blob configuration for your recovery object and your backup of the new cluster. Eg if your "old" cluster you want to recover from had a
<blob>
name ofpostgres-backup
- use a different blob name in thebackup
section of your new cluster. You can reuse the same container though - just use a different blob name. - The operator does NOT attempt to back up (and recover) the underlying secrets. Make sure to back them up with your regular k8s backups.
Summary
Adding backups to a CloudNativePG operated, highly available PostgreSQL cluster is rather easy.
-
Add the following sections to your Cluster yaml manifest and apply
1superuserSecret:2 name: example-superuser3backup:4 barmanObjectStore:5 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path6 azureCredentials:7 connectionString:8 name: backup-creds9 key: AZURE_CONNECTION_STRING10 wal:11 compression: gzip12 maxParallel: 813 encryption: AES25614 data:15 compression: gzip16 encryption: AES25617 immediateCheckpoint: false18 jobs: 219 retentionPolicy: "30d"20storage:21 pvcTemplate: -
Add a scheduled backup by creating a yaml manifest and applying it:
1apiVersion: postgresql.cnpg.io/v12kind: ScheduledBackup3metadata:4 name: general-purpose-scheduled-backup5 namespace: postgres-namespace6spec:7 # Note that this cron dialect has 6 places - an additional one for seconds8 schedule: "1 0 0 * * 0"9 # Set this to true, if you want to suspend the backup for now10 suspend: false11 # Determines if the first backup should be done immediately12 immediate: true13 #Indicates which ownerReference should be put inside the created backup resources.14 # - none: no owner reference for created backup objects (same behavior as before the field was introduced)15 # - self: sets the Scheduled backup object as owner of the backup16 # - cluster: set the cluster as owner of the backup17backupOwnerReference: self18cluster:19 name: example-cluster20 -
(Optional) Run an on-demand backup by creating a yaml manifest and applying it:
1apiVersion: postgresql.cnpg.io/v12kind: Backup3metadata:4 name: general-purpose-backup5 namespace: postgres-namespace6spec:7 cluster:8 name: example-cluster
To check the status of your backups, simply run: kubectl describe backups -n postgres-namespace <name-of-your-backup>
.
------------------
Interested in how to train your very own Large Language Model?
We prepared a well-researched guide for how to use the latest advancements in Open Source technology to fine-tune your own LLM. This has many advantages like:
- Cost control
- Data privacy
- Excellent performance - adjusted specifically for your intended use