Back up and restore

Back up and restore the Konvoy cluster

For production clusters, regular maintenance should include routine backup operations on a regular basis to ensure data integrity and reduce the risk of data loss due to unexpected events. Back up operations should include the cluster state, application state, and the running configuration of both stateless and stateful applications in the cluster.

As a production-ready solution, Konvoy provides the Velero add-on by default, to support backup and restore operations for your Kubernetes cluster and persistent volumes.

For on-premise deployments, Konvoy deploys Velero integrated with Minio, operating inside the same cluster. For production use-cases, it’s advisable to provide an external storage volume for Minio to use.

NOTE If you intend to use the cluster without an external storage volume for Minio, you should fetch the latest backup and store it in a known, secured location at a regular interval. For example, if you aren’t using an external storage volume, you should back up and archive the cluster on a weekly basis.

Install the Velero command-line interface

Although installing the Velero command-line interface is optional and independent of deploying a Konvoy cluster, having access to the command-line interface provides several benefits. For example, you can use the Velero command-line interface to back up or restore a cluster on-demand or to modify certain settings without changing the Velero platform service configuration.

By default, Konvoy sets up Velero to use Minio over TLS using a self-signed certificate. Currently, the Velero command-line interface does not handle self-signed certificates. Until an upstream fix is released, please use our patched 1.0.0 version of Velero.

Regular backup operations

For production clusters, you should be familiar with the following basic administrative functions Velero provides:

Set a backup schedule

By default, Konvoy configures a regular, automatic backup of the cluster’s state in velero. The default settings do the following:

  • create backups on a daily basis
  • save the data from all namespaces

These default settings take effect after the cluster is created. If you install Konvoy with the default platform services deployed, the initial backup starts after the cluster is successfully provisioned and ready for use.

The Velero CLI provides an easy way to create alternate backup schedules. For example:

velero create schedule thrice-daily --schedule="@every 8h"

To change the default backup service settings:

  1. Check the backup schedules currently configured for the cluster by running the following command:

    velero get schedules
    
  2. Delete the velero-kubeaddons-default schedule by running the following command:

    velero delete schedule velero-kubeaddons-default
    
  3. Replace the default schedule with your custom settings by running the following command:

    velero create schedule velero-kubeaddons-default --schedule="@every 24h"
    

You can also create backup schedules for specific namespaces. Creating a backup for a specific namespace can be useful for clusters running multiple apps operated by multiple teams. For example:

velero create schedule system-critical --include-namespaces=kube-system,kube-public,kubeaddons --schedule="@every 24h"

The Velero command-line interface provides many more options worth exploring. You can also find tutorials for disaster recovery and cluster migration on the Velero community site.

Fetching a backup archive

To list the available backup archives in your cluster, run the following command:

velero backup get

To download a selected archive to your current working directory on your local workstation, run a command similar to the following:

velero backup download BACKUP_NAME

Back up on demand

In some cases, you might find it necessary create a backup outside of the regularly-scheduled interval. For example, if you are preparing to upgrade any components or modify your cluster configuration, you should perform a backup immediately before taking that action.

You can then create a backup by running a command similar to the following:

velero backup create BACKUP_NAME

Restore a cluster

Before attempting to restore the cluster state using the Velero command-line interface, you should verify the following requirements:

  • The backend storage, Minio, is still operational.
  • The Velero platform service in the cluster is still operational.
  • The Velero platform service must be set to a restore-only-mode to avoid having backups run while restoring.

To list the available backup archives for your cluster, run the following command:

velero backup get

To set Velero to a restore-only-mode, modify the Velero addon in the ClusterConfiguration of the cluster.yaml file:

addons:
...
- name: velero
  enabled: true
  values: |-
    configuration:
      restoreOnlyMode: true
...

Then you may apply the configuration change by running

konvoy deploy addons -y

Finally check your deployment via

helm get values velero-kubeaddons

to verify that the configuration change was applied correctly.

To restore cluster data on-demand from a selected backup snapshot available in the cluster, run a command similar to the following:

velero restore create BACKUP_NAME

Enable or disable the backup addon

You can enable or disable the Velero platform service add-on in the ClusterConfiguration section of the cluster.yaml file. For example, you can enable the Velero add-on using the following settings in the ClusterConfiguration section of the cluster.yaml file:

addons:
- name: velero
  enabled: true
...

If you want to replace the Velero platform service add-on with a different backup add-on service, you can disable the velero add-on by modifying the ClusterConfiguration section of the cluster.yaml file as follows:

addons:
- name: velero
  enabled: false
...

Before disabling the Velero platform service add-on, however, be sure you have a recent backup that you can use to restore the cluster in the event there is a problem converting to the new backup service.

After making changes to your cluster.yaml, you must run konvoy up to apply them to the running cluster.

Backup service diagnostics

You can check whether the Velero service is currently running on your cluster through the operations portal, or by running the following kubectl command:

kubectl get all -n velero

If the Velero platform service add-on is currently running, you can generate diagnostic information about Velero backup and restore operations. For example, you can run the following commands retrieve backup and restore information that you can use to assess the overall health of Velero in your cluster:

velero get schedules
velero get backups
velero get restores
velero get backup-locations
velero get snapshot-locations