CephFS Backup - cback
CephFS backups are currently added by demand and backed up automatically by our
- Stored in S3 Nethub cluster (Prevessin, FR)
- Backup is not consistent (no actual mount freeze or so on)
- Snapshot based, with the default (but per job configurable) retention:
- Last 7 daily snapshots
- Last 5 weekly snapshots
- Last 6 monthly snapshots
- Backup repositories are encrypted (AES-256)
- Backups are periodically verified and pruned.
Add new backup job
Only jobs from
levinsoncan we added actually.
cback-backup.cern.ch and trigger the following command:
cback backup add [--repository s3_bucket_path] [--force] --instance instance --group ceph NAME SOURCE
NAME: name that identifies the backup
SOURCE: path of the share to backup
--instance: decorative, name of the instance where the source data is.
--group: indicates the group of backups for which the backup belongs. Backups in the same group will share common configuration, S3 credentials, etc. Use always
--force: indicates that the backup will run every time, no matter what if they were changes or not. If not specified, backup will only trigger if the recursive mtime of the volume path is newer than the last backup snapshot.
--repository: Override the default repository name generation which is
cbackceph-<NAME>. It has to be in the fullish qualified s3 url, Ex:
cback backup add --instance flax --group ceph --force alfa /cephfs-flax/volumes/_nogroup/xxxxxxxx
This will print an resume of the backup just created with the backup_id. The backup will be still disabled, so to enable it you can do:
cback backup enable <backup_id>
Please note that once enabled, the first backup will start right away if a backup agent is free, and the next will be 24h after the first finishes, and so on like this.
Enable prune. This will enable the purging of old backups using the retention policy indicated above.
cback prune enable <backup_id>
Specify backup desired start time.
If we want to have more control when a backup is performed, we can do the following:
cback backup modify <backup_id> --desired-start-time 20:00
Note: this is a desired time, not an exact time, the backup will start when there is a free agent after that time. Note: Having many backups starting at the same time could introduce load on the backend, so the recommendation is to use default scheduling unless specifically requested.
Restore data - TBD
There are many ways to restore data from a backup repository:
- Using cback asynchronous restore jobs
- Fast restore. Ideal for big restores.
- Mounting the backup repository
- Slow restore. Ideal for single files or small sets, or checking status of the backups, when is not clear what to look for.
- Using vanilla restic.
Future work will allow users to interact with the backup by themselves.