Getting Started

Note

This chapter assumes that your systems meet all required prerequisites and you have abgleich installed. For details see chapter on prerequisites and installation.

Workflow

The fundamental idea is to synchronize portions of two zpools on potentially, but not necessarily, two different physical computers. The first one is called the source, the second one is called the target. A typical workflow has three steps:

  1. snap: Creating snapshots on the source side for datasets/volumes containing changes since last snapshot.

  2. sync: Copying those snapshots currently not present on the target from the source to the target.

  3. free: Removing all but the last n snapshots per dataset/volume on the source side.

Example ZFS deployment

Two machines are assumed to exist: The sourcebox and the targetbox. abgleich runs on the sourcebox. Each machine has a zpool, conveniently named sourcepool and targetpool. The sourcebox might be a workstation or some kid of a server holding data for e.g. virtual machines. This is where “data is created/changed”. The target might be a hot spare machine or simply a remote backup machine. This is where data is synchronized to. Everything below the dataset sourcepool should be synced below the dataset targetpool/sourcebackup.

The setup on sourcebox might look as follows:

NAME                                                 USED  AVAIL     REFER  MOUNTPOINT
sourcepool                                          35.5G  95.4G      112K  /home
sourcepool/user_a                                   1.09M  95.4G      532K  /home/user_a
sourcepool/user_b                                   35.5G  95.4G     13.8G  /home/user_b
sourcepool/user_b/CACHE                             8.82G  95.4G     8.82G  /home/user_b/.cache
sourcepool/user_b/DESKTOP                            2.7G  95.4G      2.7G  /home/user_b/Desktop
sourcepool/user_b/SSH                              120.7K  95.4G      5.2M  /home/user_b/.ssh

The setup on targetbox might look as follows:

NAME                                                 USED   AVAIL     REFER  MOUNTPOINT
targetpool                                            64k  570.1G      173K  /backups

Preparation

For simplicity, this introduction assumes that user root is used to perform ZFS operations. While abgleich is supposed to run as a regular user, it can delegate operations via sudo or sudo -u user assuming that this does not require a passsword. The simple-most, though not the most secure option on both source and target boxes is to edit /etc/sudoers and add a line as follows:

myusername ALL=root NOPASSWD: /usr/sbin/zfs

This allows the user myusername to run zfs without password via sudo.

The zpool is used to back the /home directory. Snapshots or backups of /home itself are not required, but basically almost everything below. While user_a simply sits on a single dataset, user_b has a more fine-grained structure. The .cache is meaningless and is not supposed to be included in backups. On the other hand, .ssh is critical and always requires snaphots and backups, regardless of changes. The rest should only receive snapshots and backups if there actually are changes.

On the source box, abgleich needs to be configured accordingly:

sudo zfs set abgleich:snap=never sourcepool/user_b/CACHE  # no snapshots
sudo zfs set abgleich:sync=off sourcepool/user_b/CACHE  # no sync, no free
sudo zfs set abgleich:snap=always sourcepool/user_b/SSH

On the target box, the stage needs to be prepared:

sudo zfs create -o mountpoint=none targetpool/sourcepool

Setting the mountpoint to none is a convenient choice for a backup as it is not meant to be modified and it avoids mount-permission problems with non-root users on Linux.

Backup Cycle

A full cycle may now look as follows:

abgleich snap root%sourcepool/
abgleich sync root%sourcepool/ targetbox:root%targetpool/sourcepool/
abgleich free root%sourcepool/ targetbox:root%targetpool/sourcepool/

The tailing slash, /, indicates that abgleich should work on datasets below a given dataset, but not on the dataset itself.

The cycle can be repeated as many times as required.