Occasion:
Its nice to have a backup solution with remote offsite sync. But when your onsite backup server goes offline your one (the word "one" is always related to a human being, not a matter or person or number) have no backup for that day. A sysadmin could better sleep if there would be a possibility to have a high available onsite backup solution for PBS. My zfs-over-iscsi HA Storage solution inspired my one to build such a HA PBS.
If you are interested in to have such a solution direct from Proxmox then vote for this bugzilla enhancement!
IMPORTANT INFORMATION:
This solution is based on my zfs-over-iscsi HA Storage solution. So the same Information there are also relevant for this solution. My one tested it only in a test environment and not in production.
Use case for such a solution:
This solution is especially for companies which wants to have a high available Proxmox Backup Server.
Available functions:
It's a normal PBS Installation so you get what you installed a full functioned PBS. Only the Tape Backup is a little bit more difficult because your one need a TAPE Drive with 2 SAS-Ports to put both nodes to the same Tape drive. There are only 2 Vendors in the market which my one know they offer that. So for Tape-Backup user its better to use a single PBS node solution.
Key configuration of this solution:
3 PBS-Folders must be make available for the active node over a shared storage.
HW-Requirements:
Actually my one liked to install a PBS 8 on top of a normal Debian installation, but after installation of the PBS pakets the node hanged while rebooting. So my one changed the order and installed the additional needed pakets for clustering on top of a standard PBS 8 ISO Installation.
My test environment base setup:
The same as in the HA zfs-over-iscsi project. So have a look there.
Additional used pakets:
pacemaker, corosync, pcs, network-manager, sbd, watchdog
Needed resource agents:
Fencing solution:
The same as in the HA zfs-over-iscsi project. So have a look there.
Tuning:
The same as in the HA zfs-over-iscsi project. So have a look there.
Configuration:
The configuration steps for the cluster build and the pacemaker configuration for zfs and the cluster-ip are all the same as in the HA zfs-over-iscsi project. So have a look there.
Only the resource agent for the iscsi service and the resource group are not necessary. Here follows the additional steps for the PBS HA configuration.
------> all next steps have to be made on both nodes
How a cluster status could look like:

Errors and solutions:
Tested failure scenario:
Of course my one tried to make a failover while a backup job was running and of course this couldnt go well, because the hole http/TCP Stream break and therefor the connection was reseted. But thats not a problem because this was not the goal of this project. The next vm backup after failover will run without problems.
Used information sources:
The same as in the HA zfs-over-iscsi project. So have a look there. Additional the following:
Everyone can also send a direct message (no PMs, floh8 is no person or matter and floh8 is not located on a ship) to floh8 if there are requests to this solution.
Its nice to have a backup solution with remote offsite sync. But when your onsite backup server goes offline your one (the word "one" is always related to a human being, not a matter or person or number) have no backup for that day. A sysadmin could better sleep if there would be a possibility to have a high available onsite backup solution for PBS. My zfs-over-iscsi HA Storage solution inspired my one to build such a HA PBS.
If you are interested in to have such a solution direct from Proxmox then vote for this bugzilla enhancement!
IMPORTANT INFORMATION:
This solution is based on my zfs-over-iscsi HA Storage solution. So the same Information there are also relevant for this solution. My one tested it only in a test environment and not in production.
Use case for such a solution:
This solution is especially for companies which wants to have a high available Proxmox Backup Server.
Available functions:
It's a normal PBS Installation so you get what you installed a full functioned PBS. Only the Tape Backup is a little bit more difficult because your one need a TAPE Drive with 2 SAS-Ports to put both nodes to the same Tape drive. There are only 2 Vendors in the market which my one know they offer that. So for Tape-Backup user its better to use a single PBS node solution.
Key configuration of this solution:
3 PBS-Folders must be make available for the active node over a shared storage.
HW-Requirements:
- 2 Nodes with 2 high bandwith network ports in a bond, 1 Management Port
- min. 1 Dual-Controller JBOD Shelf
Actually my one liked to install a PBS 8 on top of a normal Debian installation, but after installation of the PBS pakets the node hanged while rebooting. So my one changed the order and installed the additional needed pakets for clustering on top of a standard PBS 8 ISO Installation.
My test environment base setup:
The same as in the HA zfs-over-iscsi project. So have a look there.
Additional used pakets:
pacemaker, corosync, pcs, network-manager, sbd, watchdog
Needed resource agents:
- zfs
- [filesystem] -> if you wanne go with BTRFS
- ipaddr2
- systemd
Fencing solution:
The same as in the HA zfs-over-iscsi project. So have a look there.
Tuning:
The same as in the HA zfs-over-iscsi project. So have a look there.
Configuration:
The configuration steps for the cluster build and the pacemaker configuration for zfs and the cluster-ip are all the same as in the HA zfs-over-iscsi project. So have a look there.
Only the resource agent for the iscsi service and the resource group are not necessary. Here follows the additional steps for the PBS HA configuration.
- copy the folders "/etc/proxmox-backup", "/var/lib/proxmox-backup" and "/var/log/proxmox-backup" of one node to the zfs pool on the shared JBOD and delete the source folder on both nodes.
- Then create a symlink for the source folders on both nodes to your shared zfs pool. Pay attention to the permissions! They must be exactly the same.
# ln -s /zpool1/pbs-etc/ /etc/proxmox-backup
# ln -s /zpool1/pbs-log/ /var/log/proxmox-backup
# ln -s /zpool1/pbs-lib/ /var/lib/proxmox-backup
------> all next steps have to be made on both nodes
- disable the following PBS services
- proxmox-backup.service
- proxmox-backup-proxy.service
- proxmox-backup-banner.service
- pbs-network-config-commit.service (network config only over network-manager)
- proxmox-backup-daily-update.timer
- To make sure that neither of the first both services actually started, One have to write a own systemd-service.
- create a script file with
# nano /root/stop-pbs.sh
and the content:
Code:
#!/bin/bash
systemctl stop proxmox-backup
systemctl stop proxmox-backup-proxy
- made the sript executable with
# chmod 710 /root/stop-pbs.sh
- create a new systemd service file with
# nano /lib/systemd/system/proxmox-stop-PBS.service
- edit this file with:
Code:
[Unit]
Description=For Stopping the PBS services
Wants=proxmox-backup.service proxmox-backup-proxy.service
Before=corosync.service pacemaker.service
[Service]
ExecStart=/root/stop-pbs.sh
[Install]
WantedBy=multi-user.target
- enable this new service
- configure watchdog to use the modul softdog (see point sources)
- comment out the line "
#After=multi-user.target
" of the file /lib/systemd/system/watchdog.service - add the service "watchdog.service" to the line
After=systemd-modules-load.service iscsi.service watchdog.service
in the file /lib/systemd/system/sbd.service
- add to pacemaker config
Code:
# pcs resource create res_proxmox-backup systemd:proxmox-backup
# pcs resource create res_proxmox-backup-proxy systemd:proxmox-backup-proxy
# pcs resource create res_proxmox-backup-banner systemd:proxmox-backup-banner
# pcs resource create res_proxmox-backup-daily-update_timer systemd:proxmox-backup-daily-update.timer
# pcs resource group add grp_pbs_cluster res_zpool1 res_cluster-ip res_cluster-ip_MGMT res_proxmox-backup res_proxmox-backup-proxy res_proxmox-backup-banner res_proxmox-backup-daily-update_timer
How a cluster status could look like:

Errors and solutions:
- In contrast to a standard debian installation the softdog modul is black-listed in the PBS-ISO installation. My one did'nt find where the entries for that configuration are located so it was time for a workaround. The solution was to use the paket watchdog.
- If u use the paket watchdog in combination with the paket sbd one have to define a boot dependency for sbd so that watchdog service starts before sbd service. See above.
- If u change the configuration in point 2 your system run in the next boot problem. It shows a dependency circle error and no pacemaker etc. is started. The reason for that is a stupid standard watchdog.service dependency configuration. One have to comment out this line. See above.
- Although the services "proxmox-backup" and "proxmox-backup-proxy" were disabled, they still started when booting. My one think they was initiated by an other task but not knowing which one. So the work around was to create a own systemd-service that stop both services after booting. See above.
Tested failure scenario:
Of course my one tried to make a failover while a backup job was running and of course this couldnt go well, because the hole http/TCP Stream break and therefor the connection was reseted. But thats not a problem because this was not the goal of this project. The next vm backup after failover will run without problems.
Used information sources:
The same as in the HA zfs-over-iscsi project. So have a look there. Additional the following:
- https://www.supertechcrew.com/watchdog-keeping-system-always-running/
- https://www.digitalocean.com/commun...ing-systemd-units-and-unit-files#introduction
Everyone can also send a direct message (no PMs, floh8 is no person or matter and floh8 is not located on a ship) to floh8 if there are requests to this solution.
Last edited: