Wise to use PBS as VM in HA?

apfel

New Member
Jan 13, 2024
1
0
1
for a ha cluster with 3 nodes whats a good practice to setup pbs?

1. one dedicated bare metal server for pbs?
2. pbs as vm on one of the cluster nodes?
 
There are several approaches, as usual. And @Dunuin is right, as usual :)

Probably you have a Homelab. I do it this way:
  1. my "main" PBS is installed bare metal parallel to PVE. So this hardware is an additional cluster member. It is not meant to run any VMs but exclusively PBS. Nevertheless I gain some advantages like more stable quota and the possibility to migrate some LXC/VMs to this machine. This one runs 24*7.
  2. additionally I have PBS's which are turned on only once a week respectively once a month. One is "synced" off the primary, another PBS gets fully independent backups. This is old hardware with relatively high electricity consumption - while being ideal for this job as it runs only a few hours per week/month.
This is my homelab, so I opted for low energy: compute nodes are on MinisForum devices, the mentioned "additional node" is an ODROID...

YMMV!
 
  • Like
Reactions: Dunuin
There are several approaches, as usual. And @Dunuin is right, as usual :)

Probably you have a Homelab. I do it this way:
  1. my "main" PBS is installed bare metal parallel to PVE. So this hardware is an additional cluster member. It is not meant to run any VMs but exclusively PBS. Nevertheless I gain some advantages like more stable quota and the possibility to migrate some LXC/VMs to this machine. This one runs 24*7.
  2. additionally I have PBS's which are turned on only once a week respectively once a month. One is "synced" off the primary, another PBS gets fully independent backups. This is old hardware with relatively high electricity consumption - while being ideal for this job as it runs only a few hours per week/month.
This is my homelab, so I opted for low energy: compute nodes are on MinisForum devices, the mentioned "additional node" is an ODROID...

YMMV!
Unfortunately you always have to cut some corners in a homelab.

The problem with PBS virtualized on a PVE or even bare metal installed parallel to PVE is if a PVE node gets compromised. Once there is a massive vulnerability in PVE and an attacker could get root access you are screwed. If they are unclustered but all on the same PVE version, all are attackable the same way. And even worse with a cluster as getting root access to a single node would allow you root access to all cluster nodes.
Once that PVE host get compromised the attacker gets full access to all the hardware and guests. So he could destroy your guests as well as all your backups of those guests.
The idea of a bare metal PBS (without parallel PVE) would be that this PVE vulnerability may not affect PBS. So even if all your PVEs get compromised and all guests destroyed you still got that uncompromised PBS host with all your backups. You then also set the PBS privileges so PVE nodes are only allowed to create or restore backups but to not be able to prune/delete backups. That way you get ransomware protection and even a compromised PVE host won't be able to destroy your backups.

But I'm doing it similar to UdoB.
PBSs are running as LXCs (wouldn't do that again...I'm just too lazy to turn them into VMs...) on PVE nodes. I make sure no SSH/webUI/IPMI of the PVE nodes is publicly accessible and ideally you also isolate PVE management on a separate network so even a compromised client in your LAN or guestOS won't be able to access the PVE management. So hopefully the PVE hosts won't get compromised in the first place. And one of my PVE nodes is only for backup purposes and only running once per week. So when a PVE nodes would get compromized it's hopefully on those days where the PVE node with the backups is powered down and not attackable.
My two PBS LXCs sync each other so I got 2 local copies of all backups and one of those copies isn't pruneable/deleteable by the PVE nodes.
Then there is a third PBS that is offsite where a third backup of the most important guests is stored.

While not that save, running PBS virtualized (or bare metal parallel to PVE) got some big advantages if you are on a low budget:
- saves space as you don't need to find some room for an additional server
- saves electricity if you run it on a PVE that is running 24/7 anyway instead of running an additional dedicated PBS 24/7
- PBS is great for backing up VMs and LXCs but lacking clients to backup other devices. So for backing up your Win Gaming PCs, Win/Mac Laptops, Smartphones, Tablets, ... you probably want to host some additional backup solutions like Bacula, Veeam, UrBackup, ... or simply a NAS for FTP/Rsync/SMB/NFS/Syncthing/Nextcloud to sync your data to. Here it comes very handy to run a PVE with all those things virtualized so all of that can share the same hardware and disks as backup storage.
- when running PBS virtualized (or bare metal in parallel to PVE) the backups might be way faster as when PVE backing up to the PBS on the same server the packets don't have to leave the server so you are only limited by your CPU and storage performance and not bottlenecked by the NIC.
- when running PBS virtualized (or bare metal in parallel to PVE) and any of your other PVE nodes will fail, you can temporarily restore your guests locally while you fix the broken PVE node. Not that important when running a cluster. But think of the case where you can only afford 2 servers. "PVE A running all guests + PBS" or "PVE A running all guests + PVE B with no guests except for a PBS VM". In the first case, if PVE A would fail you would be pretty screwed and you got no way to run your important services until you fix that single PVE node. In the second case you could restore your daily backups from the PBS VM to the empty PVE B node and later move them to PVE A once it got fixed. This is also the reason why I over-dimensioned my PVE node which I only use for storing backups. It doesn't need all that RAM and CPU power but nice to have it so I could temporarily restore several VMs there.
- you should test your backups. When running PBS virtualized (or bare metal in parallel to PVE) you can do some fast local restores to an isolated subnet on the same host using another VMID, test if the restored guest is working as expected and destroy it again without affecting any of the other PVE nodes.
- its not that great to put all your bets on the same horse. So maybe you want to store some additional VZDump backups on a NFS/SMB share in case there is a major bug in PBS. So even when losing all your PBS backups there are some old VZDump backups as a last resort. Here it then also would be handy to have a NAS and PBS virtualized on the same server. Otherwise you would need to set up a SMB/NFS server yourself via CLI on a bare metal PBS. This then again would make user error more likely and you never know if your changes might negatively affect your PBS installation.
 
Last edited:
  • Like
Reactions: UdoB
It is important to have something like the famous 3-2-1 backup rule = (at least) three backups on two different media and one offline/offsite. And it is important that it mostly runs automatically - nobody creates backups every day manually. Additionally I differentiate between "the VMs" and actual data like family photos - you can't have too many backups of those.

The very first step is to realize this :cool:
 
  • Like
Reactions: Fortel
+ use a tapedrive to get airgapped backups. don't skip this.
In my opinion that would be a big mistake. LTO drives and tapes only have limited compatibility; if the drive breaks, you may have to buy expensive old scrap. It is better to backup encrypted data to a cloud using the pull principle. The PBS, for example, can also be set up so that backups cannot be deleted from the PVE. In principle, different systems should always be used for such backups and, if possible, in the pull principle. This means that a possible attacker cannot see that there are additional backups.

Otherwise, I would prefer to use HDDs or SSDs in an interchangeable manner. If you use the data storage media regularly, you will quickly notice defects.
 
A Tape drive won't hurt, right? it's not meant to replace disk2disk backups. It's meant to have a full copy of your data (like maybe weekly) you can physically make inaccessible and really offline (and optionally relocate it a secure vault) in case something goes terribly wrong.
Having the right ACL set in proxmox won't save you in case proxmox/pbs has a security flaw or an attacker breaks into your pbs through lateral movement to your manegemtn network or ipmi. Then you will really pray for tapes that will ultimately save you. It is the only way to really make you ransomware resilient. Even if you use Worm HDD's (which exist) you rely on the vendor that the WORM mechanism has been implemented correctly and without any backdor. Something you will never be able to verify by yourself.

Next thing reagarding hard disks as backup medium: Hard disks are insanely complex, and there have been numerous cases where firmware errors caused drives to fail and lead to silent, unfixable data corruption.

Regarding cloud backups: Sure, you can do that, but in terms of confidentiality a lot of customers have requirements that rule out putting your data - even if it's encrypted - to a third entity. Strong encyption today is the weak encryption tomorrow. Think 5-10 years ahead.
 
Last edited:
Think 5-10 years ahead.
This is exactly what you should keep in mind with your tapes if your modern systems no longer have a SCSI connector in 10 years or BackupExec no longer exists or can no longer be installed on Windows Server 2030 (?).
I don't see that these problems can be easily mitigated with tapes. I will still be able to read a SAS disk or SATA disk with more modern systems in a few years. With the introduction of LTO 8 drives in 2017, I was no longer able to use an LTO 6 tape from 2012 or even read it. How many tapes, drives and old systems with software and licenses should I keep in stock so that I can still read the tapes within a period of at least 10 years? X-rays must even be kept for 30 years. Pension matters are kept for over 100 years. In my opinion, tape drives have simply become uncontrollable and require a lot of maintenance.

Having the right ACL set in proxmox won't save you in case proxmox/pbs has a security flaw or an attacker breaks into your pbs through lateral movement to your manegemtn network or ipmi.
Nobody said the PBS had to be on my network. Nobody said that it had to run constantly. Nobody said that I wouldn't replicate it in other locations if necessary. There are so many solutions to simply not having to use tapes anymore and that's exactly what I do, I never even thought about getting such an inflexible backup medium. I simply create proper concepts for how my backups work and how I separate the networks and access in such a way that it is not possible with lateral movement (it's not even that difficult, just don't connect to the AD and use other access data, the corresponding ACLs and you've achieved a lot with a little - of course that's not everything).

Next thing reagarding hard disks as backup medium: Hard disks are insanely complex, and there have been numerous cases where firmware errors caused drives to fail and lead to silent, unfixable data corruption.
Of course, hard drives can also have errors, which is why you don't buy 200 of them directly from one supplier but instead buy from various different ones. Then you can change HDDs from time to time without having to, to prevent wear and tear and to mitigate any bugs. By rotating several disks, preferably from different manufacturers, you avoid being subject to a FW bug or not noticing that the disk is broken. The media must of course also be tested regularly as part of a recovery test and, where possible, verified that no bitrot has occurred. The hard drives must also be transported and stored in the boxes provided for this purpose.

Regarding cloud backups: Sure, you can do that, but in terms of confidentiality a lot of customers have requirements that rule out putting your data - even if it's encrypted - to a third entity. Strong encyption today is the weak encryption tomorrow.
That's why you should also follow the development of post-quantum cryptography. I can also easily change the encryption on a hard drive. For long-term archiving on hard drives, for example, I can create a new pool in parallel and encrypt new data differently than the old. At night when there is little load, I can simply move the existing ones there and be up to date again. Doing this with tapes from the last 30 years would mean a lot of work and effort. And if you don't put your backups encrypted in a cloud, you won't save them unencrypted on tapes - you can't tell me that either! :)

But in the end it's up to each individual and I remain true to my position of not using tapes.
 
Please don't pull my remarks out of context.


This is exactly what you should keep in mind with your tapes if your modern systems no longer have a SCSI connector in 10 years or BackupExec no longer exists or can no longer be installed on Windows Server 2030 (?).
I don't see that these problems can be easily mitigated with tapes. I will still be able to read a SAS disk or SATA disk with more modern systems in a few years. With the introduction of LTO 8 drives in 2017, I was no longer able to use an LTO 6 tape from 2012 or even read it. How many tapes, drives and old systems with software and licenses should I keep in stock so that I can still read the tapes within a period of at least 10 years? X-rays must even be kept for 30 years. Pension matters are kept for over 100 years. In my opinion, tape drives have simply become uncontrollable and require a lot of maintenance.

I don't keep tapes 10 Years for archiving. My primary motivation is to have a solid Airgap for 6-12 month as an additional layer of security. I didn't talk about long term archiving, I am not sure how you come to assume this.


Nobody said the PBS had to be on my network. Nobody said that it had to run constantly. Nobody said that I wouldn't replicate it in other locations if necessary. There are so many solutions to simply not having to use tapes anymore and that's exactly what I do, I never even thought about getting such an inflexible backup medium. I simply create proper concepts for how my backups work and how I separate the networks and access in such a way that it is not possible with lateral movement (it's not even that difficult, just don't connect to the AD and use other access data, the corresponding ACLs and you've achieved a lot with a little - of course that's not everything).


Of course, hard drives can also have errors, which is why you don't buy 200 of them directly from one supplier but instead buy from various different ones. Then you can change HDDs from time to time without having to, to prevent wear and tear and to mitigate any bugs. By rotating several disks, preferably from different manufacturers, you avoid being subject to a FW bug or not noticing that the disk is broken. The media must of course also be tested regularly as part of a recovery test and, where possible, verified that no bitrot has occurred. The hard drives must also be transported and stored in the boxes provided for this purpose.


That's why you should also follow the development of post-quantum cryptography. I can also easily change the encryption on a hard drive. For long-term archiving on hard drives, for example, I can create a new pool in parallel and encrypt new data differently than the old. At night when there is little load, I can simply move the existing ones there and be up to date again. Doing this with tapes from the last 30 years would mean a lot of work and effort. And if you don't put your backups encrypted in a cloud, you won't save them unencrypted on tapes - you can't tell me that either! :)

It's about giving your data to a third entity - then it's in the wild. You can't change Crypto on data you have already given someone else. You can't remotely make it Quantum-safe. That'd require a timemachine. :)
But in the end it's up to each individual and I remain true to my position of not using tapes.

That's perfectly okay if your usecase doesn't require this. And you don't have to justify your position to me. I am not responsible for your usecase :)

Still, there are good reasons to include the use of real offline medium in your Backup concept for a lot of use cases. That is everything i wanted to point out, and I still don't get why this should be a mistake, as you stated.

regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!