What do you think of my backup strategy?

.n3

Member
Mar 19, 2023
51
5
13
Hey,

actually I'm using pve backup to secure my vms/lxc onto my NAS in the same network. I start to use my ZFS for nextcloud and before I want to use it "productive" I want to adjust my backup.

Current setup
8-12x LXC
1-2x VM

Backup: VMS/LSC --> NAS via NFS

For Nextcloud
Code:
#create datapool
zpool create -f -o ashift=12 <pool> raidz1 <device1> <device2> <device3> <device4> <device5>

#create datasets
zfs create data_tank/nextcloud-data
zfs set compression=lz4 data_tank/nextcloud-data
zfs set atime=off data_tank/nextcloud-data

#mount dataset to nextcloud container
pct set 115 -mp0 /data_tank/nextcloud-data,mp=/mnt/nextcloud-data

Goal
  • Complies with the 3-2-1 backup strategy
  • Protection against hardware failure, ransomware and site loss
  • Scalable with growing storage needs
  • Cost-efficient
Target architecture

Onsite
  • Proxmox
    • ZFS Dataset: data_tank/nextcloud-data
    • VMs / LXC
  • PBS Onsite (LXC on Proxmox)
    • Storage: NAS (1x 8TB, no mirror)
Offsite (Intel NUC)
  • PBS (bare metal, 1x 8 TB)
    • Datastore: backup_pool/pbs-datastore
  • ZFS Pool
    • backup_pool/nextcloud-data
  • syncoid (Pull)
Data Flow (On-Off over Wireguard)

VM/LXC

Proxmox → PBS Onsite → PBS Offsite
  • Backups are first performed locally
  • Offsite PBS synchronizes automatically
Nextcloud Data (ZFS)

Offsite NUC pulls data: NUC → Proxmox:data_tank/nextcloud-data → NUC:backup_pool/nextcloud-data
  • Job runs on the NUC
  • No push from Proxmox

Security
  • Pull instead of push (ZFS)
  • Versioning (PBS + ZFS snapshots)
  • Separate systems
  • No direct offsite write access
  • Enable retention
    • 24-hourly
    • 7 days daily
    • 4 weeks weekly
  • Keep multiple backup versions
Decision made
  • No RAID necessary
  • Multiple locations preferred over mirroring
  • ZFS = Data
  • PBS = Systems
Results
  • 3 copies (Proxmox + PBS Onsite + PBS Offsite)
  • 2 media types (not quite, but zpool/raidz1 + NAS)
  • 1 offsite location

Code:
#fast local restores
zfs snapshot data_tank/nextcloud-data@auto-$(date +%F-%H%M)

#offsite pbs
zfs set readonly=on backup_pool/nextcloud-data

zfs set readonly=off backup_pool/nextcloud-data
syncoid root@proxmox:data_tank/nextcloud-data backup_pool/nextcloud-data
zfs set readonly=on backup_pool/nextcloud-data

What do you think?
 
  • Like
Reactions: Johannes S
Hi .n3,

Looks solid to me, given the hardware. From a data recovery perspective, it is better to have one disk at two locations than mirror at a single location, just like you propose.

I'm not familiar with syncoid. Have you looked into the native PBS remote functionality?
 
I'm not familiar with syncoid. Have you looked into the native PBS remote functionality?
As far I can see @.n3 uses syncoid only for the backup of his nextcloud bulk data. I do something similiar myself with my NAS data although the NAS OS is a VM. Storage on cloud vservers can get quite expensive so it makes sense to use cheaper storage options for bulk data (like NAS/nextcloud data or the stuff on my notebook) with a dedicated tool for it and PBS for the relative small amount of data in VM/lxcs OS/application images.
Another reason to seperate this would be if you want to be able to restore the stuff in your nas or nextcloud without PBS.

@.n3 Your approach is well thought and I don't see any obvious faults in it. I would consider adding another offsite storage on a cloud provider (e.g. hetzner storagebox, some s3 target like backbaze or aws) so you still have another copy if somehow some bad actor managed to take over both of your servers.
For example on your offsite server you could use rclone or restic to backup your nextcloud data to cloud storage like s3 or storagebox. For the data in PBS you could use the new S3-feature. It's still technology preview so I wouldn't relie on it as sole backup provider though. But since already have this covered with our Backupserver as primary backup this wouldn't be to bad.
Another option would be to create additional vma-backups with the native backup feature of ProxmoxVE (so without PBS) from time to time and rclone/sync them to some s3 or another cloud storage. Since they need more space than PBS you wouldn't do this every day, but maybe weekly or monthly (depending on your paranoia and monetary/storage budgets).
One interesting feature of rclone is the ability to have a restic-server for the restic backup tool in an append-only mode: https://rclone.org/commands/rclone_serve_restic/
So you could setup ProxmoxVE paaralel to your PBS (it's possible although not recommended to install both bare metal on the same machine) with a small rclone lxc which would regulary backup your nextcloud-data (and if you make them vzdumps/vma backups) to some s3-storage.
Another option would be to rent a cheap vserver (for 5-10 euro), setup rclone in serve-restic-mode and use it only as "sole proxy" for uploading stuff to a s3-storage.
 
  • Like
Reactions: UdoB and Onslow
Hi .n3,

Looks solid to me, given the hardware. From a data recovery perspective, it is better to have one disk at two locations than mirror at a single location, just like you propose.

I'm not familiar with syncoid. Have you looked into the native PBS remote functionality?
Thanks for the response. It is good to hear, that my thoughts are valid.
As far as I know, PBS can backup a ZFS Dataset. syncoid uses zfs send and zfs receive to repicate the dataset with snapshots.
As far I can see @.n3 uses syncoid only for the backup of his nextcloud bulk data. I do something similiar myself with my NAS data although the NAS OS is a VM. Storage on cloud vservers can get quite expensive so it makes sense to use cheaper storage options for bulk data (like NAS/nextcloud data or the stuff on my notebook) with a dedicated tool for it and PBS for the relative small amount of data in VM/lxcs OS/application images.
Another reason to seperate this would be if you want to be able to restore the stuff in your nas or nextcloud without PBS.
I noticed your like. Thank you for it and for the extended answer. I don't need PBS because vor my lxc/vms I'm using the nativ backup functionallity of proxmox. It is simple but the backup is approx 400GB. With PBS I get a better compression and bitrot protection etc. But I think it is a one time effort to set it up and I hope it is not to complicated.

@.n3 Your approach is well thought and I don't see any obvious faults in it. I would consider adding another offsite storage on a cloud provider (e.g. hetzner storagebox, some s3 target like backbaze or aws) so you still have another copy if somehow some bad actor managed to take over both of your servers.
For example on your offsite server you could use rclone or restic to backup your nextcloud data to cloud storage like s3 or storagebox. For the data in PBS you could use the new S3-feature. It's still technology preview so I wouldn't relie on it as sole backup provider though. But since already have this covered with our Backupserver as primary backup this wouldn't be to bad.
Hm, I checked hetzner befor and 20TB PBS would cost approx 60€/mtl. Realistiv is 5-10TB, but this will be also 12-25€.
Another option would be to create additional vma-backups with the native backup feature of ProxmoxVE (so without PBS) from time to time and rclone/sync them to some s3 or another cloud storage. Since they need more space than PBS you wouldn't do this every day, but maybe weekly or monthly (depending on your paranoia and monetary/storage budgets).
One interesting feature of rclone is the ability to have a restic-server for the restic backup tool in an append-only mode: https://rclone.org/commands/rclone_serve_restic/
So you could setup ProxmoxVE paaralel to your PBS (it's possible although not recommended to install both bare metal on the same machine) with a small rclone lxc which would regulary backup your nextcloud-data (and if you make them vzdumps/vma backups) to some s3-storage.
Another option would be to rent a cheap vserver (for 5-10 euro), setup rclone in serve-restic-mode and use it only as "sole proxy" for uploading stuff to a s3-storage.
Maybe I can use my "old" 2x 4TB hdds for the it, or the s3 optionl only for my lxc/vms. But the important thing is the nextcloud zfs dataset.
 
@.n3 , sorry for hijacking your thread ;-)
@Johannes S : I'm somewhat in the same boat, allow me to ask for clarification

a) Storage on cloud vservers can get quite expensive ... b) use cheaper storage options for bulk data .... c) a dedicated tool for it

a) What do you mean with cloud vserver? Not the linux-vserver project I suppose ;-) but rather a 'fluid' cloud-based offer than a regular VPS or a homeserver?
b) Do you mean 'hosted storage services' as cheaper as opposed to NVMe-backed services, or private storage capacity?
c) In this case, back up the data separately from the system in which it is kept? Do you untangle Nextcloud data, configuration and database to back them up separately, with scripted restore to put it back together?

I don't need PBS because vor my lxc/vms I'm using the nativ backup functionallity of proxmox. It is simple but the backup is approx 400GB. With PBS I get a better compression and bitrot protection etc. But I think it is a one time effort to set it up and I hope it is not to complicated.
I think I misunderstand : do you set up PBS, but you don't need it because you use the PVE native backups instead?

Maybe I can use my "old" 2x 4TB hdds for the it, or the s3 optionl only for my lxc/vms. But the important thing is the nextcloud zfs dataset.
You could run a self-hosted S3 service like garage on it, and do a file-based pull as a non-ZFS dependent backup (or is that what the 8 TB NAS is already doing?)
 
@.n3 , sorry for hijacking your thread ;-)
@Johannes S : I'm somewhat in the same boat, allow me to ask for clarification



a) What do you mean with cloud vserver? Not the linux-vserver project I suppose ;-) but rather a 'fluid' cloud-based offer than a regular VPS or a homeserver?
A VPS like the ones by contabo, netcup or hetzner ;) I host my offsite PBS that way but the costs of bucks/GB means that they are not really suited for large amount of data (at least with my budget) compared to some cloud storage so
b) Do you mean 'hosted storage services' as cheaper as opposed to NVMe-backed services, or private storage capacity?
I mean as an offsite location if you don't have a private site b or don't want to rlive with the risk something bad happens at your family/friend members place where hosts your private backup server. For example Hetzners storagebox cost around 11 Euro für 5 TB of storage space (https://www.hetzner.com/de/storage/storage-box/ ), Backblaze starts at 6 Euro per TB for their s3-compatible storage: https://www.backblaze.com/cloud-storage/pricing
A cheap vserver at netcup starts at 5 Euro but won't have much storage. You can add storage but this costs around 12 Euro per TB. So netcup is still quite affordable (that's the reason I run my offsite PBS on it) but I can't backup my NAS and notebook data on it since it's way too much. For my (relative small) vm and lxcs where it's mainly about the ability to quickly get services running again without wasting time in reconfiguring everything again it's perfect.
If on the other hand I would run a company I would propably reconsider and get a managed PBS service like the ones provided by tuxis.nl or inett and two or three different s3 providers for data backups (so if one of them fails I still have the backups on everything else).

You could argue that you shouldn't relie on vm or lxcs backups because with a configuration-management tool like ansible, chef or puppet/openvox you should be able to resetup everything automatically without the need to restore vms and lxcs. It's something I still have on the todo for my homelab. But even then I can imagine some scenarios where it might be useful to have backups of everything.

c) In this case, back up the data separately from the system in which it is kept? Do you untangle Nextcloud data, configuration and database to back them up separately, with scripted restore to put it back together?
I don't use nextcloud. I use TrueNAS on a VM though. Since I passed through a storage controller for the storage discs the regular backup with ProxmoxVE and PBS only gets the OS data. So for the NAS data I would need to use the proxmox backup client inside the vm to do the backup. Since I don't want to use the PBS for it anyhow (for the said reasons) I call instead restic with resticprofile.
I do the same seperation on my vms. For example my paperless instance is hosted with docker from a debian vm. The VM has two virtual discs:
- One for the operating system and one for the application installs (aka all docker images and configuration of the applications I host)
- The actual data is hosted on the TrueNAS instance as samba and nfs network shares. These shares are mounted on the vm and configured in docker as storage volumes.
- If for some reason I would want to backup stuff on some virtual disc without putting it on the NAS in a network share I would put it on a dedicated virtual disc and exclude the disc from the backup jobs (there is a checkbox in the VM or lxc configuration to do this)

I think I misunderstand : do you set up PBS, but you don't need it because you use the PVE native backups instead?

I can't speak for @.n3 but I do it this way: I use PBS as VM/lxc backups and also do additional native backups so I can still restore an older state in case I can't access the PBS backups anymore. But since native backups needs a lot of space I don't do this as much as my backups to PBS: I do at least daily backups to PBS (for important vms/lxcs even every hour or two hours depending on their importance) but I have also weekly or monthly native backups as additional safety.

You could run a self-hosted S3 service like garage on it, and do a file-based pull as a non-ZFS dependent backup (or is that what the 8 TB NAS is already doing?)

Then .n3 would still relie that his offsite or NAS never fails.
 
  • Like
Reactions: wbk and UdoB
I think I misunderstand : do you set up PBS, but you don't need it because you use the PVE native backups instead?
Acutally I'm using the PVE nativ backups and therefore I don't need PBS. I need a PC for my offsite backup. A PI will work, but I want to work with external storage etc. I bought a used mini pc. Now I can install PBS Offsite. So I can use PBS to get a little bit more security.


Then .n3 would still relie that his offsite or NAS never fails.
But not in my scenario, because my offsite can fail and I have still my onsite and production data.
 
  • Like
Reactions: wbk and Johannes S