I would not rule out Proxmox as it does log quite a lot (and data for graphs) more than plain Debian (or other Linux). And maybe this has been reduced recently, but the pmxcfs also used to have a lot of write amplification.
But I agree that VMs with a lot of (synchronous) I/O, especially when the block sizes don't match, can easily become the major cause when the number of running VMs increases.
This is an interesting point that I need to do more research on. I'd like to set up a cluster, so I'd already planned to do the following to lessen boot disk writes:
1. Shared storage over NFS/ZFS over iSCSI (I'm still figuring out if I want to deal with ZFS over iSCSI … all the tutorial videos I watch are LVM-based Proxmox installs, so they all use NFS. I'm not sure whether ZFS over iSCSI is even considered a rock-solid feature or still more in-development/experimental like Ceph); and
2. Logging to an external syslog server.
I had not considered the need to handle logging specific to clusters, so I need to look into whether that's included in the general offload to a remote syslog server.
don't see that as such an obstacle. What I would do in your situation:
1. Use a small m.2 drive. 128gb?
2. From time to time (every kernel update etc.), make a dd zipped image of that boot drive to the storage area on the server.
3. Something goes wrong, write that dd image back to the boot drive (or a new one).
4. Make as little changes to the host OS as possible & document all those changes.
5. Document & backup all Proxmox host settings. (Search these forums on which files/folders to backup).
6. You could always re-install Proxmox & redo 4 & 5 above.
For all your VMs & LXCs you must anyway have full & restorable backups.
@gfngfn256 's advice on making backups is absolutely spot on. For something more automated, there's a PVE host backup script. I'm aware of it, but haven't deployed it, so I haven't read it to see exactly what it does, but it's recommended a lot:
https://community-scripts.github.io/ProxmoxVE/scripts?id=host-backup
That's also an interactive script; I'm not sure how to automate it (yet).
It's worth noting that implementing host (PVE node) backup is on the roadmap for Proxmox Backup Server, at https://pbs.proxmox.com/wiki/index.php/Roadmap . At the time I'm writing this, it's the last major server-side item on the roadmap, but we don't have any information on when it might land. So, hopefully, a standardized, easily automated official solution will be coming soon.