[...]
In practice, you have to take into account any system differences, hardware resources, different configurations, customizations made, changes made to the VMs after the backup etc...
[...]
From what I've seen when restoring to a clean system before restoring the proxmox configuration you have to be careful to first recreate the basic storage configurations (outside the proxmox ones and possibly adapt them), you also have to take into account any changes made after the last backup and manage them (reason why I do manual backups even after any changes to decrease this risk), because there could have been changes to the vm configurations and/or disks that would create problems.
[..]
important note, always before check for any hardware issues and manage them, as even with any other servers it can be counterproductive to do any restores right away (as can happen when they are made simple and fast) without checking and managing any hardware issues.
You bring up exactly the points I am concerned about. Thanks a lot for the write-up. I am totally in agreement with you. You nailed it.
When setting up a new system to replace an old, damaged one, it's crucial to consider storage mapping, naming, special configurations, and other hardware-dependent factors. My replacement system might have different NICs, resulting in different names (like the recent Broadcom issue). I might also have disks of different sizes or from different vendors. Does it make a difference? I'm not sure.
The fact is, over time, a replacement system will have different hardware. So, bringing back a damaged node with different hardware should be possible. But where are the traps? For example, if I use a cluster, will it easily reintegrate into that cluster? If the new system's disk layout is different, what about syncing VMs, etc.?
I wish there was a
guide (not just a bunch of links to community articles with long discussions) to help recover a Proxmox system in various scenarios without having to become "The Guru in Proxmox Management."
Questions like the following might be addressed by such a guide (this list might become huge, but bear with me... I'm just a simple engineer who had to take over a role):
- How can I get a Proxmox system up and running again if the following happens:
- A Proxmox upgrade fails because the hardware unexpectedly rebooted during the process? This is what happened to me and triggered these questions.
- The disk where Proxmox boots from fails.
- The Proxmox system does not have network connectivity after an upgrade, even though the hardware hasn't changed.
- How can I get that new system into a cluster, replacing the failed node?
- What needs to be done to make the cluster work fine again?
- Just a standard, here-is-all-you-need-to-recover-the-Proxmox-node guide...
- For example, have a recovery boot stick ready and know how to use it.
- What means of preparation are necessary to be ready if something happens?
- What needs to be documented and in what detail?
- Some very good engineers "do not need stinking documentation," I know... but what happens if that engineer leaves or just falls off a train?
- What needs to be backed up and how?
- There are lots of sources that list files and folders. Interestingly, not all of those sources list the same files/folders.
- How do I get the backup back onto a new system?
- Seems trivial, but consider name or disk mapping changes...
- Who knows, one can wish... a Terraform import functionality for a Proxmox system to get the setup code for the existing system.
- Or someone wrote a Terraform/module/code/config which is simple to use, and I would use it for all new Proxmox systems (just, I have a bunch of old systems still having the same issue...)...
Probably I could answer all those (and similar) questions myself if I had the time to become the Proxmox specialist.
I was looking around a bit, starting with the link from
@Gilou above, but then digging a bit deeper. Not with the goal to write something up myself (except that little script I've asked Copilot (or was it ChatGPT) to do for me), but to find something that can be used by someone like me too.
So far, I have found some interesting stuff. I still need to look at it in more detail, but if you have additional ideas and/or could look into the links below and give your feedback...
Is there more? Better?
Dan