[SOLVED] PVE reinstallation

showiproute

Well-Known Member
Mar 11, 2020
615
32
48
36
Austria
Hello there,
I plan to set up my 2nd PVE server in the near future as my current boot drive is on a consumer SSD which I would like to replace with an enterprise one (Micron 7400 Pro).

Currently I have two PVE server + a QDevice on a Raspberry Pi.

My plan would be to migrate all VM drives from the boot drive so they stay on storage which will also be there after the reinstallation (all ZFS pools).


Is there anything I need to be aware of/keep in mind?
 
Hello there,
I plan to set up my 2nd PVE server in the near future as my current boot drive is on a consumer SSD which I would like to replace with an enterprise one (Micron 7400 Pro).

Currently I have two PVE server + a QDevice on a Raspberry Pi.

My plan would be to migrate all VM drives from the boot drive so they stay on storage which will also be there after the reinstallation (all ZFS pools).


Is there anything I need to be aware of/keep in mind?
Hi,
this should be a rather standard procedure, you can find also a section about recovery in the docs, see https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_recovery

Since you have a cluster, an option would be to migrate all VMs to the other node before removing the node you want to perform the hardware upgrade on, then reinstall and join the node to the cluster again, finally migrating the VMs back to the original node.

In any case, tested backups (best to a different storage medium) of all of your VMs are strongly recommended before starting with any procedures.
 
  • Like
Reactions: showiproute
@Chris
Is single ZFS supported?

I haved selected this within the installert but unfortunately my system is unable to find any bootable media. It goes to the EFI shell is last possible option...
Will try LVM now instead.
 
Found the error.... I had to enable EFI options for PCIe slots (my NVMe is attached to a M.2 -> PCIe adapter).
After that it worked well.
 
Okay conclusion of todays boot SSDs exchange: It's not so easy.
ZFS on root did not work for me. The server was not booting at all - LVM on the other hand worked without any issues.

After installation I copied back the cluster.db file mentioned here but here my headache started.
My server was in a "no longer clustered" configuration.
On the remaining server I had removed the old PVE server but as I copied back the cluster.db file on the new server it thought it is still part of the cluster.

Maybe someone can explain me if it wouldn't be a better approach to initialy, before you backup/copy the cluster.db to remove the node from the cluster, do the maintenance and rejoin the cluster afterward.

Additionally I had the issue that one of my ZFS pools was not able to being imported.
Nothing from pages found in the internet worked.
I had to insert the old OS SSD where the "faulty" ZFS pool was still useable. After all try and error the solution was to remove all special devices, log and caches from the pool and export it.
This worked and I was able to import it at the PVE installation.
After that I reconfigured logs, cache and special devices.
 
Okay conclusion of todays boot SSDs exchange: It's not so easy.
ZFS on root did not work for me. The server was not booting at all - LVM on the other hand worked without any issues.

After installation I copied back the cluster.db file mentioned here but here my headache started.
My server was in a "no longer clustered" configuration.
On the remaining server I had removed the old PVE server but as I copied back the cluster.db file on the new server it thought it is still part of the cluster.

Maybe someone can explain me if it wouldn't be a better approach to initialy, before you backup/copy the cluster.db to remove the node from the cluster, do the maintenance and rejoin the cluster afterward.

Additionally I had the issue that one of my ZFS pools was not able to being imported.
Nothing from pages found in the internet worked.
I had to insert the old OS SSD where the "faulty" ZFS pool was still useable. After all try and error the solution was to remove all special devices, log and caches from the pool and export it.
This worked and I was able to import it at the PVE installation.
After that I reconfigured logs, cache and special devices.
I am sorry to hear that you encountered such issues during your migration.

Copying the cluster backing filesystem to the new host is only required if you are running without a cluster. As then there is no other way for the host to get the configuration back.

In a clustered setup however, this is not necessary, you only need to migrate the VMs/CTs to other nodes in the cluster, remove the node from the cluster, perform your maintenance/re-installation and join the node again to the cluster. Since the node can join the cluster, it will get then synced to the current cluster state. Once this is completed, the VMs/CTs can be migrated back to their initial node.

Regarding the ZFS pool, you probably would have needed to force an import by running zpool import -f <pool_name>, as ZFS probably complained that the pool may be active on another system or the like.
 
I am sorry to hear that you encountered such issues during your migration.

Copying the cluster backing filesystem to the new host is only required if you are running without a cluster. As then there is no other way for the host to get the configuration back.

In a clustered setup however, this is not necessary, you only need to migrate the VMs/CTs to other nodes in the cluster, remove the node from the cluster, perform your maintenance/re-installation and join the node again to the cluster. Since the node can join the cluster, it will get then synced to the current cluster state. Once this is completed, the VMs/CTs can be migrated back to their initial node.

Regarding the ZFS pool, you probably would have needed to force an import by running zpool import -f <pool_name>, as ZFS probably complained that the pool may be active on another system or the like.
My major problem with migrating VMs is that my servers have different hardware configuration:
Server 1 contains 2.5" drives with fast but limited storage
Server 2 contains 3.5" drives with slow but more storage

Therefore migrating a VM does not work due to limited storage capabilities.

zpool import -f <pool_name> did work for all other pools except the "faulty" one.
There I only received I/O error destroy and restore from backup.

zpool import -F showed the ZFS pool but mentioned the drive as "degrated".
The only solution was to boot from old OS and remove all log, cache and special devices.


I think my solution would be to copy the cluster.db, not remove the node, reinstall the OS/maintenance and copy back the cluster.db.
 
Therefore migrating a VM does not work due to limited storage capabilities.
In that case a backup/restore route might be the easiest way for bringing the VMs/CTs back to the reinstalled node, meaning of course downtime for these while you upgrade the node (which is the case also for the route you choose).
There I only received I/O error destroy and restore from backup.
In that case I strongly urge you to check your disks smart values, as this indicates that at least one disk in that pool might not be fine anymore.
I think my solution would be to copy the cluster.db, not remove the node, reinstall the OS/maintenance and copy back the cluster.db.
While totally valid, such an approach requires more in depth knowledge of the system any you might more easily end up in unexpected/nontrivial conditions. Therefore removal and reinstall of the node remains the straight forward approach.
 
In that case I strongly urge you to check your disks smart values, as this indicates that at least one disk in that pool might not be fine anymore.
Smart values for all drives show good states.
Additionally what wondered me yesterday as well: zpool export Storage worked and zpool list was empty but after some minutes with the same zpool list command it showed me the drive again.

Might be possible that it did not export the drive successfully?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!