vm offline migration from cluster to cluster using Netapp Storage

Budgreg · Sep 16, 2025

Hello,

we are planning to perform an offline migration of some hundreds of vm´s from three Proxmox clusters (which are going to be retired afterwards) to two Proxmox clusters we are keeping. And there are difficulties shown up while testing this procedure with a test vm. Would like to have your opinion on the steps needed.

Our environment:

Proxmox pve 8.4.12
Each cluster uses storage LUN´s from one Netapp Storage (with individual LUN´s for each cluster)
Connectivity is done by FC (two paths with mutlipath)
Storage is provided to PVE by lvm logical disks
cluster1, cluster2 and cluster3 are to be retired
cluster4 and cluster5 will stay and should hold the vm´s from the other clusters

We´d like to migrate all vm´s from one Cluster ("cluster1") to another ("cluster5") - and that´s the plan:

Shutdown all vm´s from cluster1
Copy the vm configuration files from /etc/pve/nodes/node[1-4]/qemu-server to cluster5
Remove the storage from Proxmox configuration ("pvesm remove storage1_cluster1")
Take offline the LUN "storage1_cluster1" on the Netapp Storage
Remove the mapping for the LUN "storage1_cluster1" from Cluster1
Create a new mapping for the LUN "storage1_cluster1" for Cluster5
Add the storage "storage1_cluster1" to cluster5
Modify vmid´s if needed (to avoid duplicates)
Start all vm´s on cluster5

First tests show that after taking the LUN offline PVE seems to be somehow irritated on the missing storage (while vm´s from other storage on the cluster were fine). It looks like some other task is needed to "release" the storage from the cluster (without deleting the files).

Is there any migration expert with some know how on central storage by Netapp out there who can assist in this issue?

Thanks in advance for your assistance.

br, Gregor

bbgeek17 · Sep 16, 2025

Hi @Budgreg , welcome to the forum.

Budgreg said:
First tests show that after taking the LUN offline PVE seems to be somehow irritated on the missing storage

Is this how you would describe a system in a ticket you open with Netapp?

What does it mean for software to be irritated?

Budgreg said:
Remove the storage from Proxmox configuration ("pvesm remove storage1_cluster1")

This just removes a pool definition from PVE, the OS/Kernel are still very aware of the LVM structure and LUN presence. If you use Multipath - it also does not just forget about the device that disappeared.

Keep in mind that PVE is based on Debian with Ubuntu derived Kernel. Treat it as a Linux system - if you had a Linux system with Multipath/LVM/SAN you would not just yank the LUN as part of scheduled maintenance from live system. At a high level: destroy/remove Multipath and remove LVM.

Budgreg said:
Is there any migration expert with some know how on central storage by Netapp out there who can assist in this issue?

I don't think your questions/issues are Netapp specific. That said, the scale of your environment suggests that you may be in a position of engaging a Proxmox Partner https://www.proxmox.com/en/partners/find-partner/explore

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Budgreg · Sep 16, 2025

bbgeek17 said:
Is this how you would describe a system in a ticket you open with Netapp? What does it mean for software to be irritated?

Hi,

sorry for being so unprecise (on my first post

). The PVE gui showed up with questionmarks on each vm (even the ones with disks from other storage than the one that went offline). And lvdisplay/vgdisplay did not display any result, too.

Anyway thank you for your support. I agree that we have to dig into the multipath/lvm configuration to fix this behavior.

best regards, Gregor

alexskysilk · Sep 17, 2025

Budgreg said:
The PVE gui showed up with questionmarks on each vm (even the ones with disks from other storage than the one that went offline)

this is due to the stats collector being in a hung state. make sure there are no vm's still referencing the missing datastore; if the question marks are still there:
1. check pvesm status. there should be no unknown datastores
2. systemctl restart pvestatd
3. systemctl restart pveproxy (may not be needed)

Otherwise, your procedure should work, EXCEPT that modifying vmid's in the config would likely be insufficient since the logical volumes will be misnamed.

bbgeek17 · Sep 17, 2025

alexskysilk said:
this is due to the stats collector being in a hung state. make sure there are no vm's still referencing the missing datastore

Not only that, but if there are dead DM devices, the lvs, pvs and other scan commands used by PVE will hang. This in turn will cause stats daemon to hang.
Having dead devices on the system will lead to unpredictable instability.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Budgreg · Sep 17, 2025

alexskysilk said:
this is due to the stats collector being in a hung state. make sure there are no vm's still referencing the missing datastore

Hi,

thanks for your reply. No, there was a single vm (shut down) using this specific storage. And although this cluster handles the least critical systems of all I´d like to reduce the impact of the migration tests to a minimum so I quickly put the LUN back online again which solved the issue for now.

Will work on stabilizing the situation after setting the LUN to offline (and removing the mapping for the cluster). I appreciate any further information on how to remove the storage - thanks.

alexskysilk said:
Otherwise, your procedure should work, EXCEPT that modifying vmid's in the config would likely be insufficient since the logical volumes will be misnamed.

Got it. Keeping the original names of the logical volumes could lead to duplicate volumes so the logical volume names have to be modified, too.

Search

Search

vm offline migration from cluster to cluster using Netapp Storage

Budgreg

New Member

bbgeek17

Distinguished Member

Budgreg

New Member

alexskysilk

Distinguished Member

bbgeek17

Distinguished Member

Budgreg

New Member

We value your privacy