I managed to move the VM to another node, move the disk to the SAN. Then i could move the VM to a third node.
But i still get this error when i try to move it back to the original node i encountered the error on.
So that server is the only one with these issues.
Still get the same error. I moved the drive to another storage for now. The issue seems to be related to iscsi having something stuck somewhere.
Pveversion (we use the enterprise repos):
# pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.13-3-pve)
pve-manager: 6.1-7 (running version...
We have a 3 node cluster and a SAN that is connected via iscsi (using lvm over iscsi).
I had to do some maintenance and move machines around and i noticed that one specific machine could not be migrated between nodes because of errors related to it's storage.
Initially it gave me some errors...
Well i just checked and we are on qemu 4.0.0-5 yet... So i cannot tell you anthing for sure.
In these cases we just shut down the machines and started up again.
But the issue had pretty much gone away after we moved the cluster traffic to a separate dedicated network without any package update.
1. You would have to reinstall them ideally. Strictly speaking it is possible to remove and readd them but you would have to make really sure stuff is
- cleaned thoroughly from the distributed cluster config
- not stuck in the cluster config
- not cached on the node.
I don't know for sure but...
It seems to me that the new corosync using unicast is more finicky than the old one that used multicast. We havea 3 node cluster and we had corosync related issues after upgrading to PVE 6. What we did is we split the management network (4x 1gbit links) into 2 2x1gbit, one for management and...
You can use a LE certificate internally too. The browser doesn't care how the DNS name was resolved or what IP points to (so you can use the hosts file, too for this, you don't even need an internal DNS to handle stuff).
But for generating the certificate you need internet connection and DNS...
Ok, good. I marked this as solved.
Interesting is the fact that after we separated the cluster network from the management we did not have any issues like this anymore.
This happened the day i submitted this thread or a day after that, since then i saw no problematic VMs and we did not update...
1. If a node ails it means you lost connection to it for whatever reason. This makes LIVE migration of any kind impossible because live migration needs both source and target servers running and communicating. HA is handled differently, on Proxmox you will have VMs respawning on live nodes after...