Move disk causes cluster to crash

Ron Gage

New Member
Aug 7, 2019
3
0
1
57
Proxmox 6.0-4 running in a 3 node cluster, storage on NFS and on Ceph. Networking is 2 subnets - storage/corosync and management/virtual machines. Storage network is minimum 2 links per host, 4 links to NFS storage - all on LACP, all on MTU 9000.

I created a test linux machine on node 1 and NFS storage. I then moved (do not delete source) the disk from NFS to Ceph while the virtual machine was running. At the end of the copy phase, the entire management UI crashes - web becomes unresponsive and I cannot log into management UI on any host.

I created a little script on each host called pverestart.sh with the following content:
killall -9 corosync
systemctl restart pve-cluster
systemctl restart pvedaemon
systemctl restart pveproxy
systemctl restart pvestatd

I had to run this script multiple times on each host to get the management interface to return and become usable again.

One positive from this: the virtual machines stayed operating and responsive the entire time.

How can I help you help me solve this problem?

Ron
 
are you sure that network link is not overloaded ? (as storage/corosync are on the same subnet).

(Note that with lacp, 1tcp connection always use same link, so when qemu read the nfs, it's only 1link. for write to ceph, it'll use multiple links/tcp connections)
 
are you sure that network link is not overloaded ? (as storage/corosync are on the same subnet).

(Note that with lacp, 1tcp connection always use same link, so when qemu read the nfs, it's only 1link. for write to ceph, it'll use multiple links/tcp connections)

While I do not have MRTG running on the port set/switch, I have no reason to believe that I am hitting saturation on the switch. Up until the point where the "cutover" happens after the copy cycle, there are no indications of trouble. Indeed, all the VMs remain running without any indication of problems - both on NFS and on Ceph.

Do you have any suggestions of things to specifically check?

Ron
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!