Added 6th node to cluster, attempted migration from node 2 to node 6, failed with disk full message and I have lost all VM and LXC data in the GUI

marcmcmillin

Member
Dec 31, 2023
1
0
6
I had a 5-node cluster with many VMs and LXCs.

I successfully added a 6th node that had a 256GB boot drive and a 3 drive ZFS pool (approx 10 TiB)

I attempted to migrate a 600gb VM running Docker workloads from node 2 to to node 6.

This proceeded until the task got a "not enough space on device" message, and then I was logged out of the GUI and could not log in. I could log in via SSH to each of the nodes.

My VMs and LXCs are still running, as I can ssh into them and I have access to the workloads, including a 9-node k3s cluster.

This feels like the corosync got messed up and I have lost all of the conf files for the VMs and LXCs.

Has anyone seen this behavior before and or have any recommendations?

Here's some info from node 2


root@proxmox2:~# qm list
root@proxmox2:~# ls /etc/pve/nodes/proxmox2/qemu-server/
root@proxmox2:~# lvs | grep vm-
root@proxmox2:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
zfsa 290G 609G 96K /zfsa
zfsa/base-997-disk-0 3.07M 609G 76K -
zfsa/base-997-disk-1 33.5G 642G 1.00G -
zfsa/vm-1572-cloudinit 6M 609G 76K -
zfsa/vm-1572-disk-0 28.9G 623G 14.6G -
zfsa/vm-1574-cloudinit 6M 609G 76K -
zfsa/vm-1574-disk-0 64.5G 634G 39.8G -
zfsa/vm-1577-cloudinit 6M 609G 76K -
zfsa/vm-1577-disk-0 131G 697G 42.4G -
zfsa/vm-7600-cloudinit 6M 609G 72K -
zfsa/vm-7600-disk-0 3M 609G 92K -
zfsa/vm-7600-disk-1 32.5G 640G 1.17G -
zfsa/vm-997-cloudinit 6M 609G 72K -
zfsc 803G 120G 96K /zfsc
zfsc/vm-703-disk-0 193G 184G 129G -
zfsc/vm-706-disk-0 609G 333G 396G -
root@proxmox2:~# systemctl status pve-cluster
* pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/usr/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
Active: active (running) since Thu 2025-12-25 13:01:50 CST; 3 weeks 3 days ago
Invocation: b71c852093d441609f2ce69004095238
Main PID: 2015 (pmxcfs)
Tasks: 13 (limit: 76935)
Memory: 47.6M (peak: 92.5M, swap: 16M, swap peak: 21.1M)
CPU: 1h 5min 37.508s
CGroup: /system.slice/pve-cluster.service
`-2015 /usr/bin/pmxcfs

Jan 19 09:20:38 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:20:54 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:22:55 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:25:59 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:25:59 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:35:54 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:46:04 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:46:51 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:49:13 proxmox2 pmxcfs[2015]: [status] notice: received log
Jan 19 09:49:13 proxmox2 pmxcfs[2015]: [status] notice: received log
root@proxmox2:~# grep -R "1572" /etc/pve/nodes/*/qemu-server/


Thanks in advance for assistance.

Marc
 

Attachments

  • 2026-01-19 09.47.31 pve.mmchomelab.com 1719346be92c.jpg
    2026-01-19 09.47.31 pve.mmchomelab.com 1719346be92c.jpg
    191.6 KB · Views: 3
Last edited: