Not sure how I managed this, but I have two nodes in a cluster. I tried to migrate a container from one node to the other, unsuccessfully, due to a lack of shared storage between the nodes - fair enough. I tried replication and that didn't work either. I gave up on all that, but left the two nodes as they were. Tonight I powered down the main node to replace one HDD, but somehow while I was doing all that the container in question (my DNS server) migrated over to the 2nd node despite its root storage target being unavailable on that node. Now my main node is back up but the container refuses to move back:
And this shows up if I try to use replication to somehow trick it into going back:
I can't backup the container or boot it to copy files off of it due to the root storage being inaccessible from this 2nd node. Still baffled about how it managed to migrate given that's the case, but here I am. What must I do?
Code:
root@proxmox2:~# ha-manager status
quorum OK
master proxmox (active, Tue Feb 4 00:42:23 2025)
lrm proxmox (idle, Tue Feb 4 00:42:23 2025)
lrm proxmox2 (idle, Tue Feb 4 00:42:24 2025)
service ct:101 (proxmox2, disabled)
root@proxmox2:~# ha-manager crm-command relocate ct:101 proxmox
root@proxmox2:~# journalctl -f
[...]
Feb 04 00:43:04 proxmox2 pve-ha-lrm[998]: successfully acquired lock 'ha_agent_proxmox2_lock'
Feb 04 00:43:04 proxmox2 pve-ha-lrm[998]: watchdog active
Feb 04 00:43:04 proxmox2 pve-ha-lrm[998]: status change wait_for_agent_lock => active
Feb 04 00:43:04 proxmox2 pve-ha-lrm[5948]: <root@pam> starting task UPID:proxmox2:0000173D:000205DD:67A1D318:vzmigrate:101:root@pam:
Feb 04 00:43:04 proxmox2 pve-ha-lrm[5949]: migration aborted
Feb 04 00:43:04 proxmox2 pve-ha-lrm[5948]: <root@pam> end task UPID:proxmox2:0000173D:000205DD:67A1D318:vzmigrate:101:root@pam: migration aborted
Feb 04 00:43:04 proxmox2 pve-ha-lrm[5948]: service ct:101 not moved (migration error)
And this shows up if I try to use replication to somehow trick it into going back:
Code:
101-0: got unexpected replication job error - zfs error: cannot open 'nvme_zfs/subvol-101-disk-0': dataset does not exist
I can't backup the container or boot it to copy files off of it due to the root storage being inaccessible from this 2nd node. Still baffled about how it managed to migrate given that's the case, but here I am. What must I do?
Last edited: