Relative Proxmox and Linux newbie here. I was playing around with HA on a 3-node cluster and seem to have hosed things up for one of my VMs. When I simulated a node failure to check HA, the migration failed with a "TASK ERROR: volume 'SSD:110/vm-110-disk-0.qcow2' does not exist". I also got an error that the VM wouldn't start: "TASK ERROR: command 'ha-manager set vm:110 --state started' failed: exit code 255". No big deal, I was going to move the VM back to the original node and start investigating what I did wrong.
Problem now is I can not move the VM to either of the other two nodes, nor can I start it on the 3rd node. When moving back to the original node it came from I get:
2022-05-20 14:28:41 ssh: connect to host 192.168.1.30 port 22: Connection refused
2022-05-20 14:28:41 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted.
I changed my SSH port, but not sure how I can adjust that in Proxmox so that I don't get this error.
However, I changed my SSH port to the default 22 on the 2nd node, and I get this error when trying to migrate the VM there:
ERROR: migration aborted (duration 00:00:00): storage migration for 'SSD:110/vm-110-disk-0.qcow2' to storage '' failed - no storage ID specified
TASK ERROR: migration aborted.
So, not the SSH problem, but something else. Being a newbie, not sure how to fix this. "SSD" is a directory storage setup on each of the 3 nodes.
When I just try to start the VM on the node it got moved to during the HA failure, I get:
TASK ERROR: volume 'SSD:110/vm-110-disk-0.qcow2' does not exist
So, a storage/disk issue, but where to go from here? All three machines are up-to-date with updates.
I appreciate the help!
--James
Problem now is I can not move the VM to either of the other two nodes, nor can I start it on the 3rd node. When moving back to the original node it came from I get:
2022-05-20 14:28:41 ssh: connect to host 192.168.1.30 port 22: Connection refused
2022-05-20 14:28:41 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted.
I changed my SSH port, but not sure how I can adjust that in Proxmox so that I don't get this error.
However, I changed my SSH port to the default 22 on the 2nd node, and I get this error when trying to migrate the VM there:
ERROR: migration aborted (duration 00:00:00): storage migration for 'SSD:110/vm-110-disk-0.qcow2' to storage '' failed - no storage ID specified
TASK ERROR: migration aborted.
So, not the SSH problem, but something else. Being a newbie, not sure how to fix this. "SSD" is a directory storage setup on each of the 3 nodes.
When I just try to start the VM on the node it got moved to during the HA failure, I get:
TASK ERROR: volume 'SSD:110/vm-110-disk-0.qcow2' does not exist
So, a storage/disk issue, but where to go from here? All three machines are up-to-date with updates.
I appreciate the help!
--James