VM migration errors

royalj7

New Member
Aug 11, 2020
7
0
1
45
Relative Proxmox and Linux newbie here. I was playing around with HA on a 3-node cluster and seem to have hosed things up for one of my VMs. When I simulated a node failure to check HA, the migration failed with a "TASK ERROR: volume 'SSD:110/vm-110-disk-0.qcow2' does not exist". I also got an error that the VM wouldn't start: "TASK ERROR: command 'ha-manager set vm:110 --state started' failed: exit code 255". No big deal, I was going to move the VM back to the original node and start investigating what I did wrong.

Problem now is I can not move the VM to either of the other two nodes, nor can I start it on the 3rd node. When moving back to the original node it came from I get:
2022-05-20 14:28:41 ssh: connect to host 192.168.1.30 port 22: Connection refused
2022-05-20 14:28:41 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted.
I changed my SSH port, but not sure how I can adjust that in Proxmox so that I don't get this error.

However, I changed my SSH port to the default 22 on the 2nd node, and I get this error when trying to migrate the VM there:
ERROR: migration aborted (duration 00:00:00): storage migration for 'SSD:110/vm-110-disk-0.qcow2' to storage '' failed - no storage ID specified
TASK ERROR: migration aborted.
So, not the SSH problem, but something else. Being a newbie, not sure how to fix this. "SSD" is a directory storage setup on each of the 3 nodes.

When I just try to start the VM on the node it got moved to during the HA failure, I get:
TASK ERROR: volume 'SSD:110/vm-110-disk-0.qcow2' does not exist
So, a storage/disk issue, but where to go from here? All three machines are up-to-date with updates.

I appreciate the help!
--James
 
I changed the SSH port on the 1st node back to 22, and I get the same "no storage ID specified" error as I got on node 2. Doing some research on that error, I came across this forum thread. However, I already have updated all my nodes and I'm on pve-container version 4.2-1, which is newer than what was required to fix the "no storage ID specified" in that instance. So I can't move the VM to either of the other cluster nodes, and I can't start it up on the one its shown on in Proxmox's web GUI. Anyone have any ideas on a fix? This was my DNS/ad-blocking VM. Luckily, I have a backup DNS running on a RPi4, but I'd like to get this solved just in case.
 
Anyone have any ideas on how to fix or diagnosis? My google-fu doesn't seem to be up to the task...

Thanks!
 
please post
- pveversion -v
- /etc/pve/storage.cfg
- VM config
- full task logs of the failed tasks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!