Help needed - failed node

xlnt

Member
Nov 17, 2011
51
0
6
Sweden
Hi
After having to reboot my 2 node cluster yesterday I ran in to a problem where one of the nodes won't boot any more (pm02). It gets stuck with a kernel panic - not syncing.
This has resulted in that all the VM's that was running on pm02 node are now inaccessible. So how do I start or recover the VM's on the failed node? The image files are on a shared storage. And later on how do I remove and replace the pm02 node within the cluster?

pm_error.jpg
 
This has resulted in that all the VM's that was running on pm02 node are now inaccessible. So how do I start or recover the VM's on the failed node? The image files are on a shared storage. And later on how do I remove and replace the pm02 node within the cluster?

If all VM disks are on shared storage, you can simply move the config files to the other node:

# mv /etc/pve/nodes/<OLDNODE>/qemu-server/<VMID>.conf /etc/pve/nodes/<NEWNODE>/qemu-server

Or you simply replace the damaged hardware, and re-add the new node using the same hostname (use --force flag for add).
 
Thank's dietmar for the quick reply. I'll try this later today.
If I later on would like to replace the pm01 node which is the first node in the cluster would that be as simple as well as simply shutting it down and install a new node with the same hostname and re-add using --force?
 
If all VM disks are on shared storage, you can simply move the config files to the other node:

# mv /etc/pve/nodes/<OLDNODE>/qemu-server/<VMID>.conf /etc/pve/nodes/<NEWNODE>/qemu-server

Or you simply replace the damaged hardware, and re-add the new node using the same hostname (use --force flag for add).

Hi dietmar
Even though I'm running this as root I get Permissin denied when trying to move those files.
root@pm01:~# mv /etc/pve/nodes/pm02/qemu-server/100.conf /etc/pve/nodes/pm01/qemu-server/
mv: cannot move `/etc/pve/nodes/pm02/qemu-server/100.conf' to `/etc/pve/nodes/pm01/qemu-server/100.conf': Permission denied

What am I missing here?
 
Thanks for the excellent support, with these instructions I've now been able to start all the critical VM's on the working node. I have been running proxmox in a small test environment for a while now and now after this I feel confident enough to replace my other virtual production environment to proxmox instead. ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!