Serious problem with ProxMox cluster losing sync

oeginc

Member
Mar 21, 2009
133
0
16
I had 3 nodes up and running just fine. I added two new nodes, which seemed
to work just fine. I migrated some servers two the 4th and 5th nodes, every
thing ran fine... For about 2 hours, then the 5th node started having problems
with it's hard drive. I tried migrating the server, but it kept giving me I/O
errors. So I copied the backup to another node and tried to restore it, but
it keeps saying that the conf file already exists. I checked the path and the
file does NOT exist, I've checked several different nodes and I didn't see
the conf file on any of them. It refuses to vzrestore, even with the force
option... I tried rebooting the node I was restoring too, and I am still getting
the same error. I have since tried removing the 5th node (the one with the
hard drive problems) from the cluster, but 1) it is still displayed in the
list of servers in the cluster, and 2) it didn't make any difference, I am still
getting errors restoring the backup.

HELP!?

I have no idea how to debug this with the funky corosync stuff...
 
all VM configurations files are store on http://pve.proxmox.com/wiki/Proxmox_Cluster_file_system_(pmxcfs)

if you have a valid backup, you should be able to restore, just use another VMID for the tests.

to finally remove a node from displaying on the web, just remove the corresponding node directory - see /etc/pve/nodes/NODENAME/...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!