Proxmox/Ceph cluster failed; nodes running, but VMs offline.

Jason King · Oct 30, 2016

I have a 3-node Proxmox/Ceph cluster at OVH. All systems have been operating over the past month, until this morning when I found that all the VMs on this cluster had gone down. HA VMs were down and the restart process seemed locked up. With only minor intervention, i.e. logging in to web gui and stopping some of the restart tasks and then restarting them all VMs came back online, but were down for 5 hours. Now I need to find out what happened, but I'm not sure which logs to review. I know this isn't much to go on, but if someone could point to any logs I should review to find the cause of the failure, I'd greatly appreciate the assistance.

dietmar · Oct 30, 2016

/var/log/syslog is a good start

Search

Search

Proxmox/Ceph cluster failed; nodes running, but VMs offline.

Jason King

New Member

dietmar

Proxmox Staff Member

We value your privacy