At the beginning of troubleshooting this issue we had a 2 node cluster:
While away I got a call from one of my tech's onsite that proxmox02 was offline in the web ui and that all the machines running on it were unavailable. At this point he never tried to ssh into proxmox02 and just hard reset it at the server. It booted back up, all the containers and VM's booted and all was well in the web ui. It ran fine for 2 days and then happened again. This time he just hard reset it out of the gate and then called me.
I had a new node that was all ready to go out to join the cluster at out colo datacenter. I had him grab that off the bench, install it in the rack, join in to the cluster and migrated all of the containers and VA's that were on proxmox02 the new node named proxmox03. So at this point our cluster looks as follows:
Has anyone experienced anything like this before? Any suggestions on where to start. I had originally suspected a hardware issue when it happened to the 1 server, but know with both experiencing it at the same time I am convinced that it is a configuration problem. I suspect something with quorum, but I am not sure where to start looking.
Any help would be greatly appreciated.
- proxmox01 (pve 3.2-4)
- proxmox02 (pve 3.3-1)
While away I got a call from one of my tech's onsite that proxmox02 was offline in the web ui and that all the machines running on it were unavailable. At this point he never tried to ssh into proxmox02 and just hard reset it at the server. It booted back up, all the containers and VM's booted and all was well in the web ui. It ran fine for 2 days and then happened again. This time he just hard reset it out of the gate and then called me.
I had a new node that was all ready to go out to join the cluster at out colo datacenter. I had him grab that off the bench, install it in the rack, join in to the cluster and migrated all of the containers and VA's that were on proxmox02 the new node named proxmox03. So at this point our cluster looks as follows:
- proxmox01 (pve 3.2-4)
- proxmox02 (pve 3.3-1)
- proxmox03 (pve 3.3-1)
Has anyone experienced anything like this before? Any suggestions on where to start. I had originally suspected a hardware issue when it happened to the 1 server, but know with both experiencing it at the same time I am convinced that it is a configuration problem. I suspect something with quorum, but I am not sure where to start looking.
Any help would be greatly appreciated.
Last edited: