Been using Proxmox for nearly a year stable on a single machine. Decided to add two more machines to create a cluster and eventual HA.
Steps taken on original machine (
Steps taken on
Behavior that is repeatedly observed:
New nodes become unhealthy overnight and by the next day only
I've done some googling and checked out logs with commands such as
Steps taken on original machine (
metal-01):- Datacenter -> Cluster -> Create Cluster
Steps taken on
metal-02 and metal-03:- Installed PVE 8.03 -> RAID-Z1 on 2x Micron 5400 960GB
- Minor NIC configurations for static IP and SAN networking
- Edited
/etc/apt/source.listto contain the pve free repo - Commented out the enterprise repos from
/etc/apt/sources.list.d/* - Ran apt update + upgrade
- Datacenter -> Cluster -> Join Cluster
- Edited a few Storage items as they only exist on
metal-01but defaulted to All Nodes
Behavior that is repeatedly observed:
New nodes become unhealthy overnight and by the next day only
metal-01 is still green. The nodes metal-02 and metal-03 are unresponsive via SSH as well, they are completely locked up. A physical restart brings the node back into the cluster.I've done some googling and checked out logs with commands such as
journalctl -b -u pve-cluster -u corosync but nothing is jumping out here.
Last edited: