Hello all,
[Background] We recently set up our first Proxmox cluster with four new HP ProLiant DL360 Gen9 servers and a HP-2530-48G (J9775A) switch. The servers' NICs configured to form LACP bonds to the switch, and we have set up CEPH and HA in Proxmox.
[Issue 1] We were getting random reboots of nodes or the whole cluster, but put a band-aid on it by creating a job that updates the Time/Date from a public NTP server every minute. Clearly this isn't ideal, so we were hoping for some advise on how to properly fix this issue.
[Issue 2] The second issue we're having is that we're trying to add a fifth server (on a second switch) into the cluster and CEPH pool, but it's causing nodes to spontaneously reboot.
Any leads would be greatly appreciated.
[Background] We recently set up our first Proxmox cluster with four new HP ProLiant DL360 Gen9 servers and a HP-2530-48G (J9775A) switch. The servers' NICs configured to form LACP bonds to the switch, and we have set up CEPH and HA in Proxmox.
[Issue 1] We were getting random reboots of nodes or the whole cluster, but put a band-aid on it by creating a job that updates the Time/Date from a public NTP server every minute. Clearly this isn't ideal, so we were hoping for some advise on how to properly fix this issue.
[Issue 2] The second issue we're having is that we're trying to add a fifth server (on a second switch) into the cluster and CEPH pool, but it's causing nodes to spontaneously reboot.
Any leads would be greatly appreciated.