All nodes restart automatically at the same time

zhaohl

Member
Nov 14, 2019
1
0
6
32
At 14:29 and 14:46 All nodes restart automatically at the same time;
last | grep -i boot
reboot system boot 5.3.10-1-pve Mon Aug 30 14:46 still running
reboot system boot 5.3.10-1-pve Mon Aug 30 14:29 still running


Aug 30 14:23:22 node1 pmxcfs[2262]: [status] notice: received all states
Aug 30 14:23:22 node1 pmxcfs[2262]: [status] notice: all data is up to date
Aug 30 14:23:22 node1 pmxcfs[2262]: [status] notice: dfsm_deliver_queue: queue length 31
Aug 30 14:23:23 node1 pve-ha-lrm[2615]: successfully acquired lock 'ha_agent_node1_lock'
Aug 30 14:23:23 node1 pve-ha-lrm[2615]: status change lost_agent_lock => active
Aug 30 14:23:31 node1 pve-ha-crm[2606]: status change wait_for_quorum => slave
Aug 30 14:24:01 node1 systemd[1]: Starting Proxmox VE replication runner...
Aug 30 14:24:01 node1 systemd[1]: pvesr.service: Succeeded.
Aug 30 14:24:01 node1 systemd[1]: Started Proxmox VE replication runner.
Aug 30 14:25:01 node1 systemd[1]: Starting Proxmox VE replication runner...
Aug 30 14:25:01 node1 systemd[1]: pvesr.service: Succeeded.
Aug 30 14:25:01 node1 systemd[1]: Started Proxmox VE replication runner.
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 5 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 4 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 3 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 2 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 5 has no active links
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1)
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 4 has no active links
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 3 has no active links
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] host: host: 2 has no active links
Aug 30 14:25:58 node1 corosync[2384]: [TOTEM ] Token has not been received in 2212 ms
Aug 30 14:25:59 node1 corosync[2384]: [TOTEM ] A processor failed, forming new configuration.
Aug 30 14:26:01 node1 systemd[1]: Starting Proxmox VE replication runner...
Aug 30 14:26:03 node1 corosync[2384]: [TOTEM ] A new membership (1.9b8) was formed. Members left: 2 3 4 5
Aug 30 14:26:03 node1 corosync[2384]: [TOTEM ] Failed to receive the leave message. failed: 2 3 4 5
Aug 30 14:26:03 node1 corosync[2384]: [CPG ] downlist left_list: 4 received
Aug 30 14:26:03 node1 pmxcfs[2262]: [dcdb] notice: members: 1/2262
Aug 30 14:26:03 node1 corosync[2384]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 30 14:26:03 node1 corosync[2384]: [QUORUM] Members[1]: 1
Aug 30 14:26:03 node1 pmxcfs[2262]: [status] notice: node lost quorum
Aug 30 14:26:03 node1 corosync[2384]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 30 14:26:03 node1 pmxcfs[2262]: [status] notice: members: 1/2262
Aug 30 14:26:03 node1 pmxcfs[2262]: [dcdb] crit: received write while not quorate - trigger resync
Aug 30 14:26:03 node1 pmxcfs[2262]: [dcdb] crit: leaving CPG group
 

Attachments

  • kern.log
    601.1 KB · Views: 4
  • daemon.txt
    521.4 KB · Views: 1
  • syslog.txt
    520.5 KB · Views: 2
  • debug.txt
    31.6 KB · Views: 1
it seems your network was down:

Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 5 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 4 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 3 link: 0 is down
Aug 30 14:25:57 node1 corosync[2384]: [KNET ] link: host: 2 link: 0 is down

check your switch/network hardware

if you enable ha (i guess you did), the nodes not in a quorate partition will fence themselves
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!