My cluserter have 33 nodes with ceph, cluster will reboot randomly by some of below operation:
1. systemctl restart corosync
2. add a new node into cluster
3. reboot one of the node
How to stop server reboot automatic ? ????? This is a production environment, I really have no idea.
What I had done is :
1. remove all ha group, I heard that ha could reboot server----HA service still ruuning
2. make corosync totek to 10000 ms, it seems no use!
What I want is no reboot any more, I can accept no HA, but no server auto reboot!
My server is : HP Dl380gen8, Gen9, IBM x3650 m4 with 10G sfp+ network
ceph: 269 osd, Data over 100TB.
Switch: cisco 4506 all link to server with single trunk port
I also have another VSAN cluster in the same switches, and also a single 10G sfp+ network, and same server. Network is stable, I am sure!
pvecm status
1. systemctl restart corosync
2. add a new node into cluster
3. reboot one of the node
How to stop server reboot automatic ? ????? This is a production environment, I really have no idea.
What I had done is :
1. remove all ha group, I heard that ha could reboot server----HA service still ruuning
2. make corosync totek to 10000 ms, it seems no use!
What I want is no reboot any more, I can accept no HA, but no server auto reboot!
My server is : HP Dl380gen8, Gen9, IBM x3650 m4 with 10G sfp+ network
ceph: 269 osd, Data over 100TB.
Switch: cisco 4506 all link to server with single trunk port
I also have another VSAN cluster in the same switches, and also a single 10G sfp+ network, and same server. Network is stable, I am sure!
pvecm status
root@g8kvm04:~# pvecm status Cluster information ------------------- Name: AW-G8-KVM Config Version: 40 Transport: knet Secure auth: on Quorum information ------------------ Date: Wed Aug 19 21:42:12 2020 Quorum provider: corosync_votequorum Nodes: 33 Node ID: 0x00000002 Ring ID: 1.1f38 Quorate: Yes Votequorum information ---------------------- Expected votes: 33 Highest expected: 33 Total votes: 33 Quorum: 17 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 10.0.141.1 0x00000002 1 10.0.141.2 (local) 0x00000003 1 10.0.141.5 0x00000004 1 10.0.141.6 0x00000005 1 10.0.141.3 0x00000006 1 10.0.141.4 0x00000007 1 10.0.141.7 0x00000008 1 10.0.141.21 0x00000009 1 10.0.141.22 0x0000000a 1 10.0.141.23 0x0000000b 1 10.0.141.24 0x0000000c 1 10.0.141.8 0x0000000d 1 10.0.141.25 0x0000000e 1 10.0.141.26 0x0000000f 1 10.0.141.31 0x00000010 1 10.0.141.9 0x00000011 1 10.0.141.10 0x00000012 1 10.0.141.27 0x00000013 1 10.0.141.28 0x00000014 1 10.0.141.29 0x00000015 1 10.0.141.16 0x00000016 1 10.0.141.18 0x00000017 1 10.0.141.20 0x00000018 1 10.0.141.17 0x00000019 1 10.0.141.19 0x0000001a 1 10.0.141.15 0x0000001b 1 10.0.141.14 0x0000001c 1 10.0.141.32 0x0000001d 1 10.0.141.13 0x0000001e 1 10.0.141.30 0x0000001f 1 10.0.141.33 0x00000020 1 10.0.141.11 0x00000021 1 10.0.141.12 |