HA-Computing Nodes Reboots during File copy

kev1904

Well-Known Member
Feb 11, 2019
61
4
48
33
We run a cluster with with 5 Computing nodes and 6 Storage nodes.
After we start a Copy from the backup node to a nfs mounted NAS, all computing nodes reboots at same time. The Traffic to the NAS goes over different Network Interfaces as the management from the Nodes (quorum, corsyn).
The NAS is mounted at every Node nad get laggy during the Copy but that does not sould reboot the HA Computing Nodes.
We get the Message "The node 'XXXXXX' failed and needs manual intervention."

{
"manager_status" : {
"master_node" : "proxstore01",
"node_status" : {
"prox01" : "unknown",
"prox02" : "unknown",
"prox03" : "unknown",
"prox04" : "unknown",
"prox06" : "unknown",
"proxback01" : "online",
"proxstore01" : "online",
"proxstore02" : "online",
"proxstore11" : "online",
"proxstore12" : "online",
"proxstore13" : "online"
},
 
Are you certain that your corosync and storage traffic is physically separated (no VLAN)?