Beware if you are already experiencing issues, the steps taken to diagnose the problem may make the problem worse in the short term!
You have already a high latency to the node 10.123.1.188
10.123.1.188 : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.072/0.132/7.539/0.145...
You have to install omping on all the machines, you want to test.
Then you have to fire up
omping -c 10000 -i 0.001 -F -q node1 node2 node3 nodeX
on all the nodes listed (node1, node2, node3, nodeX) on the same time, otherwise there could not be any response.
In your case start it (best at the...
I think one single link for all traffic is not the bettest way of doing it.
You can, and obusly now ran into problems, if the cluster becomes bigger. If all nodes talk to each other the latency becomes too high, and fencing starts.
Also if it worked with 14 nodes, in situation with high network...
Corosync cares more about latency than bandwidth. In reference the corosync link should always be seperated from other network traffic.
Especially if the cluster becomes bigger, non seperated links could end up at a too high latency for corosync. It is also dangerus only having one link, because...
Hello agapitox,
could you describe your corosync network in more detail? Is there just one link for corosync (is this link seperated from other traffic)? Are the other nodes on a different switch?
How are the switches connectet to each other?
Hi ysyldur,
Can you tell us a little bit more about your corosync network connection? (physical and virtual situation / interfaces, switches, connection speed, ect.)
We had similar messanges like: corosync[29232]: [TOTEM ] Token has not been received in 380 ms
on a smaller cluster. Our...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.