Marking ringid 0 interface FAULTY

itvietnam · Oct 28, 2018

Hi,

Our corosync is usually said this and recover after 1 second. may i know how to debug?

Test omping in 5 minutes and ok. No packet lost.

This is the test on ringid 1, they are also has the same problem with ringid 0.

Error of ringid 1 (10.20.30.0/24)

itvietnam · Oct 28, 2018

This is the test i performed on ringid 0 with history command:

Stoiko Ivanov · Oct 29, 2018

does omping with a high packet rate work?
omping -c 10000 -i 0.001 -F -q NODE1-IP NODE2-IP ... (from https://pve.proxmox.com/pve-docs/chapter-pvecm.html)

else - I would start recording the traffic on the ring with tcpdump and take a look at the dump with wireshark.

itvietnam · Oct 29, 2018

can you provide command then i will test on server

Thanks,

Stoiko Ivanov · Oct 29, 2018

The `omping` command invocations are explained in our documentation.

For the tcpdump, I'd do something like
`tcpdump -s0 -w corosyncproblem.pcap -ni $IFACE`, where `$IFACE` is replaced by the interface your corosync network is bound on.

after gathering enough data (the log saying that the ring became faulty and ok again) - stop tcpdump with Ctrl-C, take the resulting corosyncproblem.pcap and open it in wireshark.

VitoHoang · Nov 1, 2018

Here is our result of omping with a high packet rate. No packets loss and the ring not came to faulty during the test. Have done this test many time.

We attach the tcpdump result.
https://drive.google.com/file/d/1S7gqhpTJ-ysUT42rqU-9Ynf-JQegi22O/view?usp=sharing

Stoiko Ivanov · Nov 6, 2018

You could try to set corosync to log debug messages - and see if the logs show anything with more information

Search

Search

Marking ringid 0 interface FAULTY

itvietnam

Renowned Member

itvietnam

Renowned Member

Stoiko Ivanov

Proxmox Staff Member

itvietnam

Renowned Member

Stoiko Ivanov

Proxmox Staff Member

VitoHoang

New Member

Attachments

Stoiko Ivanov

Proxmox Staff Member

We value your privacy