Corosync how to fix

gerdnl

Member
Oct 21, 2013
114
0
16
Hi,

How can i fix this.
i run on 3 nodes all nodes or on gigabit ports

root@pve01:/var/lib/vz/dump# ethtool eth1
Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: on
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
root@pve01:/var/lib/vz/dump#



Mar 2 19:41:34 pve03 pvedaemon[3777]: worker 524838 finished
Mar 2 19:41:34 pve03 pvedaemon[3777]: starting 1 worker(s)
Mar 2 19:41:34 pve03 pvedaemon[3777]: worker 525960 started
Mar 2 19:41:41 pve03 corosync[3452]: [TOTEM ] Retransmit List: 3483ed
Mar 2 19:42:01 pve03 corosync[3452]: [TOTEM ] Retransmit List: 348429 34842a
Mar 2 19:42:16 pve03 pvedaemon[3777]: worker 525140 finished
Mar 2 19:42:16 pve03 pvedaemon[3777]: starting 1 worker(s)
Mar 2 19:42:16 pve03 pvedaemon[3777]: worker 526019 started
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:21 pve03 corosync[3452]: [TOTEM ] Retransmit List: 34846e
Mar 2 19:42:31 pve03 corosync[3452]: [TOTEM ] Retransmit List: 348491
Mar 2 19:42:39 pve03 corosync[3452]: [TOTEM ] Retransmit List: 3484a8
Mar 2 19:42:49 pve03 corosync[3452]: [TOTEM ] Retransmit List: 3484c9
Mar 2 19:42:49 pve03 corosync[3452]: [TOTEM ] Retransmit List: 3484c9
 
some more info about my cluster




NODE 1
--------

interface

auto lo
iface lo inet loopback


auto eth0
iface eth0 inet manual


auto eth1
iface eth1 inet manual


auto vmbr159
iface vmbr159 inet manual
bridge_ports eth0
bridge_stp off
bridge_fd 0


auto vmbr907
iface vmbr907 inet static
address 10.90.7.10
gateway 10.90.7.1
netmask 255.255.255.0
bridge_ports eth1
bridge_stp off
bridge_fd 0
dns-namservers 8.8.8.8

cluster.conf


<?xml version="1.0"?>
<cluster name="nbhosting" config_version="4">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>
<clusternode name="pve01" votes="1" nodeid="1"/>
<clusternode name="pve02" votes="1" nodeid="2"/>
<clusternode name="pve03" votes="1" nodeid="3"/></clusternodes>
</cluster>



NODE 2
---------

interfaces


auto lo
iface lo inet loopback


auto eth0
iface eth0 inet manual


auto eth1
iface eth1 inet manual


auto vmbr159
iface vmbr159 inet manual
bridge_ports eth0
bridge_stp off
bridge_fd 0


auto vmbr907
iface vmbr907 inet static
address 10.90.7.11
netmask 255.255.255.0
gateway 10.90.7.1
bridge_ports eth1
bridge_stp off
bridge_fd 0

cluster.conf


<?xml version="1.0"?>
<cluster name="nbhosting" config_version="4">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>
<clusternode name="pve01" votes="1" nodeid="1"/>
<clusternode name="pve02" votes="1" nodeid="2"/>
<clusternode name="pve03" votes="1" nodeid="3"/></clusternodes>
</cluster>



NODE 3

interfaces


auto lo
iface lo inet loopback


auto eth1
iface eth1 inet manual


auto eth0
iface eth0 inet manual


auto vmbr159
iface vmbr159 inet manual
bridge_ports eth0
bridge_stp off
bridge_fd 0


auto vmbr907
iface vmbr907 inet static
address 10.90.7.12
netmask 255.255.255.0
gateway 10.90.7.1
bridge_ports eth1
bridge_stp off
bridge_fd 0

cluster.conf


<?xml version="1.0"?>
<cluster name="nbhosting" config_version="4">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>
<clusternode name="pve01" votes="1" nodeid="1"/>
<clusternode name="pve02" votes="1" nodeid="2"/>
<clusternode name="pve03" votes="1" nodeid="3"/></clusternodes>
</cluster>
 
dmesg on the nodes:


NODE 3
---------
fwln126i0: no IPv6 routers present
fwbr104i0: port 2(tap104i0) entering disabled state
fwbr104i0: port 1(fwln104i0) entering disabled state
device fwln104i0 left promiscuous mode
fwbr104i0: port 1(fwln104i0) entering disabled state
vmbr159: port 4(fwpr104p0) entering disabled state
device fwpr104p0 left promiscuous mode
vmbr159: port 4(fwpr104p0) entering disabled state
fwbr140i0: port 2(tap140i0) entering disabled state
fwbr140i0: port 1(fwln140i0) entering disabled state
device fwln140i0 left promiscuous mode
fwbr140i0: port 1(fwln140i0) entering disabled state
vmbr159: port 14(fwpr140p0) entering disabled state
device fwpr140p0 left promiscuous mode
vmbr159: port 14(fwpr140p0) entering disabled state
device tap140i0 entered promiscuous mode
ADDRCONF(NETDEV_UP): fwln140i0: link is not ready
ADDRCONF(NETDEV_CHANGE): fwln140i0: link becomes ready
device fwln140i0 entered promiscuous mode
fwbr140i0: port 1(fwln140i0) entering forwarding state
device fwpr140p0 entered promiscuous mode
vmbr159: port 4(fwpr140p0) entering forwarding state
fwbr140i0: port 2(tap140i0) entering forwarding state
fwpr140p0: no IPv6 routers present
fwbr140i0: no IPv6 routers present
tap140i0: no IPv6 routers present
fwln140i0: no IPv6 routers present
fwbr126i0: port 2(tap126i0) entering disabled state
fwbr126i0: port 1(fwln126i0) entering disabled state
device fwln126i0 left promiscuous mode
fwbr126i0: port 1(fwln126i0) entering disabled state
vmbr159: port 5(fwpr126p0) entering disabled state
device fwpr126p0 left promiscuous mode
vmbr159: port 5(fwpr126p0) entering disabled state
device tap126i0 entered promiscuous mode
ADDRCONF(NETDEV_UP): fwln126i0: link is not ready
ADDRCONF(NETDEV_CHANGE): fwln126i0: link becomes ready
device fwln126i0 entered promiscuous mode
fwbr126i0: port 1(fwln126i0) entering forwarding state
device fwpr126p0 entered promiscuous mode
vmbr159: port 5(fwpr126p0) entering forwarding state
fwbr126i0: port 2(tap126i0) entering forwarding state
fwpr126p0: no IPv6 routers present
tap126i0: no IPv6 routers present
fwbr126i0: no IPv6 routers present
fwln126i0: no IPv6 routers present
fwbr126i0: port 2(tap126i0) entering disabled state
fwbr126i0: port 1(fwln126i0) entering disabled state
device fwln126i0 left promiscuous mode
fwbr126i0: port 1(fwln126i0) entering disabled state
vmbr159: port 5(fwpr126p0) entering disabled state
device fwpr126p0 left promiscuous mode
vmbr159: port 5(fwpr126p0) entering disabled state
device tap126i0 entered promiscuous mode
ADDRCONF(NETDEV_UP): fwln126i0: link is not ready
ADDRCONF(NETDEV_CHANGE): fwln126i0: link becomes ready
device fwln126i0 entered promiscuous mode
fwbr126i0: port 1(fwln126i0) entering forwarding state
device fwpr126p0 entered promiscuous mode
vmbr159: port 5(fwpr126p0) entering forwarding state
fwbr126i0: port 2(tap126i0) entering forwarding state
fwbr126i0: no IPv6 routers present
fwpr126p0: no IPv6 routers present
fwln126i0: no IPv6 routers present
tap126i0: no IPv6 routers present


NODE 2
---------


fwbr103i0: no IPv6 routers present
fwln103i0: no IPv6 routers present
device tap110i0 entered promiscuous mode
ADDRCONF(NETDEV_UP): fwln110i0: link is not ready
ADDRCONF(NETDEV_CHANGE): fwln110i0: link becomes ready
device fwln110i0 entered promiscuous mode
fwbr110i0: port 1(fwln110i0) entering forwarding state
device fwpr110p0 entered promiscuous mode
vmbr159: port 6(fwpr110p0) entering forwarding state
fwbr110i0: port 2(tap110i0) entering forwarding state
fwpr103p0: no IPv6 routers present
fwln105i0: no IPv6 routers present
tap105i0: no IPv6 routers present
fwpr105p0: no IPv6 routers present
fwbr105i0: no IPv6 routers present
tap105i1: no IPv6 routers present
fwpr107p0: no IPv6 routers present
tap107i0: no IPv6 routers present
fwbr107i0: no IPv6 routers present
fwln107i0: no IPv6 routers present
fwbr110i0: no IPv6 routers present
fwln110i0: no IPv6 routers present
tap110i0: no IPv6 routers present
fwpr110p0: no IPv6 routers present
kvm: emulating exchange as write
usb 2-2: USB disconnect, device number 2
dlm: closing connection to node 3
md: data-check of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 522944k.
md: delaying data-check of md1 until md0 has finished (they share one or more physical units)
md: md0: data-check done.
md: data-check of RAID array md1
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 976106304k.
md: md1: data-check done.
root@pve02:~#


NODE 1
--------


fwbr109i1: port 1(fwln109i1) entering forwarding state
device fwpr109p1 entered promiscuous mode
vmbr907: port 2(fwpr109p1) entering forwarding state
fwbr109i1: port 2(tap109i1) entering forwarding state
fwln109i0: no IPv6 routers present
fwbr109i0: no IPv6 routers present
fwpr109p0: no IPv6 routers present
fwpr109p1: no IPv6 routers present
tap109i0: no IPv6 routers present
fwln109i1: no IPv6 routers present
tap109i1: no IPv6 routers present
fwbr109i1: no IPv6 routers present
dlm: closing connection to node 2
dlm: closing connection to node 3
hrtimer: interrupt took 5898 ns
device tap106i0 entered promiscuous mode
ADDRCONF(NETDEV_UP): fwln106i0: link is not ready
ADDRCONF(NETDEV_CHANGE): fwln106i0: link becomes ready
device fwln106i0 entered promiscuous mode
fwbr106i0: port 1(fwln106i0) entering forwarding state
device fwpr106p0 entered promiscuous mode
vmbr159: port 3(fwpr106p0) entering forwarding state
fwbr106i0: port 2(tap106i0) entering forwarding state
tap106i0: no IPv6 routers present
fwpr106p0: no IPv6 routers present
fwbr106i0: no IPv6 routers present
fwln106i0: no IPv6 routers present
md: data-check of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 522944k.
md: delaying data-check of md1 until md0 has finished (they share one or more physical units)
md: md0: data-check done.
md: data-check of RAID array md1
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 976106304k.
md: md1: data-check done.
 
How can i fix this.
i run on 3 nodes all nodes or on gigabit ports

The problem is I suppose that the cluster does not work.

Check the following

Code:
pvecm node
pvecm status
clustat
cat /etc/hosts
cat /etc/hostname
 
root@pve01:~# pvecm node
Node Sts Inc Joined Name
1 M 408 2015-01-30 20:30:45 pve01
2 M 512 2015-02-06 03:15:09 pve02
3 M 520 2015-02-19 07:47:39 pve03
root@pve01:~#



root@pve01:~# pvecm status
Version: 6.2.0
Config Version: 4
Cluster Name: nbhosting
Cluster Id: 54423
Cluster Member: Yes
Cluster Generation: 520
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: pve01
Node ID: 1
Multicast addresses: 239.192.212.108 ???mutlicast adres looks like an old ip from the previous network
Node addresses: 10.90.7.10
root@pve01:~#



root@pve01:~# clustat
Cluster Status for nbhosting @ Fri Mar 6 12:40:37 2015
Member Status: Quorate


Member Name ID Status
------ ---- ---- ------
pve01 1 Online, Local
pve02 2 Online
pve03 3 Online



root@pve01:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
10.90.7.10 pve01
10.90.7.11 pve02
10.90.7.12 pve03


# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts



root@pve01:~# cat /etc/hostname
pve01
 
root@pve01:~# pvecm node
Node Sts Inc Joined Name
1 M 408 2015-01-30 20:30:45 pve01
2 M 512 2015-02-06 03:15:09 pve02
3 M 520 2015-02-19 07:47:39 pve03
root@pve01:~#



root@pve01:~# pvecm status
Version: 6.2.0
Config Version: 4
Cluster Name: nbhosting
Cluster Id: 54423
Cluster Member: Yes
Cluster Generation: 520
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: pve01
Node ID: 1
Multicast addresses: 239.192.212.108 ???mutlicast adres looks like an old ip from the previous network
Node addresses: 10.90.7.10
root@pve01:~#



root@pve01:~# clustat
Cluster Status for nbhosting @ Fri Mar 6 12:40:37 2015
Member Status: Quorate


Member Name ID Status
------ ---- ---- ------
pve01 1 Online, Local
pve02 2 Online
pve03 3 Online



root@pve01:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
10.90.7.10 pve01
10.90.7.11 pve02
10.90.7.12 pve03


# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts



root@pve01:~# cat /etc/hostname
pve01

Everything is fine - which problem do you really have?

If it´s just getting these messages

Code:

That´s because you have network interruptions and/or multicast does not work perfectly. But the problems are not too severe. Nevertheless check your network!
 
root@pve01:~# omping -m 239.192.212.108 pve01 pve02 pve03
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
pve02 : waiting for response msg
pve03 : waiting for response msg
^C
pve02 : response message never received
pve03 : response message never received
root@pve01:~#
 
root@pve01:~# asmping 239.192.212.108 pve02
asmping joined (S,G) = (*,239.192.212.234)
pinging 10.90.7.11 from 10.90.7.10
recv failed: Connection refused
errno=111
recv failed: Connection refused
errno=111
recv failed: Connection refused
errno=111
recv failed: Connection refused
errno=111
 
just found this:
http://www.ehow.com/how_7479325_enable-multicast-hp-switch.html

Code:
Type "vlan 1 ip igmp" to enable IGMP Snooping on the default VLAN 1 of the switch. Repeat the command, replacing "1" with another VLAN number, for each VLAN configured on the switch that must support multicast traffic.


type "vlan 1 ip igmp querier" to enable IGMP Querier on the default VLAN 1 of the switch. Only enable IGMP Querier if the switch is not connected to a router or switch that has multicast routing enabled. Repeat the command, replacing "1" with another VLAN number, for each VLAN configured on the switch that must support multicast traffic and is not connected to a router or switch with multicast routing enabled.
 
i saw that one. would that give any issues, i have only the default vlan, it connected to my ISP router with 2 uplinks
i couldnt find how to do it in the web management do