Corosync logging

HellHound

New Member
Jan 3, 2018
3
0
1
geekness.be
Hi all,

Playing with Proxmox, and it's been a blast.

Had a small question, is it normal for Corosync to log the following lines every 2 seconds, on all nodes ?

Code:
Jan 15 02:18:09 kvm4 corosync[24830]: notice  [TOTEM ] A new membership (192.168.220.20:765924) was formed. Members
Jan 15 02:18:09 kvm4 corosync[24830]:  [TOTEM ] A new membership (192.168.220.20:765924) was formed. Members
Jan 15 02:18:09 kvm4 corosync[24830]: notice  [QUORUM] Members[2]: 1 3
Jan 15 02:18:09 kvm4 corosync[24830]: notice  [MAIN  ] Completed service synchronization, ready to provide service.
Jan 15 02:18:09 kvm4 corosync[24830]:  [QUORUM] Members[2]: 1 3
Jan 15 02:18:09 kvm4 corosync[24830]:  [MAIN  ] Completed service synchronization, ready to provide service.

Some details,

Code:
Quorum information
------------------
Date:             Mon Jan 15 02:23:15 2018
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1/766492
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.220.20 (local)
0x00000003          1 192.168.220.24

Code:
kvm1# omping -c 10000 -i 0.001 -F -q kvm1-cs kvm4-cs
kvm4-cs :   unicast, xmt/rcv/%loss = 9366/9366/0%, min/avg/max/std-dev = 0.066/0.135/3.716/0.055
kvm4-cs : multicast, xmt/rcv/%loss = 9366/9366/0%, min/avg/max/std-dev = 0.076/0.126/3.676/0.053
Code:
kvm4# omping -c 10000 -i 0.001 -F -q kvm1-cs kvm4-cs
kvm1-cs :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.080/0.144/0.648/0.039
kvm1-cs : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.084/0.152/0.653/0.038

Version Details,

Code:
proxmox-ve: 5.1-32 (running kernel: 4.13.13-2-pve)
pve-manager: 5.1-41 (running version: 5.1-41/0b958203)
pve-kernel-4.13.13-2-pve: 4.13.13-32
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-18
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-5
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
 
Last edited:
Had a small question, is it normal for Corosync to log the following lines every 2 seconds, on all nodes ?

Code:
Jan 15 02:18:09 kvm4 corosync[24830]: notice  [TOTEM ] A new membership (192.168.220.20:765924) was formed. Members
Jan 15 02:18:09 kvm4 corosync[24830]:  [TOTEM ] A new membership (192.168.220.20:765924) was formed. Members
Jan 15 02:18:09 kvm4 corosync[24830]: notice  [QUORUM] Members[2]: 1 3
Jan 15 02:18:09 kvm4 corosync[24830]: notice  [MAIN  ] Completed service synchronization, ready to provide service.
Jan 15 02:18:09 kvm4 corosync[24830]:  [QUORUM] Members[2]: 1 3
Jan 15 02:18:09 kvm4 corosync[24830]:  [MAIN  ] Completed service synchronization, ready to provide service.

No it's not. Looks like an unstable cluster network.
 
Hi Richard,

Thanks for the reply.

Unstable, in what way ? Everything is up & running, we can multicast ping all machines, etc.

Any way to resolve it without reinstalling all servers ?
 
Unstable, in what way ?

Your dropping multicast packages, although not a big amount in your exact test. Can you run the second, longer running, omping test from https://pve.proxmox.com/pve-docs/chapter-pvecm.html#cluster-network-requirements too?

Any way to resolve it without reinstalling all servers ?

Hard to tell until we found the real problem. Can you post the corosync config and describe the (network) environment of the cluster in greater detail? Thanks.
 
I've disabled igmp snooping on the 'KVM' vlan on the switch (Juniper EX4200), the KVM vlan includes only the KVM servers, ping and omping work without any issues. As seen below.

Changing stuff on the router is harder, and that's why I did it like this (without the IGMP querier on the router). I'm guessing this is the problem ?

So a IGMP querier on the router, or switch (if using RVI), is a requirement ?

Below the details,

Network: all Proxmox servers are connected to a separate VLAN network on the switch (port ge-0/0/30 and ge-0/0/31). No other servers are on that VLAN network. So I just enabled igmp without restriction, by disabling igmp-snooping on the KVM vlan.

I'm guessing each Proxmox server is receiving their own multicast messages, and thus in a continuous loop. But not sure. Still need to read up on corosync.

Juniper EX4200 vlan/igmp config,
Code:
interfaces {
[..snip..]
    ge-0/0/30 {                        
        unit 0 {                      
            family ethernet-switching {
                vlan {                
                    members kvm;      
                }                      
            }                          
        }                              
    }                                  
    ge-0/0/31 {                        
        unit 0 {                      
            family ethernet-switching {
                vlan {                
                    members kvm;      
                }                      
            }                          
        }                              
    }
[..snip..]
}
protocols {
    igmp;
    igmp-snooping {
        vlan all;
        vlan kvm {
            disable;
        }
    }
    rstp;
    lldp {
        interface all;
    }
    lldp-med {
        interface all;
    }
}
ethernet-switching-options {
    storm-control {                    
        interface all;
    }
}
vlans {
    kvm {
        vlan-id 220;
    }
    mgmt {
        vlan-id 100;
    }
}

omping for 10 minutes,
Code:
kvm1# omping -c 600 -i 1 -q kvm1-cs kvm4-cs                                                                   
kvm4-cs : waiting for response msg
kvm4-cs : joined (S,G) = (*, 232.43.211.234), pinging
kvm4-cs : given amount of query messages was sent

kvm4-cs :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.094/0.163/0.386/0.036
kvm4-cs : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.111/0.155/0.370/0.030


# omping -c 600 -i 1 -q kvm1-cs kvm4-cs
kvm1-cs : waiting for response msg
kvm1-cs : waiting for response msg
kvm1-cs : waiting for response msg
kvm1-cs : waiting for response msg
kvm1-cs : joined (S,G) = (*, 232.43.211.234), pinging
kvm1-cs : given amount of query messages was sent

kvm1-cs :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.131/0.193/0.345/0.032
kvm1-cs : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.139/0.202/0.355/0.033

Corosync.conf,
Code:
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: kvm1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.220.20
  }
  node {
    name: kvm4
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 192.168.220.24
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: MYKVM
  config_version: 4
  interface {
    bindnetaddr: 192.168.220.20
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}

Let me know if you need more details.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!