Corosync alerts and errors (unicast)

Discussion in 'Proxmox VE: Installation and configuration' started by Cameron L, Apr 20, 2018.

  1. Cameron L

    Cameron L New Member

    Joined:
    Apr 20, 2018
    Messages:
    9
    Likes Received:
    1
    I have a new proxmox cluster undergoing testing.

    Nodes are connected via 10GbT switch with multiple VLANs trunked to each node.

    Due to multicast not working correctly on the switches the networking group uses (I don't control the network), I specify "transport: udpu" in /etc/pve/corosync.conf during creation, and I can verify that setting is in /etc/corosync/corosync.conf on all nodes.

    The custer is quorate:
    Quorum information
    ------------------
    Date: Thu Apr 19 19:02:32 2018
    Quorum provider: corosync_votequorum
    Nodes: 7
    Node ID: 0x00000001
    Ring ID: 1/548
    Quorate: Yes

    Votequorum information
    ----------------------
    Expected votes: 7
    Highest expected: 7
    Total votes: 7
    Quorum: 4
    Flags: Quorate

    I haven't noticed any problems with nodes going rogue and losing communication with the cluster. However syslog on each host is filling with one error and two alerts *each second*

    Apr 19 18:31:44 node03 corosync[33602]: error [TOTEM ] Digest does not match
    Apr 19 18:31:44 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
    Apr 19 18:31:44 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
    Apr 19 18:31:45 node03 corosync[33602]: error [TOTEM ] Digest does not match
    Apr 19 18:31:45 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
    Apr 19 18:31:45 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
    Apr 19 18:31:46 node03 corosync[33602]: error [TOTEM ] Digest does not match
    Apr 19 18:31:46 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
    Apr 19 18:31:46 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
    Apr 19 18:31:47 node03 corosync[33602]: error [TOTEM ] Digest does not match
    Apr 19 18:31:47 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
    Apr 19 18:31:47 node03 corosync[33602]: alert [TOTEM ] Invalid packet data
    Apr 19 18:31:48 node03 corosync[33602]: error [TOTEM ] Digest does not match
    Apr 19 18:31:48 node03 corosync[33602]: alert [TOTEM ] Received message has invalid digest... ignoring.
    Apr 19 18:31:48 node03 corosync[33602]: alert [TOTEM ] Invalid packet data

    This is generating ~660K log lines each day, which seems a bit... wrong.
    # grep TOTEM /var/log/syslog.1 | wc -l
    664758

    I tried turning logging to syslog off in corosync.conf, but it still logs to syslog.

    Proxmox is the only place I use corosync. How should I diagnose this problem?

    # cat /etc/corosync/corosync.conf
    logging {
    debug: off
    to_syslog: no
    }

    nodelist {
    node {
    name: node01
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.1.45
    }
    node {
    name: node02
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.1.46
    }
    node {
    name: node03
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 192.168.1.47
    }
    node {
    name: node04
    nodeid: 4
    quorum_votes: 1
    ring0_addr: 192.168.1.48
    }
    node {
    name: node05
    nodeid: 5
    quorum_votes: 1
    ring0_addr: 192.168.1.49
    }
    node {
    name: node06
    nodeid: 6
    quorum_votes: 1
    ring0_addr: 192.168.1.50
    }
    node {
    name: node07
    nodeid: 7
    quorum_votes: 1
    ring0_addr: 192.168.1.51
    }
    }

    quorum {
    provider: corosync_votequorum
    }

    totem {
    cluster_name: TESTING
    config_version: 7
    interface {
    bindnetaddr: 192.168.1.45
    ringnumber: 0
    }
    ip_version: ipv4
    secauth: on
    transport: udpu
    version: 2
    }
     
  2. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,456
    Likes Received:
    310
    I guess your un different clusters in the same network using the same ports?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. Cameron L

    Cameron L New Member

    Joined:
    Apr 20, 2018
    Messages:
    9
    Likes Received:
    1
    Hi, Dietmar. Thanks for the reply.

    There is another cluster on the same VLAN, also using unicast packets for corosync, but using a different cluster name and different IPs (of course).

    Are you suggesting that there is some other option in corosync.conf which should be changed from the default if there's another proxmox cluster on the same VLAN?

    Thanks!
     
  4. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,456
    Likes Received:
    310
    The error messages indicates that someone sends messages to the same IP/port ...
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice