[SOLVED] kleines Problem nach corosync upgrade

r4a5a88

Renowned Member
Jun 15, 2016
63
3
73
36
Hi Proxmox community

ich bin zur Zeit dabei einen Cluster vorzubereiten auf proxmox 6 zu upgraden.
Heute habe ich das Upgrade von Corosync von 2 auf 3 durchgeführt.
Eine meiner Server ist jetzt nicht mehr synchron

# pvecm status
Quorum information
------------------
Date: Mon Jun 22 09:24:21 2020
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1.76f
Quorate: No

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 1
Quorum: 4 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 IP-Address (local)

Gibt es ein Weg ohne Reboot den Server wieder in den Cluster zu bringen
ich hab es schon mit pvecm expected 1 auf dem Serveer probieert und mit neustart von corosync

auf einem der Server habe ich die Nachricht

[status] crit: cpg_send_message failed: 6
 
1592890154187.png
Auf dem Node pro-07 sieht es anders herum aus

Das ist die derzeitige Situation.
ich habe Proxmox
Virtual Environment 5.4-15
wie man sieht 7 nodes
Hatte jemand schon so etwas und weis was man tun kann ?
Ich versuche es ohne neustart zu reparieren
 
ist HA aktiv?
was sagen die logs von corosync und pve-cluster auf allen nodes ('systemctl status corosync pve-cluster' und 'journalctl --since 2020-06-21 -u pve-cluster -u corosync')?
 
Ha ist nicht aktiv. Die Logs kann ich hier nicht posten , da die Datein zu groß sind bzw die nachrichten sonst zu lang werden

Code:
â corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 08:52:57 CEST; 1 day 4h ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 17484 (corosync)
    Tasks: 9 (limit: 4915)
   Memory: 5.0G
      CPU: 3h 29min 15.965s
   CGroup: /system.slice/corosync.service
           ââ17484 /usr/sbin/corosync -f

Jun 23 13:12:35 pro-04-dmed corosync[17484]:   [MAIN  ] Completed service synchronization, ready to provide service.
Jun 23 13:12:40 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba2f) was formed. Members
Jun 23 13:12:46 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba33) was formed. Members
Jun 23 13:12:51 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba37) was formed. Members
Jun 23 13:12:56 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba3b) was formed. Members
Jun 23 13:13:01 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba3f) was formed. Members
Jun 23 13:13:06 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba43) was formed. Members
Jun 23 13:13:11 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba47) was formed. Members
Jun 23 13:13:17 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba4b) was formed. Members
Jun 23 13:13:22 pro-04-dmed corosync[17484]:   [TOTEM ] A new membership (2.2ba4f) was formed. Members

â pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 08:52:56 CEST; 1 day 4h ago
 Main PID: 17451 (pmxcfs)
    Tasks: 9 (limit: 4915)
   Memory: 53.7M
      CPU: 1min 21.607s
   CGroup: /system.slice/pve-cluster.service
           ââ17451 /usr/bin/pmxcfs

Jun 23 13:13:16 pro-04-dmed pmxcfs[17451]: [status] crit: cpg_send_message failed: 6
Jun 23 13:13:17 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 10
Jun 23 13:13:18 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 20
Jun 23 13:13:19 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 30
Jun 23 13:13:20 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 40
Jun 23 13:13:21 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 50
Jun 23 13:13:22 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 60
Jun 23 13:13:23 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 70
Jun 23 13:13:24 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 80
Jun 23 13:13:25 pro-04-dmed pmxcfs[17451]: [status] notice: cpg_send_message retry 90


pro-06-dmed:~# systemctl status corosync pve-cluster
â corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 14:43:39 CEST; 22h ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 10603 (corosync)
    Tasks: 9 (limit: 6144)
   Memory: 2.7G
      CPU: 1h 15min 54.729s
   CGroup: /system.slice/corosync.service
           ââ10603 /usr/sbin/corosync -f

Jun 23 13:14:08 pro-06-dmed corosync[10603]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:14:08 pro-06-dmed corosync[10603]:   [QUORUM] Members[4]: 1 3 5 7
Jun 23 13:14:08 pro-06-dmed corosync[10603]:   [MAIN  ] Completed service synchronization, ready to provide service.
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [TOTEM ] A new membership (1.2ba77) was formed. Members
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [QUORUM] Members[4]: 1 3 5 7
Jun 23 13:14:13 pro-06-dmed corosync[10603]:   [MAIN  ] Completed service synchronization, ready to provide service.

â pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 08:55:41 CEST; 1 day 4h ago
 Main PID: 21167 (pmxcfs)
    Tasks: 9 (limit: 6144)
   Memory: 72.6M
      CPU: 2min 50.782s
   CGroup: /system.slice/pve-cluster.service
           ââ21167 /usr/bin/pmxcfs

Jun 23 13:11:02 pro-06-dmed pmxcfs[21167]: [dcdb] notice: all data is up to date
Jun 23 13:11:02 pro-06-dmed pmxcfs[21167]: [dcdb] notice: dfsm_deliver_queue: queue length 4
Jun 23 13:11:07 pro-06-dmed pmxcfs[21167]: [status] notice: received all states
Jun 23 13:11:07 pro-06-dmed pmxcfs[21167]: [status] notice: all data is up to date
Jun 23 13:11:07 pro-06-dmed pmxcfs[21167]: [status] notice: dfsm_deliver_queue: queue length 255
Jun 23 13:11:18 pro-06-dmed pmxcfs[21167]: [status] notice: cpg_send_message retried 6 times
Jun 23 13:12:38 pro-06-dmed pmxcfs[21167]: [status] notice: cpg_send_message retry 10
Jun 23 13:12:39 pro-06-dmed pmxcfs[21167]: [status] notice: cpg_send_message retry 20
Jun 23 13:12:40 pro-06-dmed pmxcfs[21167]: [status] notice: cpg_send_message retry 30
Jun 23 13:12:40 pro-06-dmed pmxcfs[21167]: [status] notice: cpg_send_message retried 39 times

pro-07-dmed:~#  systemctl status corosync pve-cluster
â corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 16:00:39 CEST; 21h ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 31442 (corosync)
    Tasks: 9 (limit: 6144)
   Memory: 5.7G
      CPU: 46min 27.871s
   CGroup: /system.slice/corosync.service
           ââ31442 /usr/sbin/corosync -f

Jun 23 13:14:18 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba7b) was formed. Members
Jun 23 13:14:24 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba7f) was formed. Members
Jun 23 13:14:29 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba83) was formed. Members
Jun 23 13:14:34 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba87) was formed. Members
Jun 23 13:14:39 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba8b) was formed. Members
Jun 23 13:14:44 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba8f) was formed. Members
Jun 23 13:14:49 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba93) was formed. Members
Jun 23 13:14:55 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba97) was formed. Members
Jun 23 13:15:00 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba9b) was formed. Members
Jun 23 13:15:05 pro-07-dmed corosync[31442]:   [TOTEM ] A new membership (2.2ba9f) was formed. Members

â pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 14:51:32 CEST; 22h ago
  Process: 1056 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
  Process: 1015 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
 Main PID: 1034 (pmxcfs)
    Tasks: 10 (limit: 6144)
   Memory: 58.9M
      CPU: 2min 43.226s
   CGroup: /system.slice/pve-cluster.service
           ââ1034 /usr/bin/pmxcfs

Jun 23 13:15:00 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 40
Jun 23 13:15:01 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 50
Jun 23 13:15:02 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 60
Jun 23 13:15:03 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 70
Jun 23 13:15:04 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 80
Jun 23 13:15:05 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 90
Jun 23 13:15:06 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 100
Jun 23 13:15:06 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retried 100 times
Jun 23 13:15:06 pro-07-dmed pmxcfs[1034]: [status] crit: cpg_send_message failed: 6
Jun 23 13:15:07 pro-07-dmed pmxcfs[1034]: [status] notice: cpg_send_message retry 10
 
Code:
pro-08-dmed:~# systemctl status corosync pve-cluster
â corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 15:08:36 CEST; 22h ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 4596 (corosync)
    Tasks: 9 (limit: 7372)
   Memory: 2.4G
      CPU: 36min 26.281s
   CGroup: /system.slice/corosync.service
           ââ4596 /usr/sbin/corosync -f

Jun 23 13:16:17 pro-08-dmed corosync[4596]:   [TOTEM ] Failed to receive the leave message. failed: 1
Jun 23 13:16:17 pro-08-dmed corosync[4596]:   [TOTEM ] Retransmit List: 1
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [TOTEM ] A new membership (1.2bb4b) was formed. Members joined: 1 3 5 left: 1 3 5
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [TOTEM ] Failed to receive the leave message. failed: 1 3 5
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [TOTEM ] A new membership (1.2bb4f) was formed. Members joined: 1 left: 1
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [TOTEM ] Failed to receive the leave message. failed: 1
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [CPG   ] downlist left_list: 1 received
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [CPG   ] downlist left_list: 1 received
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [TOTEM ] A new membership (1.2bb57) was formed. Members joined: 1 3 left: 1 3
Jun 23 13:16:18 pro-08-dmed corosync[4596]:   [TOTEM ] Failed to receive the leave message. failed: 1 3
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [TOTEM ] A new membership (1.2bb5b) was formed. Members
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [QUORUM] Members[4]: 1 3 5 7
Jun 23 13:16:23 pro-08-dmed corosync[4596]:   [MAIN  ] Completed service synchronization, ready to provide service.

â pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 08:56:27 CEST; 1 day 4h ago
 Main PID: 895 (pmxcfs)
    Tasks: 10 (limit: 7372)
   Memory: 95.3M
      CPU: 3min 29.748s
   CGroup: /system.slice/pve-cluster.service
           ââ895 /usr/bin/pmxcfs

Jun 23 13:15:49 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 20
Jun 23 13:15:50 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 30
Jun 23 13:15:51 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 40
Jun 23 13:15:52 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 50
Jun 23 13:15:52 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retried 51 times
Jun 23 13:15:58 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 10
Jun 23 13:15:59 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 20
Jun 23 13:16:00 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 30
Jun 23 13:16:01 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retry 40
Jun 23 13:16:02 pro-08-dmed pmxcfs[895]: [status] notice: cpg_send_message retried 49 times

root@pro-01-dmed:~# systemctl status corosync pve-cluster
â corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-06-23 11:59:04 CEST; 1h 18min ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 28222 (corosync)
    Tasks: 9 (limit: 9830)
   Memory: 416.6M
      CPU: 3min 41.713s
   CGroup: /system.slice/corosync.service
           ââ28222 /usr/sbin/corosync -f

Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [QUORUM] Members[3]: 2 4 6
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [MAIN  ] Completed service synchronization, ready to provide service.
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [TOTEM ] A new membership (2.2bc4f) was formed. Members
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [CPG   ] downlist left_list: 0 received
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [QUORUM] Members[3]: 2 4 6
Jun 23 13:17:15 pro-01-dmed corosync[28222]:   [MAIN  ] Completed service synchronization, ready to provide service.
Jun 23 13:17:20 pro-01-dmed corosync[28222]:   [TOTEM ] A new membership (2.2bc53) was formed. Members

â pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-06-22 08:51:35 CEST; 1 day 4h ago
 Main PID: 26221 (pmxcfs)
    Tasks: 8 (limit: 9830)
   Memory: 21.6M
      CPU: 1min 55.734s
   CGroup: /system.slice/pve-cluster.service
           ââ26221 /usr/bin/pmxcfs

Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [status] notice: received sync request (epoch 2/17451/00000235)
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [status] crit: ignore sync request from wrong member 4/26221
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [status] notice: received sync request (epoch 4/26221/000001CB)
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [dcdb] notice: received all states
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [dcdb] notice: leader is 2/17451
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [dcdb] notice: synced members: 2/17451, 4/26221, 6/1034
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [dcdb] notice: all data is up to date
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [status] notice: received all states
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [status] notice: all data is up to date
Jun 23 13:17:15 pro-01-dmed pmxcfs[26221]: [status] notice: dfsm_deliver_queue: queue length 191
 
kannst du noch die corosync.conf hinzufügen?
 
Code:
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pro-01-dmed
    nodeid: 4
    quorum_votes: 1
    ring0_addr: pro-01-dmed
  }
  node {
    name: pro-03-dmed
    nodeid: 1
    quorum_votes: 1
    ring0_addr: pro-03-dmed
  }
  node {
    name: pro-04-dmed
    nodeid: 2
    quorum_votes: 1
    ring0_addr: pro-04-dmed
  }
  node {
    name: pro-05-dmed
    nodeid: 5
    quorum_votes: 1
    ring0_addr: pro-05-dmed
  }
  node {
    name: pro-06-dmed
    nodeid: 3
    quorum_votes: 1
    ring0_addr: pro-06-dmed
  }
  node {
    name: pro-07-dmed
    nodeid: 6
    quorum_votes: 1
    ring0_addr: pro-07-dmed
  }
  node {
    name: pro-08-dmed
    nodeid: 7
    quorum_votes: 1
    ring0_addr: 129.206.229.186
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: vm-cluster-02
  config_version: 49
  interface {
    bindnetaddr: 129.206.229.185
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}
 
I suggest starting journalctl -f -u pve-cluster -u corosync > $(hostname).log on all nodes, then systemctl restart corosync and wait a few minutes, then post the generated log files.
 
hier sind die Logs von den 3 die außer sync sind . die 4 ist wieder gesynced
 

Attachments

  • pro-01-dmed.log
    35.5 KB · Views: 2
  • pro-04-dmed.log
    23.4 KB · Views: 1
  • pro-07-dmed.log
    18 KB · Views: 1
was sagt corosync-cfgtool -sb auf allen nodes?
 
root@pro-01-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 4

pro-03-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 1
LINK ID 0
addr = 129.206.229.185
status = 3333313

pro-07-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 6

pro-06-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 3
LINK ID 0
addr = 129.206.229.173
status = 3333333


pro-08-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 7

root@pro-04-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 2
LINK ID 0
addr = 129.206.229.164
status = 3333333

root@pro-05-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 5
LINK ID 0
addr = 129.206.229.178
status = 3333333
 
okay, da scheint irgendwas ganz falsch abgebogen zu sein. könntest du nochmal logs von allen nodes sammeln und folgende befehle ausführen

systemctl stop corosync pve-cluster

warten bis services auf allen nodes gestoppt sind. dann node für node
systemctl start corosync pve-cluster
und nach jedem node verifizieren dass alle bereits gestarteten nodes sich wechselseitig sehen (pvecm status / corosync-cfgtool -sb / logs)
 
Die Gute Nachricht ist, 6 von 7 sind wieder im cluster . Einer Nicht. Ich hab von 3 der Server Die Logs seit heute Morgen angefügt
 

Attachments

  • pro-06.log
    373.3 KB · Views: 2
  • pro-07.log
    392.4 KB · Views: 2
  • pro-01.log
    316.8 KB · Views: 1
und was sagt 'pvecm status' auf allen 7 nodes?
 
root@pro-01-dmed:~# pvecm status
Quorum information
------------------
Date: Wed Jun 24 09:43:01 2020
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000004
Ring ID: 2.5a612
Quorate: No

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 3
Quorum: 4 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 129.206.229.164
0x00000004 1 129.206.229.187 (local)
0x00000006 1 129.206.229.168

pro-08-dmed:~# pvecm status
Quorum information
------------------
Date: Wed Jun 24 09:47:52 2020
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000007
Ring ID: 1.5a99a
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 4
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 129.206.229.185
0x00000003 1 129.206.229.173
0x00000005 1 129.206.229.178
0x00000007 1 129.206.229.186 (local)

das ist von 2 nodes.
 
pro-06-dmed:~# pvecm status
Quorum information
------------------
Date: Wed Jun 24 09:49:01 2020
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000003
Ring ID: 1.5a9d2
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 4
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 129.206.229.185
0x00000003 1 129.206.229.173 (local)
0x00000005 1 129.206.229.178
0x00000007 1 129.206.229.186
 
okay - also jetzt ist der cluster in zwei teile zerfallen. nochmal corosync-cfgtool -sb von ALLEN nodes? wie sieht denn das netzwerk aus? hängen alle nodes direkt am selben switch?
 
pro-07-dmed:~# pvecm status
Quorum information
------------------
Date: Wed Jun 24 09:52:30 2020
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000006
Ring ID: 2.5aaa6
Quorate: No

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 3
Quorum: 4 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 129.206.229.164
0x00000004 1 129.206.229.187
0x00000006 1 129.206.229.168 (local)


Ich bin am überlegen zu corosync 2 zurück zu kehren bis ich das upgrade auf buster mache
 
Alle nodes sind im selben SUbnetz und hängen am selben switch. mit corosync 2 hat es noch funktioniert
 
pro-06-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 3
LINK ID 0
addr = 129.206.229.173
status = 3333333

root@pro-01-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 4

pro-03-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 1
LINK ID 0
addr = 129.206.229.185
status = 3333313

root@pro-04-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 2

root@pro-05-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 5
LINK ID 0
addr = 129.206.229.178
status = 3333333

pro-07-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 6

pro-08-dmed:~# corosync-cfgtool -sb
Printing link status.
Local node ID 7
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!