Clusternode/GUI not working

Bububaer

New Member
Jan 22, 2024
2
0
1
Hi,

i am running a cluster with 3 ceph nodes.
Since a power failure, I had a problem with the system disk and have to do a fsck to get the node start.
Fortunately, the node with all the ceph services came up and work.
I got the services running again but, i have a problem with the cluster/gui -> see the picture.

What can i do to get the cluster respectively the gui to work normal again.
Where can i find the necessary log files for this.

Screenshot from 2024-01-22 06-51-50.png

thanks for help
 
Please post your /etc/pve/corosync.conf and your /etc/network/interfaces
and journalctl -u corosync

This can have various reasons:
  • your ui-network is not working, make a ping test
  • your time (ntp) is off, configure timeserver (chrony)
 
Last edited:
Hi, thank you, for your answers.
Here are the log files from the broken node.
Every node can reach each other and the time is synched.


logging { debug: off to_syslog: yes } nodelist { node { name: pve-storage01 nodeid: 1 quorum_votes: 1 ring0_addr: 192.168.13.10 ring1_addr: 192.168.10.10 } node { name: pve-storage02 nodeid: 2 quorum_votes: 1 ring0_addr: 192.168.13.11 ring1_addr: 192.168.10.11 } node { name: pve-storage03 nodeid: 3 quorum_votes: 1 ring0_addr: 192.168.13.12 ring1_addr: 192.168.10.12 } node { name: pve01 nodeid: 4 quorum_votes: 1 ring0_addr: 192.168.13.13 ring1_addr: 192.168.10.13 } node { name: pve02 nodeid: 5 quorum_votes: 1 ring0_addr: 192.168.13.14 ring1_addr: 192.168.10.14 } } quorum { provider: corosync_votequorum } totem { cluster_name: pve-cluster config_version: 5 interface { linknumber: 0 } interface { linknumber: 1 } ip_version: ipv4-6 link_mode: passive secauth: on version: 2 }


[ auto enp3s0 iface enp3s0 inet static address 192.168.0.10 netmask 255.255.255.0 mtu 9000 up ip route add 192.168.0.11/32 dev enp3s0 down ip route del 192.168.0.11/32 auto enp3s0d1 iface enp3s0d1 inet static address 192.168.0.10 netmask 255.255.255.0 mtu 9000 up ip route add 192.168.0.12/32 dev enp3s0d1 down ip route del 192.168.0.12/32 auto vmbr0 iface vmbr0 inet static address 192.168.13.10 netmask 24 bridge-ports eno1 bridge-stp off bridge-fd 0 auto vmbr1 iface vmbr1 inet static address 192.168.10.10 netmask 24 bridge-ports enp4s0f1 bridge-stp off bridge-fd 0 iface vmbr2 inet manual bridge-ports eno2 bridge-stp off bridge-fd 0 bridge-vlan-aware yes bridge-vids 2-4094 auto vmbr2.5 iface vmbr2.5 inet static address 10.28.200.30 netmask 24 gateway 10.28.200.70

-- Journal begins at Wed 2023-08-30 00:48:11 CEST, ends at Mon 2024-01-22 11:43:19 CET. -- Jan 19 12:41:55 pve-storage01 corosync[2493]: [SERV ] Service engine loaded: corosync watchdog service [7] Jan 19 12:41:55 pve-storage01 corosync[2493]: [QUORUM] Using quorum provider corosync_votequorum Jan 19 12:41:55 pve-storage01 corosync[2493]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5] Jan 19 12:41:55 pve-storage01 corosync[2493]: [QB ] server name: votequorum Jan 19 12:41:55 pve-storage01 corosync[2493]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3] Jan 19 12:41:55 pve-storage01 corosync[2493]: [QB ] server name: quorum Jan 19 12:41:55 pve-storage01 corosync[2493]: [TOTEM ] Configuring link 0 Jan 19 12:41:55 pve-storage01 corosync[2493]: [TOTEM ] Configured link number 0: local addr: 192.168.13.10, port=5405 Jan 19 12:41:55 pve-storage01 corosync[2493]: [TOTEM ] Configuring link 1 Jan 19 12:41:55 pve-storage01 corosync[2493]: [TOTEM ] Configured link number 1: local addr: 192.168.10.10, port=5406 Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] link: Resetting MTU for link 0 because host 1 joined Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 0) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 0) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 3 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 3 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 3 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 4 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 4 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 4 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [QUORUM] Sync members[1]: 1 Jan 19 12:41:55 pve-storage01 corosync[2493]: [QUORUM] Sync joined[1]: 1 Jan 19 12:41:55 pve-storage01 corosync[2493]: [TOTEM ] A new membership (1.8648ed001adc2) was formed. Members joined: 1 Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1) Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 has no active links Jan 19 12:41:55 pve-storage01 corosync[2493]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 2 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 2 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 3 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 3 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 3 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 4 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 4 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 4 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:56 pve-storage01 corosync[2493]: [QUORUM] Members[1]: 1 Jan 19 12:41:56 pve-storage01 corosync[2493]: [MAIN ] Completed service synchronization, ready to provide service. Jan 19 12:41:56 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:41:56 pve-storage01 systemd[1]: Started Corosync Cluster Engine. Jan 19 12:41:58 pve-storage01 corosync[2493]: [KNET ] rx: host: 5 link: 0 is up Jan 19 12:41:58 pve-storage01 corosync[2493]: [KNET ] link: Resetting MTU for link 0 because host 5 joined Jan 19 12:41:58 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:41:58 pve-storage01 corosync[2493]: [KNET ] rx: host: 4 link: 0 is up Jan 19 12:41:58 pve-storage01 corosync[2493]: [KNET ] link: Resetting MTU for link 0 because host 4 joined Jan 19 12:41:58 pve-storage01 corosync[2493]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1) Jan 19 12:41:59 pve-storage01 corosync[2493]: [KNET ] pmtud: PMTUD link change for host: 5 link: 0 from 469 to 1397 Jan 19 12:41:59 pve-storage01 corosync[2493]: [KNET ] pmtud: PMTUD link change for host: 4 link: 0 from 469 to 1397 Jan 19 12:41:59 pve-storage01 corosync[2493]: [KNET ] pmtud: Global data MTU changed to: 1397 Jan 19 12:42:00 pve-storage01 corosync[2493]: [QUORUM] Sync members[3]: 1 4 5 Jan 19 12:42:00 pve-storage01 corosync[2493]: [QUORUM] Sync joined[2]: 4 5 Jan 19 12:42:00 pve-storage01 corosync[2493]: [TOTEM ] A new membership (1.8648ed001ade5) was formed. Members joined: 4 5 Jan 19 12:42:00 pve-storage01 corosync[2493]: [CMAP ] Received config version (7) is different than my config version (5)! Exiting Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Unloading all Corosync service engines. Jan 19 12:42:00 pve-storage01 corosync[2493]: [QB ] withdrawing server sockets Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync vote quorum service v1.0 Jan 19 12:42:00 pve-storage01 corosync[2493]: [QB ] withdrawing server sockets Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync configuration map access Jan 19 12:42:00 pve-storage01 corosync[2493]: [QB ] withdrawing server sockets Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync configuration service Jan 19 12:42:00 pve-storage01 corosync[2493]: [QB ] withdrawing server sockets Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01 Jan 19 12:42:00 pve-storage01 corosync[2493]: [QB ] withdrawing server sockets Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync cluster quorum service v0.1 Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync profile loading service Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync resource monitoring service Jan 19 12:42:00 pve-storage01 corosync[2493]: [SERV ] Service engine unloaded: corosync watchdog service Jan 19 12:42:01 pve-storage01 corosync[2493]: [KNET ] link: Resetting MTU for link 0 because host 5 joined Jan 19 12:42:01 pve-storage01 corosync[2493]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1) Jan 19 12:42:01 pve-storage01 corosync[2493]: [KNET ] host: host: 5 has no active links Jan 19 12:42:01 pve-storage01 corosync[2493]: [KNET ] link: Resetting MTU for link 0 because host 4 joined Jan 19 12:42:01 pve-storage01 corosync[2493]: [KNET ] link: Resetting MTU for link 0 because host 1 joined Jan 19 12:42:01 pve-storage01 corosync[2493]: [MAIN ] Corosync Cluster Engine exiting normally Jan 19 12:42:01 pve-storage01 systemd[1]: corosync.service: Control process exited, code=exited, status=1/FAILURE Jan 19 12:42:01 pve-storage01 systemd[1]: corosync.service: Failed with result 'exit-code'.


Cluster information ------------------- Name: pve-cluster Config Version: 5 Transport: knet Secure auth: on Cannot initialize CMAP service

Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2024-01-19 12:41:57 CET; 2 days ago Main PID: 2750 (pvestatd) Tasks: 2 (limit: 57840) Memory: 130.3M CPU: 2h 6min 40.457s CGroup: /system.slice/pvestatd.service ├─ 2750 pvestatd └─1848062 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count


thanks
 
Last edited:
Received config version (7) is different than my config version (5)! Exiting

Seem that you don't have same config version of corosync.conf.

try to copy config (/etc/corosync/corosync.conf) with version 7 (you have a config_version: 7 in the file) from another node, on this node, then restart corosync service.