Corosync 3 Update - PVE5

ChristianDSH95

New Member
Sep 20, 2019
9
1
3
29
Hallo,

ich habe Heute alle Nodes nach dem https://pve.proxmox.com/wiki/Upgrade_from_5.x_to_6.0#In-place_upgrade auf Corosync 3 geupdated, soweit lief das ohne Fehler..
Jetzt nach dem Update sollen alle 23 Nodes (kein HA konfiguriert) wieder miteinander kommunizieren, doch ab 14 oder 16 Nodes bleibt die erste Node grau.

VM's lassen sich dann auch nicht mehr starten / stoppen und schlagen mit der Begründung "got timeout" fehlt.

Code:
pveversion -v
proxmox-ve: 5.4-2 (running kernel: 4.15.18-26-pve)
pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec)
pve-kernel-4.15: 5.4-14
pve-kernel-4.15.18-26-pve: 4.15.18-54
pve-kernel-4.15.18-24-pve: 4.15.18-52
pve-kernel-4.15.18-20-pve: 4.15.18-46
corosync: 3.0.3-pve1~bpo9
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: not correctly installed
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-56
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.5-1~bpo9+2
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-7
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-38
pve-container: 2.0-41
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-7
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-55
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3


Ich denke das irgend etwas sich hier querstellt und alles weitere blockiert, was kann das sein?

Syslog eintrag:


Code:
Mar 21 05:26:10 server256 kernel: [ 1088.106832] INFO: task pvesr:5827 blocked for more than 120 seconds.
Mar 21 05:26:10 server256 kernel: [ 1088.106927]       Tainted: G           O     4.15.18-26-pve #1
Mar 21 05:26:10 server256 kernel: [ 1088.107009] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 21 05:26:10 server256 kernel: [ 1088.107073] pvesr           D    0  5827      1 0x00000000
Mar 21 05:26:10 server256 kernel: [ 1088.107079] Call Trace:
Mar 21 05:26:10 server256 kernel: [ 1088.107096]  __schedule+0x3e0/0x870
Mar 21 05:26:10 server256 kernel: [ 1088.107104]  ? path_parentat+0x3e/0x80
Mar 21 05:26:10 server256 kernel: [ 1088.107107]  schedule+0x36/0x80
Mar 21 05:26:10 server256 kernel: [ 1088.107114]  rwsem_down_write_failed+0x228/0x380
Mar 21 05:26:10 server256 kernel: [ 1088.107122]  call_rwsem_down_write_failed+0x17/0x30
Mar 21 05:26:10 server256 kernel: [ 1088.107126]  ? call_rwsem_down_write_failed+0x17/0x30
Mar 21 05:26:10 server256 kernel: [ 1088.107130]  down_write+0x2d/0x40
Mar 21 05:26:10 server256 kernel: [ 1088.107135]  filename_create+0x7e/0x160
Mar 21 05:26:10 server256 kernel: [ 1088.107140]  SyS_mkdir+0x51/0x100
Mar 21 05:26:10 server256 kernel: [ 1088.107148]  do_syscall_64+0x73/0x130
Mar 21 05:26:10 server256 kernel: [ 1088.107154]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 21 05:26:10 server256 kernel: [ 1088.107159] RIP: 0033:0x7f6f8ec64687
Mar 21 05:26:10 server256 kernel: [ 1088.107161] RSP: 002b:00007ffee6e582c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
Mar 21 05:26:10 server256 kernel: [ 1088.107166] RAX: ffffffffffffffda RBX: 000055b306ee2010 RCX: 00007f6f8ec64687
Mar 21 05:26:10 server256 kernel: [ 1088.107168] RDX: 000055b304f98ee4 RSI: 00000000000001ff RDI: 000055b30a68cbb0
Mar 21 05:26:10 server256 kernel: [ 1088.107170] RBP: 0000000000000000 R08: 0000000000000200 R09: 000055b306ee2028
Mar 21 05:26:10 server256 kernel: [ 1088.107172] R10: 0000000000000000 R11: 0000000000000246 R12: 000055b3083b3778
Mar 21 05:26:10 server256 kernel: [ 1088.107174] R13: 000055b30a627070 R14: 000055b30a68cbb0 R15: 00000000000001ff
Mar 21 05:26:10 server256 kernel: [ 1088.107181] INFO: task qm:5866 blocked for more than 120 seconds.
Mar 21 05:26:10 server256 kernel: [ 1088.107255]       Tainted: G           O     4.15.18-26-pve #1
Mar 21 05:26:10 server256 kernel: [ 1088.107295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 21 05:26:10 server256 kernel: [ 1088.107349] qm              D    0  5866   1042 0x00000000
Mar 21 05:26:10 server256 kernel: [ 1088.107352] Call Trace:
Mar 21 05:26:10 server256 kernel: [ 1088.107357]  __schedule+0x3e0/0x870
Mar 21 05:26:10 server256 kernel: [ 1088.107360]  ? path_parentat+0x3e/0x80
Mar 21 05:26:10 server256 kernel: [ 1088.107363]  schedule+0x36/0x80
Mar 21 05:26:10 server256 kernel: [ 1088.107366]  rwsem_down_write_failed+0x228/0x380
Mar 21 05:26:10 server256 kernel: [ 1088.107370]  call_rwsem_down_write_failed+0x17/0x30
Mar 21 05:26:10 server256 kernel: [ 1088.107373]  ? call_rwsem_down_write_failed+0x17/0x30
Mar 21 05:26:10 server256 kernel: [ 1088.107376]  down_write+0x2d/0x40
Mar 21 05:26:10 server256 kernel: [ 1088.107381]  filename_create+0x7e/0x160
Mar 21 05:26:10 server256 kernel: [ 1088.107384]  SyS_mkdir+0x51/0x100
Mar 21 05:26:10 server256 kernel: [ 1088.107389]  do_syscall_64+0x73/0x130
Mar 21 05:26:10 server256 kernel: [ 1088.107392]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Ich kann insgesamt 12 Nodes aktivieren, die 13. Node bleibt grau und anschließend werden nach ein paar Minuten alle Nodes grau.
Im Corosync 2 Cluster waren insgesamt 23 Nodes aktiv.
 
Last edited:
Hier noch einen Auszug aus der daemon.log

Code:
Mar 21 12:56:59 server256 corosync[32449]:   [KNET  ] link: host: 2 link: 0 is down
Mar 21 12:57:02 server256 corosync[32449]:   [KNET  ] rx: host: 4 link: 0 is up
Mar 21 12:57:14 server256 corosync[32449]:   [KNET  ] rx: host: 2 link: 0 is up
Mar 21 12:57:18 server256 corosync[32449]:   [KNET  ] host: host: 4 (passive) best link: 0 (pri: 1)
Mar 21 12:57:18 server256 corosync[32449]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 21 12:57:18 server256 corosync[32449]:   [KNET  ] host: host: 4 (passive) best link: 0 (pri: 1)
Mar 21 12:57:18 server256 corosync[32449]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 21 12:57:25 server256 corosync[32449]:   [TOTEM ] A new membership (1.1f9f) was formed. Members
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [CPG   ] downlist left_list: 1 received
Mar 21 12:57:25 server256 corosync[32449]:   [QUORUM] Members[7]: 1 6 17 18 21 22 23
Mar 21 12:57:25 server256 corosync[32449]:   [MAIN  ] Completed service synchronization, ready to provide service.
Mar 21 12:57:28 server256 corosync[32449]:   [KNET  ] link: host: 11 link: 0 is down
Mar 21 12:57:28 server256 corosync[32449]:   [KNET  ] host: host: 11 (passive) best link: 0 (pri: 1)
Mar 21 12:57:28 server256 corosync[32449]:   [KNET  ] host: host: 11 has no active links
Mar 21 12:57:37 server256 corosync[32449]:   [KNET  ] rx: host: 11 link: 0 is up
Mar 21 12:57:37 server256 corosync[32449]:   [KNET  ] host: host: 11 (passive) best link: 0 (pri: 1)
Mar 21 12:57:46 server256 corosync[32449]:   [KNET  ] link: host: 12 link: 0 is down
Mar 21 12:57:46 server256 corosync[32449]:   [KNET  ] host: host: 12 (passive) best link: 0 (pri: 1)
Mar 21 12:57:46 server256 corosync[32449]:   [KNET  ] host: host: 12 has no active links
Mar 21 12:57:48 server256 corosync[32449]:   [TOTEM ] Retransmit List: c d e f 10 11 13 15 14 1a 1b 1c 1d 1f 20 21 22 23 24 25 26 12
Mar 21 12:57:56 server256 corosync[32449]:   [KNET  ] rx: host: 12 link: 0 is up
Mar 21 12:57:56 server256 corosync[32449]:   [KNET  ] host: host: 12 (passive) best link: 0 (pri: 1)
Mar 21 12:57:57 server256 corosync[32449]:   [TOTEM ] Retransmit List: 1a 1c 1f 20 22 23 24 25
Mar 21 12:57:57 server256 corosync[32449]:   [TOTEM ] Retransmit List: 39 3a 3c 3d 3e 40 42 43 44 45 3b
Mar 21 12:57:57 server256 corosync[32449]:   [TOTEM ] Retransmit List: 3d 3e 40 42 43 44 45 4b 4c 4d 4e 4a
Mar 21 12:58:02 server256 corosync[32449]:   [TOTEM ] Retransmit List: 42 43 45 4b 4d 4a
Mar 21 12:58:02 server256 corosync[32449]:   [TOTEM ] Retransmit List: 4d 4a
Mar 21 12:58:02 server256 corosync[32449]:   [TOTEM ] A new membership (1.1fa7) was formed. Members
Mar 21 12:58:33 server256 corosync[32449]:   [TOTEM ] A new membership (1.1fb3) was formed. Members
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:33 server256 corosync[32449]:   [QUORUM] Members[7]: 1 6 17 18 21 22 23
Mar 21 12:58:33 server256 corosync[32449]:   [MAIN  ] Completed service synchronization, ready to provide service.
Mar 21 12:58:46 server256 corosync[32449]:   [TOTEM ] Retransmit List: 9 a b d f c
Mar 21 12:58:46 server256 corosync[32449]:   [TOTEM ] Retransmit List: 1a 1b 1c 1d 18
Mar 21 12:58:51 server256 corosync[32449]:   [TOTEM ] Retransmit List: 18
Mar 21 12:58:51 server256 corosync[32449]:   [TOTEM ] A new membership (1.1fb7) was formed. Members joined: 5 19
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [CPG   ] downlist left_list: 0 received
Mar 21 12:58:51 server256 corosync[32449]:   [QUORUM] Members[9]: 1 5 6 17 18 19 21 22 23
Mar 21 12:58:51 server256 corosync[32449]:   [MAIN  ] Completed service synchronization, ready to provide service.
Mar 21 12:58:51 server256 corosync[32449]:   [MAIN  ] qb_ipcs_event_send: Transport endpoint is not connected (107)
Mar 21 12:59:01 server256 corosync[32449]:   [TOTEM ] Retransmit List: 4 5 6 7 8 9 a b c d f
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21 12:59:01 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
Mar 21
Mar 21 12:59:15 server256 corosync[32449]:   [CPG   ] *** 0x559697f2a6e0 can't mcast to group  state:0, error:12
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!