New upgrade issue

mada

Member
Aug 16, 2017
99
3
13
36
Hello,

i upgrade the Cluster lately however this issue keep happened each few hours i attached screenshot. the connection loss but i still see the summary to fix this i have to restart the corosync

proxmox-ve: 5.0-25 (running kernel: 4.10.17-4-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
libpve-http-server-perl: 2.0-6 lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3 libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2 lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1

Proxmox installed above Debian Stretch 9

Thanks
 

Attachments

  • Screen Shot 2017-10-22 at 11.27.54 PM.png
    Screen Shot 2017-10-22 at 11.27.54 PM.png
    20.2 KB · Views: 11
  • Screen Shot 2017-10-22 at 11.28.19 PM.png
    Screen Shot 2017-10-22 at 11.28.19 PM.png
    25.6 KB · Views: 11
Last edited:
Hi,

please check if this version are on all nodes the same.
 
Hi,

please check if this version are on all nodes the same.

They all the same version and setup as well now the nodes are shows up but

Connection refused (595)

and most of them their GUI not longer working !
 
Hi mada!

Can you post your upgrade steps? Are there any errors in the syslog? Are you using the test repo? If so the 4.13 should be running not the 4.10

proxmox-ve: 5.0-25 (running kernel: 4.10.17-4-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
 
Last edited:
When i do top -c i saw the pmxcfs and corosync was taken 100% of the CPU so i did corosync now it lower down but still their GUI down and unable to connect the Cluster.


Oct 23 07:40:31 xx pmxcfs[1259]: [dcdb] notice: received sync request (epoch 1/11341/0000043D)
Oct 23 07:40:31 xx pmxcfs[1259]: [status] notice: received sync request (epoch 1/11341/000004B1)
Oct 23 07:40:31 xx pmxcfs[1259]: [dcdb] notice: received all states
Oct 23 07:40:31 xx pmxcfs[1259]: [dcdb] notice: leader is 1/11341
Oct 23 07:40:31 xx pmxcfs[1259]: [dcdb] notice: synced members: 1/11341, 2/12273, 4/17322, 5/1325, 6/23598, 7/19961, 8/1259, 9/6757, 11/26019
Oct 23 07:40:31 xx pmxcfs[1259]: [dcdb] notice: all data is up to date
Oct 23 07:40:31 xx pmxcfs[1259]: [dcdb] notice: dfsm_deliver_queue: queue length 8
Oct 23 07:40:31 xx pmxcfs[1259]: [status] notice: received all states
Oct 23 07:40:31 xxx pmxcfs[1259]: [status] notice: all data is up to date
Oct 23 07:40:31 lxx pmxcfs[1259]: [status] notice: dfsm_deliver_queue: queue length 65
Oct 23 07:40:31 xx pmxcfs[1259]: [status] notice: received log
Oct 23 07:40:31 xx pmxcfs[1259]: [main] notice: ignore duplicate
Oct 23 07:40:33 xx corosync[10444]: notice [TOTEM ] A new membership (185.215.224.6:128764) was formed. Members joined: 10
Oct 23 07:40:33 xx corosync[10444]: [TOTEM ] A new membership (185.215.224.6:128764) was formed. Members joined: 10
Oct 23 07:40:33 xx pmxcfs[1259]: [dcdb] notice: members: 1/11341, 2/12273, 4/17322, 5/1325, 6/23598, 7/19961, 8/1259, 9/6757, 10/8055, 11/26019
Oct 23 07:40:33 xx pmxcfs[1259]: [dcdb] notice: starting data syncronisation
Oct 23 07:40:33 xx pmxcfs[1259]: [status] notice: members: 1/11341, 2/12273, 4/17322, 5/1325, 6/23598, 7/19961, 8/1259, 9/6757, 10/8055, 11/26019
Oct 23 07:40:33 xx pmxcfs[1259]: [status] notice: starting data syncronisation
Oct 23 07:40:33 xx corosync[10444]: notice [QUORUM] Members[11]: 8 1 10 4 5 3 6 7 9 11 2
Oct 23 07:40:33 xx corosync[10444]: notice [MAIN ] Completed service synchronization, ready to provide service.
Oct 23 07:40:33 xx corosync[10444]: [QUORUM] Members[11]: 8 1 10 4 5 3 6 7 9 11 2
Oct 23 07:40:33 xx corosync[10444]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 23 07:40:33 xx pmxcfs[1259]: [dcdb] notice: received sync request (epoch 1/11341/0000043E)
Oct 23 07:40:33 xx pmxcfs[1259]: [status] notice: received sync request (epoch 1/11341/000004B2)
 
Can you post your output of
Code:
pvecm status
and
Code:
pvecm nodes
Source: https://pve.proxmox.com/wiki/Cluster_Manager

root@xxx:~# pvecm status
Quorum information
------------------
Date: Mon Oct 23 08:08:52 2017
Quorum provider: corosync_votequorum
Nodes: 11
Node ID: 0x00000001
Ring ID: 8/129720
Quorate: Yes

Votequorum information
----------------------
Expected votes: 11
Highest expected: 11
Total votes: 11
Quorum: 6
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000008 1 1xxxx
0x00000001 1 1xxxxx (local)
0x0000000a 1 1xxxx
0x00000004 1 1xxxx
0x00000005 1 1xxxx
0x00000003 1 1xxxx
0x00000006 1 1xxxxxx
0x00000007 1 1xxxx
0x00000009 1 1xxxxx
0x0000000b 1 1xxxx
0x00000002 1 1xxxxxx
root@xxx:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
8 1 xx
1 1 xxx (local)
10 1 xx
4 1 xxx
5 1 xx
3 1 xx
6 1 xx
7 1 xx
9 1 xx
11 1 xx
2 1 xx
root@lxxx:~#
 
Can you run
Code:
pveversion -v
on all 11 nodes and post the output. I think you have loaded different kernels and this would cause a problem with corosync.
proxmox-ve: 5.0-25 (running kernel: 4.10.17-4-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
 
  • Like
Reactions: mada
I had one node that failed to pull in the new kernel. Since I use zfs that caused a real problem. I think the same is happening here.
 
Can you run
Code:
pveversion -v
on all 11 nodes and post the output. I think you have loaded different kernels and this would cause a problem with corosync.

Here is

1 -

root@xx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.13.4-1-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xxx:~#



2rd -

root@lxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
openvswitch-switch: 2.7.0-2
root@lxx:~#

3rd -

root@lxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-4-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@lxxx:~#

4th -

root@xx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xxx:~#


5th -

root@xxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.13.4-1-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xx:~#


6th -

root@lxxxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xx:~#



7th -

root@xxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xx:~#


8th -


root@xxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xx:~#




9th -

root@lxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-4-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@lxxx:~#


10th

root@lxxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-4-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-4-pve: 4.10.17-24
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@xx:~#


11th


root@lxx:~# pveversion -v
proxmox-ve: 5.0-25 (running kernel: 4.10.17-3-pve)
pve-manager: 5.0-34 (running version: 5.0-34/b325d69e)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-10
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: not correctly installed
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
root@laxx:~#
 
Yeah, different kernels.

How this possible exactly while all nodes are up to date

root@xx:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
root@xxx:~#

What i need to do ?
 
The issue still remain after match all kernel, sometimes the NOVNC for VM on nodes just stuck only loading and not open and connections sometimes goes out again but this time it shows connection loss but still moving around the Cluster.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!