These are some of the the last logs before the system hangs/becomes completely unresponsive. I have to powerfail to get it back online.
From what I gather corosync reports host 2 link seems to be going down repeatedly. What else should I be looking at to continue troubleshooting?
From what I gather corosync reports host 2 link seems to be going down repeatedly. What else should I be looking at to continue troubleshooting?
Code:
Aug 26 03:50:02 pm corosync[7436]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 26 03:50:06 pm corosync[7436]: [QUORUM] Sync members[2]: 1 3
Aug 26 03:50:06 pm corosync[7436]: [TOTEM ] A new membership (1.73b6) was formed. Members
Aug 26 03:50:07 pm corosync[7436]: [KNET ] link: host: 2 link: 0 is down
Aug 26 03:50:07 pm corosync[7436]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 26 03:50:07 pm corosync[7436]: [KNET ] host: host: 2 has no active links
Aug 26 03:50:09 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 10
Aug 26 03:50:10 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 20
Aug 26 03:50:10 pm corosync[7436]: [TOTEM ] Token has not been received in 4527 ms
Aug 26 03:50:11 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 30
Aug 26 03:50:11 pm corosync[7436]: [KNET ] rx: host: 2 link: 0 is up
Aug 26 03:50:11 pm corosync[7436]: [KNET ] link: Resetting MTU for link 0 because host 2 joined
Aug 26 03:50:11 pm corosync[7436]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 26 03:50:11 pm corosync[7436]: [KNET ] pmtud: Global data MTU changed to: 1397
Aug 26 03:50:11 pm pvestatd[7566]: proxmox-backup-client failed: Error: error trying to connect: error connecting to https://pbs001.tuxis.nl:8007/ - dns error: failed to lookup address information: Temporary failure in name resolution
Aug 26 03:50:12 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 40
Aug 26 03:50:13 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 50
Aug 26 03:50:13 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 10
Aug 26 03:50:14 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 60
Aug 26 03:50:14 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 20
Aug 26 03:50:15 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 70
Aug 26 03:50:15 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 30
Aug 26 03:50:16 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 80
Aug 26 03:50:16 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 40
Aug 26 03:50:16 pm corosync[7436]: [KNET ] link: host: 2 link: 0 is down
Aug 26 03:50:16 pm corosync[7436]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 26 03:50:16 pm corosync[7436]: [KNET ] host: host: 2 has no active links
Aug 26 03:50:17 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 90
Aug 26 03:50:17 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 50
Aug 26 03:50:17 pm corosync[7436]: [TOTEM ] Token has not been received in 2737 ms
Aug 26 03:50:18 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 100
Aug 26 03:50:18 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retried 100 times
Aug 26 03:50:18 pm pmxcfs[7284]: [dcdb] crit: cpg_send_message failed: 6
Aug 26 03:50:18 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 60
Aug 26 03:50:19 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 10
Aug 26 03:50:19 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 70
Aug 26 03:50:19 pm corosync[7436]: [KNET ] rx: host: 2 link: 0 is up
Aug 26 03:50:19 pm corosync[7436]: [KNET ] link: Resetting MTU for link 0 because host 2 joined
Aug 26 03:50:19 pm corosync[7436]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 26 03:50:19 pm corosync[7436]: [KNET ] pmtud: Global data MTU changed to: 1397
Aug 26 03:50:20 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 20
Aug 26 03:50:20 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 80
Aug 26 03:50:21 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 30
Aug 26 03:50:21 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 90
Aug 26 03:50:22 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 40
Aug 26 03:50:22 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 100
Aug 26 03:50:22 pm pmxcfs[7284]: [status] notice: cpg_send_message retried 100 times
Aug 26 03:50:22 pm pmxcfs[7284]: [status] crit: cpg_send_message failed: 6
Aug 26 03:50:22 pm corosync[7436]: [TOTEM ] Token has not been received in 7388 ms
Aug 26 03:50:23 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 50
Aug 26 03:50:23 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 10
Aug 26 03:50:24 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 60
Aug 26 03:50:24 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 20
Aug 26 03:50:25 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 70
Aug 26 03:50:25 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 30
Aug 26 03:50:26 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 80
Aug 26 03:50:26 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 40
Aug 26 03:50:26 pm corosync[7436]: [TOTEM ] Token has not been received in 11989 ms
Aug 26 03:50:27 pm pmxcfs[7284]: [dcdb] notice: cpg_send_message retry 90
Aug 26 03:50:27 pm pmxcfs[7284]: [status] notice: cpg_send_message retry 50
Aug 26 03:50:27 pm corosync[7436]: [KNET ] link: host: 2 link: 0 is down
Aug 26 03:50:27 pm corosync[7436]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 26 03:50:27 pm corosync[7436]: [KNET ] host: host: 2 has no active links
Aug 26 03:50:27 pm corosync[7436]: [QUORUM] Sync members[2]: 1 3
Code:
pm:~# pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.111-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-5
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.111-1-pve: 5.15.111-1
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.13.19-6-pve: 5.13.19-15
ceph: 17.2.6-pve1
ceph-fuse: 17.2.6-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
Code:
root@pm2:~# pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.111-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-5
pve-kernel-5.15.111-1-pve: 5.15.111-1
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph: 17.2.6-pve1
ceph-fuse: 17.2.6-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
Code:
root@pm3:~# pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.111-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-5
pve-kernel-5.15.111-1-pve: 5.15.111-1
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
ceph: 17.2.6-pve1
ceph-fuse: 17.2.6-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1