One node showing question marks for other nodes.

mattlehrer

Member
May 22, 2020
8
1
8
49
I was just updated my 1st node, rebooted and was updating some VM's on my other node and suddenly my 1st node is showing question marks for all the other nodes.

I logged into the 2nd node and it can see everything. I rebooted the 1st node again and it still shows question marks for all others.
I logged into the 3rd node and it can also see everything. Kind of scared to reboot 2nd and 3rd nodes after running that last kernel update.

Any help would be appreciated..

Output of pveversion -v


proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-4
pve-kernel-5.13: 7.1-9
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.15.104-1-pve: 5.15.104-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.60-2-pve: 5.15.60-2
pve-kernel-5.15.60-1-pve: 5.15.60-1
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.39-3-pve: 5.15.39-3
pve-kernel-5.15.39-2-pve: 5.15.39-2
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.35-3-pve: 5.15.35-6
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.15.30-2-pve: 5.15.30-3
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-5-pve: 5.13.19-13
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
 
Seeing this in journalctl:

Jul 29 11:05:25 m90q pvedaemon[1199]: <root@pam> successful auth for user 'root@pam'
Jul 29 11:05:25 m90q login[6445]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
Jul 29 11:05:25 m90q login[6445]: pam_systemd(login:session): Failed to create session: Transport endpoint is not connected
Jul 29 11:05:25 m90q login[6450]: ROOT LOGIN on '/dev/pts/0'
Jul 29 11:06:53 m90q systemd[1]: pvescheduler.service: Processes still around after SIGKILL. Ignoring.
Jul 29 11:06:53 m90q systemd[1]: pvescheduler.service: Failed with result 'timeout'.
Jul 29 11:06:53 m90q systemd[1]: pvescheduler.service: Unit process 2085 (pvescheduler) remains running after unit stopped.
Jul 29 11:06:53 m90q systemd[1]: pvescheduler.service: Unit process 2086 (pvescheduler) remains running after unit stopped.
Jul 29 11:06:53 m90q systemd[1]: pvescheduler.service: Unit process 5629 (pvescheduler) remains running after unit stopped.
Jul 29 11:06:53 m90q systemd[1]: Stopped Proxmox VE scheduler.
Jul 29 11:06:53 m90q systemd[1]: Stopping PVE guests...
Jul 29 11:07:26 m90q kernel: INFO: task pvescheduler:2085 blocked for more than 1087 seconds.
Jul 29 11:07:26 m90q kernel: Tainted: P O 5.15.108-1-pve #1
Jul 29 11:07:26 m90q kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 29 11:07:26 m90q kernel: task:pvescheduler state:D stack: 0 pid: 2085 ppid: 1 flags:0x00004004
Jul 29 11:07:26 m90q kernel: Call Trace:
Jul 29 11:07:26 m90q kernel: <TASK>
Jul 29 11:07:26 m90q kernel: __schedule+0x34e/0x1740
Jul 29 11:07:26 m90q kernel: ? terminate_walk+0xe4/0xf0
Jul 29 11:07:26 m90q kernel: ? filename_parentat+0xc0/0x1e0
Jul 29 11:07:26 m90q kernel: schedule+0x69/0x110
Jul 29 11:07:26 m90q kernel: rwsem_down_write_slowpath+0x229/0x4e0
Jul 29 11:07:26 m90q kernel: down_write+0x47/0x60
Jul 29 11:07:26 m90q kernel: filename_create+0x9a/0x160
Jul 29 11:07:26 m90q kernel: do_mkdirat+0x4d/0x160
Jul 29 11:07:26 m90q kernel: __x64_sys_mkdir+0x4c/0x70
Jul 29 11:07:26 m90q kernel: do_syscall_64+0x59/0xc0
Jul 29 11:07:26 m90q kernel: ? do_syscall_64+0x69/0xc0
Jul 29 11:07:26 m90q kernel: ? syscall_exit_to_user_mode+0x27/0x50
Jul 29 11:07:26 m90q kernel: ? do_syscall_64+0x69/0xc0
Jul 29 11:07:26 m90q kernel: ? do_syscall_64+0x69/0xc0
Jul 29 11:07:26 m90q kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Jul 29 11:07:26 m90q kernel: RIP: 0033:0x7f9167b4b047
Jul 29 11:07:26 m90q kernel: RSP: 002b:00007fffdd9fc748 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
Jul 29 11:07:26 m90q kernel: RAX: ffffffffffffffda RBX: 00005602afd032a0 RCX: 00007f9167b4b047
Jul 29 11:07:26 m90q kernel: RDX: 0000000000000027 RSI: 00000000000001ff RDI: 00005602b5d50fe0
Jul 29 11:07:26 m90q kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000001ef2
Jul 29 11:07:26 m90q kernel: R10: 00007fffdd9fc710 R11: 0000000000000246 R12: 00005602b5d50fe0
Jul 29 11:07:26 m90q kernel: R13: 00005602afd08ca8 R14: 00005602b1643370 R15: 00000000000001ff
Jul 29 11:07:26 m90q kernel: </TASK>
 
I pinned the previous kernel on reboot. Everything came back up as normal. Then rebooted again to use the newer kernel and now it's working. I'm confounded.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!