Hi!
I have a cluster of two nodes running proxmox 7.4-17 and have had it running for a few years. Node 2 suddenly dropped out of the network and kept on becoming unreachable after a few minutes. I thought this was due to a lack of storage space as I had used a thin volume that I had increased bit too much and thought that might provoke it. I have installed a new hard drive with sufficient space and reinstalled the node, added it to the cluster again and still I get the exact same problem with the node becoming unreachable both through GUI and SSH. Any tips on the next steps of troubleshooting this?
The syslog does not show any error messages that I can see, and simply stops outputting when the node becomes unreachable:
I have a cluster of two nodes running proxmox 7.4-17 and have had it running for a few years. Node 2 suddenly dropped out of the network and kept on becoming unreachable after a few minutes. I thought this was due to a lack of storage space as I had used a thin volume that I had increased bit too much and thought that might provoke it. I have installed a new hard drive with sufficient space and reinstalled the node, added it to the cluster again and still I get the exact same problem with the node becoming unreachable both through GUI and SSH. Any tips on the next steps of troubleshooting this?
The syslog does not show any error messages that I can see, and simply stops outputting when the node becomes unreachable:
Code:
Oct 18 13:19:33 pve kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 18 13:19:33 pve kernel: fwbr104i0: port 1(fwln104i0) entered forwarding state
Oct 18 13:19:33 pve kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 18 13:19:33 pve kernel: fwbr104i0: port 2(tap104i0) entered disabled state
Oct 18 13:19:33 pve kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 18 13:19:33 pve kernel: fwbr104i0: port 2(tap104i0) entered forwarding state
Oct 18 13:19:34 pve chronyd[663]: Selected source 185.35.202.197 (2.debian.pool.ntp.org)
Oct 18 13:19:34 pve chronyd[663]: System clock TAI offset set to 37 seconds
Oct 18 13:19:34 pve kernel: FS-Cache: Loaded
Oct 18 13:19:34 pve kernel: FS-Cache: Netfs 'cifs' registered for caching
Oct 18 13:19:34 pve kernel: Key type cifs.spnego registered
Oct 18 13:19:34 pve kernel: Key type cifs.idmap registered
Oct 18 13:19:34 pve kernel: CIFS: Attempting to mount \\192.168.1.100\Backup
Oct 18 13:19:36 pve pve-guests[919]: <root@pam> end task UPID:pve:00000398:00000653:652FBF43:startall::root@pam: OK
Oct 18 13:19:36 pve systemd[1]: Finished PVE guests.
Oct 18 13:19:36 pve systemd[1]: Starting Proxmox VE scheduler...
Oct 18 13:19:37 pve pvescheduler[1028]: starting server
Oct 18 13:19:37 pve systemd[1]: Started Proxmox VE scheduler.
Oct 18 13:19:37 pve systemd[1]: Reached target Multi-User System.
Oct 18 13:19:37 pve systemd[1]: Reached target Graphical Interface.
Oct 18 13:19:37 pve systemd[1]: Starting Update UTMP about System Runlevel Changes...
Oct 18 13:19:37 pve systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Oct 18 13:19:37 pve systemd[1]: Finished Update UTMP about System Runlevel Changes.
Oct 18 13:19:37 pve systemd[1]: Startup finished in 13.572s (firmware) + 5.597s (loader) + 8.737s (kernel) + 13.279s (userspace) = 41.187s.
Oct 18 13:20:40 pve chronyd[663]: Selected source 62.101.228.30 (2.debian.pool.ntp.org)
Oct 18 13:21:17 pve pmxcfs[741]: [status] notice: received log
Oct 18 13:21:17 pve sshd[1629]: Accepted publickey for root from 192.168.1.250 port 58978 ssh2: RSA SHA256:RbAcAWP3yChAIw13FMCz5SQo0qp8eCcTZedb5kuTxP8
Oct 18 13:21:17 pve sshd[1629]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Oct 18 13:21:17 pve systemd[1]: Created slice User Slice of UID 0.
Oct 18 13:21:17 pve systemd[1]: Starting User Runtime Directory /run/user/0...
Oct 18 13:21:17 pve systemd-logind[561]: New session 1 of user root.
Oct 18 13:21:17 pve systemd[1]: Finished User Runtime Directory /run/user/0.
Oct 18 13:21:17 pve systemd[1]: Starting User Manager for UID 0...
Oct 18 13:21:17 pve systemd[1632]: pam_unix(systemd-user:session): session opened for user root(uid=0) by (uid=0)
Oct 18 13:21:17 pve systemd[1632]: Queued start job for default target Main User Target.
Oct 18 13:21:17 pve systemd[1632]: Created slice User Application Slice.
Oct 18 13:21:17 pve systemd[1632]: Reached target Paths.
Oct 18 13:21:17 pve systemd[1632]: Reached target Timers.
Oct 18 13:21:17 pve systemd[1632]: Listening on GnuPG network certificate management daemon.
Oct 18 13:21:17 pve systemd[1632]: Listening on GnuPG cryptographic agent and passphrase cache (access for web browsers).
Oct 18 13:21:17 pve systemd[1632]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
Oct 18 13:21:17 pve systemd[1632]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
Oct 18 13:21:17 pve systemd[1632]: Listening on GnuPG cryptographic agent and passphrase cache.
Oct 18 13:21:17 pve systemd[1632]: Reached target Sockets.
Oct 18 13:21:17 pve systemd[1632]: Reached target Basic System.
Oct 18 13:21:17 pve systemd[1632]: Reached target Main User Target.
Oct 18 13:21:17 pve systemd[1632]: Startup finished in 184ms.
Oct 18 13:21:17 pve systemd[1]: Started User Manager for UID 0.
Oct 18 13:21:17 pve systemd[1]: Started Session 1 of user root.
-- Reboot --
Oct 18 13:28:42 pve kernel: Linux version 5.15.102-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.102-1 (2023-03-14T13:48Z) ()
Oct 18 13:28:42 pve kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.102-1-pve root=/dev/mapper/pve-root ro quiet
Oct 18 13:28:42 pve kernel: KERNEL supported cpus:
Oct 18 13:28:42 pve kernel: Intel GenuineIntel
Oct 18 13:28:42 pve kernel: AMD AuthenticAMD
Oct 18 13:28:42 pve kernel: Hygon HygonGenuine
Oct 18 13:28:42 pve kernel: Centaur CentaurHauls
Oct 18 13:28:42 pve kernel: zhaoxin Shanghai
Oct 18 13:28:42 pve kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Oct 18 13:28:42 pve kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Oct 18 13:28:42 pve kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Oct 18 13:28:42 pve kernel: x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
Oct 18 13:28:42 pve kernel: x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
Oct 18 13:28:42 pve kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
Oct 18 13:28:42 pve kernel: x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
Oct 18 13:28:42 pve kernel: x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
Oct 18 13:28:42 pve kernel: x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.
Oct 18 13:28:42 pve kernel: signal: max sigframe size: 2032
Oct 18 13:28:42 pve kernel: BIOS-provided physical RAM map:
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000000059000-0x000000000009efff] usable
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000000100000-0x000000003fffffff] usable
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000040000000-0x00000000403fffff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000040400000-0x000000006e287fff] usable
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000006e288000-0x000000006e288fff] ACPI NVS
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000006e289000-0x000000006e289fff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000006e28a000-0x0000000079da9fff] usable
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000079daa000-0x000000007a23efff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000007a23f000-0x000000007a284fff] ACPI data
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000007a285000-0x000000007aa5cfff] ACPI NVS
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000007aa5d000-0x000000007af4dfff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000007af4e000-0x000000007affdfff] type 20
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000007affe000-0x000000007affefff] usable
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x000000007afff000-0x000000007fffffff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
Oct 18 13:28:42 pve kernel: BIOS-e820: [mem 0x0000000100000000-0x000000087effffff] usable
Oct 18 13:28:42 pve kernel: NX (Execute Disable) protection: active
Oct 18 13:28:42 pve kernel: efi: EFI v2.70 by American Megatrends
Oct 18 13:28:42 pve kernel: efi: ACPI 2.0=0x7a24d000 ACPI=0x7a24d000 SMBIOS=0x7ae08000 SMBIOS 3.0=0x7ae07000 MEMATTR=0x7834f018 ESRT=0x7ae04418
Oct 18 13:28:42 pve kernel: secureboot: Secure boot disabled
Oct 18 13:28:42 pve kernel: SMBIOS 3.1.1 present.
Oct 18 13:28:42 pve kernel: DMI: /NUC7i5BNB, BIOS BNKBL357.86A.0083.2020.0714.1344 07/14/2020
Oct 18 13:28:42 pve kernel: tsc: Detected 2200.000 MHz processor
Oct 18 13:28:42 pve kernel: tsc: Detected 2199.996 MHz TSC