I'm newer to Proxmox and I've had everything up and running rock solid for about 4 months (single node). I've been updating through the GUI about every 2 - 3 weeks and since my last update on Jul 29, my node has been rebooting about 3 - 6 times a day at random times. I'm unable to figure out what's causing it. The summary charts don't show anything unusual except a spike in increased CPU usage after the reboot to start up my VMs/LXC
Here's the task
In my initial search, I've tried doing the following:
I'm running Proxmox 7.14-16 on kernel 5.15.108 on the following with the latest BIOS:
Minisforum NAB6 mini PC
Removed some of the log since I'm limited on the length of the post
I've also discovered this using dmesg
Here's the task
In my initial search, I've tried doing the following:
- Memtest - Passed
- Full shutdown and cold boot - Booted with no errors
- Unseated and reseated my SATA cable
I'm running Proxmox 7.14-16 on kernel 5.15.108 on the following with the latest BIOS:
Minisforum NAB6 mini PC
- Intel I7-12650H
- 64 GB 3200M DDR4 CL20 Ram
- 500 GB Samsung SSD
- 2 TB NVME for VMS and Containers
- PC is plugged into a UPS since the beginning
Removed some of the log since I'm limited on the length of the post
Code:
Aug 06 14:17:01 pve CRON[196143]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Aug 06 14:17:01 pve CRON[196144]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Aug 06 14:17:01 pve CRON[196143]: pam_unix(cron:session): session closed for user root
Aug 06 14:28:41 pve pvedaemon[1720]: <root@pam> successful auth for user 'root@pam'
Aug 06 14:28:44 pve pvedaemon[1720]: <root@pam> successful auth for user 'root@pam'
Aug 06 14:59:17 pve smartd[1332]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 58 to 60
-- Reboot --
Aug 06 15:02:44 pve kernel: Linux version 5.15.108-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.108-2 (2023-07-20T10:06Z) ()
Aug 06 15:02:44 pve kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.108-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on
Aug 06 15:02:44 pve kernel: KERNEL supported cpus:
Aug 06 15:02:44 pve kernel: Intel GenuineIntel
Aug 06 15:02:44 pve kernel: AMD AuthenticAMD
Aug 06 15:02:44 pve kernel: Hygon HygonGenuine
Aug 06 15:02:44 pve kernel: Centaur CentaurHauls
Aug 06 15:02:44 pve kernel: zhaoxin Shanghai
Aug 06 15:02:44 pve kernel: x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
Aug 06 15:02:44 pve kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Aug 06 15:02:44 pve kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Aug 06 15:02:44 pve kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Aug 06 15:02:44 pve kernel: x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
Aug 06 15:02:44 pve kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
Aug 06 15:02:44 pve kernel: x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8
Aug 06 15:02:44 pve kernel: x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format.
Aug 06 15:02:44 pve kernel: signal: max sigframe size: 3632
Aug 06 15:02:44 pve kernel: BIOS-provided physical RAM map:
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x000000000009e000-0x000000000009efff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x000000000009f000-0x000000000009ffff] usable
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000404f3fff] usable
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000404f4000-0x00000000435f3fff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000435f4000-0x00000000436b6fff] ACPI data
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000436b7000-0x00000000437bcfff] ACPI NVS
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000437bd000-0x0000000043e65fff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x0000000043e66000-0x0000000043efefff] type 20
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x0000000043eff000-0x0000000043efffff] usable
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x0000000043f00000-0x0000000049ffffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x000000004a200000-0x000000004a3fffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x000000004b000000-0x00000000503fffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000c0000000-0x00000000cfffffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000fed20000-0x00000000fed7ffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
Aug 06 15:02:44 pve kernel: BIOS-e820: [mem 0x0000000100000000-0x00000010afbfffff] usable
Aug 06 15:02:44 pve kernel: NX (Execute Disable) protection: active
Aug 06 15:02:44 pve kernel: efi: EFI v2.80 by American Megatrends
Aug 06 15:02:44 pve kernel: efi: ACPI=0x43739000 ACPI 2.0=0x43739014 TPMFinalLog=0x43708000 SMBIOS=0x43c9e000 SMBIOS 3.0=0x43c9d000 MEMATTR=0x374df018 ESRT=0x398fc818
Aug 06 15:02:44 pve kernel: secureboot: Secure boot disabled
Aug 06 15:02:44 pve kernel: SMBIOS 3.4.0 present.
Aug 06 15:02:44 pve kernel: DMI: Micro Computer(HK) Tech Limited NAB6/AHBNB, BIOS 1.00 03/21/2023
Aug 06 15:02:44 pve kernel: tsc: Detected 2700.000 MHz processor
Aug 06 15:02:44 pve kernel: tsc: Detected 2688.000 MHz TSC
Aug 06 15:02:44 pve kernel: e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
Aug 06 15:02:44 pve kernel: e820: remove [mem 0x000a0000-0x000fffff] usable
Aug 06 15:02:44 pve kernel: last_pfn = 0x10afc00 max_arch_pfn = 0x400000000
Aug 06 15:02:44 pve kernel: x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
Aug 06 15:02:44 pve kernel: last_pfn = 0x43f00 max_arch_pfn = 0x400000000
Aug 06 15:02:44 pve kernel: esrt: Reserving ESRT space from 0x00000000398fc818 to 0x00000000398fc8c8.
Aug 06 15:02:44 pve kernel: e820: update [mem 0x398fc000-0x398fcfff] usable ==> reserved
Aug 06 15:02:44 pve kernel: Using GB pages for direct mapping
Aug 06 15:02:44 pve kernel: Incomplete global flushes, disabling PCID
Aug 06 15:02:44 pve kernel: secureboot: Secure boot disabled
Aug 06 15:02:44 pve kernel: RAMDISK: [mem 0x2ca8e000-0x30545fff]
Aug 06 15:02:44 pve kernel: ACPI: Early table checksum verification disabled
Aug 06 15:02:44 pve kernel: ACPI: RSDP 0x0000000043739014 000024 (v02 ALASKA)
Aug 06 15:02:44 pve kernel: ACPI: XSDT 0x0000000043738728 0000FC (v01 ALASKA A M I 01072009 AMI 01000013)
Aug 06 15:02:44 pve kernel: ACPI: FACP 0x00000000436B4000 000114 (v06 ALASKA A M I 01072009 AMI 01000013)
Aug 06 15:02:44 pve kernel: ACPI: DSDT 0x0000000043649000 06ABF0 (v02 ALASKA A M I 01072009 INTL 20200717)
Aug 06 15:02:44 pve kernel: ACPI: FACS 0x00000000437BC000 000040
Aug 06 15:02:44 pve kernel: ACPI: FIDT 0x0000000043648000 00009C (v01 ALASKA A M I 01072009 AMI 00010013)
Aug 06 15:02:44 pve kernel: ACPI: SSDT 0x00000000436B6000 00038C (v02 PmaxDv Pmax_Dev 00000001 INTL 20200717)
Aug 06 15:02:44 pve kernel: ACPI: SSDT 0x0000000043642000 005D0B (v02 CpuRef CpuSsdt 00003000 INTL 20200717)
I've also discovered this using dmesg
Code:
[ 0.327302] ACPI: Added _OSI(Module Device)
[ 0.327302] ACPI: Added _OSI(Processor Device)
[ 0.327302] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.327302] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.327302] ACPI: Added _OSI(Linux-Dell-Video)
[ 0.327302] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 0.327302] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 0.440686] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS01], AE_NOT_FOUND (20210730/dswload2-162)
[ 0.440695] fbcon: Taking over console
[ 0.440703] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20210730/psobject-220)
[ 0.440708] ACPI: Skipping parse of AML opcode: Scope (0x0010)
[ 0.440711] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS02], AE_NOT_FOUND (20210730/dswload2-162)
[ 0.440716] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20210730/psobject-220)
[ 0.440719] ACPI: Skipping parse of AML opcode: Scope (0x0010)
[ 0.440722] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS03], AE_NOT_FOUND (20210730/dswload2-162)
[ 0.440726] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20210730/psobject-220)
[ 0.440729] ACPI: Skipping parse of AML opcode: Scope (0x0010)
[ 0.440731] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.TXHC.RHUB.SS04], AE_NOT_FOUND (20210730/dswload2-162)
[ 0.440734] ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20210730/psobject-220)
Last edited: