[SOLVED] watchdog: watchdog detected hard LOCKUP on cpu all. how solve it?

Hi,
what hardware are you running on? Is your firmware up to date?
 
I'm referring to the firmware of your motherboad manufacturer. They typically provide updates to their custom firmware. Please check their homepage for details on which firmware version is available and how to update to the latest version.
 
Could you provide the output of pveversion -v as well as the output of lscpu.

Have you tried using a different Kernel version to see if the issue still persists?
 
USAGE: pveversion [--verbose]
root@hq:~# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-7 (running version: 7.1-7/df5740ad)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libqb0: not correctly installed
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3



##############################
root@hq:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 94
Model name: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Stepping: 3
CPU MHz: 3400.000
CPU max MHz: 4000.0000
CPU min MHz: 800.0000
BogoMIPS: 6799.81
Virtualization: VT-x
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 1 MiB
L3 cache: 8 MiB
NUMA node0 CPU(s): 0-7
Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP condi
tional, RSB filling
Vulnerability Srbds: Mitigation; Microcode
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 cl
flush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm
constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts
c cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 sss
e3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_dead
line_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault
epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority
ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid r
tm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves
dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flu
sh_l1d arch_capabilities
 
I would recommend to upgrade your system to the latest version of PVE by running apt update && apt full-upgrade. Make sure you have the corresponding repos configured, for details please refer to https://pve.proxmox.com/wiki/Package_Repositories

how to change kernel?got guide?
will lost data if change other version kernel?
You can boot with a different Kernel by selecting it directly at boot from the grub menu. Booting with a different kernel does not lead to data loss.
 
apt update && apt full-upgrade.
performed this update still no able solve the issue.
Did you also reboot the system after the upgrade, so that the newer kernel version is being used?
 
issue resolved. thanks
Glad to hear that. So for the record, in case others might stumble across this thread. The newer kernel version did resolve the issue or did you do something additional?

Also, please mark the thread as solved, thanks!
 
  • Like
Reactions: dickson lee
For future reference: my problem was faulty or, rather, two incompatible RAM kits. Same brand, same speeds, a 16GB kit and a 32GB kit. Independently they work flawlessly, together I have those lockups after about an hour of uptime.
 
  • Like
Reactions: Chris
For future reference: my problem was faulty or, rather, two incompatible RAM kits. Same brand, same speeds, a 16GB kit and a 32GB kit. Independently they work flawlessly, together I have those lockups after about an hour of uptime.
Hello, I'm having the same issue, how did you find out that the problem is memory?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!