I hate doing so but I tried understanding as much as I could what was going on with one of my hosts and I don't get it.
Since a few weeks the host will crash with the following message on screen randomly. Sorry for the poor copy paste, this is from a picture taken on iphone.
I really don't know were to look for. I've tested my ecc ram and no errors on this side. checked the disks, no error as well. Do you have any suggestions ? the system is up to date.
				
			Since a few weeks the host will crash with the following message on screen randomly. Sorry for the poor copy paste, this is from a picture taken on iphone.
I really don't know were to look for. I've tested my ecc ram and no errors on this side. checked the disks, no error as well. Do you have any suggestions ? the system is up to date.
[129908.035705)
? secondary_startup_64_no_verify+oxc2/0xcb
(129908.035707]
</TASK>
(129908.096691] ru: ru_sched kthread timer wakeup didn't happen for 14253 JIffiesl 816569729 fOx0 RCU_GP_NAIT_FQS(5)-›state=0x402
(129908.080640] ru: *Possible timer handling issue on cpu=6 timer-softirq=800692
(129908.081361] ru: rcusched kthread starved for 14254 Jiffies 816563729 fOx0 RCU_GP_NAITFQS(5)-›state=0x402-›Cpu=6
(129908.082098] ru: *Unless rousched kthread gets sufficient CPU time, OOM is now expected behavior.
(129908.082834] ru: CU grace-period kthread stack dump:
[129908.083576] taskircu_sched
state:I stack:
0 pid:
14 ppid:
2 flags:0x00004000
(129908.084328] Call Trace:
[129908.085074]
(TASK>
[129908.085816]
_schedule+0x33d/0x1750
[129908.086564]
[129908.087311]
? lock_timer _base+0x3b/oxdo
[129908.088055]
mod_timer +0x271/0x440
[129908.088798]
schedule+0x4e/OxcO
[129908.089539]
[129908.090269]
[129908.090991]
[129908.091703]
rcu_g0_kthread+0xa7/0x130
? reu_gp_init+0x5f0/0x5f0
[129908.092413]
kthread+0x12a/0x150
[129908.093113]
? set_kthread_struct+0x50/0x50
[129908.093808]
ret_from_fork+0x22/0x30
(129908.0945011
</TASK>
[129908.095163] ru: Stack dump where CU GP kthread last ran:
[129908.095808] Sending NMI from CPU 7 to CPUs 6:
[129908.096450] MI backtrace for cou 6
[129908.096450] CPU: 6 PID: 7217 Comm: CPU 1/KVM Tainted: I
D
0
5.15.53-1-pve #1
(129908.096451) Hardware name: ASUS System Product Name/Pro B550M-C, BIOS 2804 06/19/2022
[129908.096452) RIP: 0010:native_queued_spin_lock_sloupath+Ox1f5/0x240
[129908.096454] Code: c5 40 19 03 00 49 81 fe ff 1f 00 00 77 49 4e 03 2c f5 e0 ea 8b a0 4d 89 65 00 41 8b 44 24 08 85 cO 75 0b f3 90 41 8b 44 24 08 <85> cO 74 f5 49 8b Oc 24 48 85 c9 Of 84 47 ff ff ff of od 09 I
[129908.096455] SP: 0018:ffffbe02120a79a0 EFLAGS:00000046
[129908.096455] AX: 0000000000000000 RBX: ffff9ecbce4fobco RCX:0000000000000010
[129908.096456] RDX: 00000000001c0000 RSI: 00000000001c0000 RDI: ffff9ecbce4fobco
[129908.096457] RB: ffffbe02120079c8 R0B: 0000000000000020 R09: 0000000000000000
[129908.096457] R10: 0000000000000020 R11: fffffffffffff000 R12: ffff9ecbce5b1940
[129908.096458] R13: ffff9ecbce7f1940 R14: 000000000000000f R15: 00000000001c0000
(129908.096459] FS: 00007fBe57fff700 (0000) GS:ffff9ecbce580000 (0000) knIGS:0000000000000000
[129908.096460] CS:
0010 DS: 0000 ES: 0000 CRO: 0000000080050033
[129908.096461] CR2: 000000c00005010 CR3: 0000000253018000 CR4: 0000000000350ee0
[129908.096461] Call Trace:
[129908.096462]
‹TASK>
(129908.096462]
_raw_spin_lock+0x22/0x30
[129908.096464]
raw_spin_rq_lock_nested+0x17/0x80
[129908.096465]
load_balance+0x550/0x10b0
[129908.096468]
[129908.096470]
newidle, balance+Oxzaf/0x470
pick_next_task_fair+0x40/0x460
[129908.096472]
_schedule+0x195/0x1750
[129908.096473]
? kvm_get_rflags+0x12/0x30 [kvm]
[129908.096497]
[129908.096499]
schedule+0x4e/Oxc0
kvm_vcpu_block+0x70/0x3b0 (kvm]
[129908.096520]
[129908.096545]
kvm_arch_vcpu_loctl_run+0x77e/0x1730 (kvm]
? vcpu out+0x1b/0x40 (kvm]
[129908.096567]
[129908.096590]
kvm_vcpu_ioct1+0x252/0x6b0 [kvm]
? kvm_vcpu_ioct1+0x2bb/0x6b0 [kvm]
(129908.096611]
? kvm_on_user _return+0x80/Oxf0 (kvml
[129908.096635]
_fget_files+0x86/Oxc0
[129908.096636]
×64_sys_loct1+0x95/0xd0
[129908.096638]
[129908.096639]
do_syscall_64+0x5c/OxcO
? exit_to_user mode _prepare+0x37/0x1b0
[129908.096640]
? syscall_exlt_to_user_mode+0x27/0x50
[129908.096642]
do_syscall_64+0x69/Oxc0
[129908.096643]
[129908.096643]
? do_syscall_64+0×69/Oxc0
[129908.096644]
? do_syscall_64+0×69/Oxc0
[129908.096645]
? do_syscall_64+0×69/0×C0
[129908.096647] Code: 00 00 00 48 8b 05 c9 91 0C 00 64 C7 00 26 00 00 00 48 c7 CO ff ff ff ff c3 66 28 Of 1f 84 00 00 00 00 00 bB 10 00 00 00 Of 05 <48> 3d 01 f0 ff ff 79 01 c9 48 8b 0d 99 91 0c 00 f7 dB 64 89 0.
[129908.096648] RSP: 002b:00007fBe57ffa288 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[129908.096649] RAX: ffffffffffffffda RBX: 000000000000a80 RCX: 00007f8e680f0cc7
[129908.096650] RDX: 0000000000000000 RSI: 000000000000a880 RDI:0000000000000016
[129908.096651] RB: 00005634304d8900 ROB: 000056343be62c88 R09: 000000000000ffff
(129908.096651] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[129908.096652] R13: 000056943c2C8480 R14: 0000000000000002 R15: 0000000000000000
pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-10
pve-kernel-helper: 7.2-10
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
ceph-fuse: 16.2.9-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
 
	