Issue with latest (4.13.13-5-pve) kernel?

tycoonbob

Member
Aug 25, 2014
67
0
6
Hi all,

Recently updated my 3-node cluster as well as a standalone PVE host and after reboot I am having a specific problem with running Splunk Universal Forwarders, and it looks like a kernel panic.

This is what I get when I try to start my Splunk Universal Forwarder:
Code:
Jan 24 15:03:23 mjolnir kernel: [71447.073437] PGD 0
Jan 24 15:03:23 mjolnir kernel: [71447.073438] P4D 0
Jan 24 15:03:23 mjolnir kernel: [71447.074111]
Jan 24 15:03:23 mjolnir kernel: [71447.075404] Oops: 0010 [#4] SMP PTI
Jan 24 15:03:23 mjolnir kernel: [71447.076043] Modules linked in: binfmt_misc rpcsec_gss_krb5 ip_set ip6table_filter ip6_tables iptable_filter softdog openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate ipmi_ssif intel_rapl_perf snd_pcm snd_timer snd soundcore mgag200 pcspkr ttm drm_kms_helper joydev input_leds drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt mei_me mei shpchp ioatdma lpc_ich wmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core nfsd iscsi_tcp
Jan 24 15:03:23 mjolnir kernel: [71447.081098]  auth_rpcgss libiscsi_tcp nfs_acl libiscsi lockd scsi_transport_iscsi grace sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ses enclosure hid_generic usbkbd usbmouse usbhid hid mpt3sas raid_class ahci i2c_i801 ixgbe(O) libahci igb(O) isci dca ptp libsas pps_core scsi_transport_sas
Jan 24 15:03:23 mjolnir kernel: [71447.083475] CPU: 31 PID: 19251 Comm: splunkd Tainted: P      D    O    4.13.13-5-pve #1
Jan 24 15:03:23 mjolnir kernel: [71447.084297] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
Jan 24 15:03:23 mjolnir kernel: [71447.085136] task: ffff91710b0b2e80 task.stack: ffffb2dfdb478000
Jan 24 15:03:23 mjolnir kernel: [71447.085987] RIP: 0010:0x5c0
Jan 24 15:03:23 mjolnir kernel: [71447.086791] RSP: 0018:ffffb2dfdb47bf50 EFLAGS: 00010202
Jan 24 15:03:23 mjolnir kernel: [71447.087585] RAX: 000000000000270f RBX: 0000000000000000 RCX: 00007f7eec3c5259
Jan 24 15:03:23 mjolnir kernel: [71447.088370] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006
Jan 24 15:03:23 mjolnir kernel: [71447.089135] RBP: 0000000000000000 R08: 000000000000000a R09: 00007f7eebeae630
Jan 24 15:03:23 mjolnir kernel: [71447.089880] R10: 00000000000005c0 R11: ffff91710b0b2e80 R12: 0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.090608] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.091319] FS:  00007f7eed79c780(0000) GS:ffff917a1f5c0000(0000) knlGS:0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.092019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 24 15:03:23 mjolnir kernel: [71447.092703] CR2: 00000000000005c0 CR3: 0000002f87608004 CR4: 00000000000606e0
Jan 24 15:03:23 mjolnir kernel: [71447.093407] Call Trace:
Jan 24 15:03:23 mjolnir kernel: [71447.094078]  ? entry_SYSCALL_64_fastpath+0x33/0xa3
Jan 24 15:03:23 mjolnir kernel: [71447.094739] Code:  Bad RIP value.
Jan 24 15:03:23 mjolnir kernel: [71447.096051] CR2: 00000000000005c0
Jan 24 15:03:23 mjolnir kernel: [71447.096712] ---[ end trace 2bf706acbbb22616 ]---
Jan 24 15:03:23 mjolnir kernel: [71447.495106] PGD 0
Jan 24 15:03:23 mjolnir kernel: [71447.495107] P4D 0
Jan 24 15:03:23 mjolnir kernel: [71447.495854]
Jan 24 15:03:23 mjolnir kernel: [71447.497157] Oops: 0010 [#5] SMP PTI
Jan 24 15:03:23 mjolnir kernel: [71447.497817] Modules linked in: binfmt_misc rpcsec_gss_krb5 ip_set ip6table_filter ip6_tables iptable_filter softdog openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate ipmi_ssif intel_rapl_perf snd_pcm snd_timer snd soundcore mgag200 pcspkr ttm drm_kms_helper joydev input_leds drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt mei_me mei shpchp ioatdma lpc_ich wmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core nfsd iscsi_tcp
Jan 24 15:03:23 mjolnir kernel: [71447.502825]  auth_rpcgss libiscsi_tcp nfs_acl libiscsi lockd scsi_transport_iscsi grace sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ses enclosure hid_generic usbkbd usbmouse usbhid hid mpt3sas raid_class ahci i2c_i801 ixgbe(O) libahci igb(O) isci dca ptp libsas pps_core scsi_transport_sas
Jan 24 15:03:23 mjolnir kernel: [71447.505193] CPU: 14 PID: 19257 Comm: splunkd Tainted: P      D    O    4.13.13-5-pve #1
Jan 24 15:03:23 mjolnir kernel: [71447.506035] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
Jan 24 15:03:23 mjolnir kernel: [71447.506869] task: ffff91710b0b2e80 task.stack: ffffb2dfdb53c000
Jan 24 15:03:23 mjolnir kernel: [71447.507690] RIP: 0010:0x20
Jan 24 15:03:23 mjolnir kernel: [71447.508488] RSP: 0018:ffffb2dfdb53ff50 EFLAGS: 00010202
Jan 24 15:03:23 mjolnir kernel: [71447.509277] RAX: 000000000000270f RBX: 0000000000000000 RCX: 00007f0f275e2259
Jan 24 15:03:23 mjolnir kernel: [71447.510080] RDX: 0000000000000000 RSI: 00007f0f2700d120 RDI: 0000000000000004
Jan 24 15:03:23 mjolnir kernel: [71447.510843] RBP: 0000000000000000 R08: 00007f0f270a6970 R09: 00007fff7f686110
Jan 24 15:03:23 mjolnir kernel: [71447.511584] R10: 0000000000000020 R11: ffff91710b0b2e80 R12: 0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.512311] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.513018] FS:  00007f0f289b9780(0000) GS:ffff917a1f380000(0000) knlGS:0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.513740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 24 15:03:23 mjolnir kernel: [71447.514421] CR2: 0000000000000020 CR3: 00000025b6ca8004 CR4: 00000000000606e0
Jan 24 15:03:23 mjolnir kernel: [71447.515100] Call Trace:
Jan 24 15:03:23 mjolnir kernel: [71447.515766]  ? entry_SYSCALL_64_fastpath+0x33/0xa3
Jan 24 15:03:23 mjolnir kernel: [71447.516425] Code:  Bad RIP value.
Jan 24 15:03:23 mjolnir kernel: [71447.517758] CR2: 0000000000000020
Jan 24 15:03:23 mjolnir kernel: [71447.518420] ---[ end trace 2bf706acbbb22617 ]---
Jan 24 15:03:23 mjolnir kernel: [71447.667074] PGD 0
Jan 24 15:03:23 mjolnir kernel: [71447.667074] P4D 0
Jan 24 15:03:23 mjolnir kernel: [71447.667074]
Jan 24 15:03:23 mjolnir kernel: [71447.667076] Oops: 0010 [#6] SMP PTI
Jan 24 15:03:23 mjolnir kernel: [71447.667077] Modules linked in: binfmt_misc rpcsec_gss_krb5 ip_set ip6table_filter ip6_tables iptable_filter softdog openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate ipmi_ssif intel_rapl_perf snd_pcm snd_timer snd soundcore mgag200 pcspkr ttm drm_kms_helper joydev input_leds drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt mei_me mei shpchp ioatdma lpc_ich wmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core nfsd iscsi_tcp
Jan 24 15:03:23 mjolnir kernel: [71447.667102]  auth_rpcgss libiscsi_tcp nfs_acl libiscsi lockd scsi_transport_iscsi grace sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ses enclosure hid_generic usbkbd usbmouse usbhid hid mpt3sas raid_class ahci i2c_i801 ixgbe(O) libahci igb(O) isci dca ptp libsas pps_core scsi_transport_sas
Jan 24 15:03:23 mjolnir kernel: [71447.667116] CPU: 26 PID: 19267 Comm: splunkd Tainted: P      D    O    4.13.13-5-pve #1
Jan 24 15:03:23 mjolnir kernel: [71447.667117] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
Jan 24 15:03:23 mjolnir kernel: [71447.667118] task: ffff91710b0b0000 task.stack: ffffb2dfdb668000
Jan 24 15:03:23 mjolnir kernel: [71447.667119] RIP: 0010:0xffff
Jan 24 15:03:23 mjolnir kernel: [71447.667119] RSP: 0018:ffffb2dfdb66bf50 EFLAGS: 00010202
Jan 24 15:03:23 mjolnir kernel: [71447.667120] RAX: 000000000000270f RBX: 0000000000000000 RCX: 00007fec64a4e259
Jan 24 15:03:23 mjolnir kernel: [71447.667121] RDX: 0000000000000044 RSI: 0000000000000004 RDI: 0000555a188da5c0
Jan 24 15:03:23 mjolnir kernel: [71447.667121] RBP: 0000000000000000 R08: 000000000000000f R09: 00007ffcbde76c8d
Jan 24 15:03:23 mjolnir kernel: [71447.667122] R10: 000000000000ffff R11: ffff91710b0b0000 R12: 0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.667122] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.667123] FS:  00007fec65e25780(0000) GS:ffff917a1f480000(0000) knlGS:0000000000000000
Jan 24 15:03:23 mjolnir kernel: [71447.667124] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 24 15:03:23 mjolnir kernel: [71447.667124] CR2: 000000000000ffff CR3: 0000001a8b716006 CR4: 00000000000606e0
Jan 24 15:03:23 mjolnir kernel: [71447.667125] Call Trace:
Jan 24 15:03:23 mjolnir kernel: [71447.667129]  ? entry_SYSCALL_64_fastpath+0x33/0xa3
Jan 24 15:03:23 mjolnir kernel: [71447.667130] Code:  Bad RIP value.
Jan 24 15:03:23 mjolnir kernel: [71447.667132] CR2: 000000000000ffff
Jan 24 15:03:23 mjolnir kernel: [71447.667133] ---[ end trace 2bf706acbbb22618 ]---
Jan 24 15:26:25 mjolnir kernel: [72829.323115] sh (15544): drop_caches: 3
Jan 24 15:28:48 mjolnir kernel: [72972.684478] sh (25897): drop_caches: 3

Code:
root@mjolnir:~# pveversion -V
proxmox-ve: 5.1-36 (running kernel: 4.13.13-5-pve)
pve-manager: 5.1-43 (running version: 5.1-43/bdb08029)
pve-kernel-4.4.40-1-pve: 4.4.40-82
pve-kernel-4.4.83-1-pve: 4.4.83-96
pve-kernel-4.13.13-4-pve: 4.13.13-35
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.95-1-pve: 4.4.95-99
pve-kernel-4.13.8-3-pve: 4.13.8-30
pve-kernel-4.13.13-5-pve: 4.13.13-36
pve-kernel-4.4.8-1-pve: 4.4.8-52
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-20
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-16
pve-qemu-kvm: 2.9.1-6
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9
openvswitch-switch: 2.7.0-2

Code:
root@mjolnir:~# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Model name:            Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
Stepping:              7
CPU MHz:               2600.143
CPU max MHz:           3300.0000
CPU min MHz:           1200.0000
BogoMIPS:              5200.28
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-7,16-23
NUMA node1 CPU(s):     8-15,24-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts

Any ideas? This is a pretty big deal for us right now.
 
>> Oops: 0010 [#4] SMP PTI

maybe related to meltdown protection ?

can you try to add in /etc/default/grub kernel options => nopti

then #update-grub

and reboot
 
yes. please update to the -38 kernel (currently available up to pve-no-subscription).
 
yes. please update to the -38 kernel (currently available up to pve-no-subscription).

At first glance, it seems that the -38 kernel has resolved my issue. I will report back if otherwise, but lets assume resolved at this point.

Thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!