Hi,
I'm running a Proxmox VE 5.4-13 on the latest kernel version 4.15.18-19-pve and I just had a kernel Oops overnight on the host machine, which froze up a process and made the whole system unstable:
Does anyone have any experience with this and how to prevent it in the future?
Thanks!
I'm running a Proxmox VE 5.4-13 on the latest kernel version 4.15.18-19-pve and I just had a kernel Oops overnight on the host machine, which froze up a process and made the whole system unstable:
Code:
[18118.496585] BUG: unable to handle kernel NULL pointer dereference at 000000000000002c
[18118.496640] IP: iget5_locked+0x9e/0x1f0
[18118.496672] PGD 0 P4D 0
[18118.496703] Oops: 0000 [#1] SMP PTI
[18118.497893] Modules linked in: tcp_diag inet_diag ebtable_filter ebtables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables veth nf_conntrack_netlink xfrm_user xfrm_algo xt_physdev xt_comment xt_mark xt_set xt_addrtype xt_conntrack ip_set_hash_net ip_set aufs ipt_REJECT nf_reject_ipv4 xt_multiport xt_nat xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack overlay iptable_filter wireguard(O) ip6_udp_tunnel udp_tunnel softdog nfnetlink_log nfnetlink xfs dm_crypt dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio snd_hda_codec_realtek snd_hda_codec_generic i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass drm_kms_helper crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
[18118.498123] pcbc drm aesni_intel i2c_algo_bit aes_x86_64 crypto_simd fb_sys_fops syscopyarea sysfillrect sysimgblt glue_helper snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ie31200_edac eeepc_wmi asus_wmi wmi_bmof sparse_keymap cryptd snd_timer snd shpchp intel_cstate intel_rapl_perf lpc_ich soundcore input_leds mac_hid video wmi serio_raw vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi sunrpc scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ahci i2c_i801 libahci e1000e(O) ptp pps_core megaraid_sas
[18118.498326] CPU: 3 PID: 6840 Comm: puma 003 Tainted: G O 4.15.18-19-pve #1
[18118.498372] Hardware name: System manufacturer System Product Name/P8B WS, BIOS 9921 07/18/2018
[18118.498422] RIP: 0010:iget5_locked+0x9e/0x1f0
[18118.498455] RSP: 0018:ffffb9a2c8e93b38 EFLAGS: 00010246
[18118.498489] RAX: 0000000000000000 RBX: ffffffff830085c0 RCX: ffff90fae4ddab68
[18118.498525] RDX: 0000000000000001 RSI: ffff90fae4ddab68 RDI: ffffffff830085c0
[18118.498563] RBP: ffffb9a2c8e93b78 R08: ffff90fae4ddab68 R09: ffff90fbc8ce3220
[18118.498600] R10: 0000000000000002 R11: 0000000000000000 R12: ffff90fc72c31800
[18118.498636] R13: ffffb9a2c2bd1eb0 R14: ffff90fae4ddab68 R15: ffffffffffffff8c
[18118.498673] FS: 00007f49d43fd700(0000) GS:ffff90fc9f2c0000(0000) knlGS:0000000000000000
[18118.498720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[18118.498756] CR2: 000000000000002c CR3: 0000000748fc4004 CR4: 00000000001626e0
[18118.498792] Call Trace:
[18118.498828] ? ovl_get_origin_fh+0x23/0x140 [overlay]
[18118.498863] ? ovl_inode_test+0x20/0x20 [overlay]
[18118.498900] ? ovl_lock_rename_workdir+0x50/0x50 [overlay]
[18118.498937] ovl_get_inode+0xa5/0x3d0 [overlay]
[18118.498972] ovl_lookup+0x26d/0x750 [overlay]
[18118.499006] path_openat+0x1233/0x15c0
[18118.499038] ? path_openat+0x1233/0x15c0
[18118.499074] do_filp_open+0x99/0x110
[18118.499108] ? __check_object_size+0xb3/0x190
[18118.499142] ? __alloc_fd+0x46/0x170
[18118.499174] do_sys_open+0x135/0x280
[18118.499207] ? do_sys_open+0x135/0x280
[18118.499242] SyS_openat+0x14/0x20
[18118.499275] do_syscall_64+0x73/0x130
[18118.499309] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[18118.499343] RIP: 0033:0x7f49f58c4dae
[18118.499375] RSP: 002b:00007f49d43f88c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[18118.499422] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f49f58c4dae
[18118.499459] RDX: 0000000000080002 RSI: 00007f49d3644d00 RDI: 00000000ffffff9c
[18118.499495] RBP: 00007f49dd9e7748 R08: 0000000000000000 R09: 00007f49f3c1e288
[18118.499531] R10: 0000000000000000 R11: 0000000000000293 R12: 00007f49d43fbe80
[18118.499569] R13: 0000557807cc1900 R14: 00007f49d697d600 R15: 00007f49d43f89b0
[18118.499607] Code: d0 4c 89 e7 4c 89 f1 4c 89 ee e8 2e e9 ff ff 48 89 df 49 89 c7 c6 07 00 0f 1f 40 00 4d 85 ff 74 4e e8 67 f4 72 00 e8 62 f4 72 00 <49> 8b 87 a0 00 00 00 a8 08 74 1d 49 8d bf a0 00 00 00 b9 02 00
[18118.499694] RIP: iget5_locked+0x9e/0x1f0 RSP: ffffb9a2c8e93b38
[18118.499730] CR2: 000000000000002c
[18118.499776] ---[ end trace 2a2f8f2fc8b9efd4 ]---
Code:
pve-cluster:amd64/stretch 5.0-37 uptodate
pve-container:all/stretch 2.0-40 uptodate
pve-docs:all/stretch 5.4-2 uptodate
pve-edk2-firmware:all/stretch 1.20190312-1 uptodate
pve-firewall:amd64/stretch 3.0-22 uptodate
pve-firmware:all/stretch 2.0-7 uptodate
pve-ha-manager:amd64/stretch 2.0-9 uptodate
pve-headers:all/stretch 5.4-2 uptodate
pve-headers-4.15:all/stretch 5.4-7 uptodate
pve-headers-4.15.18-17-pve:amd64/stretch 4.15.18-43 uptodate
pve-headers-4.15.18-19-pve:amd64/stretch 4.15.18-45 uptodate
pve-i18n:all/stretch 1.1-4 uptodate
pve-kernel-4.15:all/stretch 5.4-7 uptodate
pve-kernel-4.15.18-19-pve:amd64/stretch 4.15.18-45 uptodate
pve-libspice-server1:amd64/stretch 0.14.1-2 uptodate
pve-manager:amd64/stretch 5.4-13 uptodate
pve-qemu-kvm:amd64/stretch 3.0.1-4 uptodate
pve-xtermjs:amd64/stretch 3.12.0-1 uptodate
libpve-access-control:amd64/stretch 5.1-12 uptodate
libpve-apiclient-perl:all/stretch 2.0-5 uptodate
libpve-common-perl:all/stretch 5.0-54 uptodate
libpve-guest-common-perl:all/stretch 2.0-20 uptodate
libpve-http-server-perl:all/stretch 2.0-14 uptodate
libpve-storage-perl:all/stretch 5.0-44 uptodate
libpve-u2f-server-perl:amd64/stretch 1.0-2 uptodate
- Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
- 32GB DDR3 ECC RAM
- MegaRAID SAS 2108 RAID1
- Intel 82574L Gigabit Network
- Intel C206 chipset
Does anyone have any experience with this and how to prevent it in the future?
Thanks!
Last edited: