Page fault and DMAR errors

Isnubi

New Member
Jul 25, 2023
5
0
1
Hello, I got a problem on my Proxmox server since months.

Multiples times per day, I got some of my VM that goes into page fault, here's an extract of dmesg:
Code:
[ 2639.523165] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 2639.523298] #PF: supervisor instruction fetch in kernel mode
[ 2639.523380] #PF: error_code(0x0010) - not-present page
[ 2639.523462] PGD 0 P4D 0
[ 2639.523534] Oops: 0010 [#1] SMP PTI
[ 2639.523616] CPU: 1 PID: 364 Comm: dockerd Not tainted 5.10.0-23-amd64 #1 Debian 5.10.179-1
[ 2639.523719] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
[ 2639.523842] RIP: 0010:0x0
[ 2639.523901] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 2639.523984] RSP: 0018:ffffb14bc0423df8 EFLAGS: 00010293
[ 2639.524063] RAX: 0000000000000000 RBX: ffff91d28137e2a0 RCX: ffff91d36bb95d98
[ 2639.524149] RDX: ffffb14bc0423ea0 RSI: ffffb14bc0423e10 RDI: ffff91d28137e240
[ 2639.524263] RBP: ffff91d28137e290 R08: ffff91d282c54f00 R09: ffff91d28137e240
[ 2639.524350] R10: ffff91d28137e290 R11: 0000000000000000 R12: ffffb14bc0423e10
[ 2639.524437] R13: 0000000000000000 R14: ffffb14bc0423ea0 R15: ffff91d28137e240
[ 2639.524526] FS:  00007f0706269700(0000) GS:ffff91d3b7d00000(0000) knlGS:0000000000000000
[ 2639.524617] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2639.524697] CR2: ffffffffffffffd6 CR3: 0000000107882000 CR4: 00000000000006e0
[ 2639.524789] Call Trace:
[ 2639.524872]  ep_scan_ready_list.constprop.0+0xab/0x1e0
[ 2639.524980]  do_epoll_wait+0x247/0x670
[ 2639.525055]  ? ep_read_events_proc+0xe0/0xe0
[ 2639.525132]  ? ep_unregister_pollwait.constprop.0+0xa0/0xa0
[ 2639.525214]  __x64_sys_epoll_pwait+0x49/0xb0
[ 2639.525292]  do_syscall_64+0x33/0x80
[ 2639.525408]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[ 2639.525493] RIP: 0033:0x5604bbeac46e
[ 2639.525565] Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[ 2639.525749] RSP: 002b:00007f07062684d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000119
[ 2639.525838] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00005604bbeac46e
[ 2639.525923] RDX: 0000000000000080 RSI: 00007f0706268598 RDI: 0000000000000005
[ 2639.526007] RBP: 00007f0706268518 R08: 0000000000000000 R09: 0000000000000000
[ 2639.526090] R10: 0000000000000007 R11: 0000000000000246 R12: 00007f07062685a8
[ 2639.526175] R13: 0000000000000001 R14: 000000c000007520 R15: 000000c000078c00
[ 2639.526277] Modules linked in: ip_vs_rr xt_ipvs ip_vs xt_nat veth vxlan ip6_udp_tunnel udp_tunnel xt_policy xt_mark xt_bpf xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc overlay bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper sg evdev joydev cec serio_raw virtio_balloon virtio_console pcspkr qemu_fw_cfg button drm fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic hid_generic usbhid hid sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common virtio_net net_failover virtio_scsi failover uhci_hcd ehci_hcd ata_generic usbcore psmouse ata_piix libata scsi_mod virtio_pci i2c_piix4 virtio_ring virtio usb_common floppy
[ 2639.527177] CR2: 0000000000000000
[ 2639.527340] ---[ end trace e848c583dab10e61 ]---
[ 2639.527462] RIP: 0010:0x0
[ 2639.527553] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 2639.527683] RSP: 0018:ffffb14bc0423df8 EFLAGS: 00010293
[ 2639.527807] RAX: 0000000000000000 RBX: ffff91d28137e2a0 RCX: ffff91d36bb95d98
[ 2639.530404] RDX: ffffb14bc0423ea0 RSI: ffffb14bc0423e10 RDI: ffff91d28137e240
[ 2639.532555] RBP: ffff91d28137e290 R08: ffff91d282c54f00 R09: ffff91d28137e240
[ 2639.534689] R10: ffff91d28137e290 R11: 0000000000000000 R12: ffffb14bc0423e10
[ 2639.536602] R13: 0000000000000000 R14: ffffb14bc0423ea0 R15: ffff91d28137e240
[ 2639.538502] FS:  00007f0706269700(0000) GS:ffff91d3b7d00000(0000) knlGS:0000000000000000
[ 2639.540321] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2639.541942] CR2: ffffffffffffffd6 CR3: 0000000107882000 CR4: 00000000000006e0

At the same time this error appear, it create an error in the console of my server:
Code:
[ 3381.322269] DMAR: DRHD: handling fault status reg 102
[ 3381.322450] DMAR: [INTR-REMAP] Request device [02:00.0] fault index 0x2f [fault reason 0x26] Blocked an interrupt request due to source-id verification failure
1693931363267.png

I already run a memtest on my server and change defective RAM.
I disable intel_iommu in the grub config to avoid PCI Pass-Through.

I don't know from where it can come from.

Here's info of my configuration:
- Server: HP Proliant DL580 G7
- Processors : 4x Intel Xeon E7-4820
- RAM: nearly 400Gb of memory
Here's the result of pveversion -v:
Code:
proxmox-ve: 7.4-1 (running kernel: 5.15.111-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-5
pve-kernel-5.11: 7.0-10
pve-kernel-5.15.111-1-pve: 5.15.111-1
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.11.22-7-pve: 5.11.22-12
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!