A few hours after upgrade to:
Linux 4.15.18-2-pve #1 SMP PVE 4.15.18-20 (Thu, 16 Aug 2018 11:06:35 +0200)
pve-manager/5.2-7/8d88e66a
I got this.
Aug 24 01:45:09 zpm kernel: [ 8639.650825] DMAR: [DMA Read] Request device [00:1f.6] fault addr fe537000 [fault reason 06] PTE Read access is not set
Seems to be related with PCIE passthrough, however, internal controller at 1f.6 i do not passthrough (this is proxmox network-management interface)
root@zpm:/var/log# lspci | grep 00:1f.6
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V
root@zpm:/var/log#
It did stop working:
Aug 24 01:45:17 zpm kernel: [ 8647.810267] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
Aug 24 01:45:17 zpm kernel: [ 8647.810267] TDH <54>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] TDT <af>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_use <af>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_clean <54>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] buffer_info[next_to_clean]:
Aug 24 01:45:17 zpm kernel: [ 8647.810267] time_stamp <1001fd037>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_watch <56>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] jiffies <1001fd830>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_watch.status <0>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] MAC Status <40080083>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PHY Status <796d>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PHY 1000BASE-T Status <3800>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PHY Extended Status <3000>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PCI Status <10>
Aug 24 01:45:19 zpm kernel: [ 8649.922062] CR2: 00007f47f7dd7b24 CR3: 00000002fd40a004 CR4: 00000000003626f0
Aug 24 01:45:19 zpm kernel: [ 8649.922063] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 24 01:45:19 zpm kernel: [ 8649.922064] <IRQ>
Aug 24 01:45:19 zpm kernel: [ 8649.922068] call_timer_fn+0x32/0x130
Aug 24 01:45:19 zpm kernel: [ 8649.922071] __do_softirq+0x10c/0x2a2
Aug 24 01:45:19 zpm kernel: [ 8649.922086] smp_apic_timer_interrupt+0x79/0x130
Aug 24 01:45:19 zpm kernel: [ 8649.922088] </IRQ>
Aug 24 01:45:19 zpm kernel: [ 8649.922089] RSP: 0018:ffffffff96a03e00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
Aug 24 01:45:19 zpm kernel: [ 8649.922091] RDX: 000007ddf786b375 RSI: fffffffc9efa3dc1 RDI: 0000000000000000
Aug 24 01:45:19 zpm kernel: [ 8649.922091] R10: ffffffff96a03dd0 R11: 0000000000000173 R12: ffff8f824e62cf20
Aug 24 01:45:19 zpm kernel: [ 8649.922093] cpuidle_enter+0x17/0x20
Aug 24 01:45:19 zpm kernel: [ 8649.922110] do_idle+0x19a/0x200
Aug 24 01:45:19 zpm kernel: [ 8649.922112] rest_init+0xae/0xb0
Aug 24 01:45:19 zpm kernel: [ 8649.922115] x86_64_start_reservations+0x24/0x26
Aug 24 01:45:19 zpm kernel: [ 8649.922117] secondary_startup_64+0xa5/0xb0
Aug 24 01:45:19 zpm kernel: [ 8649.922146] ---[ end trace b4f387dca36e7068 ]---
Aug 24 01:45:19 zpm kernel: [ 8649.922758] vmbr0: port 1(eno1) entered disabled state
I do use PCIE passthrough but for another networkcard, not the internal one. Using method:
grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
/etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
in a VM I use the realtek:
/etc/pve/qemu-server/998.conf:
hostpci0: 08:00.0,pcie=1
which is
root@zpm:/etc/pve/qemu-server# lspci | grep 08:00
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
root@zpm:/etc/pve/qemu-server#
That card is passed through to VM perfectly and works OK.
The eno0/internal network-card is the one that stopped working with this.
It was stable on last few kernel versions, just hours after upgrading to 4.15.18-2-pve this occurred.
Anybody any ideas?
Linux 4.15.18-2-pve #1 SMP PVE 4.15.18-20 (Thu, 16 Aug 2018 11:06:35 +0200)
pve-manager/5.2-7/8d88e66a
I got this.
Aug 24 01:45:09 zpm kernel: [ 8639.650825] DMAR: [DMA Read] Request device [00:1f.6] fault addr fe537000 [fault reason 06] PTE Read access is not set
Seems to be related with PCIE passthrough, however, internal controller at 1f.6 i do not passthrough (this is proxmox network-management interface)
root@zpm:/var/log# lspci | grep 00:1f.6
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V
root@zpm:/var/log#
It did stop working:
Aug 24 01:45:17 zpm kernel: [ 8647.810267] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
Aug 24 01:45:17 zpm kernel: [ 8647.810267] TDH <54>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] TDT <af>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_use <af>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_clean <54>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] buffer_info[next_to_clean]:
Aug 24 01:45:17 zpm kernel: [ 8647.810267] time_stamp <1001fd037>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_watch <56>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] jiffies <1001fd830>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] next_to_watch.status <0>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] MAC Status <40080083>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PHY Status <796d>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PHY 1000BASE-T Status <3800>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PHY Extended Status <3000>
Aug 24 01:45:17 zpm kernel: [ 8647.810267] PCI Status <10>
Aug 24 01:45:19 zpm kernel: [ 8649.922062] CR2: 00007f47f7dd7b24 CR3: 00000002fd40a004 CR4: 00000000003626f0
Aug 24 01:45:19 zpm kernel: [ 8649.922063] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 24 01:45:19 zpm kernel: [ 8649.922064] <IRQ>
Aug 24 01:45:19 zpm kernel: [ 8649.922068] call_timer_fn+0x32/0x130
Aug 24 01:45:19 zpm kernel: [ 8649.922071] __do_softirq+0x10c/0x2a2
Aug 24 01:45:19 zpm kernel: [ 8649.922086] smp_apic_timer_interrupt+0x79/0x130
Aug 24 01:45:19 zpm kernel: [ 8649.922088] </IRQ>
Aug 24 01:45:19 zpm kernel: [ 8649.922089] RSP: 0018:ffffffff96a03e00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
Aug 24 01:45:19 zpm kernel: [ 8649.922091] RDX: 000007ddf786b375 RSI: fffffffc9efa3dc1 RDI: 0000000000000000
Aug 24 01:45:19 zpm kernel: [ 8649.922091] R10: ffffffff96a03dd0 R11: 0000000000000173 R12: ffff8f824e62cf20
Aug 24 01:45:19 zpm kernel: [ 8649.922093] cpuidle_enter+0x17/0x20
Aug 24 01:45:19 zpm kernel: [ 8649.922110] do_idle+0x19a/0x200
Aug 24 01:45:19 zpm kernel: [ 8649.922112] rest_init+0xae/0xb0
Aug 24 01:45:19 zpm kernel: [ 8649.922115] x86_64_start_reservations+0x24/0x26
Aug 24 01:45:19 zpm kernel: [ 8649.922117] secondary_startup_64+0xa5/0xb0
Aug 24 01:45:19 zpm kernel: [ 8649.922146] ---[ end trace b4f387dca36e7068 ]---
Aug 24 01:45:19 zpm kernel: [ 8649.922758] vmbr0: port 1(eno1) entered disabled state
I do use PCIE passthrough but for another networkcard, not the internal one. Using method:
grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
/etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
in a VM I use the realtek:
/etc/pve/qemu-server/998.conf:
hostpci0: 08:00.0,pcie=1
which is
root@zpm:/etc/pve/qemu-server# lspci | grep 08:00
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
root@zpm:/etc/pve/qemu-server#
That card is passed through to VM perfectly and works OK.
The eno0/internal network-card is the one that stopped working with this.
It was stable on last few kernel versions, just hours after upgrading to 4.15.18-2-pve this occurred.
Anybody any ideas?