Follow up: this warning:Slight threadjack:
What version of what driver are you running now?
With kernel 6.14 and patched 16.9 drivers, when I start up a VM that has my P4 passed thru (not using the A5500 patch yet) I have this in the host:
Note: I do not get this with kernel 6.11 on the host (just 6.14)
Code:[ 141.872136] ------------[ cut here ]------------ [ 141.872163] WARNING: CPU: 19 PID: 7908 at ./include/linux/rwsem.h:85 remap_pfn_range_internal+0x4af/0x5a0 [ 141.872192] Modules linked in: ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter sctp ip6_udp_tunnel udp_tunnel nf_tables nvme_fabrics nvme_keyring nfnetlink_cttimeout softdog sunrpc binfmt_misc bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 psample nfnetlink_log nfnetlink nvidia_vgpu_vfio(OE) xfs nvidia(POE) amd_atl intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel ipmi_ssif crypto_simd cryptd mdev ast rapl pcspkr kvm acpi_ipmi ccp k10temp ipmi_si ptdma ipmi_devintf ipmi_msghandler joydev input_leds mac_hid vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd efi_pstore dmi_sysfs ip_tables x_tables autofs4 zfs(PO) spl(O) btrfs blake2b_generic xor raid6_pq hid_generic usbkbd usbmouse mlx4_ib ib_uverbs usbhid ses hid enclosure ib_core mlx4_en mpt3sas igb xhci_pci nvme raid_class i2c_algo_bit ahci [ 141.872278] dca mlx4_core scsi_transport_sas libahci nvme_core xhci_hcd i2c_piix4 i2c_smbus nvme_auth [ 141.872414] CPU: 19 UID: 0 PID: 7908 Comm: CPU 1/KVM Tainted: P OE 6.14.0-2-pve #1 [ 141.872432] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 141.872445] Hardware name: Supermicro Super Server/H11SSL-NC, BIOS 3.0 07/01/2024 [ 141.872460] RIP: 0010:remap_pfn_range_internal+0x4af/0x5a0 [ 141.872473] Code: 31 db c3 cc cc cc cc 48 8b 7d a8 4c 89 fa 4c 89 ce 4c 89 4d c0 e8 81 e2 ff ff 85 c0 75 9c 4c 8b 4d c0 4d 8b 01 e9 aa fd ff ff <0f> 0b e9 d7 fb ff ff 0f 0b 48 8b 7d a8 4c 89 fa 48 89 de 4c 89 45 [ 141.873338] RSP: 0018:ffffae58873774b0 EFLAGS: 00010246 [ 141.873726] RAX: 00000000280200fb RBX: ffff8d25d7b6a0b8 RCX: 0000000000001000 [ 141.874111] RDX: 0000000000000000 RSI: 00007bb81fe00000 RDI: ffff8d25d7b6a0b8 [ 141.874520] RBP: ffffae5887377568 R08: 8000000000000037 R09: 0000000000000000 [ 141.874903] R10: 0000000000000000 R11: ffff8d254bc47380 R12: 000000002000fdf1 [ 141.875297] R13: 00007bb81fe01000 R14: 00007bb81fe00000 R15: 8000000000000037 [ 141.875683] FS: 00007bbc4affd6c0(0000) GS:ffff8d638e980000(0000) knlGS:0000000000000000 [ 141.876074] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 141.876469] CR2: 0000000000b50b10 CR3: 00000001ce5b8000 CR4: 0000000000350ef0 [ 141.876859] Call Trace: [ 141.877245] <TASK> [ 141.877620] ? show_regs+0x6c/0x80 [ 141.877994] ? __warn+0x8d/0x150 [ 141.878367] ? remap_pfn_range_internal+0x4af/0x5a0 [ 141.878731] ? report_bug+0x182/0x1b0 [ 141.879091] ? handle_bug+0x6e/0xb0 [ 141.879453] ? exc_invalid_op+0x18/0x80 [ 141.879809] ? asm_exc_invalid_op+0x1b/0x20 [ 141.880172] ? remap_pfn_range_internal+0x4af/0x5a0 [ 141.880530] ? pat_pagerange_is_ram+0x7a/0xa0 [ 141.880886] ? memtype_lookup+0x3b/0x70 [ 141.881244] ? lookup_memtype+0xd1/0xf0 [ 141.881595] remap_pfn_range+0x5c/0xb0 [ 141.881946] ? up+0x58/0xa0 [ 141.882302] vgpu_mmio_fault_wrapper+0x1fa/0x340 [nvidia_vgpu_vfio] [ 141.882661] __do_fault+0x3a/0x180 [ 141.883016] do_fault+0xca/0x4f0 [ 141.883373] __handle_mm_fault+0x840/0x10b0 [ 141.883717] handle_mm_fault+0x1a5/0x360 [ 141.884056] __get_user_pages+0x1f2/0x15d0 [ 141.884402] get_user_pages_unlocked+0xe7/0x370 [ 141.884732] hva_to_pfn+0x380/0x4c0 [kvm] [ 141.885127] ? __perf_event_task_sched_out+0x5a/0x4a0 [ 141.885447] kvm_follow_pfn+0x97/0x100 [kvm] [ 141.885825] __kvm_faultin_pfn+0x5c/0x90 [kvm] [ 141.886194] kvm_mmu_faultin_pfn+0x19d/0x6e0 [kvm] [ 141.886576] kvm_tdp_page_fault+0x8e/0xe0 [kvm] [ 141.886938] kvm_mmu_do_page_fault+0x243/0x290 [kvm] [ 141.887301] kvm_mmu_page_fault+0x8e/0x6d0 [kvm] [ 141.887646] ? nv_vgpu_vfio_access+0x2d4/0x450 [nvidia_vgpu_vfio] [ 141.887915] npf_interception+0xba/0x190 [kvm_amd] [ 141.888181] svm_invoke_exit_handler+0x182/0x1b0 [kvm_amd] [ 141.888448] svm_handle_exit+0xa2/0x200 [kvm_amd] [ 141.888705] vcpu_enter_guest+0x4e8/0x1640 [kvm] [ 141.889033] ? kvm_arch_vcpu_load+0xac/0x290 [kvm] [ 141.889359] ? restore_fpregs_from_fpstate+0x3d/0xd0 [ 141.889599] kvm_arch_vcpu_ioctl_run+0x35d/0x750 [kvm] [ 141.889903] kvm_vcpu_ioctl+0x2c2/0xaa0 [kvm] [ 141.890195] ? kvm_vcpu_ioctl+0x23e/0xaa0 [kvm] [ 141.890488] ? nv_vfio_mdev_read+0x23/0x70 [nvidia_vgpu_vfio] [ 141.890713] __x64_sys_ioctl+0xa4/0xe0 [ 141.890933] x64_sys_call+0xb45/0x2540 [ 141.891148] do_syscall_64+0x7e/0x170 [ 141.891364] ? syscall_exit_to_user_mode+0x38/0x1d0 [ 141.891575] ? do_syscall_64+0x8a/0x170 [ 141.891781] ? arch_exit_to_user_mode_prepare.constprop.0+0xc8/0xd0 [ 141.891989] ? syscall_exit_to_user_mode+0x38/0x1d0 [ 141.892193] ? do_syscall_64+0x8a/0x170 [ 141.892405] ? syscall_exit_to_user_mode+0x38/0x1d0 [ 141.892613] ? do_syscall_64+0x8a/0x170 [ 141.892820] ? sysvec_apic_timer_interrupt+0x57/0xc0 [ 141.893029] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 141.893245] RIP: 0033:0x7bbc53e81d1b [ 141.893496] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 141.893983] RSP: 002b:00007bbc4aff7ee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 141.894270] RAX: ffffffffffffffda RBX: 00005da48d103680 RCX: 00007bbc53e81d1b [ 141.894518] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000030 [ 141.894764] RBP: 000000000000ae80 R08: 0000000000000000 R09: 0000000000000000 [ 141.895011] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 141.895264] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [ 141.895508] </TASK> [ 141.895744] ---[ end trace 0000000000000000 ]---
In addition, I get this in the VM (with 6.11 and 6.14 running on the host)
Code:[ 141.872136] [ 3.407489] ------------[ cut here ]------------ [ 3.407491] WARNING: CPU: 1 PID: 560 at drivers/pci/msi/msi.c:888 __pci_enable_msi_range+0x1b3/0x1d0 [ 3.407500] Modules linked in: overlay lz4 lz4_compress zram zsmalloc binfmt_misc nls_ascii nls_cp437 vfat fat nvidia_drm(POE) nvidia_modeset(POE) intel_rapl_msr intel_rapl_common nvidia(POE) kvm_amd ccp kvm nouveau irqbypass ghash_clmulni_intel sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core mxm_wmi video snd_hwdep wmi snd_pcm iTCO_wdt drm_display_helper aesni_intel cec snd_timer rc_core intel_pmc_bxt crypto_simd iTCO_vendor_support snd cryptd pcspkr hid_generic watchdog i2c_algo_bit virtio_console soundcore button joydev evdev sg serio_raw fuse loop efi_pstore dm_mod configfs efivarfs qemu_fw_cfg ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic usbhid hid sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic virtio_net virtio_scsi net_failover failover virtio_pci ahci virtio_pci_legacy_dev libahci ehci_pci virtio_pci_modern_dev virtio crct10dif_pclmul libata crct10dif_common [ 3.407572] virtio_ring bochs crc32_pclmul drm_vram_helper uhci_hcd crc32c_intel drm_kms_helper scsi_mod psmouse drm_ttm_helper ttm i2c_i801 scsi_common i2c_smbus lpc_ich ehci_hcd drm usbcore usb_common [ 3.407585] CPU: 1 PID: 560 Comm: nvidia-gridd Tainted: P W OE 6.1.0-33-amd64 #1 Debian 6.1.133-1 [ 3.407589] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2025.02-3 04/03/2025 [ 3.407590] RIP: 0010:__pci_enable_msi_range+0x1b3/0x1d0 [ 3.407593] Code: 4c 89 ef e8 df fb ff ff 89 c6 85 c0 0f 84 68 ff ff ff 78 0e 39 c5 7f 8c 4d 85 e4 75 cc 41 89 f6 eb d8 41 89 c6 e9 50 ff ff ff <0f> 0b 41 be ea ff ff ff e9 43 ff ff ff 41 be de ff ff ff e9 38 ff [ 3.407595] RSP: 0018:ffffa973c0813998 EFLAGS: 00010202 [ 3.407597] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000 [ 3.407598] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9b1c40fb7000 [ 3.407599] RBP: 0000000000000001 R08: 0000000000000001 R09: ffff9b1c4b759708 [ 3.407600] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 3.407601] R13: ffff9b1c40fb7000 R14: ffff9b1c4586d3e0 R15: ffff9b1c4586d000 [ 3.407604] FS: 00007fb7dd066040(0000) GS:ffff9b1fafc80000(0000) knlGS:0000000000000000 [ 3.407605] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.407607] CR2: 0000000000b50b10 CR3: 0000000108f32000 CR4: 0000000000350ee0 [ 3.407610] Call Trace: [ 3.407612] <TASK> [ 3.407615] ? __warn+0x7d/0xc0 [ 3.407618] ? __pci_enable_msi_range+0x1b3/0x1d0 [ 3.407620] ? report_bug+0xe2/0x150 [ 3.407623] ? handle_bug+0x41/0x70 [ 3.407627] ? exc_invalid_op+0x13/0x60 [ 3.407629] ? asm_exc_invalid_op+0x16/0x20 [ 3.407634] ? __pci_enable_msi_range+0x1b3/0x1d0 [ 3.407636] pci_enable_msi+0x16/0x30 [ 3.407638] nv_init_msi+0x1a/0xe0 [nvidia] [ 3.408007] nv_open_device+0x843/0x940 [nvidia] [ 3.408364] nvidia_open+0x361/0x610 [nvidia] [ 3.408721] ? kobj_lookup+0xf1/0x160 [ 3.408725] nvidia_frontend_open+0x50/0xa0 [nvidia] [ 3.409109] chrdev_open+0xc1/0x250 [ 3.409113] ? __unregister_chrdev+0x50/0x50 [ 3.409116] do_dentry_open+0x1e2/0x410 [ 3.409119] path_openat+0xb7d/0x1260 [ 3.409122] do_filp_open+0xaf/0x160 [ 3.409126] do_sys_openat2+0xaf/0x170 [ 3.409128] __x64_sys_openat+0x6a/0xa0 [ 3.409131] do_syscall_64+0x55/0xb0 [ 3.409134] ? call_rcu+0xde/0x630 [ 3.409137] ? mntput_no_expire+0x4a/0x250 [ 3.409141] ? kmem_cache_free+0x15/0x310 [ 3.409144] ? do_unlinkat+0xb8/0x320 [ 3.409146] ? exit_to_user_mode_prepare+0x40/0x1e0 [ 3.409149] ? syscall_exit_to_user_mode+0x1e/0x40 [ 3.409150] ? do_syscall_64+0x61/0xb0 [ 3.409153] ? exit_to_user_mode_prepare+0x40/0x1e0 [ 3.409155] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 3.409157] RIP: 0033:0x7fb7dcc36fc1 [ 3.409159] Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 2a 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 93 00 00 00 48 8b 54 24 28 64 48 2b 14 25 [ 3.409161] RSP: 002b:00007fffd8dede30 EFLAGS: 00000202 ORIG_RAX: 0000000000000101 [ 3.409163] RAX: ffffffffffffffda RBX: 0000000000080002 RCX: 00007fb7dcc36fc1 [ 3.409164] RDX: 0000000000080002 RSI: 00007fffd8dedec0 RDI: 00000000ffffff9c [ 3.409165] RBP: 00007fffd8dedec0 R08: 0000000000000000 R09: 0000000000000064 [ 3.409166] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fffd8dedfec [ 3.409167] R13: 0000000000c22560 R14: 0000000000c22560 R15: 0000000000c22560 [ 3.409169] </TASK> [ 3.409170] ---[ end trace 0000000000000000 ]--- [ 3.409171] NVRM: GPU 0000:01:00.0: Failed to enable MSI; falling back to PCIe virtual-wire interrupts.
Everything seems to be working, regardless of these warnings. I just don't remember seeing it before I started changing the kernels and updating the drivers.
Is there any real benefit to using the 17.x drivers and mocking an A5500? To apply the A5500 patch, I guess I need to reverse and reinstall the vgpu_unlock-rs but using GreenDamTan repo and instructions?
`WARNING: CPU: 1 PID: 560 at drivers/pci/msi/msi.c:888 __pci_enable_msi_range+0x1b3/0x1d0`
was because in my VM I did not have the nouveau driver blacklisted.
I am, however, still getting this warning on the host:
`WARNING: CPU: 5 PID: 24912 at ./include/linux/rwsem.h:85 remap_pfn_range_internal+0x4af/0x5a0`
I only get that warning when I start up a VM that has the vGPU passed to it. Either that 6.14 kernel, or the 17.5 drivers introduced that (or maybe the latest qemu that was installed with 8.4?) because I didn't have that before I updated everything.
It seems to be working, so I'm not worried.