hi All, i am new to pve, just install nvidia P4 on pve but i have problem with vm config. it doesn't show mdevctl in config.
mdevctl inavailable even nvidia-smi has output.
driver has been install but after reboot or some operation nvidia-smi get error
/usr/share/perl5/PVE/QemuServer.pm
root@pve:~# lspci -k
please help. and let you know if need more info i don't know what info provide for diagnostic. thanks!
Code:
root@pve:~# cat /etc/pve/qemu-server/101.conf
args: -uuid 00000000-0000-0000-0000-000000000101
bios: ovmf
boot: order=sata0
cores: 10
cpu: host
efidisk0: local-lvm:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
ide2: none,media=cdrom
machine: pc-q35-8.0
memory: 14096
meta: creation-qemu=8.0.2,ctime=1688742827
name: Win11
net0: virtio=56:24:53:CE:23:89,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
sata0: local-lvm:vm-101-disk-3,size=1T
scsihw: virtio-scsi-single
smbios1: uuid=0535d794-2850-4c6f-aa4d-9fa68f58b374
sockets: 1
tpmstate0: local-lvm:vm-101-disk-1,size=4M,version=v2.0
unused0: local-lvm:vm-101-disk-2
usb0: host=2-1.3,usb3=1
usb1: host=3-1.4,usb3=1
usb2: spice
usb3: host=10c4:ea60,usb3=1
vmgenid: 72fa323e-0154-4c6b-a7d2-cb43915ea3bd
Code:
root@pve:~# pveversion -v
proxmox-ve: 8.0.1 (running kernel: 6.2.16-15-pve)
pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)
pve-kernel-6.2: 8.0.5
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.0
libpve-access-control: 8.0.3
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.5
libpve-guest-common-perl: 5.0.3
libpve-http-server-perl: 5.0.3
libpve-rs-perl: 0.8.3
libpve-storage-perl: 8.0.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 2.99.0-1
proxmox-backup-file-restore: 2.99.0-1
proxmox-kernel-helper: 8.0.2
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.5
pve-cluster: 8.0.1
pve-container: 5.0.3
pve-docs: 8.0.3
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.2
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.4
pve-qemu-kvm: 8.0.2-3
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
Code:
root@pve:~# mdevctl list
root@pve:~#
Code:
root@pve:~# mdevctl types
root@pve:~#
Code:
root@pve:~# nvidia-smi
Failed to initialize NVML: Unknown Error
root@pve:~#
/usr/share/perl5/PVE/QemuServer.pm
Code:
6165 sub cleanup_pci_devices {
6166 my ($vmid, $conf) = @_;
6167
6168 foreach my $key (keys %$conf) {
6169 next if $key !~ m/^hostpci(\d+)$/;
6170 my $hostpciindex = $1;
6171 my $uuid = PVE::SysFSTools::generate_mdev_uuid($vmid, $hostpciindex);
6172 my $d = parse_hostpci($conf->{$key});
6173 if ($d->{mdev}) {
6174 # NOTE: avoid PVE::SysFSTools::pci_cleanup_mdev_device as it requires PCI ID and we
6175 # don't want to break ABI just for this two liner
6176 my $dev_sysfs_dir = "/sys/bus/mdev/devices/$uuid";
6177
6178 # some nvidia vgpu driver versions want to clean the mdevs up themselves, and error
6179 # out when we do it first. so wait for 10 seconds and then try it
6180 if ($d->{ids}->[0]->[0]->{vendor} =~ m/^(0x)?10de$/) {
6181 sleep 10;
6182 }
6183
6184 PVE::SysFSTools::file_write("$dev_sysfs_dir/remove", "1") if -e $dev_sysfs_dir;
6185 }
6186 }
6187 PVE::QemuServer::PCI::remove_pci_reservation($vmid);
6188 }
6189
6190 sub vm_stop_cleanup {
6191 my ($storecfg, $vmid, $conf, $keepActive, $apply_pending_changes) = @_;
6192
6193 eval {
6194
6195 if (!$keepActive) {
--
8644 sub cleanup_pci_devices {
8645 my ($vmid, $conf) = @_;
8646
8647 foreach my $key (keys %$conf) {
8648 next if $key !~ m/^hostpci(\d+)$/;
8649 my $hostpciindex = $1;
8650 my $uuid = PVE::SysFSTools::generate_mdev_uuid($vmid, $hostpciindex);
8651 my $d = parse_hostpci($conf->{$key});
8652 if ($d->{mdev}) {
8653 # NOTE: avoid PVE::SysFSTools::pci_cleanup_mdev_device as it requires PCI ID and we
8654 # don't want to break ABI just for this two liner
8655 my $dev_sysfs_dir = "/sys/bus/mdev/devices/$uuid";
8656
8657 # some nvidia vgpu driver versions want to clean the mdevs up themselves, and error
8658 # out when we do it first. so wait for 10 seconds and then try it
8659 my $pciid = $d->{pciid}->[0]->{id};
8660 my $info = PVE::SysFSTools::pci_device_info("$pciid");
8661 if ($info->{vendor} eq '10de') {
8662 sleep 10;
8663 }
8664 PVE::SysFSTools::file_write("$dev_sysfs_dir/remove", "1") if -e $dev_sysfs_dir;
8665 }
8666 }
8667 PVE::QemuServer::PCI::remove_pci_reservation($vmid);
8668 }
8669
8670 sub del_nets_bridge_fdb {
8671 my ($conf, $vmid) = @_;
8672
8673 for my $opt (keys %$conf) {
8674 next if $opt !~ m/^net(\d+)$/;
root@pve:~# lspci -k
Code:
03:00.0 3D controller: NVIDIA Corporation GP104GL [Tesla P4] (rev a1)
Subsystem: NVIDIA Corporation GP104GL [Tesla P4]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
Code:
root@pve:~# qm start 101
Subroutine cleanup_pci_devices redefined at /usr/share/perl5/PVE/QemuServer.pm line 8644, <DATA> line 960.
swtpm_setup: Not overwriting existing state file.
Subroutine cleanup_pci_devices redefined at /usr/share/perl5/PVE/QemuServer.pm line 8644, <DATA> line 960.
please help. and let you know if need more info i don't know what info provide for diagnostic. thanks!
Last edited: