4.3 nvidia k80 gpu passthrough lost one gpu when nvidia driver installed

zuni11

Active Member
Aug 30, 2016
6
0
41
45
I passthrough one K80 to guest vm(centos 7).
Everything is ok at 4.2.and when i update to 4.3,under guest os nvidia-smi just show one gpu,not two,please help me.

my configs:
---------------------------------------------------------------------------------------
root@pve2:~# pveversion -v
proxmox-ve: 4.3-72 (running kernel: 4.4.24-1-pve)
pve-manager: 4.3-12 (running version: 4.3-12/6894c9d9)
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.4.24-1-pve: 4.4.24-72
pve-kernel-4.4.19-1-pve: 4.4.19-66
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-47
qemu-server: 4.0-96
pve-firmware: 1.1-10
libpve-common-perl: 4.0-83
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-68
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.3-17
pve-qemu-kvm: 2.7.0-8
pve-container: 1.0-85
pve-firewall: 2.0-31
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-1
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
----------------------------------------------------------------------------------------
root@pve2:~# qm config 100
args: -machine pc,max-ram-below-4g=1G
bios: ovmf
bootdisk: scsi0
cores: 4
efidisk0: local-lvm:vm-106-disk-2,size=128K
hostpci0: 83:00,pcie=1
hostpci1: 84:00,pcie=1
ide2: local:iso/CentOS-7-x86_64-Minimal-1511.iso,media=cdrom
machine: pc-q35-2.6
memory: 4096
name: centos7-uefi
net0: virtio=16:01:BB:33:9A:8D,bridge=vmbr0
numa: 1
ostype: l26
scsi0: local-lvm:vm-106-disk-1,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=155bd52d-a33d-474c-b157-7d76a3a51019
sockets: 1
------------------------------------------------------------------------------------------------
lspci under guest os:
show 2 gpus.

[root@localhost ~]# lspci | grep NVIDIA
01:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
02:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
-----------------------------------------------------------------------------------------------
[root@localhost ~]# nvidia-smi -L
GPU 0: Tesla K80 (UUID: GPU-c428959f-b550-bbaf-e26f-a946e8dd7b1f)
only show 1 GPU.
-----------------------------------------------------------------------------------------------
[root@localhost ~]# nvidia-smi
Fri Dec 2 05:30:04 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.99 Driver Version: 352.99 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:01:00.0 Off | 0 |
| N/A 37C P0 56W / 149W | 55MiB / 11519MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
only show 1 GPU.
----------------------------------------------------------------------------------------------------------------

i don't know what different between 4.2 and 4.3,what can i do now?
 
Today I try to reinstall proxmox 4.2,and do the same thing above.
everything is ok.
nvidia driver can detect two gpus.

root@ubuntu1510:~$ nvidia-smi
Tue Dec 6 02:12:27 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.99 Driver Version: 352.99 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:01:00.0 Off | 0 |
| N/A 35C P0 55W / 149W | 55MiB / 11519MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 0000:02:00.0 Off | 0 |
| N/A 30C P0 72W / 149W | 55MiB / 11519MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
-----------------------------------------------------------------------------------------------------------------------------------------------------------

so i am sure this issue is under proxmox 4.3 only.