4.3 nvidia k80 gpu passthrough lost one gpu when nvidia driver installed

zuni11

Active Member
Aug 30, 2016
6
0
41
44
I passthrough one K80 to guest vm(centos 7).
Everything is ok at 4.2.and when i update to 4.3,under guest os nvidia-smi just show one gpu,not two,please help me.

my configs:
---------------------------------------------------------------------------------------
root@pve2:~# pveversion -v
proxmox-ve: 4.3-72 (running kernel: 4.4.24-1-pve)
pve-manager: 4.3-12 (running version: 4.3-12/6894c9d9)
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.4.24-1-pve: 4.4.24-72
pve-kernel-4.4.19-1-pve: 4.4.19-66
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-47
qemu-server: 4.0-96
pve-firmware: 1.1-10
libpve-common-perl: 4.0-83
libpve-access-control: 4.0-19
libpve-storage-perl: 4.0-68
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.3-17
pve-qemu-kvm: 2.7.0-8
pve-container: 1.0-85
pve-firewall: 2.0-31
pve-ha-manager: 1.0-38
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 2.0.6-1
lxcfs: 2.0.5-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve13~bpo80
----------------------------------------------------------------------------------------
root@pve2:~# qm config 100
args: -machine pc,max-ram-below-4g=1G
bios: ovmf
bootdisk: scsi0
cores: 4
efidisk0: local-lvm:vm-106-disk-2,size=128K
hostpci0: 83:00,pcie=1
hostpci1: 84:00,pcie=1
ide2: local:iso/CentOS-7-x86_64-Minimal-1511.iso,media=cdrom
machine: pc-q35-2.6
memory: 4096
name: centos7-uefi
net0: virtio=16:01:BB:33:9A:8D,bridge=vmbr0
numa: 1
ostype: l26
scsi0: local-lvm:vm-106-disk-1,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=155bd52d-a33d-474c-b157-7d76a3a51019
sockets: 1
------------------------------------------------------------------------------------------------
lspci under guest os:
show 2 gpus.

[root@localhost ~]# lspci | grep NVIDIA
01:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
02:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
-----------------------------------------------------------------------------------------------
[root@localhost ~]# nvidia-smi -L
GPU 0: Tesla K80 (UUID: GPU-c428959f-b550-bbaf-e26f-a946e8dd7b1f)
only show 1 GPU.
-----------------------------------------------------------------------------------------------
[root@localhost ~]# nvidia-smi
Fri Dec 2 05:30:04 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.99 Driver Version: 352.99 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:01:00.0 Off | 0 |
| N/A 37C P0 56W / 149W | 55MiB / 11519MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
only show 1 GPU.
----------------------------------------------------------------------------------------------------------------

i don't know what different between 4.2 and 4.3,what can i do now?
 
Today I try to reinstall proxmox 4.2,and do the same thing above.
everything is ok.
nvidia driver can detect two gpus.

root@ubuntu1510:~$ nvidia-smi
Tue Dec 6 02:12:27 2016
+------------------------------------------------------+
| NVIDIA-SMI 352.99 Driver Version: 352.99 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:01:00.0 Off | 0 |
| N/A 35C P0 55W / 149W | 55MiB / 11519MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 0000:02:00.0 Off | 0 |
| N/A 30C P0 72W / 149W | 55MiB / 11519MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
-----------------------------------------------------------------------------------------------------------------------------------------------------------

so i am sure this issue is under proxmox 4.3 only.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!