Host freezes after GPU passthrough

ProxFuchs

New Member
Jul 4, 2017
2
2
1
39
Hello everybody,
i have some issues with GPU passthrough in my Proxmox system.
Proxmox runs on Dell T20 (Xeon E3-1225v3) and i want to pass the GPU (AMD Radeon Pro WX5100) to Ubuntu VM.
I setup the VM like described in Wiki and install first Ubuntu 16.4.2 in UEFI mode. But after uncomment the line with hostpci0/hostpci1 and start the VM, the host freezes after ca. 10 seconds and does not respond to any request. I see that the vCPU runs with 100%, but can't reach the server neither over SSH nor over AMT.
I'm grateful for any help...

Packages on my host:
proxmox-ve: 4.4-92 (running kernel: 4.4.67-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.44-1-pve: 4.4.44-84
pve-kernel-4.4.67-1-pve: 4.4.67-92
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.40-1-pve: 4.4.40-82
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-52
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-41
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4 lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80


Config of GPU-VM:
balloon: 0
bios: ovmf
bootdisk: scsi0
cores: 1
efidisk0: local-lvm:vm-106-disk-2,size=128K
hostpci0: 01:00.0,x-vga=on,pcie=1
hostpci1: 01:00.1
ide2: local:iso/ubuntu-16.04.2-server-amd64.iso,media=cdrom
machine: q35
memory: 2048
name: ubuntu-gpu
net0: virtio=B6:39:31:29:61:CA,bridge=vmbr0
numa: 0
ostype: l26
scsi0: local-lvm:vm-106-disk-1,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=c601f05b-979e-4e75-985a-06bc21ae4b2c
sockets: 1


Output of lspci -v:
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67c7 (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0b0d
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at e0000000 (64-bit, prefetchable) [size=256M]

Memory at f0000000 (64-bit, prefetchable) [size=2M]
I/O ports at e000
Memory at f7e00000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at f7e40000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [200] #15
Capabilities: [270] #19
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] #13
Capabilities: [2d0] #1b
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
Kernel driver in use: vfio-pci

01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf0
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf0
Flags: bus master, fast devsel, latency 0, IRQ 10
Memory at f7e60000 (64-bit, non-prefetchable) [size=16K]

Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Kernel driver in use: vfio-pci


GRUB (/etc/default/grub):
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream video=efifb:eek:ff"

/etc/modprobe.d/vfio.conf:
options vfio-pci ids=1002:67c7,1002:aaf0 disable_vga=1

Output of find /sys/kernel/iommu_groups/ -type l :
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/2/devices/0000:00:14.0
/sys/kernel/iommu_groups/3/devices/0000:00:16.0
/sys/kernel/iommu_groups/3/devices/0000:00:16.3
/sys/kernel/iommu_groups/4/devices/0000:00:19.0
/sys/kernel/iommu_groups/5/devices/0000:00:1a.0
/sys/kernel/iommu_groups/6/devices/0000:00:1b.0
/sys/kernel/iommu_groups/7/devices/0000:00:1c.0
/sys/kernel/iommu_groups/8/devices/0000:00:1c.1
/sys/kernel/iommu_groups/9/devices/0000:00:1c.2
/sys/kernel/iommu_groups/10/devices/0000:00:1c.4
/sys/kernel/iommu_groups/11/devices/0000:00:1d.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.0
/sys/kernel/iommu_groups/12/devices/0000:00:1f.2
/sys/kernel/iommu_groups/12/devices/0000:00:1f.3
/sys/kernel/iommu_groups/13/devices/0000:01:00.0
/sys/kernel/iommu_groups/13/devices/0000:01:00.1
/sys/kernel/iommu_groups/14/devices/0000:03:00.0
/sys/kernel/iommu_groups/14/devices/0000:04:02.0
/sys/kernel/iommu_groups/15/devices/0000:05:00.0
/sys/kernel/iommu_groups/16/devices/0000:06:00.0
/sys/kernel/iommu_groups/17/devices/0000:06:00.1
 
  • Like
Reactions: ONE FOTON

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!