GPU pass through on HPE Cray XD670

Ignauus

New Member
Dec 18, 2024
1
0
1
Hi,

Anyone that has tried and enabled GPU pass through on this beast of a server?
It has 8 x Nvida H200 with Intel XEON 8570

I've tried various settings / tutorials on the web and this forum but none ends well.

The closest I've got is a driver install fail with errors like belov in the cuda installer log:
NVRM: This PCI I/O region assigned to your NVIDIA device is invalid

With CPU type host the VM hangs with "NO VNC" in the console and on the host I see it get stuck on:
vfio-pci-pci 0000:0a:00.0:Enabling HDA controller

I use an UEFI VM with q35, tried all the settings I can think of when adding a PCI Device, IOMMU groups looks fine

We had problems like this on an older HPE server with 8 x A100 also and ended up with using libvirt kvm. I'm not quite ready to give up just yet.
Any suggestion where to begin?

--Peter
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!