Proxmox 9.x / Strix Halo / GPU Passthrough

dlasher

Renowned Member
Mar 23, 2011
253
51
93
Having spent the last few days fighting all the subtle parts of getting this working, I put together a quick guide on how to get Proxmox, running on a Strix Halo, machine, with working GPU passthrough into LXC containers. To be clear, this is a "works right now" recipe, subject to kernel changes, ROCM changes, etc.


---------------------------------

Guide: The Strix Halo AI Powerhouse (Proxmox 9.1 + ROCm 7.2)​

Target Hardware: AMD STRIX HALO box (Minisforum S1-MAX etc)
Goal: 128GB total RAM, 64GB ram for CPU/applications + 64GB VRAM local AI server with full hardware acceleration


Introduction​

The AMD Strix Halo (RDNA 3.5 / gfx1151) is a game-changer for local AI. By leveraging a high-speed unified memory architecture, this APU can address massive amounts of system RAM as video memory. This guide details how to configure Proxmox 9.1 to carve out a 64GB VRAM pool and pass it through to a high-performance LXC container.


Phase 1: Host BIOS & Kernel Tuning​

Unlocking the memory gates to allow the GPU to access 64GB of RAM.

1. BIOS Settings​

  • IOMMU: Enabled.
  • UMA Framebuffer: Auto. (The kernel parameters below will override and expand this).
  • Resizable BAR: Enabled.

2. Host Kernel Parameters​

Edit /etc/default/grub (or /etc/kernel/cmdline) on your Proxmox Host to enable IOMMU pass-through and define the Graphics Translation Table (GTT) size.

Code:
# Edit this line in /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=on iommu=pt amdgpu.gttsize=65536 ttm.pages_limit=16777216 video=1024x768@60"

  • amdgpu.gttsize=65536: Maps 64GB of system RAM for GPU use.
  • ttm.pages_limit=16777216: Sets the page limit to exactly 64GB (16777216 times 4096 byte pages).
  • video=1024x768@60: Because console graphics mode autosense is always wrong.
Apply and Reboot:
Code:
update-grub && reboot


Phase 2: Host Driver & Firmware Installation​

Strix Halo requires the latest firmware blobs and the ROCm 7.2 userspace stack.

1. Update Firmware (Critical for gfx1151)​

Run this on the Proxmox Host:

Code:
apt update && apt install -y git
git clone --depth 1 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
cp linux-firmware/amdgpu/gc_11_5_1* /lib/firmware/amdgpu/
cp linux-firmware/amdgpu/sdma_6_1_1* /lib/firmware/amdgpu/
update-initramfs -u

2. Install ROCm 7.2 Userspace (Host)​

Code:
apt update && apt install -y wget gpg curl
wget -q https://repo.radeon.com/amdgpu-install/latest/ubuntu/noble/amdgpu-install_7.2.70200-1_all.deb -O /tmp/amdgpu.deb
apt install -y /tmp/amdgpu.deb
amdgpu-install -y --usecase=graphics,rocm
usermod -aG render,video root

3. Host Verification​

Confirm the host sees the hardware:

Code:
/usr/bin/rocminfo | grep "gfx1151"


Phase 3: LXC Creation & Hardware Mapping​

1. Create the Container​

  • Privileged: Yes (Required for the KFD driver handshake).
  • Template: Ubuntu 24 LTS
  • RAM: 16GB (The GPU will pull from the 64GB GTT pool independently).

2. Map Devices (On Host)​

Run these commands on the Proxmox Host to find your GIDs and map the hardware to your LXC (replace 1201 with your actual Container ID):

Code:
# Find GIDs
RENDER_GID=$(getent group render | cut -d: -f3)
VIDEO_GID=$(getent group video | cut -d: -f3)

# Native Proxmox device passthrough
pct set 1201 -dev0 /dev/kfd,gid=$RENDER_GID
pct set 1201 -dev1 /dev/dri/renderD128,gid=$RENDER_GID
pct set 1201 -dev2 /dev/dri/card0,gid=$VIDEO_GID


Phase 4: Container Internal Setup​

Inside the Ubuntu 24 LTS LXC, install the ROCm stack without kernel modules (no-dkms) and configure Ollama.

1. Install ROCm 7.2 (LXC)​

Code:
apt update && apt install -y wget gpg curl zstd
wget -q 'https://repo.radeon.com/amdgpu-install/latest/ubuntu/noble/amdgpu-install_7.2.70200-1_all.deb' -O /tmp/amdgpu.deb
apt install -y /tmp/amdgpu.deb
amdgpu-install -y --usecase=rocm --no-dkms
usermod -aG render,video root



2. Install & Reconfigure Ollama​

Code:
curl -fsSL https://ollama.com/install.sh | sh
systemctl edit ollama.service

Paste the following into the override file:

Code:
[Service]
# Force RDNA 3.5 recognition
Environment="HSA_OVERRIDE_GFX_VERSION=11.5.0"

# Stability Fix: Disable bugged SDMA for unified memory
Environment="HSA_ENABLE_SDMA=0"
Environment="OLLAMA_VULKAN=1"

# Connectivity & Performance
Environment="OLLAMA_HOST=0.0.0.0"
Environment="OLLAMA_ORIGINS=*"
Environment="OLLAMA_NUM_PARALLEL=2"
Environment="OLLAMA_KV_CACHE_TYPE=q8_0"
Environment="OLLAMA_KEEP_ALIVE=24h"



Phase 5: Functional Validation​

Perform these checks inside the LXC to ensure the stack is operational.

1. Driver Check​

Code:
/usr/bin/rocminfo | grep "gfx1151"


Expected Output: Name: gfx1151 and Name: amdgcn-amd-amdhsa--gfx1151.

2. Functional LLM Test​

Code:
systemctl daemon-reload
systemctl restart ollama
ollama pull qwen2.5:0.5b
ollama run qwen2.5:0.5b "Why is the sky blue?"

If the reply is instant and doesn't crash, your 64G/64G Strix Halo workstation is live.

 
Last edited:
OPTIONAL:

If you don't want to blind copy the firmware over the top, or you might actually want to check if you're ALREADY on the latest firmware, use this script instead.

Code:
#!/bin/bash
# pve-firmware-sync.sh

# 1. Pull latest firmware blobs from upstream
apt update && apt install -y git
git clone --depth 1 https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git /tmp/linux-firmware

# Track if ANY file across all patterns was updated
ANY_UPDATED=0

sync_firmware() {
    local src_pattern=$1
    local dest_dir="/lib/firmware/amdgpu/"
    local changed_in_this_run=0

    # Expand the pattern into a list of files
    for src_file in $src_pattern; do
        filename=$(basename "$src_file")
        dest_file="${dest_dir}${filename}"

        # If dest doesn't exist OR hashes differ, copy it
        if [[ ! -f "$dest_file" ]] || [[ "$(md5sum < "$src_file")" != "$(md5sum < "$dest_file")" ]]; then
            echo "Updating: $filename"
            cp "$src_file" "$dest_file"
            changed_in_this_run=1
            ANY_UPDATED=1
        else
            echo "Skipping: $filename (already up to date)"
        fi
    done
}

# 2. Perform the sync
sync_firmware "/tmp/linux-firmware/amdgpu/gc_11_5_1*"
sync_firmware "/tmp/linux-firmware/amdgpu/sdma_6_1_1*"

# 3. Only rebuild if ANY_UPDATED was set to 1
if [ "$ANY_UPDATED" -eq 1 ]; then
    echo "------------------------------------------------"
    echo "Changes detected. Rebuilding initramfs..."
    update-initramfs -u
    echo "Reboot recommended to apply new firmware."
else
    echo "------------------------------------------------"
    echo "Firmware is already synchronized. No rebuild required."
fi

# Cleanup
rm -rf /tmp/linux-firmware


# bash ./sync.firmware.sh
Hit:1 http://ftp.us.debian.org/debian trixie InRelease
Hit:2 http://security.debian.org trixie-security InRelease
Hit:3 http://ftp.us.debian.org/debian trixie-updates InRelease
Hit:4 https://repo.radeon.com/amdgpu/30.30/ubuntu noble InRelease
Hit:5 http://download.proxmox.com/debian/pve trixie InRelease
Hit:6 https://repo.radeon.com/rocm/apt/7.2 noble InRelease
Hit:7 https://repo.radeon.com/graphics/7.2/ubuntu noble InRelease
All packages are up to date.
git is already the newest version (1:2.47.3-0+deb13u1).
Summary:
Upgrading: 0, Installing: 0, Removing: 0, Not Upgrading: 0
Cloning into '/tmp/linux-firmware'...
remote: Enumerating objects: 4286, done.
remote: Counting objects: 100% (4286/4286), done.
remote: Compressing objects: 100% (2969/2969), done.
remote: Total 4286 (delta 1672), reused 3272 (delta 1190), pack-reused 0 (from 0)
Receiving objects: 100% (4286/4286), 721.90 MiB | 24.74 MiB/s, done.
Resolving deltas: 100% (1672/1672), done.
Updating files: 100% (4462/4462), done.
Skipping: gc_11_5_1_imu.bin (already up to date)
Skipping: gc_11_5_1_me.bin (already up to date)
Skipping: gc_11_5_1_mec.bin (already up to date)
Skipping: gc_11_5_1_mes1.bin (already up to date)
Skipping: gc_11_5_1_mes_2.bin (already up to date)
Skipping: gc_11_5_1_pfp.bin (already up to date)
Skipping: gc_11_5_1_rlc.bin (already up to date)
Skipping: sdma_6_1_1.bin (already up to date)
------------------------------------------------
Firmware is already synchronized. No rebuild required.
 
Last edited:
Just in case anyone else stumbles across this.
The instructions worked for me. My notes are attached below:

Phase 1:

1 BIOS Settings
-
2. Host Kernel Parameters

I read that the amdgpu.gttsize parameter was deprecated on the Strix Halo Wiki (referenced depracation message)
Just ommitting it worked:

Code:
# Edit this line in /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=on iommu=pt  ttm.pages_limit=25165824 video=1024x768@60"

This was the output on my Minisforum MS-S1 Max:
Code:
root@pve:~# dmesg | grep -i amdgpu | grep memory
[    3.639795] amdgpu 0000:bd:00.0:  512M of VRAM memory ready
[    3.639797] amdgpu 0000:bd:00.0:  98304M of GTT memory ready.


Phase 2:

1. Update Firmware
My kernel is/was: Linux pve 7.0.0-3-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.0-3 (2026-04-21T22:56Z) x86_64 GNU/Linux
I did not have to update the firmware mentioned in phase 2

2 Install ROCm 7.2
Rocm latest version is 7.2.2 so a small url update is needed. There was a sed cmd to downgrade a part to 7.2.1 (from what i understand, I do not know what I am doing) on the AMD website so i went with those commands:

Code:
apt update &amp;&amp; apt install -y wget gpg curl
wget https://repo.radeon.com/amdgpu-install/7.2.2/ubuntu/noble/amdgpu-install_7.2.2.70202-1_all.deb
 apt install ./amdgpu-install_7.2.2.70202-1_all.deb
sed -i "s|graphics/7.2.2|graphics/7.2.1|" /etc/apt/sources.list.d/rocm.list
apt update
amdgpu-install -y --usecase=graphics,rocm
usermod -aG render,video root

I got an error with amdgpu-install -y --usecase=graphics,rocm :

Code:
root@pve:~# amdgpu-install  --usecase=graphics,rocm
INFO: i386 architecture has not been enabled with dpkg.
Installation of 32-bit run time has been excluded.
Hit:1 http://deb.debian.org/debian trixie InRelease
Hit:2 http://deb.debian.org/debian trixie-updates InRelease
Hit:3 https://repo.radeon.com/amdgpu/30.30.2/ubuntu noble InRelease
Hit:4 http://security.debian.org/debian-security trixie-security InRelease
Hit:5 https://repo.radeon.com/rocm/apt/7.2.2 noble InRelease
Hit:6 https://repo.radeon.com/graphics/7.2.1/ubuntu noble InRelease
Hit:7 http://download.proxmox.com/debian/pve trixie InRelease
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
amdgpu-lib is already the newest version (1:7.2.70201-2303469.24.04).
rocm is already the newest version (7.2.2.70202-86~24.04).
amdgpu-dkms is already the newest version (1:6.16.13.30300200-2317211.24.04).
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
WARNING: amdgpu dkms failed for running kernel

According to claude/chatgpt the Kernel ist too new but the amdgpu DKMS module is apparently not needed, the driver that comes with the kernel is enough.
Code:
# On the HOST — just verify the in-tree driver works:
/opt/rocm/bin/rocminfo | grep "gfx1151"
If your in-tree amdgpu driver (from kernel 7.0.0-3-pve) already exposes /dev/kfd and /dev/dri/renderD128, you're good. Check with:

bash
ls -la /dev/kfd /dev/dri/
If those devices exist, skip the DKMS step on the host and go straight to the LXC setup using --no-dkms:

bash
# Inside the LXC container only:
amdgpu-install -y --usecase=rocm --no-dkms
The container doesn't need DKMS at all — it just needs the ROCm userspace libraries.

Phases 3-5 just "worked" at least I thought they did. I then ran ollama ps which showed:

Code:
root@llmtest:~# ollama ps
NAME            ID              SIZE      PROCESSOR    CONTEXT    UNTIL
qwen2.5:0.5b    a8b0c5157701    492 MB    100% CPU     4096       23 hours from now

The group ids of the render group in the lxc container and on the host where not identical. I changed the groups in the lxc container, rebooted it, and got a quicker answer and:
Code:
root@llmtest:~# ollama ps
NAME            ID              SIZE      PROCESSOR    CONTEXT    UNTIL
qwen2.5:0.5b    a8b0c5157701    3.2 GB    100% GPU     32768      24 hours from now

Thank you very much for the instructions.
 
Last edited: