Tutorial: Run LLMs using AMD GPU and ROCm in unprivileged LXC container

Dirty snippet with arch container:
Code:
nano /etc/pacman.d/mirrorlist   /2-3 server (https!)
pacman-key --init && pacman-key --populate archlinux && pacman-key --refresh-keys
pacman -Syu openssh sudo && systemctl restart sshd && systemctl enable sshd
visudo
################
## Uncomment to allow members of group wheel to execute any command
 %wheel ALL=(ALL:ALL) ALL      //(x + :wq)

useradd -m -G wheel,users,render,video -s /bin/bash archc01
passwd archc01

*ab hier via SSH*

sudo pacman -S --needed base-devel git rocminfo
git clone https://aur.archlinux.org/yay.git && cd yay && makepkg -si && yay --version
yay python312

auf host:
ls -l /dev/dri/by-path/ /dev/kfd /dev/dri && lspci -d ::03xx

/dev/kfd -> mapping to render GID of LXC
/dev/dri/renderD128 -> mapping to video GID of LXC
/dev/dri/renderD129 -> mapping to video GID of LXC

in container:
cat /etc/group | grep -w 'render\|\video'

shutdown container, restart after setting passthrough

git clone https://github.com/vladmandic/sdnext
cd sdnext
python3.12 -m venv venv
./webui.sh --listen --debug --auth "CreativeOne:p4sSw0rD" --use-rocm
or
./webui.sh --listen --debug --auth "CreativeOne:p4sSw0rD" --use-zluda

sit back and wait

browse to http://IP:7860

Btw. no need to set those variables, it is autodetection and just works.
Code:
HSA_OVERRIDE_GFX_VERSION=9.0.0
HSA_ENABLE_SDMA=0
Bildschirmfoto zu 2025-02-15 11-34-04.png
Bildschirmfoto zu 2025-02-15 11-35-10.png
Bildschirmfoto zu 2025-02-15 11-36-00.png
 
Last edited:
I want add for LXC ad 5825u


**Installing ROCm:**

Download the `amdgpu-install` package (use the latest stable version, e.g., 6.3.4):
```bash
wget https://repo.radeon.com/amdgpu-install/6.3.4/ubuntu/noble/amdgpu-install_6.3.60304-1_all.deb
```

Install the package:
```bash
sudo apt install ./amdgpu-install_6.3.60304-1_all.deb
```

Install ROCm components without DKMS (critical for LXC):
```bash
sudo amdgpu-install --usecase=rocm --no-dkms
```

Install additional ROCm libraries:
```bash
sudo apt install rocm-hip-libraries rocm-dev rocm-core
```

Add the current user to the `render` and `video` groups:
```bash
sudo usermod -a -G render,video $LOGNAME
```

**Important:** Add the following environment variables to your `.profile` (or `.bashrc`) and restart the terminal session/container:
```bash
echo "export HSA_OVERRIDE_GFX_VERSION=9.0.0" >> ~/.profile
echo "export HSA_ENABLE_SDMA=0" >> ~/.profile
```

Verify ROCm installation:
```bash
rocminfo
rocm-smi
```
*(Ensure `rocminfo` displays your gfxID, e.g., gfx90c)*

---

**Preparing to compile llama.cpp:**

Install build tools:
```bash
sudo apt update && sudo apt install build-essential cmake clang lld compiler-rt git
```

Clone the llama.cpp repository:
```bash
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
```

Set environment variables for the HIP compiler:
```bash
export HIPCXX="$(hipconfig -l)/clang"
export HIP_PATH="$(hipconfig -R)"
export HIP_DEVICE_LIB_PATH="$(find "${HIP_PATH}" -name "oclc_abi_version_400.bc" -exec dirname {} \; | head -n 1)"
```

---

**Compiling llama.cpp:**

Clean the build directory (if it exists):
```bash
rm -rf build
```

Configure CMake with your GPU architecture (`gfx90c` or `gfx900` in your case) and enable UMA:
```bash
cmake -S. -B build \
-DGGML_HIP=ON \
-DAMDGPU_TARGETS=gfx900 \
-DGGML_HIP_UMA=ON \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_CCACHE=ON
```
*(Note: `gfx900` was the target ID that worked successfully in your last run)*

Compile llama.cpp:
```bash
cmake --build build --config Release -- -j $(nproc)
```

---

**Verification:**

Run llama-bench to test GPU acceleration:
```bash
cd build
./bin/llama-bench -m ../../llama-2-7b.Q4_0.gguf -ngl 100 -fa 0,1
```
*(Verify that llama-bench runs successfully and shows GPU utilization)*
 
  • Like
Reactions: crypted
Some updates:

I used Debian with the Trixie repositories enabled (and ran apt dist-upgrade -y for updating to the latest Podman with ZFS and Quadlet support).
I run Ollama as a OCI container with rootless Podman inside the LXC container. i could not get it to run with 660 permissions even though my user was in the render and video group. I had to set the permissions to 666 to make Ollama recognize the GPU.

Note: There is an error in the original post (can not edit):
/dev/kbd shoud be /dev/kfd

Both /dev/dri/renderD* and /dev/kfd belongs to the render group
It is /dev/dri/card* that belongs to the video group

No other changes had to be done.
 
  • Like
Reactions: Johannes S
Hi guys!

Really appreciate all the help you're putting in!

I am almost at a frustrating point after trying all the available things, but never get it to working:


Server Specifications

Before diving into the troubleshooting steps, here are the specifications of my server:

  • CPU: AMD Ryzen 9 9700X
  • Mainboard: GIGABYTE X870E Aorus Elite
  • GPU: AMD Radeon RX 9070 XT
  • RAM: 64GB DDR5
  • Proxmox Version: 9.0.5
  • Kernel Version: 6.14.8-2-pve

What I've Tried So Far

1. Trying all the steps that have been mentioned, installing ROCm, setting the up the devices with both 0660 and 0666 permissions. Trying out on a fresh Ubuntu 25.04 unprivileged Container, but nothing worked. Despite these changes, the container still failed to access the GPU, showing the error message:
Unable to open /dev/kfd read-write: Invalid argument. Also, Ollama never recognizes it. I've set the variable export HSA_OVERRIDE_GFX_VERSION=12.0.0 as of [here](https://llvm.org/docs/AMDGPUUsage.html#processors) to both ~/.profile as well as for an Environment under /etc/systemd/system/ollama.server.

video group is gid = 44
render group is gid = 992

1756560597058.png


It seems to me that it's still something related to permissions but could never found out why. Maybe it is the AMD 9070 XT that makes problems. However, on a Windows VM the passthrough works fine.

Thanks for any help!
 
I'm not sure if Kernel 6.14.8-2 is new enough for RX 9070 XT, but it would need rocm6.4+ inside the container -> https://github.com/vladmandic/sdnext/issues/4030#issuecomment-3090158898

/card0 is usually not needed (for pure computing), and the default permissions should be sufficient. Therefore, leaving nothing will result in 0660.
Bildschirmfoto zu 2025-08-31 07-49-20.png
Double check your GIDs, I'm pretty sure you want gid=44 for /renderD128 (or the other way around)

compare:
auf host:
ls -l /dev/dri/by-path/ /dev/kfd /dev/dri && lspci -d ::03xx

/dev/kfd -> mapping to render GID of LXC
/dev/dri/renderD128 -> mapping to video GID of LXC
/dev/dri/renderD129 -> mapping to video GID of LXC

in container:
cat /etc/group | grep -w 'render\|\video'
 
Last edited:
Hi!

Thanks for the fast reply and help.

I've upgraded to the latest 6.14.11-1, since 6.15 is not availabe yet (because Proxmox apparently waits for Ubuntu 25.04 Lifecycle):
Code:
$ uname -a
Linux pve 6.14.11-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.14.11-1 (2025-08-26T16:06Z) x86_64 GNU/Linux

The groups are as follows:

Code:
root@gpu:~# cat /etc/group | grep -w 'render\|\video'
video:x:44:root,ollama
render:x:992:root,ollama

And the devices are set accordingly:

1756673590896.png

Also, to be sure that I'm using the correct GPU:

Code:
/dev/dri/by-path/:
total 0
lrwxrwxrwx 1 root root  8 Aug 31 22:36 pci-0000:03:00.0-card -> ../card0
lrwxrwxrwx 1 root root 13 Aug 31 22:36 pci-0000:03:00.0-render -> ../renderD128
lrwxrwxrwx 1 root root  8 Aug 31 22:13 pci-0000:7a:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Aug 31 22:13 pci-0000:7a:00.0-render -> ../renderD129
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 [RX 9070/9070 XT] (rev c0)
7a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Granite Ridge [Radeon Graphics] (rev c5)

I'm using ROCm version 6.4 as suggested from the Installation Instructions.

I've set all the devices as mentioned, even though I am unsure if I need to passthrough both iGPU and dGPU into it. But, I've also tried to do both of them but without any success.

Additionally, also tried with/without HSA override but didn't make any difference.

Guess overall its a kernel issue rather than a problem with permissions, even though the error message might be misleading here.

Thanks for all the help though!

Hope to get this running soon.
 
These are the outputs:

Code:
root@gpu:~# rocminfo
ROCk module is loaded
Unable to open /dev/kfd read-write: Invalid argument
root is member of render group

After research it seems that it seems to still be a permission problem.

The /dev/kfd does exist on the host:

Code:
➜  ~ ls -la /dev/kfd
crw-rw-rw- 1 root render 235, 0 Aug 31 22:36 /dev/kfd

Output of nvtop

Code:
 Device 0 [AMD Radeon Graphics] PCIe GEN 5@16x RX: N/A TX: N/A                 Device 1 [AMD Radeon Graphics] Integrated GPU RX: N/A TX: N/A
 GPU 4MHz    MEM 772MHz  TEMP  29°C  FAN   0%   POW  38 / 330 W                GPU 600MHz  MEM 2800MHz TEMP  40°C   CPU-FAN   POW   0 W
 GPU[                               0%] MEM[|                0.507Gi/15.922Gi] GPU[                               0%] MEM[                  0.015Gi/2.000Gi]
   ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
100│GPU0 %                                                                                                                                                                                                                                                                              │
   │GPU0 mem%                                                                                                                                                                                                                                                                           │
   │                                                                                                                                                                                                                                                                                    │
 75│                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
 50│                                                                                                                                                                                                                                                                                ┌─┐ │
   │                                                                                                                                                                                                                                                                                │ │ │
   │                                                                                                                                                                                                                                                                                │ │ │
 25│                                                                                                                                                                                                                                                                                │ │ │
   │                                                                                                                                                                                                                                                                                │ │ │
   │                                                                                                                                                                                                                                                                                │ │ │
  0│────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─┴─│
   └138s───────────────────────────────────────────────────────────────103s──────────────────────────────────────────────────────────────────69s──────────────────────────────────────────────────────────────────34s─────────────────────────────────────────────────────────────────0s┘
   ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
100│GPU1 %                                                                                                                                                                                                                                                                              │
   │GPU1 mem%                                                                                                                                                                                                                                                                           │
   │                                                                                                                                                                                                                                                                                    │
 75│                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
 50│                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
 25│                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
   │                                                                                                                                                                                                                                                                                    │
  0│────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────│
   └138s───────────────────────────────────────────────────────────────103s──────────────────────────────────────────────────────────────────69s──────────────────────────────────────────────────────────────────34s─────────────────────────────────────────────────────────────────0s┘
    PID USER DEV     TYPE  GPU        GPU MEM    CPU  HOST MEM Command
1004603 root   1  Graphic N/A       0MiB   0%     0%      5MiB nvtop

Output of dmesg | grep amdgpu

Code:
➜  ~ dmesg | grep amdgpu
[43443.107205] amdgpu 0000:03:00.0: amdgpu: PCIE GART of 512M enabled (table at 0x00000083DAB00000).
[43443.107232] amdgpu 0000:03:00.0: amdgpu: PSP is resuming...
[43443.308877] amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
[43443.308882] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[43443.308884] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[43443.308886] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000002e, smu fw if version = 0x00000032, smu fw program = 0, smu fw version = 0x00684a00 (104.74.0)
[43443.308889] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[43443.330555] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[43443.330711] amdgpu 0000:03:00.0: amdgpu: program CP_MES_CNTL : 0x4000000
[43443.330713] amdgpu 0000:03:00.0: amdgpu: program CP_MES_CNTL : 0xc000000
[43443.346974] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[43443.346979] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[43443.346980] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[43443.346982] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[43443.346983] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 6 on hub 0
[43443.346984] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 7 on hub 0
[43443.346985] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 8 on hub 0
[43443.346986] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 9 on hub 0
[43443.346987] amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[43443.346988] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[43443.350330] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes


Code:
➜  ~ apt list | grep "amdgpu"

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

amdgpu-core/noble 1:6.2.60204-2070768.24.04 all
amdgpu-dkms-firmware/noble 1:6.8.5.60204-2070768.24.04 all
amdgpu-dkms-headers/noble 1:6.8.5.60204-2070768.24.04 all
amdgpu-dkms/noble 1:6.8.5.60204-2070768.24.04 all
amdgpu-doc/noble 1:6.2-2070768.24.04 all
amdgpu-install/noble,now 6.2.60204-2070768.24.04 all [installed]
amdgpu-lib32/noble 1:6.2.60204-2070768.24.04 amd64
amdgpu-lib/noble 1:6.2.60204-2070768.24.04 amd64
amdgpu/noble 1:6.2.60204-2070768.24.04 amd64
gst-omx-amdgpu/noble 1:1.0.0.1.60204-2070768.24.04 amd64
libdrm-amdgpu-amdgpu1/noble 1:2.4.120.60204-2070768.24.04 amd64
libdrm-amdgpu-common/noble 1.0.0.60204-2070768.24.04 all
libdrm-amdgpu-dev/noble 1:2.4.120.60204-2070768.24.04 amd64
libdrm-amdgpu-radeon1/noble 1:2.4.120.60204-2070768.24.04 amd64
libdrm-amdgpu-static/noble 1:2.4.120.60204-2070768.24.04 amd64
libdrm-amdgpu-utils/noble 1:2.4.120.60204-2070768.24.04 amd64
libdrm-amdgpu1/stable,now 2.4.124-2 amd64 [installed]
libdrm2-amdgpu/noble 1:2.4.120.60204-2070768.24.04 amd64
libegl1-amdgpu-mesa-dev/noble 1:24.2.0.60204-2070768.24.04 amd64
libegl1-amdgpu-mesa-drivers/noble 1:24.2.0.60204-2070768.24.04 amd64
libegl1-amdgpu-mesa/noble 1:24.2.0.60204-2070768.24.04 amd64
libgbm-amdgpu-dev/noble 1:24.2.0.60204-2070768.24.04 amd64
libgbm1-amdgpu/noble 1:24.2.0.60204-2070768.24.04 amd64
libgl1-amdgpu-mesa-dev/noble 1:24.2.0.60204-2070768.24.04 amd64
libgl1-amdgpu-mesa-dri/noble 1:24.2.0.60204-2070768.24.04 amd64
libgl1-amdgpu-mesa-glx/noble 1:24.2.0.60204-2070768.24.04 amd64
libglapi-amdgpu-mesa/noble 1:24.2.0.60204-2070768.24.04 amd64
libllvm18.1-amdgpu/noble 1:18.1.60204-2070768.24.04 amd64
libvdpau-amdgpu-dev/noble 6.2-2070768.24.04 amd64
libvdpau-amdgpu-doc/noble 6.2-2070768.24.04 all
libvdpau1-amdgpu/noble 6.2-2070768.24.04 amd64
libwayland-amdgpu-bin/noble 1.22.0.60204-2070768.24.04 amd64
libwayland-amdgpu-client0/noble 1.22.0.60204-2070768.24.04 amd64
libwayland-amdgpu-cursor0/noble 1.22.0.60204-2070768.24.04 amd64
libwayland-amdgpu-dev/noble 1.22.0.60204-2070768.24.04 amd64
libwayland-amdgpu-doc/noble 1.22.0.60204-2070768.24.04 all
libwayland-amdgpu-egl-backend-dev/noble 1.22.0.60204-2070768.24.04 amd64
libwayland-amdgpu-egl1/noble 1.22.0.60204-2070768.24.04 amd64
libwayland-amdgpu-server0/noble 1.22.0.60204-2070768.24.04 amd64
libxatracker-amdgpu-dev/noble 1:24.2.0.60204-2070768.24.04 amd64
libxatracker2-amdgpu/noble 1:24.2.0.60204-2070768.24.04 amd64
llvm-amdgpu-18.1-dev/noble 1:18.1.60204-2070768.24.04 amd64
llvm-amdgpu-18.1-runtime/noble 1:18.1.60204-2070768.24.04 amd64
llvm-amdgpu-18.1/noble 1:18.1.60204-2070768.24.04 amd64
llvm-amdgpu-dev/noble 1:18.1.60204-2070768.24.04 amd64
llvm-amdgpu-runtime/noble 1:18.1.60204-2070768.24.04 amd64
llvm-amdgpu/noble 1:18.1.60204-2070768.24.04 amd64
mesa-amdgpu-common-dev/noble 1:24.2.0.60204-2070768.24.04 amd64
mesa-amdgpu-multimedia/noble 1:24.2.0.60204-2070768.24.04 amd64
mesa-amdgpu-omx-drivers/noble 1:24.2.0.60204-2070768.24.04 amd64
mesa-amdgpu-va-drivers/noble 1:24.2.0.60204-2070768.24.04 amd64
mesa-amdgpu-vdpau-drivers/noble 1:24.2.0.60204-2070768.24.04 amd64
ricks-amdgpu-utils/stable 3.9.0-1 all
wayland-protocols-amdgpu/noble 1.34.60204-2070768.24.04 all
xserver-xorg-amdgpu-video-amdgpu/noble 1:22.0.0.60204-2070768.24.04 amd64
xserver-xorg-video-amdgpu/stable 23.0.0-1 amd64

I assume that there is no need to install anything on the host as the amdgpu is already integrated in the kernel, right?

Thanks a lot!
 
Last edited:
no output has been shown with dmesg | grep amdgpu
Uh....that should look like this:
Code:
[   14.897558] [drm] amdgpu kernel modesetting enabled.
[   14.897995] amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GP17.VGA_.ATPX handle
[   14.898522] amdgpu: ATPX version 1, functions 0x00000000
[   14.908299] amdgpu: Virtual CRAT table created for CPU
[   14.908659] amdgpu: Topology: Add CPU node
[   14.909116] amdgpu 0000:0c:00.0: enabling device (0000 -> 0002)
[   14.910235] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 0 <soc15_common>
[   14.910471] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 1 <gmc_v9_0>
[   14.910659] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 2 <vega20_ih>
[   14.910849] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 3 <psp>
[   14.911060] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 4 <powerplay>
[   14.911266] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 5 <dm>
[   14.911472] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 6 <gfx_v9_0>
[   14.911669] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 7 <sdma_v4_0>
[   14.911841] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 8 <uvd_v7_0>
[   14.912034] amdgpu 0000:0c:00.0: amdgpu: detected ip block number 9 <vce_v4_0>
[   14.912240] amdgpu 0000:0c:00.0: amdgpu: ACPI VFCT table present but broken (too short #2),skipping
[   14.913398] amdgpu 0000:0c:00.0: amdgpu: Fetched VBIOS from platform
[   14.913713] amdgpu: ATOM BIOS: 113-D1631711-100
[   14.948239] amdgpu 0000:0c:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[   14.948451] amdgpu 0000:0c:00.0: amdgpu: PCIE atomic ops is not supported
[   14.949315] amdgpu 0000:0c:00.0: amdgpu: MEM ECC is active.
[   14.949472] amdgpu 0000:0c:00.0: amdgpu: SRAM ECC is active.
[   14.949767] amdgpu 0000:0c:00.0: amdgpu: RAS INFO: ras initialized successfully, hardware ability[67f7f] ras_mask[67f7f]
[   14.950555] amdgpu 0000:0c:00.0: BAR 2 [mem 0xfc00000000-0xfc001fffff 64bit pref]: releasing
[   14.950750] amdgpu 0000:0c:00.0: BAR 0 [mem 0xf800000000-0xfbffffffff 64bit pref]: releasing
[   14.951068] amdgpu 0000:0c:00.0: BAR 0 [mem 0xf800000000-0xfbffffffff 64bit pref]: assigned
[   14.951250] amdgpu 0000:0c:00.0: BAR 2 [mem 0xfc00000000-0xfc001fffff 64bit pref]: assigned
[   14.951422] amdgpu 0000:0c:00.0: amdgpu: VRAM: 32752M 0x0000008000000000 - 0x00000087FEFFFFFF (32752M used)
[   14.951576] amdgpu 0000:0c:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[   14.952305] [drm] amdgpu: 32752M of VRAM memory ready
[   14.952466] [drm] amdgpu: 63979M of GTT memory ready.
[   14.953849] amdgpu: hwmgr_sw_init smu backed is vega20_smu
[   15.135098] amdgpu 0000:0c:00.0: amdgpu: reserve 0x400000 from 0x87fec00000 for PSP TMR
[   15.220694] amdgpu 0000:0c:00.0: amdgpu: RAP: optional rap ta ucode is not available
[   15.560873] amdgpu: HMM registered 32752MB device memory
[   15.562489] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[   15.562842] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[   15.563366] amdgpu: Virtual CRAT table created for GPU
[   15.563838] amdgpu: Topology: Add dGPU node [0x66a1:0x1002]
[   15.564034] kfd kfd: amdgpu: added device 1002:66a1
[   15.578256] amdgpu 0000:0c:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 16, active_cu_number 60
[   15.578465] amdgpu 0000:0c:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[   15.578706] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[   15.578928] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[   15.579128] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[   15.579327] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[   15.579528] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[   15.579678] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[   15.579819] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[   15.579960] amdgpu 0000:0c:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[   15.580100] amdgpu 0000:0c:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
[   15.580239] amdgpu 0000:0c:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 8
[   15.580376] amdgpu 0000:0c:00.0: amdgpu: ring page0 uses VM inv eng 1 on hub 8
[   15.580518] amdgpu 0000:0c:00.0: amdgpu: ring sdma1 uses VM inv eng 4 on hub 8
[   15.580655] amdgpu 0000:0c:00.0: amdgpu: ring page1 uses VM inv eng 5 on hub 8
[   15.580783] amdgpu 0000:0c:00.0: amdgpu: ring uvd_0 uses VM inv eng 6 on hub 8
[   15.580909] amdgpu 0000:0c:00.0: amdgpu: ring uvd_enc_0.0 uses VM inv eng 7 on hub 8
[   15.581033] amdgpu 0000:0c:00.0: amdgpu: ring uvd_enc_0.1 uses VM inv eng 8 on hub 8
[   15.581155] amdgpu 0000:0c:00.0: amdgpu: ring uvd_1 uses VM inv eng 9 on hub 8
[   15.581276] amdgpu 0000:0c:00.0: amdgpu: ring uvd_enc_1.0 uses VM inv eng 10 on hub 8
[   15.581404] amdgpu 0000:0c:00.0: amdgpu: ring uvd_enc_1.1 uses VM inv eng 11 on hub 8
[   15.581530] amdgpu 0000:0c:00.0: amdgpu: ring vce0 uses VM inv eng 12 on hub 8
[   15.581656] amdgpu 0000:0c:00.0: amdgpu: ring vce1 uses VM inv eng 13 on hub 8
[   15.581776] amdgpu 0000:0c:00.0: amdgpu: ring vce2 uses VM inv eng 14 on hub 8
[   15.593300] amdgpu: Detected AMDGPU DF Counters. # of Counters = 8.
[   15.593524] amdgpu: Detected AMDGPU 2 Perf Events.
[   15.594196] amdgpu 0000:0c:00.0: amdgpu: Runtime PM not available
[   15.594640] amdgpu 0000:0c:00.0: [drm] Registered 6 planes with drm panic
[   15.594811] [drm] Initialized amdgpu 3.61.0 for 0000:0c:00.0 on minor 1
[   15.597977] amdgpu 0000:13:00.0: enabling device (0006 -> 0007)
[   15.600548] amdgpu 0000:13:00.0: amdgpu: detected ip block number 0 <nv_common>
[   15.600679] amdgpu 0000:13:00.0: amdgpu: detected ip block number 1 <gmc_v10_0>
[   15.600805] amdgpu 0000:13:00.0: amdgpu: detected ip block number 2 <navi10_ih>
[   15.600928] amdgpu 0000:13:00.0: amdgpu: detected ip block number 3 <psp>
[   15.601049] amdgpu 0000:13:00.0: amdgpu: detected ip block number 4 <smu>
[   15.601167] amdgpu 0000:13:00.0: amdgpu: detected ip block number 5 <dm>
[   15.601284] amdgpu 0000:13:00.0: amdgpu: detected ip block number 6 <gfx_v10_0>
[   15.601399] amdgpu 0000:13:00.0: amdgpu: detected ip block number 7 <sdma_v5_2>
[   15.601516] amdgpu 0000:13:00.0: amdgpu: detected ip block number 8 <vcn_v3_0>
[   15.601630] amdgpu 0000:13:00.0: amdgpu: detected ip block number 9 <jpeg_v3_0>
[   15.601873] amdgpu 0000:13:00.0: amdgpu: Fetched VBIOS from VFCT
[   15.602103] amdgpu: ATOM BIOS: 102-RAPHAEL-008
[   15.669623] amdgpu 0000:13:00.0: vgaarb: deactivate vga console
[   15.669628] amdgpu 0000:13:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[   15.669666] amdgpu 0000:13:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[   15.669671] amdgpu 0000:13:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[   15.669753] [drm] amdgpu: 512M of VRAM memory ready
[   15.669756] [drm] amdgpu: 63979M of GTT memory ready.
[   15.692535] amdgpu 0000:13:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
[   15.753532] amdgpu 0000:13:00.0: amdgpu: RAS: optional ras ta ucode is not available
[   15.759357] amdgpu 0000:13:00.0: amdgpu: RAP: optional rap ta ucode is not available
[   15.759361] amdgpu 0000:13:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[   15.760960] amdgpu 0000:13:00.0: amdgpu: SMU is initialized successfully!
[   15.763021] snd_hda_intel 0000:13:00.1: bound 0000:13:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[   15.791781] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[   15.791793] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[   15.791998] amdgpu: Virtual CRAT table created for GPU
[   15.792858] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
[   15.792862] kfd kfd: amdgpu: added device 1002:164e
[   15.792871] amdgpu 0000:13:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
[   15.792876] amdgpu 0000:13:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[   15.792879] amdgpu 0000:13:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0
[   15.792882] amdgpu 0000:13:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0
[   15.792886] amdgpu 0000:13:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0
[   15.792889] amdgpu 0000:13:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[   15.792892] amdgpu 0000:13:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[   15.792896] amdgpu 0000:13:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[   15.792899] amdgpu 0000:13:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[   15.792902] amdgpu 0000:13:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[   15.792905] amdgpu 0000:13:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[   15.792909] amdgpu 0000:13:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0
[   15.792912] amdgpu 0000:13:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0
[   15.792915] amdgpu 0000:13:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8
[   15.792918] amdgpu 0000:13:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8
[   15.792922] amdgpu 0000:13:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8
[   15.792925] amdgpu 0000:13:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[   15.793181] amdgpu 0000:13:00.0: amdgpu: Runtime PM not available
[   15.793461] amdgpu 0000:13:00.0: [drm] Registered 4 planes with drm panic
[   15.793465] [drm] Initialized amdgpu 3.61.0 for 0000:13:00.0 on minor 2
[   15.797206] fbcon: amdgpudrmfb (fb0) is primary device
[   15.962407] amdgpu 0000:13:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 6009.779790] amdgpu 0000:0c:00.0: amdgpu: Failed to send message 0x2d, response 0x0
[ 6009.780136] amdgpu: [powerplay] [GetCurrentClkFreq] Attempt to get Current Frequency Failed!

I assume that there is no need to install anything on the host as the amdgpu is already integrated in the kernel, right?
Yes, but then you should see something in dmesg.

After research it seems that its still a permission problem.
Yes, your user inside LXC must be added to groups for access. They are render and video.
 
Hi!

I have updated my answer in terms of output of amdgpu

The user is in the group already:

Code:
root@gpu:~# id
uid=0(root) gid=0(root) groups=0(root),44(video),989(ollama),992(render)

root@gpu:~# groups
root video ollama render

But funnily enough, I can successfully passthrough this GPU to a Windows VM and it works flawlessly.
 
The user is in the group already:
And you start ollama with logged in as ollama-user?

But funnily enough, I can successfully passthrough this GPU to a Windows VM and it works flawlessly.
This is not the same. That is passthrough exclusive..."gone from host" and only in one VM at the same time.
Passthrough into LXC is sharing with host and/or with multiple LCXs at the same time.
 
Hi!

Thanks for your answer.

But we need to assign the /dev/kfd to the video group rather than to ollama? From my perspective, the GID of video is 44 and not 989.

But nevertheless, I've also tried with other gids (989 ollama for /dev/kfd) and it results in the same problems:

Code:
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.218Z level=INFO source=routes.go:1384 msg="Listening on [::]:11434 (version 0.11.8)"
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.218Z level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.221Z level=WARN source=amd_linux.go:61 msg="ollama recommends running the https://www.amd.com/en/support/download/linux-drivers.html" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/ve>
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=WARN source=amd_linux.go:379 msg="amdgpu is not supported (supported types:[gfx1010 gfx1012 gfx1030 gfx1100 gfx1101 gfx1102 gfx1151 gfx1200 gfx1201 gfx900 gfx906 gfx908 gfx90a gfx942])" gpu_type=gfx1036 gpu=0 lib>
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=WARN source=amd_linux.go:386 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=INFO source=amd_linux.go:389 msg="amdgpu is supported" gpu=GPU-2f987739463f9020 gpu_type=gfx1201
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=ERROR source=amd_linux.go:410 msg="amdgpu devices detected but permission problems block access: failed to check permission on /dev/kfd: open /dev/kfd: invalid argument"
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=INFO source=gpu.go:379 msg="no compatible GPUs were discovered"
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="16.0 GiB" available="15.9 GiB"
Sep 01 08:34:33 gpu ollama[519]: time=2025-09-01T08:34:33.222Z level=INFO source=routes.go:1425 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB"

Code:
root@gpu:~# rocminfo
ROCk module is loaded
Unable to open /dev/kfd read-write: Invalid argument
root is member of ollama group
 
From my perspective, the GID of video is 44 and not 989.

But nevertheless, I've also tried with other gids (989 ollama for /dev/kfd) and it results in the same problems:
Yes...forget it. I had a short stroke. :D

https://forum.proxmox.com/attachments/1756673590896-png.90107/ <- this is correct.

root is member of ollama group
This should not be. You should use an extra user for starting ollama inside this LXC (without root privileges, but in groups video and render). The problem here is likely that you use the VNC-console from proxmox into this LXC. That is automatically root.
Login as root, add your user useradd -m -G wheel,users,render,video -s /bin/bash testuser1(that is for arch, I don't know ubuntu well!), then SSH into the LXC with this testuser1 and start ollama as this user.
 
Last edited:
Haha all good :D

Okay, I can test it by creating a new ollama user and will assign it to the groups. But also, rocminfo should output any information if the gpu passthrough works correctly, even when setting root to the video and render group, am I right?

Thanks a lot for your efforts and short strokes :D
 
But also, rocminfo should output any information if the gpu passthrough works correctly, even when setting root to the video and render group, am I right?
Yes, if this works I'm sure ollama will start also! Edit: Checked it...rocminfo works also for the root user.

rocminfo as regular user:
Code:
ROCk module is loaded
=====================  
HSA System Attributes  
=====================  
Runtime Version:         1.15
Runtime Ext Version:     1.7
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                            
System Endianness:       LITTLE                            
Mwaitx:                  DISABLED
XNACK enabled:           NO
DMAbuf Support:          YES
VMM Support:             YES

==========              
HSA Agents              
==========              
*******                
Agent 1                
*******                
  Name:                    AMD Ryzen 9 7950X 16-Core Processor
  Uuid:                    CPU-XX                            
  Marketing Name:          AMD Ryzen 9 7950X 16-Core Processor
  Vendor Name:             CPU                              
  Feature:                 None specified                    
  Profile:                 FULL_PROFILE                      
  Float Round Mode:        NEAR                              
  Max Queue Number:        0(0x0)                            
  Queue Min Size:          0(0x0)                            
  Queue Max Size:          0(0x0)                            
  Queue Type:              MULTI                            
  Node:                    0                                
  Device Type:             CPU                              
  Cache Info:            
    L1:                      32768(0x8000) KB                  
  Chip ID:                 0(0x0)                            
  ASIC Revision:           0(0x0)                            
  Cacheline Size:          64(0x40)                          
  Max Clock Freq. (MHz):   5883                              
  BDFID:                   0                                
  Internal Node ID:        0                                
  Compute Unit:            32                                
  SIMDs per CU:            0                                
  Shader Engines:          0                                
  Shader Arrs. per Eng.:   0                                
  WatchPts on Addr. Ranges:1                                
  Memory Properties:      
  Features:                None
  Pool Info:              
    Pool 1                  
      Segment:                 GLOBAL; FLAGS: FINE GRAINED      
      Size:                    131030852(0x7cf5f44) KB          
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                              
      Alloc Recommended Granule:4KB                              
      Alloc Alignment:         4KB                              
      Accessible by all:       TRUE                              
    Pool 2                  
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    131030852(0x7cf5f44) KB          
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                              
      Alloc Recommended Granule:4KB                              
      Alloc Alignment:         4KB                              
      Accessible by all:       TRUE                              
    Pool 3                  
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    131030852(0x7cf5f44) KB          
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                              
      Alloc Recommended Granule:4KB                              
      Alloc Alignment:         4KB                              
      Accessible by all:       TRUE                              
    Pool 4                  
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED    
      Size:                    131030852(0x7cf5f44) KB          
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                              
      Alloc Recommended Granule:4KB                              
      Alloc Alignment:         4KB                              
      Accessible by all:       TRUE                              
  ISA Info:              
*******                
Agent 2                
*******                
  Name:                    gfx906                            
  Uuid:                    GPU-216c78a173497dd3              
  Marketing Name:          AMD Radeon Graphics              
  Vendor Name:             AMD                              
  Feature:                 KERNEL_DISPATCH                  
  Profile:                 BASE_PROFILE                      
  Float Round Mode:        NEAR                              
  Max Queue Number:        128(0x80)                        
  Queue Min Size:          64(0x40)                          
  Queue Max Size:          131072(0x20000)                  
  Queue Type:              MULTI                            
  Node:                    1                                
  Device Type:             GPU                              
  Cache Info:            
    L1:                      16(0x10) KB                      
    L2:                      8192(0x2000) KB                  
  Chip ID:                 26273(0x66a1)                    
  ASIC Revision:           1(0x1)                            
  Cacheline Size:          64(0x40)                          
  Max Clock Freq. (MHz):   1725                              
  BDFID:                   3072                              
  Internal Node ID:        1                                
  Compute Unit:            60                                
  SIMDs per CU:            4                                
  Shader Engines:          4                                
  Shader Arrs. per Eng.:   1                                
  WatchPts on Addr. Ranges:4                                
  Coherent Host Access:    FALSE                            
  Memory Properties:      
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE                              
  Wavefront Size:          64(0x40)                          
  Workgroup Max Size:      1024(0x400)                      
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                      
    y                        1024(0x400)                      
    z                        1024(0x400)                      
  Max Waves Per CU:        40(0x28)                          
  Max Work-item Per CU:    2560(0xa00)                      
  Grid Max Size:           4294967295(0xffffffff)            
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)            
    y                        4294967295(0xffffffff)            
    z                        4294967295(0xffffffff)            
  Max fbarriers/Workgrp:   32                                
  Packet Processor uCode:: 472                              
  SDMA engine uCode::      145                              
  IOMMU Support::          None                              
  Pool Info:              
    Pool 1                  
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED    
      Size:                    33538048(0x1ffc000) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                              
      Alloc Recommended Granule:2048KB                            
      Alloc Alignment:         4KB                              
      Accessible by all:       FALSE                            
    Pool 2                  
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    33538048(0x1ffc000) KB            
      Allocatable:             TRUE                              
      Alloc Granule:           4KB                              
      Alloc Recommended Granule:2048KB                            
      Alloc Alignment:         4KB                              
      Accessible by all:       FALSE                            
    Pool 3                  
      Segment:                 GROUP                            
      Size:                    64(0x40) KB                      
      Allocatable:             FALSE                            
      Alloc Granule:           0KB                              
      Alloc Recommended Granule:0KB                              
      Alloc Alignment:         0KB                              
      Accessible by all:       FALSE                            
  ISA Info:              
    ISA 1                  
      Name:                    amdgcn-amd-amdhsa--gfx906:sramecc+:xnack-
      Machine Models:          HSA_MACHINE_MODEL_LARGE          
      Profiles:                HSA_PROFILE_BASE                  
      Default Rounding Mode:   NEAR                              
      Default Rounding Mode:   NEAR                              
      Fast f16:                TRUE                              
      Workgroup Max Size:      1024(0x400)                      
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                      
        y                        1024(0x400)                      
        z                        1024(0x400)                      
      Grid Max Size:           4294967295(0xffffffff)            
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)            
        y                        4294967295(0xffffffff)            
        z                        4294967295(0xffffffff)            
      FBarrier Max Size:       32                                
    ISA 2                  
      Name:                    amdgcn-amd-amdhsa--gfx9-generic:sramecc+:xnack-
      Machine Models:          HSA_MACHINE_MODEL_LARGE          
      Profiles:                HSA_PROFILE_BASE                  
      Default Rounding Mode:   NEAR                              
      Default Rounding Mode:   NEAR                              
      Fast f16:                TRUE                              
      Workgroup Max Size:      1024(0x400)                      
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                      
        y                        1024(0x400)                      
        z                        1024(0x400)                      
      Grid Max Size:           4294967295(0xffffffff)            
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)            
        y                        4294967295(0xffffffff)            
        z                        4294967295(0xffffffff)            
      FBarrier Max Size:       32                                
*** Done ***
 
Last edited:
Hmm but then there is still some misconfiguration or permission issue. Is it because the current kernel doesn't support it yet?

Otherwise, we tried out all things and still no success with rocm...

EDIT: I never use the vnc console, I always use raw SSH access to the LXC