Tutorial: Run LLMs using AMD GPU and ROCm in unprivileged LXC container

Dirty snippet with arch container:
Code:
nano /etc/pacman.d/mirrorlist   /2-3 server (https!)
pacman-key --init && pacman-key --populate archlinux && pacman-key --refresh-keys
pacman -Syu openssh sudo && systemctl restart sshd && systemctl enable sshd
visudo
################
## Uncomment to allow members of group wheel to execute any command
 %wheel ALL=(ALL:ALL) ALL      //(x + :wq)

useradd -m -G wheel,users,render,video -s /bin/bash archc01
passwd archc01

*ab hier via SSH*

sudo pacman -S --needed base-devel git rocminfo
git clone https://aur.archlinux.org/yay.git && cd yay && makepkg -si && yay --version
yay python312

auf host:
ls -l /dev/dri/by-path/ /dev/kfd /dev/dri && lspci -d ::03xx

/dev/kfd -> mapping to render GID of LXC
/dev/dri/renderD128 -> mapping to video GID of LXC
/dev/dri/renderD129 -> mapping to video GID of LXC

in container:
cat /etc/group | grep -w 'render\|\video'

shutdown container, restart after setting passthrough

git clone https://github.com/vladmandic/sdnext
cd sdnext
python3.12 -m venv venv
./webui.sh --listen --debug --auth "CreativeOne:p4sSw0rD" --use-rocm
or
./webui.sh --listen --debug --auth "CreativeOne:p4sSw0rD" --use-zluda

sit back and wait

browse to http://IP:7860

Btw. no need to set those variables, it is autodetection and just works.
Code:
HSA_OVERRIDE_GFX_VERSION=9.0.0
HSA_ENABLE_SDMA=0
Bildschirmfoto zu 2025-02-15 11-34-04.png
Bildschirmfoto zu 2025-02-15 11-35-10.png
Bildschirmfoto zu 2025-02-15 11-36-00.png
 
Last edited:
I want add for LXC ad 5825u


**Installing ROCm:**

Download the `amdgpu-install` package (use the latest stable version, e.g., 6.3.4):
```bash
wget https://repo.radeon.com/amdgpu-install/6.3.4/ubuntu/noble/amdgpu-install_6.3.60304-1_all.deb
```

Install the package:
```bash
sudo apt install ./amdgpu-install_6.3.60304-1_all.deb
```

Install ROCm components without DKMS (critical for LXC):
```bash
sudo amdgpu-install --usecase=rocm --no-dkms
```

Install additional ROCm libraries:
```bash
sudo apt install rocm-hip-libraries rocm-dev rocm-core
```

Add the current user to the `render` and `video` groups:
```bash
sudo usermod -a -G render,video $LOGNAME
```

**Important:** Add the following environment variables to your `.profile` (or `.bashrc`) and restart the terminal session/container:
```bash
echo "export HSA_OVERRIDE_GFX_VERSION=9.0.0" >> ~/.profile
echo "export HSA_ENABLE_SDMA=0" >> ~/.profile
```

Verify ROCm installation:
```bash
rocminfo
rocm-smi
```
*(Ensure `rocminfo` displays your gfxID, e.g., gfx90c)*

---

**Preparing to compile llama.cpp:**

Install build tools:
```bash
sudo apt update && sudo apt install build-essential cmake clang lld compiler-rt git
```

Clone the llama.cpp repository:
```bash
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
```

Set environment variables for the HIP compiler:
```bash
export HIPCXX="$(hipconfig -l)/clang"
export HIP_PATH="$(hipconfig -R)"
export HIP_DEVICE_LIB_PATH="$(find "${HIP_PATH}" -name "oclc_abi_version_400.bc" -exec dirname {} \; | head -n 1)"
```

---

**Compiling llama.cpp:**

Clean the build directory (if it exists):
```bash
rm -rf build
```

Configure CMake with your GPU architecture (`gfx90c` or `gfx900` in your case) and enable UMA:
```bash
cmake -S. -B build \
-DGGML_HIP=ON \
-DAMDGPU_TARGETS=gfx900 \
-DGGML_HIP_UMA=ON \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_CCACHE=ON
```
*(Note: `gfx900` was the target ID that worked successfully in your last run)*

Compile llama.cpp:
```bash
cmake --build build --config Release -- -j $(nproc)
```

---

**Verification:**

Run llama-bench to test GPU acceleration:
```bash
cd build
./bin/llama-bench -m ../../llama-2-7b.Q4_0.gguf -ngl 100 -fa 0,1
```
*(Verify that llama-bench runs successfully and shows GPU utilization)*
 
  • Like
Reactions: crypted