[SOLVED] Ubuntu 22.04 + Ollama + nvidia 3060, gpu passthrough and drivers all looking good - but...

p3ter_b

New Member
Mar 20, 2024
7
0
1
I have followed (almost) all instructions I've found here on the forums and elsewhere, and have my GeForce RTX 3060 PCI Device GPU passthrough setup.
The Xubuntu 22.04 VM client says it's happily running nvidia CUDA drivers - but I can't Ollama to make use of the card.

`nvtop` says: 0/0/0% - all waiting.

I don't know where else to look.
Any ideas?

Thanks in advance.
Hope this is the right place, since it feels like the issue may still be in the GPU forwarding...?
 
Inside the VM:

Code:
$ nvidia-smi

Thu Mar 28 23:39:22 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        Off |   00000000:00:10.0 Off |                  N/A |
|  0%   42C    P8              8W /  170W |      11MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                        
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      7950      G   /usr/lib/xorg/Xorg                              4MiB |
+-----------------------------------------------------------------------------------------+


Code:
Device 0 [NVIDIA GeForce RTX 3060] PCIe GEN 1@16x RX: 0.000 KiB/s TX: 0.000 KiB/s
 GPU 210MHz  MEM 405MHz  TEMP  42°C FAN   0% POW   9 / 170 W
 GPU[                                 0%] MEM[|                  0.256Gi/12.000Gi]
   ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
100│GPU0 %                                                                                                    │
   │GPU0 mem%                                                                                                 │
   │                                                                                                          │
 75│                                                                                                          │
   │                                                                                                          │
   │                                                                                                          │
 50│                                                                                                          │
   │                                                                                                          │
   │                                                                                                          │
 25│                                                                                                          │
   │                                                                                                          │
   │                                                                                                          │
  0│──────────────────────────────────────────────────────────────────────────────────────────────────────────│
   └──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
    PID USER DEV    TYPE  GPU        GPU MEM    CPU  HOST MEM Command                                         
   7950 root   0 Graphic   0%      4MiB   0%     4%    168MiB /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /va




F2Setup F6Sort F9Kill F10Quit F12Save Config
 
I found it!

The VM's CPU type needed to be set to "Host", instead of the default "x86-64-v2-AES".
Thanks to "journalctl -xeu ollama", for spitting out these error messages:
Code:
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=WARN source=gpu.go:151 msg="CPU does not have AVX or AVX2, disabling GPU support."
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=INFO source=routes.go:1141 msg="no GPU detected"

...and a Proxmox forum thread on "AVX2 and AVX flags" with hints towards the VM-CPU type.

Now it's running perfectly, and journalctl confirms this:

Code:
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:115 msg="Detecting GPU type"
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libcudart.so*"
ollama[779]: time=2024-03-29T10:21:40.129+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/tmp/ollama4267936894/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.99]"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[779]: time=2024-03-29T10:21:40.215+01:00 level=INFO source=gpu.go:188 msg="[cudart] CUDART CUDA Compute Capability detected: 8.6"

nvtop, etc all show proper GPU usage now.

Case closed. :D
Thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!