[SOLVED] Ubuntu 22.04 + Ollama + nvidia 3060, gpu passthrough and drivers all looking good - but...

p3ter_b

New Member
Mar 20, 2024
8
11
1
I have followed (almost) all instructions I've found here on the forums and elsewhere, and have my GeForce RTX 3060 PCI Device GPU passthrough setup.
The Xubuntu 22.04 VM client says it's happily running nvidia CUDA drivers - but I can't Ollama to make use of the card.

`nvtop` says: 0/0/0% - all waiting.

I don't know where else to look.
Any ideas?

Thanks in advance.
Hope this is the right place, since it feels like the issue may still be in the GPU forwarding...?
 
Inside the VM:

Code:
$ nvidia-smi

Thu Mar 28 23:39:22 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        Off |   00000000:00:10.0 Off |                  N/A |
|  0%   42C    P8              8W /  170W |      11MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                        
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      7950      G   /usr/lib/xorg/Xorg                              4MiB |
+-----------------------------------------------------------------------------------------+


Code:
Device 0 [NVIDIA GeForce RTX 3060] PCIe GEN 1@16x RX: 0.000 KiB/s TX: 0.000 KiB/s
 GPU 210MHz  MEM 405MHz  TEMP  42°C FAN   0% POW   9 / 170 W
 GPU[                                 0%] MEM[|                  0.256Gi/12.000Gi]
   ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
100│GPU0 %                                                                                                    │
   │GPU0 mem%                                                                                                 │
   │                                                                                                          │
 75│                                                                                                          │
   │                                                                                                          │
   │                                                                                                          │
 50│                                                                                                          │
   │                                                                                                          │
   │                                                                                                          │
 25│                                                                                                          │
   │                                                                                                          │
   │                                                                                                          │
  0│──────────────────────────────────────────────────────────────────────────────────────────────────────────│
   └──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
    PID USER DEV    TYPE  GPU        GPU MEM    CPU  HOST MEM Command                                         
   7950 root   0 Graphic   0%      4MiB   0%     4%    168MiB /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /va




F2Setup F6Sort F9Kill F10Quit F12Save Config
 
I found it!

The VM's CPU type needed to be set to "Host", instead of the default "x86-64-v2-AES".
Thanks to "journalctl -xeu ollama", for spitting out these error messages:
Code:
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=WARN source=gpu.go:151 msg="CPU does not have AVX or AVX2, disabling GPU support."
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=INFO source=routes.go:1141 msg="no GPU detected"

...and a Proxmox forum thread on "AVX2 and AVX flags" with hints towards the VM-CPU type.

Now it's running perfectly, and journalctl confirms this:

Code:
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:115 msg="Detecting GPU type"
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libcudart.so*"
ollama[779]: time=2024-03-29T10:21:40.129+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/tmp/ollama4267936894/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.99]"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[779]: time=2024-03-29T10:21:40.215+01:00 level=INFO source=gpu.go:188 msg="[cudart] CUDART CUDA Compute Capability detected: 8.6"

nvtop, etc all show proper GPU usage now.

Case closed. :D
Thanks!
 
I found it!

The VM's CPU type needed to be set to "Host", instead of the default "x86-64-v2-AES".
Thanks to "journalctl -xeu ollama", for spitting out these error messages:
Code:
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=WARN source=gpu.go:151 msg="CPU does not have AVX or AVX2, disabling GPU support."
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=INFO source=routes.go:1141 msg="no GPU detected"

...and a Proxmox forum thread on "AVX2 and AVX flags" with hints towards the VM-CPU type.

Now it's running perfectly, and journalctl confirms this:

Code:
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:115 msg="Detecting GPU type"
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libcudart.so*"
ollama[779]: time=2024-03-29T10:21:40.129+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/tmp/ollama4267936894/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.99]"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[779]: time=2024-03-29T10:21:40.215+01:00 level=INFO source=gpu.go:188 msg="[cudart] CUDART CUDA Compute Capability detected: 8.6"

nvtop, etc all show proper GPU usage now.

Case closed. :D
Thanks!
THANK YOU
 
Oh my, thank you! I was looking for a solution for quite a while. CPU type host did the trick.

Cheers!
 
@p3ter_b solution worked for me. Explicit steps below for future beginners like myself.
  1. Shut down the VM in question (full shutdown is required, not just a simple reboot)
  2. Select the VM from inside ProxMox if not already there, then select Hardware then select Processors.
  3. Once Processors is selected, click edit and in the "Type" text input field enter or select 'host'. It is likely set to 'x86-64-v2-AES' if you originally built the VM using defaults.
  4. Click OK to save
  5. Restart the VM
Following these steps, the VM rebooted and the ollama service (which I had already installed) launched at reboot and was able to use the GPU because the vector instructions are now 'available' on the CPU.

If you 'cat /proc/cpuinfo' from inside your VM, you should see the exact CPU model ID that your promox server is sitting on. Prior to the change to 'host', /proc/cpuinfo is likely "QEMU Virtual CPU version 2.5+". Apparently Ollama does not recognize "QEMU Virtual CPU version 2.5+" as a valid cpu for it's purposes.

 
Нщг
I found it!

The VM's CPU type needed to be set to "Host", instead of the default "x86-64-v2-AES".
Thanks to "journalctl -xeu ollama", for spitting out these error messages:
Code:
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=WARN source=gpu.go:151 msg="CPU does not have AVX or AVX2, disabling GPU support."
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=INFO source=routes.go:1141 msg="no GPU detected"

...and a Proxmox forum thread on "AVX2 and AVX flags" with hints towards the VM-CPU type.

Now it's running perfectly, and journalctl confirms this:

Code:
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:115 msg="Detecting GPU type"
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libcudart.so*"
ollama[779]: time=2024-03-29T10:21:40.129+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/tmp/ollama4267936894/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.99]"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[779]: time=2024-03-29T10:21:40.215+01:00 level=INFO source=gpu.go:188 msg="[cudart] CUDART CUDA Compute Capability detected: 8.6"

nvtop, etc all show proper GPU usage now.

Case closed. :D
Thanks!
Thank you so much for your help, kind person! You've saved my sanity! I was stuck with this issue for three days and couldn't get the model
running on my virtual machine. But thanks to your solution, everything is working perfectly now!
 
Thank You so much OP!!! I spent numerous hours. Installed any and all packages that has nvidia name in it.....Finally, the solution is here. Thanks for digging this through!
 
I found it!

The VM's CPU type needed to be set to "Host", instead of the default "x86-64-v2-AES".
Thanks to "journalctl -xeu ollama", for spitting out these error messages:
Code:
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[778]: time=2024-03-29T09:51:08.606+01:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=WARN source=gpu.go:151 msg="CPU does not have AVX or AVX2, disabling GPU support."
ollama[778]: time=2024-03-29T09:51:08.607+01:00 level=INFO source=routes.go:1141 msg="no GPU detected"

...and a Proxmox forum thread on "AVX2 and AVX flags" with hints towards the VM-CPU type.

Now it's running perfectly, and journalctl confirms this:

Code:
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:115 msg="Detecting GPU type"
ollama[779]: time=2024-03-29T10:21:40.126+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libcudart.so*"
ollama[779]: time=2024-03-29T10:21:40.129+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/tmp/ollama4267936894/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.4.99]"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=gpu.go:120 msg="Nvidia GPU detected via cudart"
ollama[779]: time=2024-03-29T10:21:40.150+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[779]: time=2024-03-29T10:21:40.215+01:00 level=INFO source=gpu.go:188 msg="[cudart] CUDART CUDA Compute Capability detected: 8.6"

nvtop, etc all show proper GPU usage now.

Case closed. :D
Thanks!
非常感谢
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!