Ubuntu Guest with Passthrough GPU does not boot if allocated more than 24GB of ram

troy.roberts

Member
Jun 24, 2023
6
1
8
I have an Ubunut 24 Vm running docker with a Nvidia GPU passed through that works fine as long as i only assign it 24GB of ram

if I remove the GPU I can assign more and it happily boots but anything more than say 24Gb of ram and it will just timeout with the following error and not boot

generating cloud-init ISO

TASK ERROR: start failed: command '/usr/bin/kvm -id 132 -name 'nvidia-docker,debug-threads=on' -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/132.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/132.pid -daemonize -smbios 'type=1,uuid=f72ef5cc-7c94-417b-bb08-fc58731f0d87' -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,id=drive-efidisk0,format=raw,file=/dev/zvol/rpool/data/vm-132-disk-0,size=131072' -smp '32,sockets=2,cores=16,maxcpus=32' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc 'unix:/var/run/qemu-server/132.vnc,password=on' -cpu host,+kvm_pv_eoi,+kvm_pv_unhalt -m 32000 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=9249d60a-722f-4523-8c33-4c84e380e48d' -device 'vfio-pci,host=0000:42:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=0000:42:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -chardev 'socket,id=serial0,path=/var/run/qemu-server/132.serial0,server=on,wait=off' -device 'isa-serial,chardev=serial0' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/132.qga,server=on,wait=off,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:5826f0d0987' -drive 'file=/dev/zvol/rpool/data/vm-132-cloudinit,if=none,id=drive-ide2,media=cdrom,aio=io_uring' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/rpool/data/vm-132-disk-1,if=none,id=drive-scsi0,cache=writethrough,discard=on,format=raw,aio=io_uring,detect-zeroes=unmap' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,rotation_rate=1,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap132i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=02:1B:E5:C3:86:E3,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256' -rtc 'base=localtime' -machine 'type=q35+pve0'' failed: got timeout

Any thoughts on what could be causing this behavior?
 
hey,
please drop:
-VM cfg
-VM options

check your "cloud-init" configuration (idk the real name of this, never used).
 
Hey,

UPDATE: i've try on my computer your context situation. I'm writing this post with my Ubuntu 24.04 lts with rtx3060 passthrough and 32Gb RAM working.

It seems the problem is specific to your case.

VM importants points:
- Type q35
- UEFI Bios
- EFI DIsk & TPM disk
- processor on "host" with good tags applied (specifically 1Gb + hugepages)

hope this will help you