New VM Not Starting Troubleshooting Via CLI

Stereoscope

Member
Jan 27, 2024
60
2
8
Hi, I'm trying to set up my Promxox server from CLI. I don't have it connected to a computer, so I don't have access to the GUI.

I created my first VM:

Code:
balloon: 0
bios: ovmf
boot: order=usb0;scsi0
cores: 2
cpu: host
efidisk0: local-zfs:vm-100-disk-0,efitype=4m,size=1M
hostpci0: 35:00
machine: q35
memory: 512
meta: creation-quemu=9.2.0,ctime=1749528920
name: OS
numa: 1
onboot: 1
ostype: other
scsi0: local-zfs:vm-100-disk-1,iothread=1,size=5G,ssd=1
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=255279c2-dd23-5ac8-c7a8-1f1a533128b3
sockets: 1
usb0: host=13c3:1925
vmgenid: 3cf5a291-822c-8dc2-ca3a-8e262523aca2

lsusb shows:
Bus 002 Device 004: ID 13c3:1925 Card Reader

Which is exactly the USB that I want to use to boot from. I've booted from it before and everything worked fine on another system.

I think I have the settings correct. Took a while troubleshooting, so I'm pretty confident everything is correct.

However, when I type:
qm start 100; qm terminal 100
I get a blinking cursor on a new line and then nothing happens.

Is there some kind of verbose setting I use somewhere to see what's going on?

If I Ctrl+c I see:
^received interrupt
VM 100 not running.

This is the first time passing through a PCIe device however, so who knows, maybe something is wrong in that regard. I've gone through the PCI(e) Passthrough chapter and everything seems to be honkey dorey. But I've never done this before with this hardware, so no idea if something might be wrong in this regard.

I started poking around the logs with:
journalctl -u pveproxy -u pvedaemon
Nothing too useful there it seems.

EDIT: This exercise has uncovered a blind spot. I know nothing about logs in Linux. Researching this further now.
 
Last edited:
I don't have it connected to a computer, so I don't have access to the GUI.

Okay, you may have a reason for this, I am sure. But, really: if you don't have PC in reach get some Laptop or a Tablett or a Phone. A Phone/Tablett is no fun to use for this task (by using a "Desktop"-mode browser!) but it does work.

My point is: the WebGUI includes some (probably a lot of) sanity checks, which the manual CLI-method lacks. And not seeing the virtual desktop of a virtual machine would be a showstopper for me...

Good luck!
 
UdoB has given you the correct rundown on your "inaccessible" VM - (I actually think he has been rather kind & forgiving in his assessment - I would have probably used something more like "your VM in your present setup is not only blind but also dead!").

Setting up any OS to boot the first time - sometimes needs a little fine tuning (especially OVMF, EFI settings etc. & SB) - trying it blind with a booting USB passthrough (+ PCI passthrough) is probably close to impossible.

I have no idea what your use-case is with your above setup - but I also notice you have no NW device at all associated with this VM - so I'm rather perplexed as to exactly what your "OS" is actually supposed to be doing even if it actually boots up, & more importantly - how do you expect to interact with it? Maybe this is where your PCI passthrough comes into play? Maybe only with (limited) qm monitor? IDK.

I'm going to bet that whatever the scenario - a bare-metal setup for this is going to be much more appropriate - but I may be wrong - surprise me!
I'd be interested to hear more - I sometimes enjoy oddballs!

Sorry I missed the serial terminal you are trying to use! This must be setup in the guest also. See here.
 
Last edited:
  • Like
Reactions: UdoB
Yes, I have serial terminal at my disposal, so full CLI access to VMs. For sanity checks, I've got a Proxmox running in a VM with a web GUI where I do all my testing. I've done all the heavy lifting already. Just a matter of hitting go at this stage. Or so I thought!

I fumbled my way around the logs. Still don't understand the structure of:

1 2 3 4 5 6 7 8 9 active A B C D E F index

Or whatever it is. Where do the newest logs go? I had to recursively check every folder! I've seen:

https://forum.proxmox.com/threads/w...logs-stored-for-shipping-to-log-server.72718/
https://forum.proxmox.com/threads/api-to-read-qemu-vm-creation-time-or-uptime.45092/
https://forum.proxmox.com/threads/forwarding-proxmox-logs-to-graylog.112692/

Anyway, couldn't find anything relevant. All my logs just showed "TASK ERROR: received interrupt" which was the result of me pressing ctrl+c. So I let the VM just run and hoped for some kind of error and lo and behold I got:

timeout: no zvol device link for 'vm-100-disk-1' found after 300 sec found.

I've already seen:

https://forum.proxmox.com/threads/zfs-issues-with-disk-images-after-reboot.106322/

The solution proposed there was:

for i in $(ls -1 /dev/zd* |grep -v '/dev/zd[0-9]*p[0-9]*'); do udevadm trigger $i; done

Unfortunately, I don't think this applies to me because I don't have an installed VM yet. So when I run that command, I get:

ls: cannot access '/dev/zd*': No such file or directory

I have no idea what that command is trying to do, so I don't know where to even begin troubleshooting.
 
Last edited:
Are you sure that you are in fact using local-zfs at all?

What is the output for:
Code:
cat /etc/pve/storage.cfg
 
/etc/pve/storage.cfg has:
dir: local
path /var/lib/vz
content iso,vztmpl,backup

zfspool: local-zfs
pool rpool/data
sparse
content images,rootdir

As for what the purpose of the VM, yes, that's where the PCI passthrough comes in. NIC for a router.
 
Last edited:
Code:
Name        Type         Status       Total          Used          Available     %
local       dir          active       3624914304     1822208       3623092096    0.05%
local-zfs   zfspool      active       3623092536     320           3623092216    0.00%
 
Last edited:
As for what the purpose of the VM, yes, that's where the PCI passthrough comes in. NIC for a router.
So I won my bet:
I'm going to bet that whatever the scenario - a bare-metal setup for this is going to be much more appropriate

From your (latest) output - it is clear that the since the VM is non-booting - so no disk has yet been created.

You probably should add a real working bootable disk or ISO (non-passed-through) to the VM to get it to boot.
Have you tried removing that passthrough to get it to boot?

In the end, I still don't see how you are going to get your system working - blindly!

You need to configure the OS to actually use the serial port and I'm not sure you have one at this point.
As above.
https://pve.proxmox.com/wiki/Serial_Terminal#Configuration_on_the_guest
 
From your (latest) output - it is clear that the since the VM is non-booting - so no disk has yet been created.

You probably should add a real working bootable disk or ISO (non-passed-through) to the VM to get it to boot.
Have you tried removing that passthrough to get it to boot?
Can you elaborate?

I decided to try to install another VM with Ubuntu without any fancy PCIe passthrough shenanigans. Same issue. Times out after 300 seconds.

Guys, I've already configured the install media to boot from serial. I can see everything! As I said, I've already tested everything in a Virtualised Proxmox setup. I can install everything just fine.
 
Last edited:
My first post has all the settings I used. qm config 100 seems to produce the same output.

I've enabled NUMA because the documentation said it's a good idea if your hardware supports it. I've got a latest gen Xeon CPU. I will be surprised if it doesn't support NUMA.
 
because the documentation said
Care to elaborate on which documentation you refer to?

On a single non-AMD CPU, which thus uses only one NUMA-node (NUMA=Non-Uniform Memory Access), you probably want it off.

If you have multiple CPUs (physical sockets) (IDK - since you have not provided that info) - you could still try to disable NUMA, which is the default.

My first post has all the settings I used. qm config 100 seems to produce the same output.
So you still have that hostpci & usb passthrough in place.
With what disk are you trying to boot that Ubuntu VM?
 
NUMA is discussed on page 212 of Administration Guide 8.4.0.

I have just 1 socket.
On a single non-AMD CPU, which thus uses only one NUMA-node (NUMA=Non-Uniform Memory Access), you probably want it off.
If you have multiple CPUs (physical sockets) (IDK - since you have not provided that info) - you could still try to disable NUMA, which is the default.
Ok, thanks.
So you still have that hostpci & usb passthrough in place.
With what disk are you trying to boot that Ubuntu VM?
Oh now I follow, you're asking about the Ubuntu VM. Here is the qm config 101 for Ubuntu
balloon: 512
boot: order=scsi0;ide2
cores: 4
cpu: host
ide2: local:iso/ubuntu-24.04.2-live-server-amd64-modified-serial-boot.iso,media=cdrom
memory: 7168
meta: creation-qemu=9.2.0,ctime=1748314320
name: PostgreSQL
numa: 1
ostype: l26
scsi0: local-zfs:vm-101-disk-0,iothread=1,size=100G,ssd=1
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=1a1bd596-3021-91b2-8bfe-52275449712a
sockets: 1
vmgenid: 12bf1acc-275c-2a17-8b42-5bc1ffd97abe
 
Last edited:
I'm less familiar with zfs on linux than freebsd but comparing to my /etc/pve/storage.pve on one of my nodes, where there is a zfs backed pool, it lacks a mountpoint, see:
Code:
zfspool: nest-2
        pool nest-2
        content rootdir,images
        mountpoint /nest-2
        sparse 0
Could it be that the pool exists but the filesystem doesn't?
$zpool list should show the pool and $zfs list should show the datasets. Can you check that?
 
ubuntu-24.04.2-live-server-amd64-modified-serial-boot.iso
You have some non-standard iso there (self-modified?) that you probably can't tell if it is actually bootable. UEFI or legacy?

Can you try a simple & standard VM - for testing purposes - just to see if it actually boots on your blind system.

name: PostgreSQL
Where did that come from on a plain Ubuntu VM. Some copy & paste action? Where from?
 
You have some non-standard iso there (self-modified?) that you probably can't tell if it is actually bootable. UEFI or legacy?

Can you try a simple & standard VM - for testing purposes - just to see if it actually boots on your blind system.
Yes, I modified it similar to:

https://pve.proxmox.com/wiki/Serial_Terminal

It definitely boots, it definitely installs. If you run:

qm start 101; qm terminal 101

You literally see the CLI of the booting Ubuntu instance on the screen instead of Proxmox CLI. You can press ctlr+O to exit terminal view of Ubuntu to get back to Proxmox CLI. The system is definitely not blind :).
 
Last edited: