KRBD 0 Ceph prevents VMs from starting

jepper

New Member
Oct 2, 2025
9
0
1
I have a ceph squid cluster installed separately (with cephadm) and added to my proxmox cluster using the Datacenter/Storage pane. Native proxmox Ceph is not enabled.

I can only boot VMs when KRBD is enabled. I would like to disable it, so I can enable PWL. But when I disable KRBD, and turn on any VM, proxmox throws

kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"6789"},{"host":"10.112.212.72","port":"6789"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory
TASK ERROR: start failed: QEMU exited with code 1

Using rbd, I can see the storage:
Bash:
root@lab6:~# rbd ls pool1|grep vm-103-disk-2
vm-103-disk-2

I'm at loss as to where to start troubleshooting this.
 
Last edited:
Is that the output from the task log?

What output do you get if you try to start the VM from the CLI?
qm start <VMID>
 
Is that the output from the task log?

What output do you get if you try to start the VM from the CLI?
qm start <VMID>
The error in the initial post is from the web interface Tasks log.
Similar output when I run from cli:
root@lab6:~# qm start 103
kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"6789"},{"host":"10.112.212.72","port":"6789"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory
start failed: QEMU exited with code 1
 
Last edited:
please post:

- pveversion -v
- cat /etc/pve/storage.cfg
- cat /etc/pve/priv/ceph/ceph-metal.conf
- cat /etc/ceph/ceph.conf

please censor any sensitive data such as auth keys
 
please post:

- pveversion -v
- cat /etc/pve/storage.cfg
- cat /etc/pve/priv/ceph/ceph-metal.conf
- cat /etc/ceph/ceph.conf

please censor any sensitive data such as auth keys
# pveversion -v
proxmox-ve: 9.1.0 (running kernel: 6.17.2-2-pve)
pve-manager: 9.1.2 (running version: 9.1.2/9d436f37a0ac4172)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17.2-2-pve-signed: 6.17.2-2
proxmox-kernel-6.17: 6.17.2-2
proxmox-kernel-6.14.11-4-pve-signed: 6.14.11-4
proxmox-kernel-6.14: 6.14.11-4
proxmox-kernel-6.14.8-2-pve-signed: 6.14.8-2
amd64-microcode: 3.20250311.1
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.4
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.0
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.3
libpve-rs-perl: 0.11.3
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-3
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.0-1
proxmox-backup-file-restore: 4.1.0-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.2
pve-cluster: 9.0.7
pve-container: 6.0.18
pve-docs: 9.1.1
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.17-2
pve-ha-manager: 5.0.8
pve-i18n: 3.6.5
pve-qemu-kvm: 10.1.2-4
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.1
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1

# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content images,iso,backup,vztmpl,snippets
shared 0

rbd: ceph-metal
content rootdir,images
krbd 0
monhost 10.112.212.71:6789,10.112.212.72:6789
pool pool1
username admin

# cat /etc/pve/priv/ceph/ceph-metal.conf
[global]
fsid = 1216xx
mon_host = [v2:10.112.212.71:3300/0,v1:10.112.212.71:6789/0] [v2:10.112.212.72:3300/0,v1:10.112.212.72:6789/0]

# cat /etc/ceph/ceph.conf (it's a symlink from /etc/pve/priv/ceph/ceph-metal.conf)
[global]
fsid = 1216xx
mon_host = [v2:10.112.212.71:3300/0,v1:10.112.212.71:6789/0] [v2:10.112.212.72:3300/0,v1:10.112.212.72:6789/0]
 
Last edited:
monhost 10.112.212.71:6789,10.112.212.72:6789
As a first guess, I don't think you should add the ports here. Can you try it with just the IPs in the storage config?

And please use the code block formatting that usually makes reading output a lot nicer :)
 
the configs (storage and ceph) don't agree on the ports - the first mon has port 3300 in your ceph config, and 6789 in your storage.cfg. assuming qemu picks the first monitor, if it is not listening on 6789, that would explain the error ;)
 
As a first guess, I don't think you should add the ports here. Can you try it with just the IPs in the storage config?

And please use the code block formatting that usually makes reading output a lot nicer :)
I have removed port numbers, and space separated the IP addresses.

## with krbd enabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 1
monhost 10.112.212.71 10.112.212.72
## and qm start works
root@lab6:~# qm start 103
/dev/rbd0

## with krbd disabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 0
monhost 10.112.212.71 10.112.212.72
## it now defaults to port 3300 but same error
root@lab6:/tmp# qm start 103
kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"3300"},{"host":"10.112.212.72","port":"3300"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory
start failed: QEMU exited with code 1

Would an strace qm start 103 be useful?
 
the configs (storage and ceph) don't agree on the ports - the first mon has port 3300 in your ceph config, and 6789 in your storage.cfg. assuming qemu picks the first monitor, if it is not listening on 6789, that would explain the error ;)
I can confirm all the ports are listening (I have a 2 node ceph cluster just to be sure, it is configured as such deliberately)
root@lab6:~# echo >/dev/tcp/10.112.212.71/3300||echo error
root@lab6:~# echo >/dev/tcp/10.112.212.72/3300||echo error
root@lab6:~# echo >/dev/tcp/10.112.212.71/6789||echo error
root@lab6:~# echo >/dev/tcp/10.112.212.72/6789||echo error
 
Last edited:
I have removed port numbers, and space separated the IP addresses.

## with krbd enabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 1
monhost 10.112.212.71 10.112.212.72
## and qm start works
root@lab6:~# qm start 103
/dev/rbd0

## with krbd disabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 0
monhost 10.112.212.71 10.112.212.72
## it now defaults to port 3300 but same error
root@lab6:/tmp# qm start 103
kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"3300"},{"host":"10.112.212.72","port":"3300"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory
start failed: QEMU exited with code 1

Would an strace qm start 103 be useful?
could you try first running the "kvm" command printed by `qm showcmd 103`, if it also exits with that error, then run it under strace, it might shed some light..
 
that just prints the command that you need to execute ;)
 
Code:
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.client.admin.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
Maybe these are the cause? Does your /etc/pve/priv/ceph/ceph-metal.conf configure the keyring property pointing to the correct file?
 
Code:
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.client.admin.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
Maybe these are the cause? Does your /etc/pve/priv/ceph/ceph-metal.conf configure the keyring property pointing to the correct file?
You're absolutely right. I had already looked at this particular snippet, but I must have suffered from severe troubleshooting/afternoon fatigue. I have symlinked from /etc/pve/priv/ceph now to one of the paths kvm is looking for, and the VM came up. Two heads are better than one as they say. Thank you so much guys!