KRBD 0 Ceph prevents VMs from starting

jepper · Dec 11, 2025

I have a ceph squid cluster installed separately (with cephadm) and added to my proxmox cluster using the Datacenter/Storage pane. Native proxmox Ceph is not enabled.

I can only boot VMs when KRBD is enabled. I would like to disable it, so I can enable PWL. But when I disable KRBD, and turn on any VM, proxmox throws

kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"6789"},{"host":"10.112.212.72","port":"6789"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory
TASK ERROR: start failed: QEMU exited with code 1

Using rbd, I can see the storage:

Bash:

root@lab6:~# rbd ls pool1|grep vm-103-disk-2
vm-103-disk-2

I'm at loss as to where to start troubleshooting this.

aaron · Dec 11, 2025

Is that the output from the task log?

What output do you get if you try to start the VM from the CLI?
qm start <VMID>

jepper · Dec 11, 2025

aaron said:
Is that the output from the task log?

What output do you get if you try to start the VM from the CLI?
qm start <VMID>

The error in the initial post is from the web interface Tasks log.
Similar output when I run from cli:
root@lab6:~# qm start 103

kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"6789"},{"host":"10.112.212.72","port":"6789"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory

start failed: QEMU exited with code 1

fabian · Dec 11, 2025

please post:

- pveversion -v
- cat /etc/pve/storage.cfg
- cat /etc/pve/priv/ceph/ceph-metal.conf
- cat /etc/ceph/ceph.conf

please censor any sensitive data such as auth keys

jepper · Dec 11, 2025

fabian said:
please post:

- pveversion -v
- cat /etc/pve/storage.cfg
- cat /etc/pve/priv/ceph/ceph-metal.conf
- cat /etc/ceph/ceph.conf

please censor any sensitive data such as auth keys

# pveversion -v
proxmox-ve: 9.1.0 (running kernel: 6.17.2-2-pve)
pve-manager: 9.1.2 (running version: 9.1.2/9d436f37a0ac4172)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17.2-2-pve-signed: 6.17.2-2
proxmox-kernel-6.17: 6.17.2-2
proxmox-kernel-6.14.11-4-pve-signed: 6.14.11-4
proxmox-kernel-6.14: 6.14.11-4
proxmox-kernel-6.14.8-2-pve-signed: 6.14.8-2
amd64-microcode: 3.20250311.1
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.4
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.0
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.3
libpve-rs-perl: 0.11.3
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-3
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.0-1
proxmox-backup-file-restore: 4.1.0-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.2
pve-cluster: 9.0.7
pve-container: 6.0.18
pve-docs: 9.1.1
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.17-2
pve-ha-manager: 5.0.8
pve-i18n: 3.6.5
pve-qemu-kvm: 10.1.2-4
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.1
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1

# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content images,iso,backup,vztmpl,snippets
shared 0

rbd: ceph-metal
content rootdir,images
krbd 0
monhost 10.112.212.71:6789,10.112.212.72:6789
pool pool1
username admin

# cat /etc/pve/priv/ceph/ceph-metal.conf
[global]
fsid = 1216xx
mon_host = [v2:10.112.212.71:3300/0,v1:10.112.212.71:6789/0] [v2:10.112.212.72:3300/0,v1:10.112.212.72:6789/0]

# cat /etc/ceph/ceph.conf (it's a symlink from /etc/pve/priv/ceph/ceph-metal.conf)
[global]
fsid = 1216xx
mon_host = [v2:10.112.212.71:3300/0,v1:10.112.212.71:6789/0] [v2:10.112.212.72:3300/0,v1:10.112.212.72:6789/0]

aaron · Dec 11, 2025

jepper said:
monhost 10.112.212.71:6789,10.112.212.72:6789

As a first guess, I don't think you should add the ports here. Can you try it with just the IPs in the storage config?

And please use the code block formatting that usually makes reading output a lot nicer

fabian · Dec 12, 2025

the configs (storage and ceph) don't agree on the ports - the first mon has port 3300 in your ceph config, and 6789 in your storage.cfg. assuming qemu picks the first monitor, if it is not listening on 6789, that would explain the error

jepper · Dec 12, 2025

aaron said:
As a first guess, I don't think you should add the ports here. Can you try it with just the IPs in the storage config?

And please use the code block formatting that usually makes reading output a lot nicer

I have removed port numbers, and space separated the IP addresses.

## with krbd enabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 1
monhost 10.112.212.71 10.112.212.72
## and qm start works
root@lab6:~# qm start 103
/dev/rbd0

## with krbd disabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 0
monhost 10.112.212.71 10.112.212.72
## it now defaults to port 3300 but same error
root@lab6:/tmp# qm start 103

kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"3300"},{"host":"10.112.212.72","port":"3300"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory

start failed: QEMU exited with code 1

Would an strace qm start 103 be useful?

jepper · Dec 12, 2025

fabian said:
the configs (storage and ceph) don't agree on the ports - the first mon has port 3300 in your ceph config, and 6789 in your storage.cfg. assuming qemu picks the first monitor, if it is not listening on 6789, that would explain the error

I can confirm all the ports are listening (I have a 2 node ceph cluster just to be sure, it is configured as such deliberately)
root@lab6:~# echo >/dev/tcp/10.112.212.71/3300||echo error
root@lab6:~# echo >/dev/tcp/10.112.212.72/3300||echo error
root@lab6:~# echo >/dev/tcp/10.112.212.71/6789||echo error
root@lab6:~# echo >/dev/tcp/10.112.212.72/6789||echo error

fabian · Dec 15, 2025

jepper said:
I have removed port numbers, and space separated the IP addresses.

## with krbd enabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 1
monhost 10.112.212.71 10.112.212.72
## and qm start works
root@lab6:~# qm start 103
/dev/rbd0

## with krbd disabled
root@lab6:~# grep -E "krbd|mon" /etc/pve/storage.cfg
krbd 0
monhost 10.112.212.71 10.112.212.72
## it now defaults to port 3300 but same error
root@lab6:/tmp# qm start 103
kvm: -blockdev {"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"auth-client-required":["cephx"],"cache":{"direct":true,"no-flush":false},"conf":"/etc/pve/priv/ceph/ceph-metal.conf","detect-zeroes":"on","discard":"ignore","driver":"rbd","image":"vm-103-disk-2","node-name":"e6c5a00bb877016fc68a29f1bd66a1b","pool":"pool1","read-only":false,"server":[{"host":"10.112.212.71","port":"3300"},{"host":"10.112.212.72","port":"3300"}],"user":"admin"},"node-name":"f6c5a00bb877016fc68a29f1bd66a1b","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}: error connecting: No such file or directory
start failed: QEMU exited with code 1

Would an strace qm start 103 be useful?

could you try first running the "kvm" command printed by `qm showcmd 103`, if it also exits with that error, then run it under strace, it might shed some light..

jepper · Dec 15, 2025

fabian said:
qm showcmd 103

I have attached the output of strace -f. Thanks for your patience, and sorry for the misunderstanding earlier.

fabian · Dec 15, 2025

that just prints the command that you need to execute

jepper · Dec 15, 2025

fabian said:
that just prints the command that you need to execute

Penny just dropped. Sorry let me re-write the reply!

fiona · Dec 16, 2025

Code:

[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.client.admin.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

Maybe these are the cause? Does your /etc/pve/priv/ceph/ceph-metal.conf configure the keyring property pointing to the correct file?

jepper · Dec 16, 2025

fiona said:

Code:

[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.client.admin.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/ceph-metal.keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 1373949] openat(AT_FDCWD, "/etc/ceph/keyring.bin", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

Maybe these are the cause? Does your /etc/pve/priv/ceph/ceph-metal.conf configure the keyring property pointing to the correct file?

You're absolutely right. I had already looked at this particular snippet, but I must have suffered from severe troubleshooting/afternoon fatigue. I have symlinked from /etc/pve/priv/ceph now to one of the paths kvm is looking for, and the VM came up. Two heads are better than one as they say. Thank you so much guys!

Search

Search

KRBD 0 Ceph prevents VMs from starting

jepper

New Member

aaron

Proxmox Staff Member

jepper

New Member

fabian

Proxmox Staff Member

jepper

New Member

aaron

Proxmox Staff Member

fabian

Proxmox Staff Member

jepper

New Member

jepper

New Member

fabian

Proxmox Staff Member

jepper

New Member

Attachments

fabian

Proxmox Staff Member

jepper

New Member

fiona

Proxmox Staff Member

jepper

New Member

We value your privacy