Posting this in case it helps others on PVE 9.2 + Ceph Tentacle, and to ask whether the underlying behavior is a bug. Full disclosure, I heavily relied on AI to troubleshoot and resolve (albeit with a workaround) and draft this post.
After a cluster-wide cold boot, all my LXC containers backed by Ceph RBD refused to start with
My environment is:
The kernel connects to the mon over
The cluster is healthy and the config is standard. The OSDs bind and listen on v2 sockets on the public interface, and the OSD metadata reports v2 — but the published addrvec in the OSDMap is v1-only:
Confirmed it's purely a messenger mismatch — a manual map with legacy mode connects over v1 and works fine:
Why only LXC, and why after a reboot? VMs map via librbd (userspace), which tolerates the v1-only public addrs. LXC uses kernel krbd, which — combined with Tentacle's
The workaround to resolve the issue is to use legacy mode.
Add
This makes every
Is the v1-only public addrvec expected on Tentacle, or a bug? The OSDs clearly bind v2 on the public network and report it in metadata, yet the OSDMap publishes only v1 for the public address — which is what breaks krbd now that
Has anyone else on PVE 9.2 + Tentacle seen this, and is there a way to make the OSDs publish the full v2+v1 public addrvec so the legacy workaround isn't needed?
Happy to provide full
After a cluster-wide cold boot, all my LXC containers backed by Ceph RBD refused to start with
rbd: sysfs write failed / exit status 110. VMs on the same pool were completely unaffected. Root cause turned out to be a messenger-protocol mismatch: my OSDs publish a v1-only public address in the OSDMap, while Tentacle's rbd map now defaults to msgr2 — so krbd can't find a v2 address and aborts. A ceph.conf workaround fixes it; I think the v1-only public advertisement may be a Tentacle bug.My environment is:
- Proxmox VE 9.2.3, kernel 7.0.12-1-pve
- Ceph 20.2.1-pve1 Tentacle (hyperconverged), 5-node cluster, 3 nodes running Ceph
- Separate public (192.168.111.0/24) and cluster (192.168.123.0/24) networks
- Cluster originally built on an earlier Ceph (Squid) release and upgraded to Tentacle
Code:
pct start <vmid>
...
rbd: sysfs write failed
can't map rbd volume vm-<vmid>-disk-0: rbd: sysfs write failed
Script exited with status 110
dmesg showed:
Code:
libceph: mon1 (2)192.168.111.11:3300 session established
libceph: no match of type 2 in addrvec
libceph: corrupt full osdmap (-2) epoch <N> off <X>
libceph: osdc handle_map corrupt msg
The kernel connects to the mon over
msgr2 (the (2)…:3300), then fails decoding the OSDMap because it can't find a type-2 (v2) address for an OSD.The cluster is healthy and the config is standard. The OSDs bind and listen on v2 sockets on the public interface, and the OSD metadata reports v2 — but the published addrvec in the OSDMap is v1-only:
Code:
# ceph osd metadata 5 | grep front_addr
"front_addr": "[v2:192.168.111.10:6802/...,v1:192.168.111.10:6803/...]", <-- v2 present
# ceph osd find 5
"addrs": { "addrvec": [ { "type": "v1", "addr": "192.168.111.10:6803", ... } ] } <-- v2 missing
ceph osd dump confirms it for every OSD: the public-network address is bare v1:, while the cluster-network address is a full [v2:…,v1:…] addrvec. ms_bind_msgr2, ms_bind_ipv4 are true, ms_bind_ipv6 is false, mons advertise both v1+v2 correctly, and there are no stray public_addr lines on the OSDs (only the standard per-mon ones).Confirmed it's purely a messenger mismatch — a manual map with legacy mode connects over v1 and works fine:
Code:
# rbd map <pool>/vm-<vmid>-disk-0 -o ms_mode=legacy
/dev/rbd0 <-- success
Why only LXC, and why after a reboot? VMs map via librbd (userspace), which tolerates the v1-only public addrs. LXC uses kernel krbd, which — combined with Tentacle's
rbd device map now defaulting to msgr2 — strictly requires v2 and aborts. It only surfaced after the cold boot because that's when the OSDs first restarted onto Tentacle and (re)published the v1-only addrvec; already-running containers had been coasting on maps made before the upgrade.The workaround to resolve the issue is to use legacy mode.
Add
rbd_default_map_options = ms_mode=legacy to /etc/pve/ceph.conf under [client]:
INI:
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
rbd_default_map_options = ms_mode=legacy
This makes every
rbd map (including the ones Proxmox issues for containers) default to msgr1. After this, both a bare rbd map and pct start succeed. No daemon restart needed; it only affects krbd maps (LXC), not VMs.Is the v1-only public addrvec expected on Tentacle, or a bug? The OSDs clearly bind v2 on the public network and report it in metadata, yet the OSDMap publishes only v1 for the public address — which is what breaks krbd now that
rbd map defaults to msgr2.Has anyone else on PVE 9.2 + Tentacle seen this, and is there a way to make the OSDs publish the full v2+v1 public addrvec so the legacy workaround isn't needed?
Happy to provide full
ceph osd dump, ceph mon dump, and ss output if useful.