I recently updated our 3node proxmox cluster with ceph to Ceph Squid (19.2) and PVE9.
As our previous ceph pool config was not ideal I created a new erasure coded pool with the new crush rule profile options
However, now I can't start VMs with TPM anymore and running
The
It looks to me that the backend files for
As our previous ceph pool config was not ideal I created a new erasure coded pool with the new crush rule profile options
crush-osds-per-failure-domain and crush-num-failure-domains. To create the new pool with the new profile I had to set the client feature set to squid (ceph osd set-require-min-compat-client squid). Everything seemed to work as expected and also the new PG distribution was as it should be.However, now I can't start VMs with TPM anymore and running
swtpm processes are hanging unkillable in dead state:
Code:
root 49682 0.0 0.0 15832 3732 ? D Dec03 0:00 swtpm socket --tpmstate backend-uri=file:///dev/rbd-pve/f975fbb2-5281-4024-b361-1faca8b15e3e/tpx-ecpool2-meta/vm-3033-disk-2,mode=0600 --ctrl type=unixio,path=/var/run/qemu-server/3033.swtpm,mode=0600 --pid file=/var/run/qemu-server/3033.swtpm.pid --terminate --daemon --log file=/run/qemu-server/3033-swtpm.log,level=1,prefix=[id=1764762902] --tpm2
dmesg log shows some libceph errors with missing protocol features:
Code:
[Tue Dec 9 15:46:04 2025] libceph: osd20 (1)10.20.20.153:6820 feature set mismatch, my 2f018fb87aa4aafe < server's 2f018ff87aa4aafe, missing 4000000000
[Tue Dec 9 15:46:04 2025] libceph: osd20 (1)10.20.20.153:6820 missing required protocol features
It looks to me that the backend files for
swtpm mapped via the kernel rbd module which doesn't support the newest client feature set of ceph squid. IMHO this is a bug. The backend files for swtpm should be exported with the same client feature set as the VM disk image files.