Hi everyone,
I’m facing a massive performance drop when accessing RBD volumes from inside VMs compared to direct access from the Proxmox host. I’d like your feedback to understand if this is expected or if something can be tuned — without giving up live migration.
~115k IOPS, avg latency ~270 µs
2. Inside VM (via virtio-scsi + iothread, disk via rbd
~500 IOPS, avg latency > 2 ms
Is this a known limitation of:
Florian (8-node Proxmox + Ceph datacenter, NVMe-only, dual-site design)
I’m facing a massive performance drop when accessing RBD volumes from inside VMs compared to direct access from the Proxmox host. I’d like your feedback to understand if this is expected or if something can be tuned — without giving up live migration.
Setup details :
- Ceph version: Reef
- Proxmox VE: 8.2.x
- Hardware: 8-node full-NVMe cluster (Micron 7300, 7×7.68TB per host)
- Ceph pool: RBD, replica 4:2 (for dual-site failure tolerance)
- Network: 2×40Gbps bonded, low latency (< 0.5ms), VXLAN-based fabric
- Tested Clients:
- Host via rbd map
- VMs with Proxmox-integrated RBD disks (virtio-scsi + iothread)
Benchmark results :
1. On the Proxmox host (direct
Code:
fio --name=randread --filename=/dev/rbd0 --rw=randread --bs=8k --iodepth=32 --numjobs=1 --runtime=30 --ioengine=libaio --group_reporting
2. Inside VM (via virtio-scsi + iothread, disk via rbd

Code:
fio --name=randread --filename=/dev/sdX --rw=randread --bs=8k --iodepth=32 --numjobs=1 --runtime=30 --ioengine=libaio --group_reporting
Additional notes
- KRBD is enabled on the RBD storage definition in Proxmox, but VM disks still show up as /dev/sdX
- Changing cache, aio, and controller types (virtio-blk vs scsi) doesn’t improve things
- Using rbd map and exposing raw LVM volume to the VM gives ~14k IOPS — better, but still below native
- This obviously breaks migration, snapshots, etc.
My question
Why is the I/O performance 20× worse inside a VM than from the host directly, despite using iothread, virtio-scsi, and KRBD?Is this a known limitation of:
- QEMU’s librbd integration?
- Proxmox’s disk layer?
- Something else I’m missing?
Any advice or similar experience welcome!
Thanks in advance,Florian (8-node Proxmox + Ceph datacenter, NVMe-only, dual-site design)