Proxmox VE Ceph Benchmark 2020/09 - hyper-converged with NVMe

As Ceph uses a 4M "block size" I would rather test around changing the NVMe's blocksize from 512K to 4M.
In my Intel SSDPE2KX080T8 NVMe disks I see two LBA format:
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good
LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best (in use)

You have wrote 512K to 4M, is that correct or did you mean 512B to 4KB? I can't find anything about 4M LBA Format.

Thanks again!
 
In my Intel SSDPE2KX080T8 NVMe disks I see two LBA format:
LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good
LBA Format 1 : Metadata Size: 0 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best (in use)

You have wrote 512K to 4M, is that correct or did you mean 512B to 4KB? I can't find anything about 4M LBA Format.

Thanks again!
I only found out about msecli from this ZFS benchmark thread and back then had not considered it for my benchmarks.
So yes, I was wrong - it should be 4KB NVMe block size.
And the default Ceph block size is 4MB - no idea if Proxmox does changes to the RBDs here.
 
So yes, I was wrong - it should be 4KB NVMe block size.
Some NVMe allow to set 4 MB allocation size, not the Micron 9300 though. And it doesn't make any difference if it is 4 KB or 512 B, not with Ceph anyway.

And the default Ceph block size is 4MB - no idea if Proxmox does changes to the RBDs here.
It's vanilla as it gets. You can try to change the stripe size to get better write performance with sacrificing some read performance though.
 
How did you benchmarking rados bench reads? Did you create writes on each host with its own run-name and then rados bench read with that run-names specified? Each client (read) has its own run-names (writes) to read from?

Does it make a difference if two rados clients (read) use the same run-name (-> data read is the same for both clients)?
 
How did you benchmarking rados bench reads? Did you create writes on each host with its own run-name and then rados bench read with that run-names specified? Each client (read) has its own run-names (writes) to read from?

Does it make a difference if two rados clients (read) use the same run-name (-> data read is the same for both clients)?
IIRC first the write benchmarks with the --no-cleanup option. Both times with their unique name per node with --run-name.

Not sure if there is much of a difference, but I assume it could have a bit of an effect if two clients use the same data. Since we benchmarked writes before, we wanted to have them separate anyway.
 
  • Like
Reactions: jsterr
IIRC first the write benchmarks with the --no-cleanup option. Both times with their unique name per node with --run-name.

Not sure if there is much of a difference, but I assume it could have a bit of an effect if two clients use the same data. Since we benchmarked writes before, we wanted to have them separate anyway.

Seems like it does not really make a difference (we tested it). We did 4x 25 Gbit Meshed Round Robin with 3 Nodes and got that values:

6x (2 per node) rados bench 60 seq -t 16 -p vm_nvme --run-name UNIQUE-NAME
SUM: 13.8 GB/s

6x (2 per Node) rados bench 60 write -b 4M -t 16 --no-cleanup -p vm_nvme --run-name UNIQUE-NAME
SUM: 4.8 GB/s
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!