Hi,
I have recently deployed a 4 node proxmox Ceph cluster each node has the following
2 x Xeon gold 6130
384GB RAM
5 x Samsung PM 1733
2 x 25gbe dedicated ceph links bonded
I don't seem to be getting the same level of performance as I was previously with vSAN ESA and I am wondering if there is any performance tweaks I should try.
I am using 4 vm's 1 on each host to run fio with
fio --filename=/root/fio.bin --size=10GB --direct=1 --rw=randrw --rwmixread=100 --ioengine=libaio --iodepth=16 --runtime=60 --numjobs=16 --time_based --group_reporting --name=iops-test-job --bs 4K
1 VM I get ~70K IOPS
2 VM I get ~135K IOPS
3 VM I get ~150K IOPS
4 VM I get ~160K IOPS
fio --filename=/root/fio.bin --size=10GB --direct=1 --rw=randrw --rwmixread=0 --ioengine=libaio --iodepth=16 --runtime=60 --numjobs=16 --time_based --group_reporting --name=iops-test-job --bs 4K
1 VM I get ~45K IOPS
2 VM I get ~70K IOPS
3 VM I get ~75K IOPS
4 VM I get ~80K IOPS
Looking at the disk utilisation with iostat they don't look too stressed reported around 30% utilisation
I have tested performance after taking out 16 disks from the cluster leaving just 1 per node and was able to reach similar performance to the above so it does feel like there is a bottleneck somewhere. network usage when testing is around 4-5gbps total for each node.
any ideas would be appreciated
I have recently deployed a 4 node proxmox Ceph cluster each node has the following
2 x Xeon gold 6130
384GB RAM
5 x Samsung PM 1733
2 x 25gbe dedicated ceph links bonded
I don't seem to be getting the same level of performance as I was previously with vSAN ESA and I am wondering if there is any performance tweaks I should try.
I am using 4 vm's 1 on each host to run fio with
fio --filename=/root/fio.bin --size=10GB --direct=1 --rw=randrw --rwmixread=100 --ioengine=libaio --iodepth=16 --runtime=60 --numjobs=16 --time_based --group_reporting --name=iops-test-job --bs 4K
1 VM I get ~70K IOPS
2 VM I get ~135K IOPS
3 VM I get ~150K IOPS
4 VM I get ~160K IOPS
fio --filename=/root/fio.bin --size=10GB --direct=1 --rw=randrw --rwmixread=0 --ioengine=libaio --iodepth=16 --runtime=60 --numjobs=16 --time_based --group_reporting --name=iops-test-job --bs 4K
1 VM I get ~45K IOPS
2 VM I get ~70K IOPS
3 VM I get ~75K IOPS
4 VM I get ~80K IOPS
Looking at the disk utilisation with iostat they don't look too stressed reported around 30% utilisation
I have tested performance after taking out 16 disks from the cluster leaving just 1 per node and was able to reach similar performance to the above so it does feel like there is a bottleneck somewhere. network usage when testing is around 4-5gbps total for each node.
any ideas would be appreciated