Very poor storage performance in PVE ceph

Jordan.zhang

New Member
May 13, 2024
6
2
3
I tried to test the storage performance of PVE ceph, but the performance I got was very low. I also tried some methods to optimize the test conditions, but there was basically no big change. Any suggestions are appreciated.

hardware configuration:
Node:4
CPU:2 x 6140 18 core 2.3Ghz
MEM:128G
SSD:1x Nvme 800G
HDD:2x 4TB SATA
NIC:4x1Gb+2x10Gb
 

Attachments

  • 截屏2024-06-20 下午3.46.46.png
    截屏2024-06-20 下午3.46.46.png
    237.7 KB · Views: 18
  • 截屏2024-06-20 下午3.47.37.png
    截屏2024-06-20 下午3.47.37.png
    69.8 KB · Views: 18
  • 截屏2024-06-20 下午3.51.15.png
    截屏2024-06-20 下午3.51.15.png
    328.3 KB · Views: 16
  • 截屏2024-06-20 下午3.53.41.png
    截屏2024-06-20 下午3.53.41.png
    44.1 KB · Views: 16
  • 截屏2024-06-20 下午4.04.15.png
    截屏2024-06-20 下午4.04.15.png
    312.6 KB · Views: 16
  • 截屏2024-06-20 下午4.17.45.png
    截屏2024-06-20 下午4.17.45.png
    66.8 KB · Views: 18
Would you be ok to run lsblk -o tran,name,type,size,vendor,model,label,rota,log-sec,phy-sec on one of the Proxmox servers having the problem?

It'll give us a bunch of detailed info about the disks themselves, and save a bunch of back and forth questions.

The standard lsblk output (like you've provided) is missing a bunch of important, useful info. :)
 
  • Like
Reactions: aaron
What is low? The bandwidth or IOPS? You did a FIO bench with 4k block size. This way, you will run into IOPS limits, but not bandwidth limits.
Changing the blocksize to, for example, 4M would cause it to rather run into bandwidth limits.

Keep in mind that Ceph will usually issue sync write commands and is very latency sensitive. Therefore, using SSDs with power loss protection (PLP) is recommended.

Make sure that the network works reliably and can achieve the expected speeds, for example with iperf.
 
  • Like
Reactions: Kingneutron
Would you be ok to run lsblk -o tran,name,type,size,vendor,model,label,rota,log-sec,phy-sec on one of the Proxmox servers having the problem?

It'll give us a bunch of detailed info about the disks themselves, and save a bunch of back and forth questions.

The standard lsblk output (like you've provided) is missing a bunch of important, useful info. :)
Hi Justinclift,

Thanks for your reply, I attached the verbose output of lsblk
 

Attachments

  • 截屏2024-06-21 上午9.26.48.png
    截屏2024-06-21 上午9.26.48.png
    233.1 KB · Views: 7
  • Like
Reactions: justinclift
Cool. Looking through the model numbers of the pieces, this seems to be your setup:
I'm having trouble finding the data sheet, or really any kind of detailed information on the 4TB Toshiba drives.

There's product info for the 1TB and 2TB drives in that series (here), but nothing definitive for the 4TB one. Can't even tell if they're using CMR or SMR, which is of critical importance performance wise.

IF those 4TB drives use SMR, you'll probably need to go and find some alternative drives that don't. ;)

That being said, I'm not a Ceph guy (not yet anyway), so this is about where my knowledge ends. @aaron (or others) may be able to look at those specs and have helpful suggestions though. :)
 
Hello everyone, after I readjusted the network, I reached what I think is a reasonable range. Thank you for your clues and suggestions.
 

Attachments

  • 截屏2024-06-21 下午1.19.42.png
    截屏2024-06-21 下午1.19.42.png
    283.8 KB · Views: 12
  • Like
Reactions: justinclift

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!