proxmox + ceph = slow performance?

dominiaz

Renowned Member
Sep 16, 2016
53
11
73
38
I set up a new cluster with ceph storage on 3 identical nodes (2x XEON, 4TB raid5).

When I`m trying to restore backup from NFS to ceph storage i get only 5MB/s speed. Installing from iso is also very slow performance.

When i`m trying to copy the same backup file to /var/lib/ceph/osd/ceph-1 I get 500-600MB/s.

What is wrong?
 
3 osd - 1 osd per server. Each osd is 4TB (5x 1TB HDD RAID5).

Pool is

Size: 3
Min: 1
pg_num: 300

LAN is 10Gb/s
 
3 osd - 1 osd per server. Each osd is 4TB (5x 1TB HDD RAID5).

That is a particularly bad setup for ceph. You need much more OSDs when you use slow (spinning) disks (>24).

Note: ceph.com recommended a minimum of 100 OSDs in the past.

If you use fast SSD only, a reasonable start is with 12 OSDs. But IMHO this is the absolute minimum. 3 OSDs will not work with ceph.
 
How to make 100 osd if I have only 15 hdd in 3 servers?

Should I remove raid5 and make 15 osd?
 
Yes, I have 10Gbit LAN.

I made 15 osd and the speed is really the same 5MB/s.

What is wrong?
 
Last edited:
Yes, I have 10Gbit LAN.

I made 15 osd and the speed is really the same 5MB/s.

What is wrong?
Hi,
I think here are mixed two problems together.

1. restore-speed to ceph
2. ceph single threat performance

For both points there are threads in this forum.
To point one, there was an update a time ago, which speed up the performance but imho it's still quite slow...
Ceph has not the best single thread performance (ceph like the access of multible (many) VMs - to many many ODSs). If you have only one VM/thread and some OSDs the readperformance don't will be huge.
You can tune your ceph.conf for better values.

For testing run "rados bench" an the host, like this:
Code:
rados bench -p rbd 60 write --no-cleanup     # no-cleanup is important to leave the data for reading

# after that clear the cache on all osd-nodes
echo 3 > /proc/sys/vm/drop_caches

# now reading with 16 threads
rados bench -p rbd 60 seq --no-cleanup

# again clear the cache on all nodes
echo 3 > /proc/sys/vm/drop_caches

# read with one thread only
rados bench -t 1 -p rbd 60 seq --no-cleanup
Udo
 
Now I have 25 osd.

There must be some problem. I can get 800MB/s on the same server without ceph inside VM. On ceph I can get only 100MB/s inside VM? WTF?

root@prox-a1:~# rados bench -p ceph_storage 60 write --no-cleanup

Total time run: 61.044912
Total writes made: 750
Write size: 4194304
Bandwidth (MB/sec): 49.1441
Stddev Bandwidth: 28.9756
Max bandwidth (MB/sec): 112
Min bandwidth (MB/sec): 0
Average IOPS: 12
Average Latency(s): 1.3003
Stddev Latency(s): 0.8181
Max latency(s): 5.84128
Min latency(s): 0.0897775

root@prox-a1:~# rados bench -p ceph_storage 60 seq --no-cleanup

Total time run: 17.536572
Total reads made: 750
Read size: 4194304
Bandwidth (MB/sec): 171.071
Average IOPS: 42
Average Latency(s): 0.371262
Max latency(s): 8.25948
Min latency(s): 0.00641081

root@prox-a1:~# rados bench -t 1 -p ceph_storage 60 seq --no-cleanup

Total time run: 42.994229
Total reads made: 750
Read size: 4194304
Bandwidth (MB/sec): 69.7768
Average IOPS: 17
Average Latency(s): 0.0566783
Max latency(s): 0.307723
Min latency(s): 0.00647915




HOST:

hdparm -tT /dev/sdb

/dev/sdb:
Timing cached reads: 20918 MB in 2.00 seconds = 10461.83 MB/sec
Timing buffered disk reads: 2528 MB in 3.00 seconds = 842.40 MB/sec



VM on CEPH (10G LAN)

hdparm -tT /dev/vda

/dev/vda:
Timing cached reads: 12706 MB in 2.00 seconds = 6457.19 MB/sec
Timing buffered disk reads: 298 MB in 3.06 seconds = 97.30 MB/sec
 
For test try adding iothread: 1 to the VM's disk. (At the moment doing this means you cannot use proxmox' backup solution then for these disks)
 
The results are pretty the same.

Timing buffered disk reads: 446 MB in 3.01 seconds = 148.30 MB/sec