Hello,
A few information about the System:
Its a Hyperconverged Cluster of 5 Supermicro AS -1114S-WN10RT:
4 of the Servers have:
CPU: 128 x AMD EPYC 7702P 64-Core Processor (1 Socket)
RAM: 512 GB
1 of the Servers has:
64 x AMD EPYC 7502P 32-Core Processor (1 Socket)
RAM: 256 GB
Network:
All Servers have a Mellanox Technologies MT28800 Family [ConnectX-5 Ex] with 2x 100GbE for Ceph
I installed the OFED Driver in the newest Version MLNX_OFED_LINUX-5.1-2.5.8.0-debian10.3-x86_64 (on the Server I build a new repo for new Kernel (5.4.106-1-pve) and installed the new Packages form Local repo)
Storage:
Ceph storage on each Server 2 x Micron_9300_MTFDHAL3T2TDR with 3.2TB
Each drive has 4 OSDs. The Cluster has 40 OSDs with 1025 PGs
On Thursday 08.04.2021 I Upgraded from PVE Version 6.3-3 to 6.3-6. I also upgraded the Kernel form 5.4.78-2-pve to 5.4.106-1-pve. Now the CEPH Cluster is very slow. Normally the I/O Wait Time inside the VM is about 7 ms. Now it increased to about 750 ms up to 2000 ms
First I did a Network benchmark with iperf where I got with only one thread:
0.0- 6.7 sec 26.8 GBytes 34.5 Gbits/sec
Normally when doing Backups the Network reaches 2 Gbit/s on the 100GB interfaces
Hence the Network is OK, i guess, I assume that CEPH has a Problem..
Further we did some Benchmarks on the System with RADOS Bench. Both benchmarks were started within 10 Minutes:
1.
2.
In older benchmarks with an empty CEPH Cluster I reached about 6GB/s. Now its instable in Bandwidth and slow.
In addition I noticed that one Server has a High load with up to 300% CPU load on its osd Processes. That Server has also a higher apply and Commit Latency. The Server with high load changed once the previous Server crashed with a max. Latency of 88ms on one OSD.
Maybe this Output helps:
I hope Somebody can help me figure it out.
Is there a possibility to downgrade back to CEPH Version 15.2.8?
A few information about the System:
Its a Hyperconverged Cluster of 5 Supermicro AS -1114S-WN10RT:
4 of the Servers have:
CPU: 128 x AMD EPYC 7702P 64-Core Processor (1 Socket)
RAM: 512 GB
1 of the Servers has:
64 x AMD EPYC 7502P 32-Core Processor (1 Socket)
RAM: 256 GB
Network:
All Servers have a Mellanox Technologies MT28800 Family [ConnectX-5 Ex] with 2x 100GbE for Ceph
I installed the OFED Driver in the newest Version MLNX_OFED_LINUX-5.1-2.5.8.0-debian10.3-x86_64 (on the Server I build a new repo for new Kernel (5.4.106-1-pve) and installed the new Packages form Local repo)
Storage:
Ceph storage on each Server 2 x Micron_9300_MTFDHAL3T2TDR with 3.2TB
Each drive has 4 OSDs. The Cluster has 40 OSDs with 1025 PGs
On Thursday 08.04.2021 I Upgraded from PVE Version 6.3-3 to 6.3-6. I also upgraded the Kernel form 5.4.78-2-pve to 5.4.106-1-pve. Now the CEPH Cluster is very slow. Normally the I/O Wait Time inside the VM is about 7 ms. Now it increased to about 750 ms up to 2000 ms
First I did a Network benchmark with iperf where I got with only one thread:
0.0- 6.7 sec 26.8 GBytes 34.5 Gbits/sec
Normally when doing Backups the Network reaches 2 Gbit/s on the 100GB interfaces
Hence the Network is OK, i guess, I assume that CEPH has a Problem..
Further we did some Benchmarks on the System with RADOS Bench. Both benchmarks were started within 10 Minutes:
1.
Code:
root@__:~# rados bench -p CEPHStor 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_vs5_83744
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 341 325 1299.9 1300 0.00991154 0.0424592
2 16 356 340 679.93 60 0.391409 0.0522247
3 16 363 347 462.616 28 0.0258595 0.0721364
4 16 387 371 370.958 96 1.57218 0.131835
5 16 402 386 308.765 60 0.0113969 0.16646
6 16 408 392 261.303 24 0.0111479 0.178742
7 16 435 419 239.401 108 0.529584 0.226812
8 16 452 436 217.974 68 0.0106614 0.266152
9 16 461 445 197.754 36 0.96406 0.279443
10 16 466 450 179.979 20 0.0126327 0.286548
11 16 466 450 163.617 0 - 0.286548
Total time run: 11.7593
Total writes made: 466
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 158.513
Stddev Bandwidth: 378.314
Max bandwidth (MB/sec): 1300
Min bandwidth (MB/sec): 0
Average IOPS: 39
Stddev IOPS: 94.5785
Max IOPS: 325
Min IOPS: 0
Average Latency(s): 0.401447
Stddev Latency(s): 0.917697
Max latency(s): 5.38532
Min latency(s): 0.00813231
root@___:~# rados bench -p CEPHStor 10 rand
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 105 89 355.93 356 0.00395163 0.0381291
2 16 134 118 235.962 116 0.00667274 0.143368
3 16 160 144 191.972 104 0.00561947 0.25516
4 16 203 187 186.974 172 0.00612447 0.224298
5 16 366 350 279.962 652 0.00742376 0.225647
6 16 386 370 246.633 80 0.00378491 0.217095
7 16 397 381 217.686 44 0.00523762 0.23806
8 16 399 383 191.475 8 0.00763485 0.242517
9 16 516 500 222.194 468 0.00886989 0.251759
10 16 516 500 199.974 0 - 0.251759
11 16 516 500 181.794 0 - 0.251759
12 15 516 501 166.979 1.33333 6.47211 0.264174
13 13 516 503 154.75 8 3.7695 0.278351
14 13 516 503 143.696 0 - 0.278351
15 12 516 504 134.383 2 6.6732 0.291039
16 11 516 505 126.234 4 9.95319 0.310172
Total time run: 16.0488
Total reads made: 516
Read size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 128.607
Average IOPS: 32
Stddev IOPS: 49.1677
Max IOPS: 163
Min IOPS: 0
Average Latency(s): 0.470387
Max latency(s): 11.0187
Min latency(s): 0.00254663
Code:
root@___:~# rados bench -p CEPHStor 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_vs5_87907
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 16 16 0 0 0 - 0
1 16 30 14 55.9724 56 0.492942 0.266121
2 16 36 20 39.9879 24 0.273857 0.340842
3 16 41 25 33.3256 20 2.16337 0.661045
4 16 55 39 38.9924 56 0.492101 0.779455
5 16 61 45 35.9937 24 0.0132808 0.991254
6 16 67 51 33.9946 24 2.99239 1.12742
7 16 69 53 30.2811 8 6.43563 1.22443
8 16 72 56 27.996 12 0.0119884 1.25493
9 16 72 56 24.8852 0 - 1.25493
10 16 72 56 22.3968 0 - 1.25493
11 14 72 58 21.0879 2.66667 7.47855 1.42587
12 10 72 62 20.6637 16 5.5864 1.6983
13 4 72 68 20.9201 24 6.16283 2.28117
14 4 72 68 19.4259 0 - 2.28117
15 4 72 68 18.1309 0 - 2.28117
16 1 72 71 17.7477 4 7.74045 2.66913
Total time run: 16.5578
Total writes made: 72
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 17.3936
Stddev Bandwidth: 18.045
Max bandwidth (MB/sec): 56
Min bandwidth (MB/sec): 0
Average IOPS: 4
Stddev IOPS: 4.54927
Max IOPS: 14
Min IOPS: 0
Average Latency(s): 2.80683
Stddev Latency(s): 3.53249
Max latency(s): 13.6394
Min latency(s): 0.00975463
In addition I noticed that one Server has a High load with up to 300% CPU load on its osd Processes. That Server has also a higher apply and Commit Latency. The Server with high load changed once the previous Server crashed with a max. Latency of 88ms on one OSD.
Maybe this Output helps:
Code:
:~# pveversion -V
proxmox-ve: 6.3-1 (running kernel: 5.4.106-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
ceph: 15.2.10-pve1
ceph-fuse: 15.2.10-pve1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-8
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.13-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-9
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-5
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-10
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1
Is there a possibility to downgrade back to CEPH Version 15.2.8?