Hi,
we have two clusters, each with a NVME CEPH.
On one cluster, when we try some performance tests like : ceph daemon osd.X bench
we recieve the following results:
Good Cluster:
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.34241843399999999,
"bytes_per_sec": 3135759402.4859071,
"iops": 747.6233011450546
}
Slow Cluster:
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.7230354280000002,
"bytes_per_sec": 288404943.96713537,
"iops": 68.761096946510165
}
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.7165428359999999,
"bytes_per_sec": 288908771.23742104,
"iops": 68.881218728404292
}
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.6945760989999998,
"bytes_per_sec": 290626527.9772222,
"iops": 69.290763849549819
}
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.7006634480000002,
"bytes_per_sec": 290148466.37304908,
"iops": 69.176785081159849
}
Any ideas what can cause the 2nd cluster to have such different results?
We check the osd performance because some vms have icmp delays between some ms and 3000-5000ms when pinging them.
Network already checked, perhaps its caused on io waits.
Any ideas?
Debian 12, pve-manager 8.4.1 installed.
82 osds installed
root@pve-node01:~# ceph -s
cluster:
id: ...
health: HEALTH_WARN
7 daemons have recently crashed
services:
mon: 3 daemons, quorum pve-node01,pve-node03,pve-node05 (age 2w)
mgr: pve-node05(active, since 95m), standbys: pve-node03, pve-node01
osd: 82 osds: 82 up (since 17h), 81 in (since 37h)
data:
pools: 2 pools, 2049 pgs
objects: 11.32M objects, 42 TiB
usage: 110 TiB used, 173 TiB / 283 TiB avail
pgs: 2048 active+clean
1 active+clean+scrubbing+deep
io:
client: 17 MiB/s rd, 185 MiB/s wr, 1.18k op/s rd, 4.31k op/s wr
Best regards,
Volker
we have two clusters, each with a NVME CEPH.
On one cluster, when we try some performance tests like : ceph daemon osd.X bench
we recieve the following results:
Good Cluster:
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 0.34241843399999999,
"bytes_per_sec": 3135759402.4859071,
"iops": 747.6233011450546
}
Slow Cluster:
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.7230354280000002,
"bytes_per_sec": 288404943.96713537,
"iops": 68.761096946510165
}
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.7165428359999999,
"bytes_per_sec": 288908771.23742104,
"iops": 68.881218728404292
}
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.6945760989999998,
"bytes_per_sec": 290626527.9772222,
"iops": 69.290763849549819
}
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 3.7006634480000002,
"bytes_per_sec": 290148466.37304908,
"iops": 69.176785081159849
}
Any ideas what can cause the 2nd cluster to have such different results?
We check the osd performance because some vms have icmp delays between some ms and 3000-5000ms when pinging them.
Network already checked, perhaps its caused on io waits.
Any ideas?
Debian 12, pve-manager 8.4.1 installed.
82 osds installed
root@pve-node01:~# ceph -s
cluster:
id: ...
health: HEALTH_WARN
7 daemons have recently crashed
services:
mon: 3 daemons, quorum pve-node01,pve-node03,pve-node05 (age 2w)
mgr: pve-node05(active, since 95m), standbys: pve-node03, pve-node01
osd: 82 osds: 82 up (since 17h), 81 in (since 37h)
data:
pools: 2 pools, 2049 pgs
objects: 11.32M objects, 42 TiB
usage: 110 TiB used, 173 TiB / 283 TiB avail
pgs: 2048 active+clean
1 active+clean+scrubbing+deep
io:
client: 17 MiB/s rd, 185 MiB/s wr, 1.18k op/s rd, 4.31k op/s wr
Best regards,
Volker