Hi,
I'm running proxmox cluster with 5 nodes and pure SSD Ceph storage (currently about 20 OSDs, all enterprise grade INTEL S3710/S4500, bluestore). Nodes are connected through 10Gbit network. Storage is about 50% full. Everything (system, proxmox, ceph) is updated to latest versions. On top of that there are dozens of KVM machines with Linux. During normal hours there is very small load (read/write about 20 MB/s, below 1000 iops, ethernet traffic below 100 MB/s). Ceph has two pools, 576 (512+64) PG total (so about 30 PG per OSD).
I have discovered that sometimes there is a noticeable lag across the whole cluster. I can measure it even during "normal" load (there is no heavy I/O task), for example with ioping ("ioping -DW -c 60" ran 10 times, so it covers 10 minutes window) - as you can see, there are sometimes 5 seconds latencies:
min/avg/max/mdev = 1.8 ms / 70.0 ms / 906.7 ms / 191.6 ms
min/avg/max/mdev = 1.9 ms / 198.7 ms / 4.4 s / 682.1 ms
min/avg/max/mdev = 1.9 ms / 41.6 ms / 1.9 s / 251.3 ms
min/avg/max/mdev = 1.8 ms / 11.7 ms / 258.3 ms / 47.1 ms
min/avg/max/mdev = 1.5 ms / 251.8 ms / 5.0 s / 956.0 ms
min/avg/max/mdev = 1.9 ms / 11.4 ms / 426.0 ms / 55.7 ms
min/avg/max/mdev = 1.5 ms / 37.1 ms / 1.5 s / 195.4 ms
min/avg/max/mdev = 1.8 ms / 79.4 ms / 1.7 s / 321.2 ms
min/avg/max/mdev = 1.8 ms / 2.3 ms / 4.5 ms / 474 us
min/avg/max/mdev = 1.7 ms / 4.9 ms / 49.1 ms / 8.2 ms
It is even worse during heavy I/O operations like rsync of milions small files. This operation affects not only current KVM machine (it becames almost unusable) but the whole cluster. During operations like this I can see high OSD latencies (both apply and commit) via proxmox interface; sometimes way above 100ms. During normal hours there is usually visible zero latency (or small numbers like 2 ms).
I tried to measure packetloss with ping, omping (everything smooth), network throughput with iperf (which went like 9,8Gbps).
Do you have any tips where to look?
Thanks!
I'm running proxmox cluster with 5 nodes and pure SSD Ceph storage (currently about 20 OSDs, all enterprise grade INTEL S3710/S4500, bluestore). Nodes are connected through 10Gbit network. Storage is about 50% full. Everything (system, proxmox, ceph) is updated to latest versions. On top of that there are dozens of KVM machines with Linux. During normal hours there is very small load (read/write about 20 MB/s, below 1000 iops, ethernet traffic below 100 MB/s). Ceph has two pools, 576 (512+64) PG total (so about 30 PG per OSD).
I have discovered that sometimes there is a noticeable lag across the whole cluster. I can measure it even during "normal" load (there is no heavy I/O task), for example with ioping ("ioping -DW -c 60" ran 10 times, so it covers 10 minutes window) - as you can see, there are sometimes 5 seconds latencies:
min/avg/max/mdev = 1.8 ms / 70.0 ms / 906.7 ms / 191.6 ms
min/avg/max/mdev = 1.9 ms / 198.7 ms / 4.4 s / 682.1 ms
min/avg/max/mdev = 1.9 ms / 41.6 ms / 1.9 s / 251.3 ms
min/avg/max/mdev = 1.8 ms / 11.7 ms / 258.3 ms / 47.1 ms
min/avg/max/mdev = 1.5 ms / 251.8 ms / 5.0 s / 956.0 ms
min/avg/max/mdev = 1.9 ms / 11.4 ms / 426.0 ms / 55.7 ms
min/avg/max/mdev = 1.5 ms / 37.1 ms / 1.5 s / 195.4 ms
min/avg/max/mdev = 1.8 ms / 79.4 ms / 1.7 s / 321.2 ms
min/avg/max/mdev = 1.8 ms / 2.3 ms / 4.5 ms / 474 us
min/avg/max/mdev = 1.7 ms / 4.9 ms / 49.1 ms / 8.2 ms
It is even worse during heavy I/O operations like rsync of milions small files. This operation affects not only current KVM machine (it becames almost unusable) but the whole cluster. During operations like this I can see high OSD latencies (both apply and commit) via proxmox interface; sometimes way above 100ms. During normal hours there is usually visible zero latency (or small numbers like 2 ms).
I tried to measure packetloss with ping, omping (everything smooth), network throughput with iperf (which went like 9,8Gbps).
Do you have any tips where to look?
Thanks!