Degraded Windows VM-Performance on SSD with Ceph-Cluster

Jan 6, 2020
8
0
1
26
Hi there,

We have a Proxmox cluster of 6 SSD-hosts. Those SSDs are in a CEPH-Cluster. Before we used CEPH, the nodes had local storage only and everything worked fine. But now, newly created Windows VMs have a performance problem. Every action takes about 3 - 5 seconds to complete.

We have 5 OSDs per host, 30 in total. pg_num is 1024.

Does someone have an idea? Thank you.
 
What network do you have connecting each of the hosts? 1Gbps / 10Gbps?
 
They're connected with 10Gbps.

You have 30 SSD's being shared over a single 10Gbps network? Do you have a separate private network so you have 20Gbps capacity per node?

You are likely running into bottlenecks because of the network.
 
So we did a few tests with fio on a VM. We've used 1, 5 and 10 Threads (Numjobs) Here are the results:
Host 1Host 2Host 3Host 4
IOPS322 / 652 / 970344 / 800 / 1129316 / 740 / 1035440 / 939 / 1172
BW1290KiB/s / 2611KiB/s / 3883KiB/s1377KiB/s / 3201KiB/s / 4520KiB/s1265KiB/s / 2960KiB/s / 4141KiB/s1762KiB/s / 3759KiB/s / 4691KiB/s

If we disable write cache on the hosts, the results are lower (again with 1, 5 and 10 Threads):
Host 1Host 2Host 3Host 4
IOPS310 / 582 / 1000327 / 776 / 1043296 / 639 / 910401 / 867 / 1143
BW1242KiB/s / 2332KiB/s / 4002KiB/s1311KiB/s / 3107KiB/s / 4172KiB/s1186KiB/s / 2558KiB/s / 3643KiB/s
1607KiB/s / 3470KiB/s / 4573KiB/s

Network is fast enough, here we can see a test which was done using iperf:
Code:
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.14, port 51890
[ 5] local 10.10.10.15 port 5201 connected to 10.10.10.14 port 51892
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 989 MBytes 8.29 Gbits/sec
[ 5] 1.00-2.00 sec 1.01 GBytes 8.69 Gbits/sec
[ 5] 2.00-3.00 sec 1.04 GBytes 8.95 Gbits/sec
[ 5] 3.00-4.00 sec 1.07 GBytes 9.16 Gbits/sec
[ 5] 4.00-5.00 sec 1.04 GBytes 8.97 Gbits/sec
[ 5] 5.00-6.00 sec 1.02 GBytes 8.78 Gbits/sec
[ 5] 6.00-7.00 sec 1.04 GBytes 8.90 Gbits/sec
[ 5] 7.00-8.00 sec 1.07 GBytes 9.15 Gbits/sec
[ 5] 8.00-9.00 sec 1.04 GBytes 8.92 Gbits/sec
[ 5] 9.00-10.00 sec 1.03 GBytes 8.89 Gbits/sec
[ 5] 10.00-10.00 sec 693 KBytes 8.73 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 10.3 GBytes 8.87 Gbits/sec receiver

Our network interfaces look like this (example from host1):
Code:
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

iface ens2 inet manual

iface ens2d1 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves ens2 ens2d1
        bond-miimon 100
        bond-mode active-backup

auto vmbr0
iface vmbr0 inet static
        address <publicIP>
        netmask <netmask>
        gateway <gateway>
        bridge-ports bond0.111
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes

auto vmbr2
iface vmbr2 inet static
        address 10.10.10.14
        netmask 255.255.255.0
        bridge-ports bond0.116
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes

auto vmbr1
iface vmbr1 inet manual
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0

And this is our ceph.conf:
Code:
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 10.10.10.0/24
         fsid = 0854b677-1518-4011-b99f-9b0a940d629f
         mon_allow_pool_delete = true
         mon_host = 10.10.10.16 10.10.10.14 10.10.10.15
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 10.10.10.0/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring
 
Your iperf is not fast enough... you need to enable jumbo frame for your 10GbE network. it should have network speed up to 9.6~9.8 gbps

also your bonding mode is active backup... can you do a lacp bonding instead?

you are using linux bridge to setup your vlan... your network config is really not optimize for ceph... it should not use any bridge connection at all. does your nic support sriov?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!