Compellent SC5020 All-Flash Read Latency.

Rabi Hanna

New Member
Jul 24, 2018
6
0
1
42
Hi Guys,

so we have started using Proxmox, migrating from VMware, I currently have 4 Proxmox nodes in Cluster and already migrated almost 40 VM's.

everything is working as expected, migration doesn't take that much.

we are using a Compellent SC5020 as a shared Storage, VMware also uses the same storage, and I have 50TB.

what I have noticed is that the Read Latency for Proxmox nodes it always stuck on 10ms, I did open a ticket with Compellent CoPilot support but they couldn't figure out what is causing it, they suggested to change the or disable the delay ACK but this setting is for RedHat and I can't find any reference for it for Debian or Ubuntu.

so I open this Thread to see if anyone who uses the same setup experienced the same issue and the way they solved it, if they really did find a solution.

as I said everything works fine on Proxmox i'm even getting better performance than VMware.

here is some information:

SAN: Compellent SC5020 All-Flash
Servers: Dell R640 each with 4x10Gb Interfaces 2 for Multipath and 2 for VM traffic in Bond with OVS
Switches: 2x Dell S4048
Proxmox Package Version:
Code:
proxmox-ve: 5.2-2 (running kernel: 4.15.18-7-pve)
pve-manager: 5.2-9 (running version: 5.2-9/4b30e8f9)
pve-kernel-4.15: 5.2-10
pve-kernel-4.15.18-7-pve: 4.15.18-27
pve-kernel-4.15.17-1-pve: 4.15.17-9
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-40
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-30
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-2
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-20
pve-cluster: 5.0-30
pve-container: 2.0-28
pve-docs: 5.2-8
pve-firewall: 3.0-14
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-36
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.11-pve1~bpo1

so please share your experience and the way you solved it.

Best Regards
 
Hi Guys,

Please share you experience with iSCSI SAN performance, I'm still not able to find out why I have lower performance that VMware on a All-Flash SAN Storage.

as someone mentioned Dell cannot figure it out since it's not an Enterprise Linux Distro, they have tried to help but all their recommendation are for RedHat and doesn't apply to Debina.

I'm posting some configuration just for an Proxmox Expert to see maybe I will be able to figure out the issue.

each PVE Node have 2x10Gb interface connected to the SAN no bridge configuration or anything:
interface 1
Code:
# ifconfig eno2np1
eno2np1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet 192.168.100.51  netmask 255.255.255.0  broadcast 192.168.100.255
        inet6 fe80::20a:f7ff:feb6:7503  prefixlen 64  scopeid 0x20<link>
        ether 00:0a:f7:b6:75:03  txqueuelen 1000  (Ethernet)
        RX packets 2840885195  bytes 9409827480020 (8.5 TiB)
        RX errors 0  dropped 60  overruns 0  frame 0
        TX packets 2589551736  bytes 12115991647884 (11.0 TiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
interface 2
Code:
# ifconfig enp59s0f1np1
enp59s0f1np1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet 192.168.110.51  netmask 255.255.255.0  broadcast 192.168.110.255
        inet6 fe80::20a:f7ff:fea7:2a91  prefixlen 64  scopeid 0x20<link>
        ether 00:0a:f7:a7:2a:91  txqueuelen 1000  (Ethernet)
        RX packets 2837484315  bytes 9412070301404 (8.5 TiB)
        RX errors 0  dropped 60  overruns 0  frame 0
        TX packets 2593685536  bytes 12116405117748 (11.0 TiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

multipath.conf file
Code:
defaults {
        polling_interval        2
        path_selector           "round-robin 0"
        path_grouping_policy    multibus
        uid_attribute           ID_SERIAL
        rr_min_io               100
        failback                immediate
        no_path_retry           queue
        user_friendly_names     yes
    }

blacklist {
            devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
            devnode "^hd[a-z]"
            #devnode "^sda"
            devnode "^sda[0-9]"
            device {
                    vendor DELL
                    product "PERC|Universal|Virtual"
           }

}

blacklist_exceptions {
wwid "36000d310055b74000000000000000016"
wwid "36000d310055b74000000000000000017"
wwid "36000d310055b74000000000000000018"
wwid "36000d310055b74000000000000000019"
wwid "36000d310055b7400000000000000001a"
wwid "36000d310055b7400000000000000001b"
wwid "36000d310055b7400000000000000001d"
wwid "36000d310055b7400000000000000001e"
wwid "36000d310055b7400000000000000001f"
wwid "36000d310055b74000000000000000020"
}

multipaths {
        multipath {
                    alias "pvelvm1"
                    wwid "36000d310055b74000000000000000016"
                  }
    multipath {
                    alias "pvelvm2"
                    wwid "36000d310055b74000000000000000017"
                  }
        multipath {
                    alias "pvelvm3"
                    wwid "36000d310055b74000000000000000018"
                  }
        multipath {
                    alias "pvelvm4"
                    wwid "36000d310055b74000000000000000019"
                  }
        multipath {
                    alias "pvelvm5"
                    wwid "36000d310055b7400000000000000001a"
                  }
        multipath {
                    alias "pvelvm6"
                    wwid "36000d310055b7400000000000000001b"
                  }
        multipath {
                    alias "pvelvm7"
                    wwid "36000d310055b7400000000000000001d"
                  }
        multipath {
                    alias "pvelvm8"
                    wwid "36000d310055b7400000000000000001e"
                  }
        multipath {
                    alias "pvelvm9"
                    wwid "36000d310055b7400000000000000001f"
                  }
        multipath {
                    alias "pvelvm10"
                    wwid "36000d310055b74000000000000000020"
                  }
}

storage.cfg file
Code:
lvm: pvelvm1
    vgname pvelvm1
    content images,rootdir
    shared 1

lvm: pvelvm2
    vgname pvelvm2
    content rootdir,images
    shared 1

nfs: pvenfs
    export /bigvolume/pvenfs
    path /mnt/pve/pvenfs
    server 172.16.100.248
    content vztmpl,backup,iso
    maxfiles 2
    options vers=3

nfs: pvenode
    export /onapp/pvenode
    path /mnt/pve/pvenode
    server 172.16.100.250
    content backup,vztmpl,iso
    maxfiles 2
    options vers=3

dir: local
    path /var/lib/vz
    content rootdir
    maxfiles 0
    shared 0

lvm: pvelvm3
    vgname pvelvm3
    content images,rootdir
    shared 1

lvm: pvelvm4
    vgname pvelvm4
    content images,rootdir
    shared 1

lvm: pvelvm5
    vgname pvelvm5
    content images,rootdir
    shared 1

lvm: pvelvm6
    vgname pvelvm6
    content rootdir,images
    shared 1

lvm: pvelvm7
    vgname pvelvm7
    content images,rootdir
    shared 1

lvm: pvelvm8
    vgname pvelvm8
    content images,rootdir
    shared 1

lvm: pvelvm9
    vgname pvelvm9
    content rootdir,images
    shared 1

lvm: pvelvm10
    vgname pvelvm10
    content images,rootdir
    shared 1

configs for one VM and almost all of the VM's have the same configs:
Code:
# qm config 122
agent: 1
bootdisk: scsi0
cores: 6
ide2: none,media=cdrom
memory: 32768
name: cpsrv37
net0: virtio=C6:5D:07:0B:A0:B4,bridge=vmbr0,tag=2090
numa: 0
onboot: 1
ostype: l26
scsi0: pvelvm2:vm-122-disk-0,size=550G
scsihw: virtio-scsi-pci
smbios1: uuid=2bdedc2e-a4b9-4d7f-a5b5-80d1d38425ed
sockets: 1

I tried to change the disk cache used Write through and Direct Sync but that didn't change anything.
here is some fio on two VM's one running on Proxmox and the other on VMware.

Proxmox VM:
Code:
# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.1
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=104MiB/s,w=34.0MiB/s][r=26.7k,w=8708 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=10332: Fri Dec 14 14:06:26 2018
   read: IOPS=25.0k, BW=97.8MiB/s (103MB/s)(3070MiB/31390msec)
   bw (  KiB/s): min=83008, max=115272, per=100.00%, avg=100245.16, stdev=6936.23, samples=62
   iops        : min=20752, max=28818, avg=25061.24, stdev=1734.05, samples=62
  write: IOPS=8367, BW=32.7MiB/s (34.3MB/s)(1026MiB/31390msec)
   bw (  KiB/s): min=27640, max=39152, per=100.00%, avg=33497.21, stdev=2369.83, samples=62
   iops        : min= 6910, max= 9788, avg=8374.27, stdev=592.44, samples=62
  cpu          : usr=16.30%, sys=58.24%, ctx=29414, majf=0, minf=22
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwt: total=785920,262656,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=97.8MiB/s (103MB/s), 97.8MiB/s-97.8MiB/s (103MB/s-103MB/s), io=3070MiB (3219MB), run=31390-31390msec
  WRITE: bw=32.7MiB/s (34.3MB/s), 32.7MiB/s-32.7MiB/s (34.3MB/s-34.3MB/s), io=1026MiB (1076MB), run=31390-31390msec

Disk stats (read/write):
    dm-0: ios=781828/261337, merge=0/0, ticks=1101346/388284, in_queue=1490292, util=99.75%, aggrios=785920/262716, aggrmerge=0/0, aggrticks=1107273/390659, aggrin_queue=1497889, aggrutil=99.71%
  sda: ios=785920/262716, merge=0/0, ticks=1107273/390659, in_queue=1497889, util=99.71%


VMware VM:
Code:
# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.1
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=128MiB/s,w=41.6MiB/s][r=32.7k,w=10.6k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2929485: Fri Dec 14 14:16:17 2018
   read: IOPS=35.9k, BW=140MiB/s (147MB/s)(3070MiB/21898msec)
   bw (  KiB/s): min=96456, max=179992, per=100.00%, avg=144365.02, stdev=18827.78, samples=43
   iops        : min=24114, max=44998, avg=36091.30, stdev=4706.93, samples=43
  write: IOPS=11.0k, BW=46.9MiB/s (49.1MB/s)(1026MiB/21898msec)
   bw (  KiB/s): min=32400, max=60240, per=100.00%, avg=48246.70, stdev=6350.39, samples=43
   iops        : min= 8100, max=15060, avg=12061.63, stdev=1587.66, samples=43
  cpu          : usr=9.81%, sys=49.00%, ctx=347665, majf=0, minf=30
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwt: total=785920,262656,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=140MiB/s (147MB/s), 140MiB/s-140MiB/s (147MB/s-147MB/s), io=3070MiB (3219MB), run=21898-21898msec
  WRITE: bw=46.9MiB/s (49.1MB/s), 46.9MiB/s-46.9MiB/s (49.1MB/s-49.1MB/s), io=1026MiB (1076MB), run=21898-21898msec

Disk stats (read/write):
  sda: ios=784173/262669, merge=80/541, ticks=834790/397651, in_queue=1232319, util=99.66%

I would really be very grateful if someone can give a recommendation and any other information that I can check.

Thanks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!