Bad iSCSI performance in Proxmox 6. Very good in Proxmox 5.4-3

Aug 25, 2019
12
1
23
50
I am testing a Proxmox cluster environment with 3 servers and one iSCSI storage.
I have configured the storage, and on the servers I connect the storage with multipath.
In both versions I perform the same tasks and I have the same configuration (except for an additional minor option in the multipath.conf).

blacklist_exceptions {
wwid "3600a098000fd5aa4000000bc5d6de0b7.
}
multipaths {
multipath {
wwid "3600a098000fd5a7c000000c05d5e0f25"
alias mpath0
}
}
defaults {
polling_interval 5
path_selector "round-robin 0"
path_grouping_policy multibus
uid_attribute "ID_SERIAL"
rr_min_io 100
failback immediate
no_path_retry queue
}

*in Promxmox 6 I need to add in defaults:
find_multipaths "no"

This is the only single change in both versions

If i run this command
fio --filename=/dev/mapper/mpath0 --direct=1 --rw=read --bs=1m --size=20G --numjobs=200 --runtime=60 --group_reporting --name=file1

In Proxmox 5.4-3 I have this results
Run status group 0 (all jobs):
READ: io=83446MB, aggrb=1386.5MB/s, minb=1386.5MB/s, maxb=1386.5MB/s, mint=60187msec, maxt=60187msec

In Proxmox 6 I have around 20MB/s.

¿Does someone has a similar issue ?
The steps could not be wrong only for one version.
I'm pretty lost

I have this issue before I try to add the iSCSI to proxmox.
So I think the problem must be with the Debian Kernel, a driver. A new needed configuration for Debian 10 and iSCSI multipath.
I uses the same layout, Ips, etc, in both cases

*P6 uses version 12.something (I need to reinstall all again with that version to take note of the driver)
root@Jupiter:~# modinfo be2net
filename: /lib/modules/4.15.18-12-pve/kernel/drivers/net/ethernet/emulex/benet/be2net.ko
license: GPL
author: Emulex Corporation
description: Emulex OneConnect NIC Driver 11.4.0.0
version: 11.4.0.0
srcversion: 0D790BBD653BC12849EED84
alias: pci:v000010DFd00000728sv*sd*bc*sc*i*
alias: pci:v000010DFd00000720sv*sd*bc*sc*i*
alias: pci:v000010DFd0000E228sv*sd*bc*sc*i*
alias: pci:v000010DFd0000E220sv*sd*bc*sc*i*
alias: pci:v000019A2d00000710sv*sd*bc*sc*i*
alias: pci:v000019A2d00000700sv*sd*bc*sc*i*
alias: pci:v000019A2d00000221sv*sd*bc*sc*i*
alias: pci:v000019A2d00000211sv*sd*bc*sc*i*
depends:
retpoline: Y
intree: Y
name: be2net
vermagic: 4.15.18-12-pve SMP mod_unload modversions
parm: num_vfs:Number of PCI VFs to initialize (uint)
parm: rx_frag_size:Size of a fragment that holds rcvd data. (ushort)


Moving a disk from the iscsi to a local lvm volume takes 5 minutes in Proxmox 5. In Proxmox 6, It takes 3 hours
 
Can you try the test with other distributions that use a 5.x kernel? (e.g. Ubuntu 19.04, CentOS 7) Also some other distributions that use a 4.x kernel. (e.g. Ubuntu 18.04)
Can you try it without multipath?
 
It's a good test.
Right now I have the 3 servers testing in Proxmox5.
I should remove one of them from the cluster and install a Linux.
Perhaps someone had this issues and can tell us if there is a workaround or some taks to do to correct the problem.
But is a dangerous issue, because someone could try tu update Proxmox 5 to 6 with this configuration and I think the issue will appear.

As soon I can do that test I will post the results
 
A VM should be enough to test it. No need to install it directly on the hardware.
Can you provide some information about your storage and your nodes?
 
A VM should be enough to test it. No need to install it directly on the hardware.
Can you provide some information about your storage and your nodes?

The storages are in a separate 10Gb Network. A Vm will not see that network.
I need a server node with two 10Gb ports placed in that network.
I'm moving vmware vm machines from vmware Esxi hosts to Proxmox. I will have a free server in this weekend and i will made some test with it.

* Yesterday I added a second storage (not in picture)

1567677785954.png

Conf Node 1, Proxmox 5.4-3 (vmbr100 and vmbr101 - Storage networks)

8: vmbr100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 172.16.0.11/24 brd 172.16.0.255 scope global vmbr100
valid_lft forever preferred_lft forever
9: vmbr101: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 172.16.1.11/24 brd 172.16.1.255 scope global vmbr101
valid_lft forever preferred_lft forever
11: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 10.21.0.78/22 brd 10.21.3.255 scope global vmbr0
valid_lft forever preferred_lft forever

/etc/network/interfaces
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno3 inet manual

iface eno4 inet manual

iface eno2 inet manual

iface ens2f0 inet manual

iface ens2f1 inet manual

auto bond0
iface bond0 inet manual
bond-slaves eno1 eno2 eno3 eno4
bond-miimon 100
bond-mode balance-alb

auto vmbr0
iface vmbr0 inet static
address 10.21.0.78
netmask 255.255.252.0
gateway 10.21.0.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094

auto vmbr100
iface vmbr100 inet static
address 172.16.0.11
netmask 255.255.255.0
bridge-ports ens2f1
bridge-stp off
bridge-fd 0

auto vmbr101
iface vmbr101 inet static
address 172.16.1.11
netmask 255.255.255.0
bridge-ports ens2f0
bridge-stp off
bridge-fd 0


root@Jupiter:~# cat /etc/multipath.conf
blacklist_exceptions {
wwid "3600a098000fd5a7c000000c05d5e0f25"
wwid "3600a098000fd5aa4000000bc5d6de0b7"
}
multipaths {
multipath {
wwid "3600a098000fd5a7c000000c05d5e0f25"
alias mpath0
}
multipath {
wwid "3600a098000fd5aa4000000bc5d6de0b7"
alias mpath1
}
defaults {
polling_interval 5
path_selector "round-robin 0"
path_grouping_policy multibus
uid_attribute "ID_SERIAL"
rr_min_io 100
failback immediate
no_path_retry queue
}


root@Jupiter:~# multipath -ll
mpath1 (3600a098000fd5aa4000000bc5d6de0b7) dm-5 LENOVO,DE_Series
size=13T features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 16:0:0:1 sdc 8:32 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
`- 15:0:0:1 sdb 8:16 active ready running
mpath0 (3600a098000fd5a7c000000c05d5e0f25) dm-6 LENOVO,DE_Series
size=13T features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 18:0:0:1 sdh 8:112 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
`- 17:0:0:1 sdd 8:48 active ready running
3600a098000fd5ac2000000205d028628 dm-7 LENOVO,Universal Xport
size=20M features='1 retain_attached_hw_handler' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 15:0:0:7 sde 8:64 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
`- 16:0:0:7 sdf 8:80 active ready running
3600a098000fd58fa000000205d0274da dm-8 LENOVO,Universal Xport
size=20M features='1 retain_attached_hw_handler' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 17:0:0:7 sdg 8:96 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
`- 18:0:0:7 sdi 8:128 active ready running

root@Jupiter:~# lsscsi
[0:2:0:0] disk Lenovo RAID 530-8i 5.07 /dev/sda
[15:0:0:0] disk LENOVO DE_Series 0851 -
[15:0:0:1] disk LENOVO DE_Series 0851 /dev/sdb
[15:0:0:7] disk LENOVO Universal Xport 0851 /dev/sde
[16:0:0:0] disk LENOVO DE_Series 0851 -
[16:0:0:1] disk LENOVO DE_Series 0851 /dev/sdc
[16:0:0:7] disk LENOVO Universal Xport 0851 /dev/sdf
[17:0:0:0] disk LENOVO DE_Series 0851 -
[17:0:0:1] disk LENOVO DE_Series 0851 /dev/sdd
[17:0:0:7] disk LENOVO Universal Xport 0851 /dev/sdg
[18:0:0:0] disk LENOVO DE_Series 0851 -
[18:0:0:1] disk LENOVO DE_Series 0851 /dev/sdh
[18:0:0:7] disk LENOVO Universal Xport 0851 /dev/sdi

root@Jupiter:~# ethtool ens2f0
Settings for ens2f0:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000000 (0)

Link detected: yes

Settings for ens2f1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 1
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000000 (0)

Link detected: yes


oot@Jupiter:~# modinfo be2net
filename: /lib/modules/4.15.18-12-pve/kernel/drivers/net/ethernet/emulex/benet/be2net.ko
license: GPL
author: Emulex Corporation
description: Emulex OneConnect NIC Driver 11.4.0.0
version: 11.4.0.0
srcversion: 0D790BBD653BC12849EED84
alias: pci:v000010DFd00000728sv*sd*bc*sc*i*
alias: pci:v000010DFd00000720sv*sd*bc*sc*i*
alias: pci:v000010DFd0000E228sv*sd*bc*sc*i*
alias: pci:v000010DFd0000E220sv*sd*bc*sc*i*
alias: pci:v000019A2d00000710sv*sd*bc*sc*i*
alias: pci:v000019A2d00000700sv*sd*bc*sc*i*
alias: pci:v000019A2d00000221sv*sd*bc*sc*i*
alias: pci:v000019A2d00000211sv*sd*bc*sc*i*
depends:
retpoline: Y
intree: Y
name: be2net
vermagic: 4.15.18-12-pve SMP mod_unload modversions
parm: num_vfs:Number of PCI VFs to initialize (uint)
parm: rx_frag_size:Size of a fragment that holds rcvd data. (ushort)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!