[SOLVED] Proxmox ZFS - Unusable

Darren

New Member
Jul 28, 2016
21
1
1
47
I have 15 Proxmox on ZFS installs in production and we've found that, after a while, the IO slows to a crawl (mostly HGST 6TB RAID10 with Samsung NVME caches). We've spent a year troubleshooting to no avail. It seems to be related to this bug:

https://github.com/zfsonlinux/zfs/issues/6171

I'm to the point whereby switching to Ubuntu or Debian seems like my only option.

Is there a solution?
 
Pls add your:

> pveversion -v
 
Here's one of them (although most are late in the 4.x branch):

proxmox-ve: 5.1-32 (running kernel: 4.13.13-2-pve)
pve-manager: 5.1-42 (running version: 5.1-42/724a6cb3)
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.13.13-2-pve: 4.13.13-33
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.95-1-pve: 4.4.95-99
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-20
qemu-server: 5.0-22
pve-firmware: 2.0-4
libpve-common-perl: 5.0-28
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-8
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-16
pve-qemu-kvm: 2.9.1-9
pve-container: 2.0-19
pve-firewall: 3.0-5
pve-ha-manager: 2.0-5
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-3
lxcfs: 2.0.8-2
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.6-pve1~bpo9
 
I'm to the point whereby switching to Ubuntu or Debian seems like my only option.

With respect to the ZFS version they use or how would a switch help you? You are currently running an Ubuntu LTS kernel on your Debian-based PVE.

We also experienced slow ZFS in the past, and here are some hints:
- check the pool usage, it'll be slow starting about at 80% usage
- check the I/O on each disk in the pool with iostat and look out for very slow response times
- are the disks all enterprise grade? (they have a better read-to-failure timing and report an underlying error sooner, so that the I/O does take less time)
- is your workload based a lot on sync writes, if not, add more disks

Are only 4 disks in your ZFS pool or more?
 
  • Like
Reactions: NewDude
other users here on the forum which were affected by this issue have reported that it is fixed with the current pve-kernel-4.15 and ZFS packages, so maybe give them a try as well..
 
other users here on the forum which were affected by this issue have reported that it is fixed with the current pve-kernel-4.15 and ZFS packages, so maybe give them a try as well..

How do I do this? I spent last night (7pm to 2am) replacing a RAID card (LSI 3108 to 9400) on the hunch that might help. I did a dist-upgrade at the same time and it only took me to 4.13.16-2-pve. I did get a new version of zfs out of it, though:

Code:
proxmox-ve: 5.1-42 (running kernel: 4.13.16-2-pve)
pve-manager: 5.1-51 (running version: 5.1-51/96be5354)
pve-kernel-4.13: 5.1-44
pve-kernel-4.13.16-2-pve: 4.13.16-47
pve-kernel-4.13.13-2-pve: 4.13.13-33
pve-kernel-4.4.95-1-pve: 4.4.95-99
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.6-1-pve: 4.4.6-48
corosync: 2.4.2-pve4
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-30
libpve-guest-common-perl: 2.0-14
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-18
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-2
lxcfs: 3.0.0-1
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-15
pve-cluster: 5.0-25
pve-container: 2.0-21
pve-docs: 5.1-17
pve-firewall: 3.0-8
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-4
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-2
qemu-server: 5.0-25
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.7-pve1~bpo9

I did resilver the array and it didn't kick any disks out yet: fingers crossed. EDIT - I should note that the z_null_int did not consume iotop with the new RAID card and ZFS version. I know - changing too many variables but this is a production system that can't afford to be down at all.
 
Last edited:
With respect to the ZFS version they use or how would a switch help you? You are currently running an Ubuntu LTS kernel on your Debian-based PVE.

We have several other Ubuntu boxes sitting around and they've been clean as a whistle for years. It seems like there are so many users that the bugs and best practices really get boiled out. We chose Proxmox because we wanted something that was a little more 'enterprise' grade but it just feels like Ubuntu Jr - all of the same stuff is there but none of the good things.
 
Here's the zpool configs. I don't have a ZIL on tank01 because it was kicked out. If the new HBA is stable, then I will add it back:

Code:
# zpool status
  pool: tank01
 state: ONLINE
  scan: resilvered 132G in 0h52m with 0 errors on Tue Apr 24 20:06:21 2018
config:

    NAME                                                 STATE     READ WRITE CKSUM
    tank01                                               ONLINE       0     0     0
      mirror-0                                           ONLINE       0     0     0
        ata-HGST_HDN726060ALE610_NAH79SUY                ONLINE       0     0     0
        ata-HGST_HDN726060ALE610_NAHU6P2Y                ONLINE       0     0     0
      mirror-1                                           ONLINE       0     0     0
        ata-HGST_HDN726060ALE610_NAHU92SX                ONLINE       0     0     0
        ata-HGST_HDN726060ALE614_K1KB0KLD                ONLINE       0     0     0
      mirror-2                                           ONLINE       0     0     0
        ata-HGST_HDN726060ALE610_K1G0AEPB                ONLINE       0     0     0
        ata-HGST_HDN726060ALE614_NCH4YWXZ                ONLINE       0     0     0
      mirror-3                                           ONLINE       0     0     0
        ata-HGST_HDN726060ALE614_K1KH43VN                ONLINE       0     0     0
        ata-HGST_HDN726060ALE614_K1HTMHRD                ONLINE       0     0     0
    cache
      ata-Samsung_SSD_850_PRO_1TB_S1SRNWAF909526P-part2  ONLINE       0     0     0
      ata-Samsung_SSD_850_PRO_1TB_S1SRNWAF909527J-part2  ONLINE       0     0     0

errors: No known data errors

  pool: tank02
 state: ONLINE
  scan: scrub repaired 0B in 1h2m with 0 errors on Sun Apr  8 01:26:13 2018
config:

    NAME         STATE     READ WRITE CKSUM
    tank02       ONLINE       0     0     0
      mirror-0   ONLINE       0     0     0
        nvme0n1  ONLINE       0     0     0
        nvme1n1  ONLINE       0     0     0
      mirror-1   ONLINE       0     0     0
        nvme2n1  ONLINE       0     0     0
        nvme3n1  ONLINE       0     0     0

errors: No known data errors

  pool: tank03
 state: ONLINE
  scan: none requested
config:

    NAME                              STATE     READ WRITE CKSUM
    tank03                            ONLINE       0     0     0
      mirror-0                        ONLINE       0     0     0
        ata-HUH721010ALE601_2THJBR4D  ONLINE       0     0     0
        ata-HUH721010ALE601_2THS91GD  ONLINE       0     0     0
      mirror-1                        ONLINE       0     0     0
        ata-HUH721010ALE601_2THYEUWD  ONLINE       0     0     0
        ata-HUH721010ALE601_2THZWM5D  ONLINE       0     0     0
      mirror-2                        ONLINE       0     0     0
        ata-HUH721010ALE601_2TJ14JKD  ONLINE       0     0     0
        ata-HUH721010ALE601_2TJ2UJPD  ONLINE       0     0     0

errors: No known data errors
 
If you're having problems with your HBA or with your disks.... you're not going to get good ZFS performance. Can you test / benchmark the HBA or the disks separately?
 
Hi Darren, not to be devil advocate entirely, but, is there a specific need/requirement for you to use ZFS in your proxmox deployment? (ie, as opposed to .. say, vanilla setup with HW Raid disks and standard local storage?)

Tim
 
This was solved by switching to the Avago 9400 HBA in lieu of the LSI 3108 (which does not have an HBA mode). Apparently, ZFS does not like it if it cannot 'speak directly' with the disks.

We switched from VMware to Proxmox specifically for ZFS. Without ZFS, there would be no reason to switch.

Proxmox needs an HCL for this reason. I have marked the thread 'SOLVED'.
 
Hi everyone, Hi Darren, you are right,ZFS is not the issue, hardware is.

Maintaining an HCL is going to be tough due to all the setup options offered by PVE. Maye a tips and tricks sticky post ? I don't know.

I have seen a lot of issues on the forum regarding ZFS and Raid Adapters, performances issues, and even hosts crashes lately.

We ran into the exact same issues on several servers and we'd like to share some insights.

Speaking about what we have experienced :

Make use of a LSI/Avago Raid Card with JBOD mode ON and using ZFS Raid-Z gives us very poor performances, lots of IOdelay, and event hosts can freeze under reasonably high load. Even with full SSD pools.

This independant of caching, BBU etc.

For people using servers provided by majors vendors, many of them have LSI Chips based RAID adapters, or should we say AVAGO now, oh no, it's Broadcom sorry.

For the same vendor , there might be an HBA adapter available , with an AVAGO chip also. Ex : Lenovo 930-(x)i = RAID and 430-(x)i = HBA

It seems that only AVAGO/LSI based HBA adapters are supported by VMWARE vsan, not RAID/JBOD ones. And i goes for Hitachi, Dell, Lenovo, Cisco UCS boxes, as we know, maybe others too.

Looking at Broadcom website, both cards/chips architecture are quite different indeed, and what about the linux drivers, not the same also.

Maybe it can be the root cause of much pain for some of you because most of vendors will ship LSI RAID adapters by default with a JBOD option.

Regards,
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!