ZFS: sudden reboot on heavy disk I/O on a RAID10

Slamdunk

Active Member
May 29, 2018
6
0
41
Hi, I would like to share an issue I had with ZFS, that I couldn't be able to solve at all. Now I'm on a software raid with ext4, but maybe sharing the issue could help someone else in the future.

Hardware
  • Motherboard: Asus P8P67 PRO
  • CPU: Intel Core i7-2600 CPU @ 3.40GHz
  • RAM: 24 GB (2 x CMZ4GX3M1A1600C9 + 2 x CMZ8GX3M1A1600C10)
  • HDD: 4 x WDC WD40EFRX
  • PSU: Corsair AX850
The four disks are connected to the four SATA ports at 3 GB/s of the main Intel P67(B3) chipset.

Software
Code:
proxmox-ve: 5.2-2 (running kernel: 4.15.17-3-pve)
pve-manager: 5.2-2 (running version: 5.2-2/b1d1c7f4)
pve-kernel-4.15: 5.2-3
pve-kernel-4.15.17-3-pve: 4.15.17-12
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: not correctly installed
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-33
libpve-guest-common-perl: 2.0-16
libpve-http-server-perl: 2.0-9
libpve-storage-perl: 5.0-23
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 1.0.0-1
proxmox-widget-toolkit: 1.0-19
pve-cluster: 5.0-27
pve-container: 2.0-23
pve-docs: 5.2-4
pve-firewall: 3.0-12
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-5
qemu-server: 5.0-28
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3

VM
  • 1 x Slackware 14.2 clean installation from ISO
  • RAM: 12 GB allocated to the VM, and the other 12 GB for the PVE
  • 1 x 3 TB raw disk dedicated to the VM
The issue

During the installation of Proxmox, I chose a RAID 10 on the four disks, done by ZFS.
I installed the Slackware guest, and everything worked.
At the end of the installation, I attached another SATA disk (a WDC WD20EARS) that contains 1 TB of data to be synced into the VM as a passthrough.

I started the rsync between the passthroughed disk and the VM, and after 30 minutes of full disk I/O at ~ 200 MB/s (the maximum reachable for this system and disks), the host rebooted.

No kernel panic, logs of both the host and the guest empty.

On another 2 tries, again after 30 minutes a reboot happened.

So I tried to limit the rsync to 5 MB/s, and everything ended succesfully, but obviously this is not a solution.

Then I tried what is written in https://pve.proxmox.com/wiki/ZFS_on_Linux#_limit_zfs_memory_usage with various swappiness values and some zfs_arc_max values too, but reboots still occurred in the same way.

No hardware watchdog was active (AFAICT).

The ampere and voltage of the system are everything ok and way under the limits of the PSU.

The solution

I found no solution to the issue, so I restarted everything with a Debian as described in https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch and with a Software Raid 10.

The same system on a Debian + MDADM Raid 10 works flawlessly.


End of the story
 
Last edited:
You mentioned setting zfs_arc_max, are you also setting zfs_arc_min? I've had similar problems, and they turned out to be the arc_min getting so small, ZFS couldn't operate properly, and took the whole system down. Best I can tell, Linux still tries to cache file system data from ZFS in it's standard caching format. ZFS sees this cache as pressure on the memory, and so it starts cutting down the size of the ARC.

A good command to monitor this is arcstat (or arcstat.py, if you go to an older Linux distro, or to FreeBSD). If this is what's happening to you, you will see the "c" field drop to megabytes in size before the system crashes.
 
You mentioned setting zfs_arc_max, are you also setting zfs_arc_min?
Thank you for your feedback: no I haven't tried zfs_arc_min because I didn't know this parameter until now.

I am a newbie in the filesystem universe: what embittered me is the total lack of feedback in such a sensitive and tricky system part.
No logs, no warnings, nothing out-of-the-box. Nothing to google about.

I have to consider from now ZFS on Linux only an experimental feature until proven otherwise.
A good command to monitor this is arcstat
I hope this can be useful for the future reader ;)
 
Have you got swap on ZFS on proxmox node?
Try this:
Code:
sysctl -w vm.swappiness=0
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!