[SOLVED] Proxmox freezes with high IO, maybe ZFS related

arukashi · Apr 14, 2026

Hello.
I'm now experiencing freezes which started three days ago and happens every day but in random time. Hardware is Hetzner Server
Symptoms: Proxmox interface shows high IO delay, about 30-50%, while no significant IO operations are happening. I watched iotop, and 15M/s write was maximum
Simple reboot fixes the problem until it happens again

In system logs i can see many records like this

zed[1407629]: eid=333 class=deadman pool='rpool' vdev=sdd3 size=73728 offset=139804012544 priority=3 err=0 flags=0x80100480 delay=101266124ms

Meanwhile zpool events -v shows this

Code:

Apr 13 2026 15:54:06.617975577 ereport.fs.zfs.deadman
        class = "ereport.fs.zfs.deadman"
        ena = 0x819c1944f5406001
        detector = (embedded nvlist)
                version = 0x0
                scheme = "zfs"
                pool = 0xc0300e71306e37ec
                vdev = 0xb685d326aa5b66c8
        (end detector)
        pool = "rpool"
        pool_guid = 0xc0300e71306e37ec
        pool_state = 0x0
        pool_context = 0x0
        pool_failmode = "wait"
        vdev_guid = 0xb685d326aa5b66c8
        vdev_type = "disk"
        vdev_path = "/dev/sdd3"
        vdev_ashift = 0x9
        vdev_complete_ts = 0x4827c2f25336
        vdev_delta_ts = 0xb40f810ebd
        vdev_read_errors = 0x0
        vdev_write_errors = 0x0
        vdev_cksum_errors = 0x0
        vdev_delays = 0x0
        dio_verify_errors = 0x0
        parent_guid = 0x79041166b7a9b905
        parent_type = "mirror"
        vdev_spare_paths =
        vdev_spare_guids =
        zio_err = 0x0
        zio_flags = 0x300080 [CANFAIL DONT_QUEUE DONT_PROPAGATE]
        zio_stage = 0x200000 [VDEV_IO_START]
        zio_pipeline = 0x4e00000 [VDEV_IO_START VDEV_IO_DONE VDEV_IO_ASSESS DONE]
        zio_delay = 0x0
        zio_timestamp = 0x481318fd6e2e
        zio_delta = 0x0
        zio_type = 0x2 [WRITE]
        zio_priority = 0x3 [ASYNC_WRITE]
        zio_offset = 0x14922dd000
        zio_size = 0x2000
        zio_objset = 0x8245
        zio_object = 0x1
        zio_level = 0x0
        zio_blkid = 0x1af7f7c
        time = 0x69dcf57e 0x24d58f19
        eid = 0x64

Freezes come out of the blue: I didn't do any ZFS configurations recently, no new workloads too
So i thought it was faulty /dev/sdd drive, i asked support to replace the drive, it helped, but not for long time - 5 hours later freezes repeated.
But this time in logs all messages were about all the drives. Which makes me think, that that's not the drives case.

Apr 12 09:08:24 pve-htznr-6 zed[1466615]: eid=124 class=deadman pool='rpool' vdev=sdb3 size=4096 offset=1687645704192 priority=3 err=0 flags=0x300080>
Apr 12 09:08:24 pve-htznr-6 zed[168287]: Missed 506 events
Apr 12 09:08:24 pve-htznr-6 zed[168287]: Missed 306 events
Apr 12 09:08:24 pve-htznr-6 zed[1466623]: eid=126 class=deadman pool='rpool' vdev=sdc3 size=4096 offset=382222438400 priority=3 err=0 flags=0x300080 >
Apr 12 09:08:24 pve-htznr-6 zed[168287]: Missed 242 events
Apr 12 09:08:24 pve-htznr-6 zed[1466627]: eid=127 class=deadman pool='rpool' vdev=sda3 size=8192 offset=1603562139648 priority=3 err=0 flags=0x300080>

Some diagnostic info

Code:

zpool status -v
  pool: rpool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: resilvered 4.47G in 00:00:20 with 0 errors on Mon Apr 13 22:55:05 2026
config:

    NAME                                                  STATE     READ WRITE CKSUM
    rpool                                                 ONLINE       0     0     0
      mirror-0                                            ONLINE       0     0     0
        sdb3                                              ONLINE       0     0     0
        sdc3                                              ONLINE       0     0     0
      mirror-1                                            ONLINE       0     0     0
        ata-Micron_5200_MTFDDAK1T9TDC_18501FD2E0D6-part3  ONLINE       0     0     0
        sda3                                              ONLINE       0     0     0

errors: No known data errors

Code:

proxmox-ve: 9.1.0 (running kernel: 6.17.13-2-pve)
pve-manager: 9.1.7 (running version: 9.1.7/16b139a017452f16)
proxmox-kernel-helper: 9.0.4
pve-kernel-5.15: 7.4-15
proxmox-kernel-6.17: 6.17.13-2
proxmox-kernel-6.17.13-2-pve-signed: 6.17.13-2
proxmox-kernel-6.17.4-2-pve-signed: 6.17.4-2
proxmox-kernel-6.8: 6.8.12-17
proxmox-kernel-6.8.12-17-pve-signed: 6.8.12-17
pve-kernel-5.15.158-2-pve: 5.15.158-2
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 19.2.3-pve1
corosync: 3.1.10-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx12
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.1
libproxmox-backup-qemu0: 2.0.2
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.6
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.1.1
libpve-cluster-perl: 9.1.1
libpve-common-perl: 9.1.9
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.5
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.1
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-4
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.5-1
proxmox-backup-file-restore: 4.1.5-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.9
pve-cluster: 9.1.1
pve-container: 6.1.2
pve-docs: 9.1.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.18-2
pve-ha-manager: 5.1.3
pve-i18n: 3.7.0
pve-qemu-kvm: 10.1.2-7
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.6
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.4.1-pve1

Hardware Hetzner Server
CPU: Intel(R) Xeon(R) W-2295 CPU @ 3.00GHz
Memory: 4 * Samsung M393A4G43AB3-CWE ECC Registered + 4 * Samsung M393A4K40EB3-CWE ECC Registered Speed: 2933 MT/s
Drives:

Code:

Device Model:     SAMSUNG MZ7LH1T9HMLT-00005
Device Model:     Micron_5200_MTFDDAK1T9TDC
Device Model:     Micron_5200_MTFDDAK1T9TDC
Device Model:     Micron_5200_MTFDDAK1T9TDC

Any help appreciated, thanks!

d.oishi · Apr 14, 2026

not changed any settings.
The issue still occurred even after replacing the SSD (`sdd`).
The same issue has also occurred on the other disks (`sda` / `sdb` / `sdc`).

Based on the above, I suspect the problem may be related to a component shared by all SSDs, such as the SATA controller, backplane, or similar shared infrastructure.

arukashi · Apr 14, 2026

Freezes repeated again. This is what iostat says

Code:

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda              6.19      0.03     0.00   0.00    0.13     4.90    6.39      0.20     0.00   0.00    0.25    31.75    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.06
sdb              0.00      0.00     0.00   0.00    0.00     0.00    2.20      0.13     0.00   0.00 2958.27    58.55    0.00      0.00     0.00   0.00    0.00     0.00    0.40 1934.00    7.27  71.88
sdc             35.93      0.20     0.00   0.00    0.18     5.73    2.40      0.16     0.00   0.00    0.42    68.00    0.00      0.00     0.00   0.00    0.00     0.00    0.40    0.00    0.01   0.22
sdd             67.27      0.37     0.20   0.30    0.19     5.57    6.99      0.20     0.40   5.41    0.20    29.03    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.01   0.42


Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda              2.60      0.01     0.00   0.00    0.23     4.00   13.80      0.68     0.60   4.17    0.29    50.09    0.00      0.00     0.00   0.00    0.00     0.00    0.60    0.00    0.00   0.06
sdb              0.00      0.00     0.00   0.00    0.00     0.00    4.80      0.26     0.00   0.00 2800.71    56.17    0.00      0.00     0.00   0.00    0.00     0.00    0.80 1500.50   14.64 117.36
sdc              0.00      0.00     0.00   0.00    0.00     0.00    7.40      0.70     0.00   0.00    0.73    96.43    0.00      0.00     0.00   0.00    0.00     0.00    0.80    0.25    0.01   0.10
sdd              2.00      0.01     0.20   9.09    0.80     4.40   13.00      0.68     0.80   5.80    0.28    53.17    0.00      0.00     0.00   0.00    0.00     0.00    0.60    0.33    0.01   0.12


Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda             43.60      0.32     0.00   0.00    0.16     7.60    8.80      1.04     0.20   2.22    1.82   120.55    0.00      0.00     0.00   0.00    0.00     0.00    0.20    1.00    0.02   0.92
sdb              0.00      0.00     0.00   0.00    0.00     0.00    4.40      0.22     0.20   4.35 2933.95    51.27    0.00      0.00     0.00   0.00    0.00     0.00    0.40 1634.00   13.56 114.72
sdc             16.00      0.11     0.20   1.23    0.20     7.30    7.80      1.04     1.40  15.22    0.90   136.82    0.00      0.00     0.00   0.00    0.00     0.00    0.20    0.00    0.01   0.24
sdd             32.40      0.25     0.00   0.00    0.25     7.78    8.20      1.04     0.80   8.89    0.56   129.37    0.00      0.00     0.00   0.00    0.00     0.00    0.20    0.00    0.01   0.50

In logs i can see this

zed[1848837]: eid=242 class=deadman pool='rpool' vdev=sdb3 size=4096 offset=1619770630144 priority=3 err=0 flags=0x300080 bookmark=53732:0:5:0

In this case only sdb causing trouble. Taking sdb3 out of pool
zpool offline rpool sdb3
Problem seems to be fixed, loadavg getting lower, io delay lowers to almost zero.

Getting sdb drive back to the pool
zpool online rpool sdb3
IO delay rising again.
What is this, another faulty drive? Faulty ZFS logic which unfairly loads only one drive?

d.oishi · Apr 15, 2026

I interpret this as meaning that sdb may already have had a problem at the time of the first incident, but the deadman log just happened to appear on sdd. My understanding is that deadman does not necessarily directly identify the disk that first went bad. If a severe stall occurs on one drive or along one I/O path, the resulting ZFS write stall can also show up as deadman logs on other disks.

Based on the iostat results, this looks more like a problem with sdb itself or with the connection path specific to sdb (such as the port, slot, backplane position, cable, or controller path), rather than a logic bug in ZFS itself.

Sven Barr · Apr 15, 2026

I have been seeing this too and it started getting worse with more VMs showing IO delays and high IO Pressure Stalls.

I unchecked IO thread for a few VMs Hard Disk attributes, rebooted the VMs and now no more IO delays and IO Pressure Stalls. This does not appear to be ZFS related as one of my PVE hosts is on EXT4 partitions and I have a mix of AMD and Intel CPU hosts that it was happening with and I have both 6.17.13-3-pve and 7.0.0-2-pve kernels.

arukashi · Apr 16, 2026

I've changed suspected failed drive sdb, for now its all okay, no more freezes since 36 hours ago. I hope the problem can be marked as solved.
I came across this topic and in the end topic-starter changed the hardware too.

Search

Search

[SOLVED] Proxmox freezes with high IO, maybe ZFS related

arukashi

Member

d.oishi

Member

arukashi

Member

d.oishi

Member

Sven Barr

New Member

arukashi

Member

We value your privacy