Critical ZFS crash: VERIFY3(0 == dmu_object_claim_dnsize(zfsvfs->z_os... failed (0 == 28)

Don Daniello

Active Member
Jan 28, 2012
60
8
28
We just experienced a nasty crash whenever kernel touched our ZFS pool. This occured after we replaced one faulty drive and resilvered but in fact had nothing to do with that.

The crash bug occurs when ZFS is trying to reapply ZIL due to a previous power loss. The issues linked below document the detailed problem, symptoms and potential solutions. They helped us recover from that issue.

Partial trace (like this: https://github.com/zfsonlinux/zfs/issues/7151#issuecomment-518045491):
Code:
 kernel:[  149.314630] VERIFY3(0 == dmu_object_claim_dnsize(zfsvfs->z_os, obj, DMU_OT_PLAIN_FILE_CONTENTS, 0, obj_type, bonuslen, dnodesize, tx)) failed (0 == 28)

Related PRs/issues:
* tuxoko/zfs commit fixing the issue: https://github.com/tuxoko/zfs/commit/698ba75a05b09e2b7dd460e5564e3c9d6a9df1f2 (refers to 3 issues against zfsonlinux/zfs listed below)
* Issue 1: https://github.com/zfsonlinux/zfs/issues/7151
* Issue 2: https://github.com/zfsonlinux/zfs/issues/8910
* Issue 3: https://github.com/zfsonlinux/zfs/issues/9123
* Merged PR in zfsonlinux/zfs: https://github.com/zfsonlinux/zfs/pull/9061
* Helpful workarounds: https://github.com/zfsonlinux/zfs/issues/8910#issuecomment-502381050 and https://github.com/zfsonlinux/zfs/issues/8910#issuecomment-504147847

Since the bug has been fixed in zfsonlinux/zfs, it is up to the Proxmox team to release a kernel version with the patch - probably backported, since zfsonlinux releases are rare.
 
Thanks for the report!
We try to follow ZFS point releases (and the first patchset for 0.8.2 recently hit the ZFS lists) - once they are out we try to include them (after some testing).

You could ask upstream (https://github.com/zfsonlinux/zfs/pull/9161) to include this commit for 0.8.2 (it would make all other ZFS users profit from the fix as well)