A long tedious story of XFS and memory corruption

danmac

Active Member
Dec 6, 2016
18
0
41
46
WARNING this thread is long and contains zero entertainment value.

So I spent the last few days intermittently looking into this problem, and although I don't care about it anymore, I'm documenting it here in case anyone else may benefit.

I got a new 3TB USB external drive. This will contain mostly porn^H^H^H^Hbig files I'm relieving my main (RAIDZ1, 4TB) storage array of. It won't be the end of the world if I lose these files, they won't change much, and once they are on the disk it will be almost entirely reads. So I choose to format the drive with XFS.

I mount the drive on the Proxmox machine and all is well. I start the copy and leave it running. Drive is USB3 but host is USB2 so I know this will take a while. I already ran badblocks -wsv on the drive off a USB3 host - the drive is good. (and clearly I'm not in a rush)

Fast forward an hour or so and the Proxmox machine is dead. I hook up a screen and the kernel has puked a panic all over it, but the screen is 80x25 and I don't get it all. Arse end of the stack dump makes me suspect ZFS immediately. I repeat this several times and each time the same happens, in the same timeframe but on different files.

ZFS scrub always comes back clean. I can hashdeep all the files without any issues, and the hashes always match. Anyway long story short I set the machine up with a crashkernel and bla bla bla. One thing I didn't find in the repos was the "dbg" package for the Proxmox kernel?

After all this, I have the full kernel panic text (along with memory dumps I can't do anything with) here is an example:

[ 4455.460832] Kernel panic - not syncing: buffer modified while frozen!
[ 4455.460852] CPU: 3 PID: 45 Comm: kswapd0 Tainted: P O 4.4.49-1-pve #1
[ 4455.460868] Hardware name: System manufacturer P5Q-E/P5Q-E, BIOS 2101 04/06/2009
[ 4455.460884] 0000000000000086 000000007bcde9a1 ffff880119f479f0 ffffffff813fa693
[ 4455.460907] ffffffffc022b5ef ffff880119f47a88 ffff880119f47a78 ffffffff8118d30d
[ 4455.460929] ffff880100000008 ffff880119f47a88 ffff880119f47a20 000000007bcde9a1
[ 4455.460952] Call Trace:
[ 4455.460964] [<ffffffff813fa693>] dump_stack+0x63/0x90
[ 4455.460993] [<ffffffff8118d30d>] panic+0xd5/0x21c
[ 4455.461064] [<ffffffffc015f814>] arc_cksum_verify.isra.10+0xe4/0xf0 [zfs]
[ 4455.461116] [<ffffffffc0160997>] arc_buf_destroy+0x47/0x170 [zfs]
[ 4455.461167] [<ffffffffc0162b3a>] arc_evict_state+0x48a/0x770 [zfs]
[ 4455.461229] [<ffffffffc01be65e>] ? spa_get_random+0x2e/0x60 [zfs]
[ 4455.461280] [<ffffffffc0162ec1>] arc_adjust_impl.constprop.25+0x31/0x40 [zfs]
[ 4455.461347] [<ffffffffc01630af>] arc_adjust+0x1df/0x460 [zfs]
[ 4455.461397] [<ffffffffc01646f9>] arc_shrink+0x49/0xb0 [zfs]
[ 4455.461447] [<ffffffffc0164a20>] __arc_shrinker_func.isra.24+0xa0/0x130 [zfs]
[ 4455.461513] [<ffffffffc0164ac7>] arc_shrinker_func_scan_objects+0x17/0x30 [zfs]
[ 4455.461558] [<ffffffff811a230a>] shrink_slab.part.40+0x1fa/0x410
[ 4455.461586] [<ffffffff811a6921>] shrink_zone+0x291/0x2d0
[ 4455.461613] [<ffffffff811a7ad3>] kswapd+0x583/0xa40
[ 4455.461640] [<ffffffff811a7550>] ? mem_cgroup_shrink_node_zone+0x1c0/0x1c0
[ 4455.461670] [<ffffffff810a12ea>] kthread+0xea/0x100
[ 4455.461696] [<ffffffff810a1200>] ? kthread_park+0x60/0x60
[ 4455.461725] [<ffffffff818606cf>] ret_from_fork+0x3f/0x70
[ 4455.461751] [<ffffffff810a1200>] ? kthread_park+0x60/0x60

After examining the ZFS source code, I understand the situation. This machine is not using ECC RAM, and in order to mitigate being a dirty cheap heathen fully deserving of corrupted data, I configured ZFS with zfs_flags=0x10 which is used to enable additional corruption checks within ZFS. That is detecting a discrepancy and panicking the kernel I assume because the alternative is to serve up corrupted data.

Well, shit. This could be *anything*. CPU? RAM? PSU? Bad driver? Anything between the ZFS array on SATA (flakey ICH9 SATA I gotta limit to 1.5Gbps, no less) and XFS drive on USB.

The core components are known good (my old desktop + old server PSU) but I know what you're thinking. Known good is only good until it's not. Out comes memtester (the one you run from within Linux) ran overnight with no errors. This machine (as configured) has previously run memtest86+ for days, and spent several years running fault-free at 3.2GHz instead of the stock 2.4GHz. (don't sweat, it's been running at stock ever since it stopped being my desktop) Also I know 100% I can repeatedly read every file on the source ZFS array reliably with no issues - at this point I'm on my 5th or 6th scrub this week, and I've hashdeep'd all these files at least twice.

So I'm no nearer figuring out what's going on here. I'm pretty much out of ideas. I reformat the USB drive as ext4 and retry ... completes successfully. Reformat with btrfs and retry ... completes successfully. Oh well, screw it. I left it as btrfs. Pre-3.16 btrfs was just asking for trouble, but 4.4 isn't horrendous, and I'm not doing anything clever with it.

I wonder if I should bother reporting this upstream or something? I have a couple of crash dumps but no -dbg kernel package and honestly I'm not sure I would know what to do with it anyway.

Anyway kids, that's enough storytime for one day. TL;DR: don't use XFS, it shat on my kernel memory with 100% reproduceability, it might shit on yours too, and if you're not paying attention you might never know until you already overwrote your backups with shit.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!