PVE 6.1 (ZFS setup) keep killing disks

alexc

Active Member
Apr 13, 2015
123
4
38
Hello,

I rented a blade server with two 3.5" 4 Tb SATA disks, added both as ZFS mirror. No problem was seen initially, but soon I got one of drives failed (physically), so hosting company replaced it promptly and I have pool resilvered. After 3 days second disk died physically, and again I have it replaced. Since that moment I see disks keep die roughtly every 2-3 days (and hosting keep replace it, they good guys), and the once replaced whole server in a hope the controller may be faulty. None helped so far.

You see, these drives are ST4000NM002A (see https://www.seagate.com/enterprise-storage/exos-drives/exos-e-drives/exos-7e8/ ) - that is, Exos 7E8 4TB 512e SATA, not that it is home grade disks, so 3 days is too short for them to withstand (and the I/O load is quite low too).

I have only one VM on that PVE server, this VM is for web development purpose so load is small, and disks appears to die at night, when backup is in progress. Backup is set to run from rpool to "local" (/var/lib/vz) dir, so it run from and to the same pool physically.

I doubt ZFS capable to kill disks in any way, nor these disks can kill inself with vibration or whatever. The server itself is this https://www.supermicro.com/en/products/system/3U/5039/SYS-5039MC-H12TRF.cfm one (SuperServer 5039MC-H12TRF), it is provided by hosting company and I see no suspicious in this choice. The CPU is Xeon E-2288G, new and promising one (https://ark.intel.com/content/www/u...eon-e-2288g-processor-16m-cache-3-70-ghz.html).

Please advice how can I save disks further and make the server work not kill disks!
 
Hi,

We run Seagate Exos with ZFS in our Environment without any problems.
Also, I never heard from anybody about problems with ZFS and Spinning disk.

May be your hoster got a bad batch of disks.
 
Bad batch (maybe) was the reason they replaced disks that many times so easily (they are nice, really), but they told me many other clients use the same disks from the same batch without any problems.

That also tried to install disks of the same series but from different batch to overcome that, but no luck.

So they keep asking me what I can do on server to kill disks. If they remove killed disk from this server and put it into another disk is really dead, so this is not controller/driver issue.
 
And yes the server have no hw raid so all I can do is to resetup it as md mirror but I'd like to try zfs replicate feature to send VM state to another server.

We planned to rent several such blades (the same config), so zfs replicate can be good to try and use, but as we stick with this issue we first desided to fix this out before rent other blades.
 
Very curious, we have many disks of different types and vendors running with ZFS since years (RaidZ2 and Mirror). Of course we see disks passing away, but with normally expected rates ( usually < 1 per month out of around 500 disks ). So I bet your hoster has just no luck with these disks.
 
I'd wish to believe it is me who's no luck but I know other clients are happy with the same disks.

I doubt hoster will collect bad-looking disks only to supply it to my server :) ZFS did its best to predict problems and it warns me each time. First that there are some errors in checksums on one drive, then that errors are ate "too much" level so it won't use disk anymore (at this point the mirror is broken and we run on single disk, but this single disk run well which is strange indeed), and in maybe 8-10 hours the server hangs (there is no hotswap, it is a blade with simple Intel direct-to-motherboard SATA3 ports), and as I reboot it via IPMI the disk used to be dead.

So what's strange is why single disk appears to run as intended while when there are 2 disks problems arise but on one disk only. Another strange thing is that as the disk got replaced next time a candidate to be dead is second disk (so first time it'll be sda to die, next time as sda be replaced it'll be sdb to die, next time sda again). If only sda become dead each time I'd say the hardware is faulty so sda disk place used to show extra vibration or something, but it keep swap, no approaches.

Bad luck, but I need to have this server working!
 
Can you describe your HW ion detail?
 
Yes, here is it:

Blade server SYS-5039MC-H12TRF
2 disks Exos 7E8 4TB 512e SATA
1 disk Samsung m.2 970 Pro
Xeon E-2288-G
64 GB of RAM
(onboard IPMI)

Can check motherboard devices but nothing custom.

m.2 disk used as separate disk (zfs pool), not as cache for mirror made of HDDs.
 
There are any controller cards like Raid involved?
 
This is all common HW components so there should no problems with drivers.
Have you got any disk corruption since the board was replaced?
 
This is all common HW components so there should no problems with drivers.
Have you got any disk corruption since the board was replaced?
No and yes. As we replaced board and one of hdds, server seems to run well, the mirror restore process finished well too (this was yesterday night, actually, so not long ago), so I can say "no problem so far".

But last night backup job given me strange output, and I can reproduce it manually by try to "move" one of VM disks from HDD-dased pool to NVME-based pool:

Code:
no zvol device link for 'vm-800-disk-3' found after 10 sec found. (500)

This is land unknown for me, so I try to read docs for zfs for this, and it seems to be refered to old zfs modules in PVE, but my setup is latest and no updates are available:

Code:
pve-manager/6.1-3/37248ce6 (running kernel: 5.3.10-1-pve)

(Any recommendations on this situation?)

I don't know if this is disk corruption, but not looks like storage is ok, right?

The pool itself is ok, like this:

Code:
  pool: rpool
 state: ONLINE
  scan: resilvered 2.74T in 0 days 07:40:06 with 0 errors on Tue Jan 14 23:20:35 2020
config:

    NAME                                        STATE     READ WRITE CKSUM
    rpool                                       ONLINE       0     0     0
      mirror-0                                  ONLINE       0     0     0
        ata-ST4000NM002A-2HZ101_WJG0EP76-part3  ONLINE       0     0     0
        ata-ST4000NM002A-2HZ101_WJG0DJVS-part3  ONLINE       0     0     0
 
Do you have the UDEV rule for the symlink creation?
it is located /lib/udev/rules.d/60-zvol.rules
and are the symlinks present?
Code:
ls -Rhl /dev/zvol
 
Yes I can see symlinks:

Bash:
/dev/zvol:
total 0
drwxr-xr-x 2 root root 300 Jan 16 01:03 nvme1Tb
drwxr-xr-x 3 root root  60 Jan 16 01:03 rpool

/dev/zvol/nvme1Tb:
total 0
lrwxrwxrwx 1 root root 11 Jan 16 01:03 vm-800-disk-0 -> ../../zd112
lrwxrwxrwx 1 root root 13 Jan 16 01:03 vm-800-disk-0-part1 -> ../../zd112p1
lrwxrwxrwx 1 root root 10 Jan 16 01:03 vm-800-disk-1 -> ../../zd64
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-800-disk-1-part1 -> ../../zd64p1
lrwxrwxrwx 1 root root 10 Jan 16 01:03 vm-800-disk-2 -> ../../zd48
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-800-disk-2-part1 -> ../../zd48p1
lrwxrwxrwx 1 root root 10 Jan 16 01:03 vm-800-disk-3 -> ../../zd80
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-800-disk-3-part1 -> ../../zd80p1
lrwxrwxrwx 1 root root 10 Jan 16 01:03 vm-800-disk-4 -> ../../zd96
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-800-disk-4-part1 -> ../../zd96p1
lrwxrwxrwx 1 root root 10 Jan 16 01:03 vm-800-disk-5 -> ../../zd32
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-800-disk-5-part1 -> ../../zd32p1
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-800-disk-5-part2 -> ../../zd32p2

/dev/zvol/rpool:
total 0
drwxr-xr-x 2 root root 160 Jan 16 01:03 data

/dev/zvol/rpool/data:
total 0
lrwxrwxrwx 1 root root 12 Jan 16 01:03 vm-100-disk-0 -> ../../../zd0
lrwxrwxrwx 1 root root 14 Jan 16 01:03 vm-100-disk-0-part1 -> ../../../zd0p1
lrwxrwxrwx 1 root root 14 Jan 16 01:03 vm-100-disk-0-part2 -> ../../../zd0p2
lrwxrwxrwx 1 root root 14 Jan 16 01:03 vm-100-disk-0-part3 -> ../../../zd0p3
lrwxrwxrwx 1 root root 13 Jan 16 01:03 vm-800-disk-0 -> ../../../zd16
lrwxrwxrwx 1 root root 15 Jan 16 01:03 vm-800-disk-0-part1 -> ../../../zd16p1

But actually I overcome this (to be on the safe side) by adding another similiar sized disk to VM and restore files that are to be on zvol from backup, so this list includes this zvol too.

What's strange is that I have 2 HDDs in mirror, but I can boot from 2nd disk and can not boot from 1st disk. When I try to boot from 1st I see black screen and nothing happens while is I set to boot from 2nd disk I see GRUB menu with kernels and everything boos well.

I tried to do pool scrub, it finished well, no errors, so strange to know I can not boot from any disk and only one is on. So to say I don't know if it connected with disk fail situations since I installed grub on each new disk as intended by documentation.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!