Salvage broken zfs rpool?

Aug 16, 2023
2
0
1
Hello Together,

first of my Hardware configuration:
Mainboard: ASRock X570 Pro4
CPU: Ryzen 5750G
GPU: Intel A380
RAM: 128GB 3200MHz ECC UDIMM
NIC: X540-T2
SSD: 2x 980 Pro Samung 2TB for rpool
3x10TB & 3x18TB for HDD-pool

Last weekend I installed Proxmox Backup Server inside a LXC with a bind mount to my HDD-pool on my pve node.
I wanted to create a backup of an external server which worked fine, but then I started to backup my VMs/LXC on my pve node.
Most worked fine, but I ran into some errors with some LXC/VMs like:
INFO: Error: error at "root/world/region/r.-16.4.mca": No data available (os error 61)
ERROR: job failed with err -61 - No data available
After looking in syslog I saw messages like these:
Code:
2023-08-13T13:07:05.034645+02:00 pve zed: eid=35 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=131072 offset=140912259072 priority=4 err=61 flags=0x1808b0 delay=253ms bookmark=76013:96729:0:1
2023-08-13T13:07:38.859831+02:00 pve kernel: [  655.716472] critical medium error, dev nvme1n1, sector 1411234976 op 0x0:(READ) flags 0x0 phys_seg 16 prio class 2
2023-08-13T13:07:38.860361+02:00 pve zed: eid=36 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=131072 offset=722014388224 priority=4 err=61 flags=0x40080cb0 delay=308ms
2023-08-13T13:07:38.860478+02:00 pve zed: eid=37 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014502912 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496371
2023-08-13T13:07:38.860684+02:00 pve zed: eid=38 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014494720 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496370
2023-08-13T13:07:38.861065+02:00 pve zed: eid=39 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014470144 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496367
2023-08-13T13:07:38.861231+02:00 pve zed: eid=40 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014453760 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496365
2023-08-13T13:07:38.861348+02:00 pve zed: eid=41 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014445568 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496364
2023-08-13T13:07:38.861834+02:00 pve zed: eid=42 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014511104 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496374
2023-08-13T13:07:38.862249+02:00 pve zed: eid=43 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014486528 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496369
2023-08-13T13:07:38.862515+02:00 pve zed: eid=44 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014478336 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496368
2023-08-13T13:07:38.863063+02:00 pve zed: eid=45 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014437376 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496363
2023-08-13T13:07:38.863366+02:00 pve zed: eid=46 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014388224 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496357
2023-08-13T13:07:38.863638+02:00 pve zed: eid=47 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014461952 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496366
2023-08-13T13:07:38.863831+02:00 pve zed: eid=48 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014429184 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496362
2023-08-13T13:07:38.864352+02:00 pve zed: eid=49 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014412800 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496360
2023-08-13T13:07:38.864754+02:00 pve zed: eid=50 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014404608 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496359
2023-08-13T13:07:38.864963+02:00 pve zed: eid=51 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722014396416 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1496358
2023-08-13T13:07:38.865598+02:00 pve zed: eid=52 class=checksum pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 algorithm=fletcher4 size=8192 offset=722014420992 priority=4 err=52 flags=0x3808b0 bookmark=172:1:0:1496361
2023-08-13T13:07:39.171431+02:00 pve zed: eid=53 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=131072 offset=722041638912 priority=4 err=61 flags=0x40080cb0 delay=599ms
2023-08-13T13:07:39.171578+02:00 pve zed: eid=54 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722041745408 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1444633
2023-08-13T13:07:39.171673+02:00 pve zed: eid=55 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722041753600 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1444634
2023-08-13T13:07:39.171829+02:00 pve kernel: [  656.027934] nvme1n1: I/O Cmd(0x2) @ LBA 1411288200, 256 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR
2023-08-13T13:07:39.171835+02:00 pve kernel: [  656.027944] critical medium error, dev nvme1n1, sector 1411288200 op 0x0:(READ) flags 0x0 phys_seg 16 prio class 2
2023-08-13T13:07:39.172145+02:00 pve zed: eid=56 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722041704448 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1444628
2023-08-13T13:07:39.172202+02:00 pve zed: eid=57 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722041696256 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1444627
2023-08-13T13:07:39.172528+02:00 pve zed: eid=58 class=io pool='rpool' vdev=nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3 size=8192 offset=722041679872 priority=4 err=61 flags=0x3808b0 bookmark=172:1:0:1444625
I had a bad feeling about this and sadly by checking zpool status I was greeted by this:
Code:
  pool: rpool
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 1.12M in 00:12:57 with 191 errors on Sun Aug 13 13:16:06 2023
config:

        NAME                                                    STATE     READ WRITE CKSUM
        rpool                                                   DEGRADED     0     0     0
          mirror-0                                              DEGRADED 2.77K     0     0
            nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510655P-part3  DEGRADED 2.98K     0   491  too many errors
            nvme-Samsung_SSD_980_PRO_2TB_S69ENF0R510639Y-part3  FAULTED     75     0     9  too many errors
errors: Permanent errors have been detected in the following files:

        rpool/data/vm-110-disk-0:<0x1>
        rpool/data/vm-105-disk-0:<0x1>
        <0x70>:<0x7e43f>
        <0x70>:<0x7e4c3>
        <0x70>:<0x7aac5>
        <0x70>:<0x7e4d5>
        /rpool/data/subvol-109-disk-0/root/world/region/r.-16.4.mca
The scan: scrub repaired 1.12M in 00:12:57 with 191 errors on Sun Aug 13 13:16:06 2023 was started manually by me.
Apparently I was stupid and had only implemented scrubbing and mail notifications for my HDD-pool.
About a month ago I made some changes to my server:
- Upgrade to pve8
- installed Intel A380
- Upgraded BIOS
- enabled Resizable Bar
- enabled SR-IOV
Both SSDs SMART values look fine and have only 2% and 3% wear with a 24TB difference (109TB vs.133TB). (Bought them 2 years ago)
I think the difference in TBW indicates that the problem persists longer than the changes I made a month ago.
Memtest ran fine multiple times.
Right now everything still works fine.
Is there any way to salvage this?
As I see it there already is data loss and replacing one drive after another and resilvering would not get clean pool again.
I could ditch both VMs 105 and 110 and delete the file in LXC 109 as there is no important data.
But I do not understand what these errors mean:

Code:
        <0x70>:<0x7e43f>
        <0x70>:<0x7e4c3>
        <0x70>:<0x7aac5>
        <0x70>:<0x7e4d5>
What would be the best to do right now?
I backed up all my VMs and LXCs I care about on my HDD-pool via the proxmox backup server i mentioned earlier.
There is no encryption on my pools or my backup.

My plan I thought of is:
- Backup VMs/LXCs again
- Backup pve configs to pbs lxc
- Remove old SSDs
- Install Proxmox on new SSDs
- Import HDD-Pool
- Create pbs LXC with the same config/mount point as before.
- restore pve
- restore VMs/LXC

What paths are needed for a pve backup?
Right now I would backup:
- /etc
- /var
- /root
- /opt
- /boot
- installed apt packages and repos
Will this work or am I missing something?
Or is this idea stupid and there is a better/easier way?

Thanks in advance!
 
SSD: 2x 980 Pro Samung 2TB for rpool
Did you update the firmware of the Samsung 980 Pro's? There is a known issue with those drives that you cannot recover from once they break down, as far as I know.
But I do not understand what these errors mean:

Code:
        <0x70>:<0x7e43f>
        <0x70>:<0x7e4c3>
        <0x70>:<0x7aac5>
        <0x70>:<0x7e4d5>
I think those are ZFS pointers to corrupted parts of deleted files, if I recall correctly. Until you overwrite those blocks, they are marked corrupted. You'll need to do quite a bit of calculations to determine which sectors of the drive to overwrite.
What would be the best to do right now?
I'm not sure. Maybe try updating the firmware of one of the drives, secure erase it and reinstall Proxmox and restore from backups. Maybe you'll need new drives if you did not install the new firmware earlier.
I backed up all my VMs and LXCs I care about on my HDD-pool via the proxmox backup server i mentioned earlier.
Glad to hear. I love how fast PBS can back up to VMs every few hours, for situations like this.
 
Thanks for the pointer.
The firmware is probaly the issue.
Both have Version 2b2qgxa7, even one version earlier than the one mentioned in the article affected.
I will try to update the firmware on the drive marked as faulted first and see if it behaves normally again.
And if this won't work I will try the other drive marked as degraded.
I would not have guessed that I need to update the firmware on my SSDs, I had never done that before on any other SSD.
One more thing to look out for I guess.

Again, thank you!
You may have saved me some time digging into this :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!