Hi everyone, I run a small home lab for myself. My applications consist of a firewall and game servers or web apps. I've just had a drive start to fail (see SMART values) and I also have gotten IO errors in my VMs (I presume they have files in the bad blocks). How do I go about fixing this. I plan on replacing the drive but I'm not sure if that will fix the zfs errors. Any help is much appreciated. I will also provide any logs or outputs if you require anything else from me. Thank you!
Code:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 203 178 021 Pre-fail Always - 2825
4 Start_Stop_Count 0x0032 094 094 000 Old_age Always - 6125
5 Reallocated_Sector_Ct 0x0033 129 129 140 Pre-fail Always FAILING_NOW 561
7 Seek_Error_Rate 0x002e 200 185 000 Old_age Always - 0
9 Power_On_Hours 0x0032 051 051 000 Old_age Always - 36221
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 094 094 000 Old_age Always - 6124
192 Power-Off_Retract_Count 0x0032 196 196 000 Old_age Always - 3736
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 1935126
194 Temperature_Celsius 0x0022 121 085 000 Old_age Always - 29
196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 266
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
Code:
root@homelab:~# zpool status -x -v
pool: rpool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub in progress since Thu Jun 5 16:02:50 2025
67.0G / 67.0G scanned, 7.37G / 67.0G issued at 11.8M/s
0B repaired, 11.00% done, 01:25:53 to go
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD10TPVT-00HT5T0_WD-WX51C10X9276-part3 ONLINE 0 0 0
ata-ST91000640NS_9XG4G240-part3 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD10TPVT-00HT5T0_WD-WXF1AB0W3930-part3 ONLINE 0 0 0
ata-WDC_WD10TPVT-00HT5T1_WD-WXH1AC0K4108-part3 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
rpool/data/vm-100-disk-0:<0x1>
rpool/data/vm-101-disk-0:<0x1>