Greetings, I have a problem, recently I was using a disk with a zfs and there I have the virtual hard drives of all the vms, in another I have the backups, normally I have 3 vms of that zfs running, a trueNAs machine, a linux server, and another pure apis linux, recently it began to be suspended after updating the version of proxmox to 7.4-17, thinking that the disk could be failing, I bought a more recent one with more space, however it is giving me the same error, I am not completely I'm sure if it's a coincidence that the new disk was damaged and it's giving me the same errors, in this case what I did was connect the new one, create a new zfs, copy all the vms one by one to the new zfs and remove the disk old man, I leave the log, on the other hand the smart test gives me the following:
LOG:
https://sharetxt.live/proxmoxLog1
Pruba SMART
I add some images of what is happening to me, I have already tried with zpool import and if it works, also with zpool clear, however after starting a vm, minutes later it is suspended, I am open to any support you can give me, thank you very much and Excellent day.
LOG:
https://sharetxt.live/proxmoxLog1
Pruba SMART
Code:
SMART Error Log Version: 1
ATA Error Count: 5
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 5 occurred at disk power-on lifetime: 75 hours (3 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 10 10 0a 00 e0 Device Fault; Error: ABRT 16 sectors at LBA = 0x00000a10 = 2576
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 10 10 0a 00 e0 08 00:00:30.306 READ DMA
ef 10 02 00 00 00 a0 08 00:00:30.298 SET FEATURES [Enable SATA feature]
Error 4 occurred at disk power-on lifetime: 75 hours (3 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 10 10 0a 00 e0 Device Fault; Error: ABRT 16 sectors at LBA = 0x00000a10 = 2576
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 10 10 0a 00 e0 08 00:00:30.106 READ DMA
ef 10 02 00 00 00 a0 08 00:00:30.098 SET FEATURES [Enable SATA feature]
Error 3 occurred at disk power-on lifetime: 75 hours (3 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 10 10 0a 00 e0 Device Fault; Error: ABRT 16 sectors at LBA = 0x00000a10 = 2576
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 10 10 0a 00 e0 08 00:00:29.906 READ DMA
ef 10 02 00 00 00 a0 08 00:00:29.898 SET FEATURES [Enable SATA feature]
Error 2 occurred at disk power-on lifetime: 75 hours (3 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 10 10 0a 00 e0 Device Fault; Error: ABRT 16 sectors at LBA = 0x00000a10 = 2576
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 10 10 0a 00 e0 08 00:00:29.706 READ DMA
ef 10 02 00 00 00 a0 08 00:00:29.699 SET FEATURES [Enable SATA feature]
Error 1 occurred at disk power-on lifetime: 75 hours (3 days + 3 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 10 10 0a 00 e0 Device Fault; Error: ABRT 16 sectors at LBA = 0x00000a10 = 2576
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 10 10 0a 00 e0 08 00:00:29.373 READ DMA
ef 10 02 00 00 00 a0 08 00:00:29.366 SET FEATURES [Enable SATA feature]
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
root@pve:~# zpool status
pool: MainStorage
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
config:
NAME STATE READ WRITE CKSUM
MainStorage ONLINE 0 0 0
ata-WDC_WD20EFAX-68B2RN1_WD-WXA2A51HF5LN ONLINE 33 0 40
errors: List of errors unavailable: pool I/O is currently suspended
I add some images of what is happening to me, I have already tried with zpool import and if it works, also with zpool clear, however after starting a vm, minutes later it is suspended, I am open to any support you can give me, thank you very much and Excellent day.