VM fails to backup (previously ok)

al-kizik

New Member
Jan 14, 2025
3
0
1
a couple of weeks ago, one VM failed to back up. The progress got to about 25% when it reports:
Code:
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
INFO: resuming VM again

looking in the logs I see many errors (see snippet attached)
I also attach the output of "pveversion -v" and "qm config 2000"

I am presuming it is a ssd media error (as that is where the VM's disk is); but welcome advice from an expert to be more specific to my problem and how to overcome it - thanks
 

Attachments

The kernel is trying to read something from disk, but fails
Code:
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: qc timeout after 15000 msecs (cmd 0x2f)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: Read log 0x10 page 0x00 failed, Emask 0x4
Jul 18 18:06:21 HP-i7-7700 kernel: ata2: failed to read log page 10h (errno=-5)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: irq_stat 0x40000008
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: failed command: READ FPDMA QUEUED
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: cmd 60/00:00:00:17:4d/01:00:00:00:00/40 tag 0 ncq dma 131072 in
         res 41/40:01:00:00:00/00:00:00:00:00/00 Emask 0x9 (media error)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: status: { DRDY ERR }
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: error: { UNC }

It tries many times and then as last resort resets the link to drive
Code:
Jul 18 18:06:21 HP-i7-7700 kernel: ata2: hard resetting link
Jul 18 18:06:21 HP-i7-7700 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:e0(SECURITY FREEZE LOCK) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd b1/c1:00:00:00:00:e0(DEVICE CONFIGURATION OVERLAY) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:e0(SECURITY FREEZE LOCK) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd b1/c1:00:00:00:00:e0(DEVICE CONFIGURATION OVERLAY) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: configured for UDMA/133
Jul 18 18:06:21 HP-i7-7700 kernel: scsi_io_completion_action: 25 callbacks suppressed

But as there are unrecoverable errors, this doesn't help. So in the following parts the sectors on the disk, it cant read are mentioned.
Code:
Jul 18 18:06:21 HP-i7-7700 kernel: sd 1:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=15s
Jul 18 18:06:21 HP-i7-7700 kernel: sd 1:0:0:0: [sda] tag#0 Sense Key : Medium Error [current] 
Jul 18 18:06:21 HP-i7-7700 kernel: sd 1:0:0:0: [sda] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed
Jul 18 18:06:21 HP-i7-7700 kernel: I/O error, dev sda, sector 5052160 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0

You will need to replace disk sda, it seems in a bad shape.
 
The kernel is trying to read something from disk, but fails
Code:
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: qc timeout after 15000 msecs (cmd 0x2f)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: Read log 0x10 page 0x00 failed, Emask 0x4
Jul 18 18:06:21 HP-i7-7700 kernel: ata2: failed to read log page 10h (errno=-5)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: irq_stat 0x40000008
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: failed command: READ FPDMA QUEUED
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: cmd 60/00:00:00:17:4d/01:00:00:00:00/40 tag 0 ncq dma 131072 in
         res 41/40:01:00:00:00/00:00:00:00:00/00 Emask 0x9 (media error)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: status: { DRDY ERR }
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: error: { UNC }

It tries many times and then as last resort resets the link to drive
Code:
Jul 18 18:06:21 HP-i7-7700 kernel: ata2: hard resetting link
Jul 18 18:06:21 HP-i7-7700 kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:e0(SECURITY FREEZE LOCK) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd b1/c1:00:00:00:00:e0(DEVICE CONFIGURATION OVERLAY) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:e0(SECURITY FREEZE LOCK) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: ACPI cmd b1/c1:00:00:00:00:e0(DEVICE CONFIGURATION OVERLAY) filtered out
Jul 18 18:06:21 HP-i7-7700 kernel: ata2.00: configured for UDMA/133
Jul 18 18:06:21 HP-i7-7700 kernel: scsi_io_completion_action: 25 callbacks suppressed

But as there are unrecoverable errors, this doesn't help. So in the following parts the sectors on the disk, it cant read are mentioned.
Code:
Jul 18 18:06:21 HP-i7-7700 kernel: sd 1:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=15s
Jul 18 18:06:21 HP-i7-7700 kernel: sd 1:0:0:0: [sda] tag#0 Sense Key : Medium Error [current]
Jul 18 18:06:21 HP-i7-7700 kernel: sd 1:0:0:0: [sda] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed
Jul 18 18:06:21 HP-i7-7700 kernel: I/O error, dev sda, sector 5052160 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0

You will need to replace disk sda, it seems in a bad shape.
Thank you for the detailed reply that helps my understanding. Obviously not the conclusion I wanted, but certainly suspected. I will proceed to ensure backups in place and then take the step to replace the drive - thank you for replying