ERROR: Backup of VM 100 failed - job failed with err -5 - Input/output error

robbdog21

Member
Aug 21, 2020
12
0
6
33
Hey,

Just been having this issue for the last month where the backup fails at 37%.
I've changed the backup location and it still fails at 37%.
Can anyone see what I'm missing ?

Proxmox version: 7.2-3
Error: Backup of VM 100 failed - job failed with err -5 - Input/output error
Backup size: sits around 81GB
 

Attachments

  • proxmox backup fail 1.JPG
    proxmox backup fail 1.JPG
    65.3 KB · Views: 26
  • proxmox backup fail 2.JPG
    proxmox backup fail 2.JPG
    57.3 KB · Views: 24
  • proxmox backup fail 3.JPG
    proxmox backup fail 3.JPG
    35.5 KB · Views: 17
  • proxmox backup fail 4.JPG
    proxmox backup fail 4.JPG
    37.2 KB · Views: 18
  • proxmox backup fail 5.JPG
    proxmox backup fail 5.JPG
    67.8 KB · Views: 26
Can you also check and post the system log, e.g., using journalctl --since -1d for any additional errors?

"Input/output error" normally indicates storage issues of some form, could also be an IO error on the underlying disk.
Maybe check the S.M.A.R.T data of the backing disk too.
 
Feb 23 12:06:58 ProxmoxServer pvedaemon[630625]: INFO: starting new backup job: vzdump 100 --mailto backup@domain.org.au --mode snapshot --prune-backups 'keep-last=15,keep-monthly=11' --mailnotification always --all 0 --storage D>
Feb 23 12:06:58 ProxmoxServer pvedaemon[630625]: INFO: Starting Backup of VM 100 (qemu)
Feb 23 12:14:20 ProxmoxServerkernel: ata1.00: exception Emask 0x0 SAct 0x80060 SErr 0x40000 action 0x0
Feb 23 12:14:20 ProxmoxServer kernel: ata1.00: irq_stat 0x40000008
Feb 23 12:14:20 ProxmoxServerkernel: ata1: SError: { CommWake }
Feb 23 12:14:20 ProxmoxServerkernel: ata1.00: failed command: READ FPDMA QUEUED
Feb 23 12:14:20 ProxmoxServerkernel: ata1.00: cmd 60/80:28:00:d8:f3/02:00:0e:00:00/40 tag 5 ncq dma 327680 in
res 41/40:80:68:da:f3/00:02:0e:00:00/00 Emask 0x409 (media error) <F>

Feb 23 12:14:20 ProxmoxServer kernel: ata1.00: status: { DRDY ERR }
Feb 23 12:14:20 ProxmoxServer kernel: ata1.00: error: { UNC }
Feb 23 12:14:20 ProxmoxServerkernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 23 12:14:20 ProxmoxServer kernel: ata1.00: disabling queued TRIM support
Feb 23 12:14:20 ProxmoxServerkernel: ata1.00: supports DRM functions and may not be fully accessible
Feb 23 12:14:20 ProxmoxServerkernel: ata1.00: disabling queued TRIM support
Feb 23 12:14:20 ProxmoxServer kernel: ata1.00: configured for UDMA/133
Feb 23 12:14:20 ProxmoxServerkernel: sd 0:0:0:0: [sda] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Feb 23 12:14:20 ProxmoxServer kernel: sd 0:0:0:0: [sda] tag#5 Sense Key : Medium Error [current]
Feb 23 12:14:20 ProxmoxServerkernel: sd 0:0:0:0: [sda] tag#5 Add. Sense: Unrecovered read error - auto reallocate failed
Feb 23 12:14:20 ProxmoxServer kernel: sd 0:0:0:0: [sda] tag#5 CDB: Read(10) 28 00 0e f3 d8 00 00 02 80 00
Feb 23 12:14:20 ProxmoxServerkernel: blk_update_request: I/O error, dev sda, sector 250862184 op 0x0:(READ) flags 0x0 phys_seg 3 prio class 0
Feb 23 12:14:20 ProxmoxServer kernel: ata1: EH complete
Feb 23 12:14:24 ProxmoxServerpvedaemon[630625]: ERROR: Backup of VM 100 failed - job failed with err -5 - Input/output error
Feb 23 12:14:24 ProxmoxServerpvedaemon[630625]: INFO: Backup job finished with errors
Feb 23 12:14:24 ProxmoxServer pvedaemon[630625]: job errors
Feb 23 12:14:24 ProxmoxServer postfix/pickup[619486]: 8D0301E0FAE: uid=0 from=<root>
Feb 23 12:14:24 ProxmoxServer postfix/cleanup[631949]: 8D0301E0FAE: message-id=<20230223014424.8D0301E0FAE@ProxmoxServer.domain.local>
Feb 23 12:14:24 ProxmoxServer postfix/qmgr[868]: 8D0301E0FAE: from=<root@ProxmoxServer.domain.local>, size=12394, nrcpt=1 (queue active)
Feb 23 12:14:24 ProxmoxServer pvedaemon[3403609]: <root@pam> end task UPID:WhyallaHeadSt:00099F61:0A40494C:63F6C33A:vzdump:100:root@pam: job errors
Feb 23 12:14:24 ProxmoxServer postfix/smtp[631951]: 8D0301E0FAE: to=<backup@domain.org.au>, relay=central.smtp.domain.local[192.168.100.211]:25, delay=0.05, delays=0.02/0.01/0/0.01, dsn=2.6.0, status=sent (250 2.6.0 <2023>
Feb 23 12:14:24 ProxmoxServer postfix/qmgr[868]: 8D0301E0FAE: removed
Feb 23 12:17:01 ProxmoxServer CRON[632387]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Feb 23 12:17:01 ProxmoxServer CRON[632388]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Feb 23 12:17:01 ProxmoxServer CRON[632387]: pam_unix(cron:session): session closed for user root
Feb 23 12:17:14 ProxmoxServer pvedaemon[3418581]: <root@pam> successful auth for user 'root@pam'
Feb 23 12:22:51 ProxmoxServer smartd[591]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 69 to 66
Feb 23 12:22:51 ProxmoxServer smartd[591]: Device: /dev/sda [SAT], ATA error count increased from 37 to 38
 
Last edited:
would the 26% wear on the SSD be the issue you think ?
 

Attachments

  • proxmox wearout.JPG
    proxmox wearout.JPG
    97.9 KB · Views: 26
Just googled "ata1.00: exception Emask 0x0 SAct 0x800000 SErr 0x0 action 0x0"
Looks like the SSD is on its way out, do you reckon this is the case ?
 
Yeah, from the ata exception and the
blk_update_request: I/O error, dev sda, sector 250862184 op 0x0:(READ) flags 0x0 phys_seg 3 prio class 0
it seems HW related, especially if that just started to happen without any (bigger) software updates (e.g. kernel, qemu, PBS)

Checking the SMART data fits with that assumption. Especially the "uncorrectable error counter" not being zero, plus the other CRC/ECC/.. error counter also being higher than zero makes this a rather clear cute case of HW on the way to total fallout.
From the power on hours the drive was running for ~ 5 years and 10 months, and IME with 5y+ a lot of SSDs, especially consumer grade ones like your Evo 850 here, are accelerating towards their EOL.

So I'd ensure to replace this disk and move all relevant data rather sooner than later.
 
Wait, isn't the Evo 850 an SATA SSD? If so, then 69°C are DAMN hot,.
 
it might be that we are using little Intel NUC's, it may be full of dust.

I've got another NUC with a new 500GB SSD, just join them as a cluster and as it gets to 37% migration it fails.
is there anyway around this ? VM 100 qmp command 'block-job-cancel' failed - Block job 'drive-sata0' not found

Feb 27 12:43:04 ProxmoxServer kernel: sd 0:0:0:0: [sda] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
Feb 27 12:43:04 ProxmoxServer kernel: sd 0:0:0:0: [sda] tag#8 Sense Key : Medium Error [current]
Feb 27 12:43:04 ProxmoxServerkernel: sd 0:0:0:0: [sda] tag#8 Add. Sense: Unrecovered read error - auto reallocate failed
Feb 27 12:43:04 ProxmoxServerkernel: sd 0:0:0:0: [sda] tag#8 CDB: Read(10) 28 00 0e f3 d8 00 00 02 80 00
Feb 27 12:43:04 ProxmoxServerkernel: blk_update_request: I/O error, dev sda, sector 250862184 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
Feb 27 12:43:04 ProxmoxServerkernel: ata1: EH complete
Feb 27 12:43:04 ProxmoxServerpvedaemon[1575245]: VM 100 qmp command failed - VM 100 qmp command 'block-job-cancel' failed - Block job 'drive-sata0' not found
Feb 27 12:43:05 ProxmoxServer pmxcfs[1569083]: [status] notice: received log
Feb 27 12:43:05 ProxmoxServer pmxcfs[1569083]: [status] notice: received log
Feb 27 12:43:06 ProxmoxServer pmxcfs[1569083]: [status] notice: received log
Feb 27 12:43:06 ProxmoxServer pmxcfs[1569083]: [status] notice: received log
Feb 27 12:43:07 ProxmoxServer pvedaemon[1575245]: migration problems
Feb 27 12:43:07 ProxmoxServer pvedaemon[1572961]: <root@pam> end task UPID:WhyallaHeadSt:0018094D:0C5229A7:63FC0FB6:qmigrate:100:root@pam: migration problems
 

Attachments

  • proxmox migrate.JPG
    proxmox migrate.JPG
    89.1 KB · Views: 9
Still the same error as before. Makes sense as the migration is not much different from backup-restore.
Use ddrescue to clone the broken disk to a new one of at least the same size.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!