VM backup err -5 - Input/output error

valk · Jan 8, 2023

Hi guys. Can't backup VM. Nothing I've googled helped so I need some advice. here is log from back up job:

INFO: Starting Backup of VM 3617 (qemu)
INFO: Backup started at 2023-01-08 08:06:15
INFO: status = running
INFO: VM Name: 3617
INFO: include disk 'sata0' 'local-zfs:vm-3617-disk-0' 50G
INFO: exclude disk 'sata1' 'hddzfs:vm-3617-disk-0' (backup=no)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/var/lib/vz/dump/vzdump-qemu-3617-2023_01_08-08_06_15.vma.zst'
INFO: started backup task 'cb27f70b-9ab0-47e5-a510-38d13c8a254a'
INFO: resuming VM again
INFO: 0% (348.0 MiB of 50.0 GiB) in 3s, read: 116.0 MiB/s, write: 73.6 MiB/s
INFO: 2% (1.1 GiB of 50.0 GiB) in 6s, read: 245.3 MiB/s, write: 75.4 MiB/s
INFO: 3% (1.9 GiB of 50.0 GiB) in 10s, read: 225.7 MiB/s, write: 68.9 MiB/s
INFO: 8% (4.0 GiB of 50.0 GiB) in 13s, read: 708.1 MiB/s, write: 76.3 MiB/s
INFO: 8% (4.2 GiB of 50.0 GiB) in 14s, read: 184.0 MiB/s, write: 65.0 MiB/s
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 3617 failed - job failed with err -5 - Input/output error
INFO: Failed at 2023-01-08 08:06:29
INFO: Backup job finished with errors
TASK ERROR: job errors

It always stop at 8% (Thats what I get from 50gib inside of VM:

)
My backup destination shows 400gb free space.
I've tried default and 4096 size vzdump.conf

# vzdump default settings
#tmpdir: DIR
#dumpdir: DIR
#storage: STORAGE_ID
#mode: snapshot|suspend|stop
#bwlimit: KBPS
#performance: max-workers=N
#ionice: PRI
#lockwait: MINUTES
#stopwait: MINUTES
#stdexcludes: BOOLEAN
#mailto: ADDRESSLIST
#prune-backups: keep-INTERVAL=N[,...]
#script: FILENAME
#exclude-path: PATHLIST
#pigz: N
#notes-template: {{guestname}}
size: 4096

root@prox:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.2-1
proxmox-backup-file-restore: 2.3.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve2

Moayad · Jan 9, 2023

Hello,

In general, the “error 5” code pointer to an issue with the drive.

To narrow down the case, I would do a backup on a different storage, and check the disk using smartctl.

valk · Jan 9, 2023

Moayad said:
Hello,

In general, the “error 5” code pointer to an issue with the drive.

To narrow down the case, I would do a backup on a different storage, and check the disk using smartctl.

Thx for the reply. Worth mentioning that I have 3 containers and one other VM backing up regularly to this storage. Also if I don't include the 50gib disk in the back up then it will run without issues.

valk · Jan 10, 2023

I did a test of using network drive and it had the same result exactly at 8 %

INFO: starting new backup job: vzdump 3617 --remove 0 --mode snapshot --compress zstd --storage dvaNFS --node prox --notes-template '{{guestname}}'
INFO: Starting Backup of VM 3617 (qemu)
INFO: Backup started at 2023-01-10 08:56:06
INFO: status = running
INFO: VM Name: 3617
INFO: include disk 'sata0' 'local-zfs:vm-3617-disk-0' 50G
INFO: exclude disk 'sata1' 'hddzfs:vm-3617-disk-0' (backup=no)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/dvaNFS/dump/vzdump-qemu-3617-2023_01_10-08_56_06.vma.zst'
INFO: started backup task 'e8d36916-2639-43e0-ad77-7fe9a3a2c36b'
INFO: resuming VM again
INFO: 0% (308.1 MiB of 50.0 GiB) in 3s, read: 102.7 MiB/s, write: 60.9 MiB/s
INFO: 2% (1.1 GiB of 50.0 GiB) in 6s, read: 258.6 MiB/s, write: 88.6 MiB/s
INFO: 3% (1.9 GiB of 50.0 GiB) in 9s, read: 273.1 MiB/s, write: 89.3 MiB/s
INFO: 8% (4.2 GiB of 50.0 GiB) in 12s, read: 792.6 MiB/s, write: 96.9 MiB/s
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 3617 failed - job failed with err -5 - Input/output error
INFO: Failed at 2023-01-10 08:56:46
INFO: Backup job finished with errors
TASK ERROR: job errors

Neobin · Jan 10, 2023

valk said:
I did a test of using network drive and it had the same result exactly at 8 %

INFO: starting new backup job: vzdump 3617 --remove 0 --mode snapshot --compress zstd --storage dvaNFS --node prox --notes-template '{{guestname}}'
INFO: Starting Backup of VM 3617 (qemu)
INFO: Backup started at 2023-01-10 08:56:06
INFO: status = running
INFO: VM Name: 3617
INFO: include disk 'sata0' 'local-zfs:vm-3617-disk-0' 50G
INFO: exclude disk 'sata1' 'hddzfs:vm-3617-disk-0' (backup=no)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/dvaNFS/dump/vzdump-qemu-3617-2023_01_10-08_56_06.vma.zst'
INFO: started backup task 'e8d36916-2639-43e0-ad77-7fe9a3a2c36b'
INFO: resuming VM again
INFO: 0% (308.1 MiB of 50.0 GiB) in 3s, read: 102.7 MiB/s, write: 60.9 MiB/s
INFO: 2% (1.1 GiB of 50.0 GiB) in 6s, read: 258.6 MiB/s, write: 88.6 MiB/s
INFO: 3% (1.9 GiB of 50.0 GiB) in 9s, read: 273.1 MiB/s, write: 89.3 MiB/s
INFO: 8% (4.2 GiB of 50.0 GiB) in 12s, read: 792.6 MiB/s, write: 96.9 MiB/s
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 3617 failed - job failed with err -5 - Input/output error
INFO: Failed at 2023-01-10 08:56:46
INFO: Backup job finished with errors
TASK ERROR: job errors

Sounds like this might not be a write problem to the backup storage, but a read problem from the source storage?

Would check the filesystem inside the VM as well as the whole underlaying disk/storage.

valk · Jan 10, 2023

I like the idea generation part but I have absolutely no idea where to start... What and how should I check for? Should I check for errors of guest fs or some kind of access problems from host? Permissions? Is there some kind of log I can view to get any idea of where to look next?
Maybe someone knows similar thread on forums or documentation?

Neobin · Jan 10, 2023

valk said:
Should I check for errors of guest fs

Yes. How depends on the used filesystem. Google should help here.

For ZFS: Scrub [1]
For the physical disk(s): Long SMART test [2]

[1] https://openzfs.github.io/openzfs-docs/man/8/zpool-scrub.8.html
[2] https://www.thomas-krenn.com/en/wiki/SMART_tests_with_smartctl

valk · Jan 11, 2023

Neobin said:
Yes. How depends on the used filesystem. Google should help here.

For ZFS: Scrub [1]
For the physical disk(s): Long SMART test [2]

[1] https://openzfs.github.io/openzfs-docs/man/8/zpool-scrub.8.html
[2] https://www.thomas-krenn.com/en/wiki/SMART_tests_with_smartctl

Looks like we find the culprit! VM ID I am trying to backup is 3617.

pool: hddzfs
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 16K in 05:09:34 with 3 errors on Sun Jan 8 05:33:35 2023
config:

NAME STATE READ WRITE CKSUM
hddzfs ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-HGST_HUS724040ALA640_PN2334PCKVV28B ONLINE 0 0 0
ata-HGST_HMS5C4040BLE640_PL2331LAH496HJ ONLINE 0 0 0
ata-WDC_WD40PURZ-85TTDY0_WD-WCC7K4NYNNP4 ONLINE 0 0 0
ata-WDC_WD40PURZ-85TTDY0_WD-WCC7K6EKUF02 ONLINE 0 0 0
ata-WDC_WD4000FYYZ-01UL1B2_WD-WMC130F0YDNE ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

hddzfs/vm-3617-disk-0:<0x1>

pool: rpool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 1.62M in 00:03:17 with 2 errors on Wed Jan 11 08:16:30 2023
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-Micron_5100_MTFDDAK480TBY_172117438175-part3 ONLINE 0 0 10
ata-Micron_5100_MTFDDAK480TBY_17211743822A-part3 ONLINE 0 0 15

errors: Permanent errors have been detected in the following files:

rpool/data/vm-3617-disk-0:<0x1>

Is there any way I can save this VM? what is this file? I don't have a backup obviously and to start this VM from scratch would be a huge pain. Also, what could have caused this?

Neobin · Jan 11, 2023

valk said:
Looks like we find the culprit! VM ID I am trying to backup is 3617.

pool: hddzfs
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 16K in 05:09:34 with 3 errors on Sun Jan 8 05:33:35 2023
config:

NAME STATE READ WRITE CKSUM
hddzfs ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-HGST_HUS724040ALA640_PN2334PCKVV28B ONLINE 0 0 0
ata-HGST_HMS5C4040BLE640_PL2331LAH496HJ ONLINE 0 0 0
ata-WDC_WD40PURZ-85TTDY0_WD-WCC7K4NYNNP4 ONLINE 0 0 0
ata-WDC_WD40PURZ-85TTDY0_WD-WCC7K6EKUF02 ONLINE 0 0 0
ata-WDC_WD4000FYYZ-01UL1B2_WD-WMC130F0YDNE ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

hddzfs/vm-3617-disk-0:<0x1>

pool: rpool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 1.62M in 00:03:17 with 2 errors on Wed Jan 11 08:16:30 2023
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-Micron_5100_MTFDDAK480TBY_172117438175-part3 ONLINE 0 0 10
ata-Micron_5100_MTFDDAK480TBY_17211743822A-part3 ONLINE 0 0 15

errors: Permanent errors have been detected in the following files:

rpool/data/vm-3617-disk-0:<0x1>

Is there any way I can save this VM? what is this file? I don't have a backup obviously and to start this VM from scratch would be a huge pain. Also, what could have caused this?

Unfortunately I do not know how to proceed on this, sorry.

Maybe try proceed from inside the VM (filesystem and bad blocks check and identification of the affected file(s)), like suggested here:
https://forum.proxmox.com/threads/p...ted-in-the-following-files.92865/#post-404523

Search

Search

VM backup err -5 - Input/output error

valk

New Member

Moayad

Proxmox Staff Member

valk

New Member

valk

New Member

Neobin

Distinguished Member

valk

New Member

Neobin

Distinguished Member

valk

New Member

Neobin

Distinguished Member