ERROR: Backup of VM 168 failed - job failed with err -5 - Input/output error

MG_100

Active Member
Oct 31, 2020
43
3
28
25
Abend zusammen,

wir bekommen beim backuppen einer VM seit langer Zeit immer folgender Fehler:

Code:
INFO: Starting Backup of VM 168 (qemu)
INFO: Backup started at 2021-11-08 02:08:44
INFO: status = running
INFO: VM Name: web
INFO: include disk 'scsi0' 'local:168/vm-168-disk-0.qcow2' 50G
INFO: backup mode: snapshot
INFO: ionice priority: 5
INFO: creating Proxmox Backup Server archive 'vm/168/2021-11-08T01:08:44Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '62f60099-7a36-4335-a852-018b74b8badf'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO:   0% (428.0 MiB of 50.0 GiB) in 3s, read: 142.7 MiB/s, write: 93.3 MiB/s
INFO:   1% (528.0 MiB of 50.0 GiB) in 6s, read: 33.3 MiB/s, write: 33.3 MiB/s
INFO:   2% (1.0 GiB of 50.0 GiB) in 19s, read: 39.7 MiB/s, write: 39.7 MiB/s
INFO:   3% (1.5 GiB of 50.0 GiB) in 32s, read: 38.8 MiB/s, write: 32.6 MiB/s
INFO:   4% (2.0 GiB of 50.0 GiB) in 46s, read: 36.9 MiB/s, write: 36.9 MiB/s
INFO:   5% (2.5 GiB of 50.0 GiB) in 1m 3s, read: 29.4 MiB/s, write: 29.4 MiB/s
INFO:   6% (3.0 GiB of 50.0 GiB) in 1m 34s, read: 16.6 MiB/s, write: 16.6 MiB/s
INFO:   7% (3.5 GiB of 50.0 GiB) in 1m 58s, read: 21.0 MiB/s, write: 21.0 MiB/s
INFO:   8% (4.0 GiB of 50.0 GiB) in 2m 16s, read: 28.7 MiB/s, write: 28.7 MiB/s
INFO:   9% (4.5 GiB of 50.0 GiB) in 2m 36s, read: 25.8 MiB/s, write: 22.4 MiB/s
INFO:  10% (5.0 GiB of 50.0 GiB) in 2m 59s, read: 22.1 MiB/s, write: 16.2 MiB/s
INFO:  11% (5.5 GiB of 50.0 GiB) in 3m 40s, read: 12.4 MiB/s, write: 12.2 MiB/s
INFO:  12% (6.0 GiB of 50.0 GiB) in 4m 34s, read: 9.5 MiB/s, write: 9.5 MiB/s
INFO:  13% (6.5 GiB of 50.0 GiB) in 5m 3s, read: 18.2 MiB/s, write: 17.5 MiB/s
INFO:  14% (7.0 GiB of 50.0 GiB) in 5m 24s, read: 24.4 MiB/s, write: 24.4 MiB/s
INFO:  15% (7.5 GiB of 50.0 GiB) in 5m 57s, read: 15.3 MiB/s, write: 11.3 MiB/s
INFO:  16% (8.0 GiB of 50.0 GiB) in 6m 14s, read: 29.9 MiB/s, write: 17.4 MiB/s
INFO:  17% (8.5 GiB of 50.0 GiB) in 6m 48s, read: 15.1 MiB/s, write: 14.6 MiB/s
INFO:  18% (9.0 GiB of 50.0 GiB) in 7m 34s, read: 11.3 MiB/s, write: 11.3 MiB/s
INFO:  19% (9.5 GiB of 50.0 GiB) in 7m 57s, read: 22.1 MiB/s, write: 10.4 MiB/s
INFO:  19% (9.7 GiB of 50.0 GiB) in 8m 3s, read: 28.0 MiB/s, write: 8.7 MiB/s
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 168 failed - job failed with err -5 - Input/output error
INFO: Failed at 2021-11-08 02:16:48

Backups werden auf einem PBS Backup Server gespeichert.

Pveversion:
Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-5-pve)
pve-manager: 7.0-13 (running version: 7.0-13/7aa7e488)
pve-kernel-helper: 7.1-2
pve-kernel-5.11: 7.0-8
pve-kernel-5.4: 6.4-4
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-4-pve: 5.11.22-9
pve-kernel-5.11.22-2-pve: 5.11.22-4
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.15-pve1
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: not correctly installed
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve1
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-10
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-3
libpve-storage-perl: 7.0-12
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.11-1
proxmox-backup-file-restore: 2.0.11-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.1-1
pve-docs: 7.0-5
pve-edk2-firmware: 3.20210831-1
pve-firewall: 4.2-4
pve-firmware: 3.3-2
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-16
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

lvdisplay:
Code:
  --- Logical volume ---
  LV Path                /dev/pve/swap
  LV Name                swap
  VG Name                pve
  LV UUID                1Lvqaf-dHDi-PRkm-fPPb-J8KW-6NG6-t1s6TZ
  LV Write Access        read/write
  LV Creation host, time proxmox, 2021-02-17 17:51:48 +0100
  LV Status              available
  # open                 2
  LV Size                4.00 GiB
  Current LE             1024
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
 
  --- Logical volume ---
  LV Path                /dev/pve/root
  LV Name                root
  VG Name                pve
  LV UUID                lStzjb-wDoU-rHTT-HBnR-u83h-IngD-geEgiT
  LV Write Access        read/write
  LV Creation host, time proxmox, 2021-02-17 17:51:48 +0100
  LV Status              available
  # open                 1
  LV Size                <495.50 GiB
  Current LE             126847
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1

Hat jemand Ideen wie wir dies beheben können?

Wir haben auf der VM selber schon fsck etc. laufen lassen und hat keine fehlerhaften Sektoren gefunden.
 
wie sieht denn das syslog/journal (von beiden servern) aus? input/output error ist meistens ein disk problem
 
wie sieht denn das syslog/journal (von beiden servern) aus? input/output error ist meistens ein disk problem
konnte in der syslog folgendes finden:

Code:
Nov  7 02:17:45 server kernel: [1693966.634996] sd 2:0:0:0: [sda] tag#53 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
Nov  7 02:17:45 server kernel: [1693966.635020] sd 2:0:0:0: [sda] tag#53 Sense Key : Aborted Command [current]
Nov  7 02:17:45 server kernel: [1693966.635022] sd 2:0:0:0: [sda] tag#53 Add. Sense: I/O process terminated
Nov  7 02:17:45 server kernel: [1693966.635026] sd 2:0:0:0: [sda] tag#53 CDB: Read(10) 28 00 28 39 b1 90 00 07 f0 00
Nov  7 02:17:45 server kernel: [1693966.635037] blk_update_request: I/O error, dev sda, sector 674869648 op 0x0:(READ) flags 0x0 phys_seg 254 prio class 2
Nov  7 02:17:45 server kernel: [1693967.002227] sd 2:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
Nov  7 02:17:45 server kernel: [1693967.002238] sd 2:0:0:0: [sda] tag#58 Sense Key : Aborted Command [current]
Nov  7 02:17:45 server kernel: [1693967.002241] sd 2:0:0:0: [sda] tag#58 Add. Sense: I/O process terminated
Nov  7 02:17:45 server kernel: [1693967.002245] sd 2:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 28 39 c9 88 00 08 00 00
Nov  7 02:17:45 server kernel: [1693967.002248] blk_update_request: I/O error, dev sda, sector 674875784 op 0x0:(READ) flags 0x4000 phys_seg 254 prio class 2
Nov  7 02:17:45 server kernel: [1693967.312491] sd 2:0:0:0: [sda] tag#144 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
Nov  7 02:17:45 server kernel: [1693967.312550] sd 2:0:0:0: [sda] tag#144 Sense Key : Aborted Command [current]
Nov  7 02:17:45 server kernel: [1693967.312565] sd 2:0:0:0: [sda] tag#144 Add. Sense: I/O process terminated
Nov  7 02:17:45 server kernel: [1693967.312571] sd 2:0:0:0: [sda] tag#144 CDB: Read(10) 28 00 28 3a 96 00 00 07 f0 00
Nov  7 02:17:45 server kernel: [1693967.312588] blk_update_request: I/O error, dev sda, sector 674928128 op 0x0:(READ) flags 0x4000 phys_seg 254 prio class 2
Nov  7 02:17:48 server vzdump[4020670]: ERROR: Backup of VM 168 failed - job failed with err -5 - Input/output error
 
sieht danach aus als wäre die disk '/dev/sda' nicht in ordnung