[Backup] ERROR: job failed with err -5 - Input/output error

mfmconsulting

Member
Nov 14, 2021
6
0
6
39
Good evening everyone,
first of all thanks to those who can help me!

Below is the error:
Code:
INFO: starting new backup job: vzdump 100 --mode snapshot --remove 0 --compress 0 --node pve --storage bck
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2021-11-14 19:20:59
INFO: status = running
INFO: VM Name: MKCDC01
INFO: include disk 'sata0' 'local-lvm:vm-100-disk-0' 180G
INFO: include disk 'sata1' 'vm:100/vm-100-disk-0.qcow2' 600G
INFO: include disk 'sata2' 'vm:100/vm-100-disk-1.qcow2' 600G
INFO: include disk 'sata3' 'vm:100/vm-100-disk-2.qcow2' 500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating archive '/mnt/md1/dump/vzdump-qemu-100-2021_11_14-19_20_59.vma'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '9a02f7d3-65bb-46c0-a150-cde3f96bb300'
INFO: resuming VM again
INFO: status: 0% (1924136960/2018634629120), sparse 0% (183267328), duration 3, read/write 641/580 MB/s
INFO: status: 1% (20277362688/2018634629120), sparse 0% (1904996352), duration 66, read/write 291/263 MB/s
INFO: status: 1% (31101550592/2018634629120), sparse 0% (2038902784), duration 116, read/write 216/213 MB/s
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
ERROR: Backup of VM 100 failed - job failed with err -5 - Input/output error
INFO: Failed at 2021-11-14 19:23:03
INFO: Backup job finished with errors
TASK ERROR: job errors

In the VM, it is a Windows 2019 DC, I have performed analysis and chkdsk / F on the volumes (NOT SSD), the results are all positive ...

How to fix to re-backup this VM?

Greetings
 
Hi,

ist there any error in the syslog/journalctl during the backup time?
syslog:

Code:
Nov 14 19:20:59 pve pvedaemon[11587]: INFO: starting new backup job: vzdump 100 --mode snapshot --remove 0 --compress 0 --node pve --storage bck
Nov 14 19:20:59 pve pvedaemon[11587]: INFO: Starting Backup of VM 100 (qemu)
Nov 14 19:21:00 pve systemd[1]: Starting Proxmox VE replication runner...
Nov 14 19:21:00 pve systemd[1]: pvesr.service: Succeeded.
Nov 14 19:21:00 pve systemd[1]: Started Proxmox VE replication runner.
Nov 14 19:21:46 pve pvestatd[1379]: auth key pair too old, rotating..
Nov 14 19:22:00 pve systemd[1]: Starting Proxmox VE replication runner...
Nov 14 19:22:00 pve systemd[1]: pvesr.service: Succeeded.
Nov 14 19:22:00 pve systemd[1]: Started Proxmox VE replication runner.
Nov 14 19:22:59 pve kernel: [103971.821809] blk_update_request: critical medium error, dev nvme0n1, sector 315193016 op 0x0:(READ) flags 0x80700 phys_seg 9 prio class 0
Nov 14 19:22:59 pve kernel: [103971.900123] blk_update_request: critical medium error, dev nvme0n1, sector 315193072 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Nov 14 19:22:59 pve kernel: [103971.901681] Buffer I/O error on dev dm-6, logical block 7593038, async page read
Nov 14 19:22:59 pve kernel: [103971.977964] blk_update_request: critical medium error, dev nvme0n1, sector 315193072 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Nov 14 19:22:59 pve kernel: [103971.979497] Buffer I/O error on dev dm-6, logical block 7593038, async page read
Nov 14 19:23:00 pve systemd[1]: Starting Proxmox VE replication runner...
Nov 14 19:23:00 pve systemd[1]: pvesr.service: Succeeded.
Nov 14 19:23:00 pve systemd[1]: Started Proxmox VE replication runner.
Nov 14 19:23:03 pve pvedaemon[11587]: ERROR: Backup of VM 100 failed - job failed with err -5 - Input/output error
Nov 14 19:23:03 pve pvedaemon[11587]: INFO: Backup job finished with errors
Nov 14 19:23:03 pve pvedaemon[11587]: job errors

Code:
@pve:~# smartctl -a /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.34-1-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 970 EVO 1TB
Serial Number:                      S5H9NS0N847156H
Firmware Version:                   2B2QEXE7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization:            110,246,121,472 [110 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 580143d3b9
Local Time is:                      Mon Nov 15 10:24:01 2021 CET
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.20W       -        -    0  0  0  0        0       0
 1 +     4.30W       -        -    1  1  1  1        0       0
 2 +     2.10W       -        -    2  2  2  2        0       0
 3 -   0.0400W       -        -    3  3  3  3      210    1200
 4 -   0.0050W       -        -    4  4  4  4     2000    8000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        53 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    10,308,376 [5.27 TB]
Data Units Written:                 3,063,752 [1.56 TB]
Host Read Commands:                 81,839,893
Host Write Commands:                124,062,265
Controller Busy Time:               1,013
Power Cycles:                       105
Power On Hours:                     3,028
Unsafe Shutdowns:                   97
Media and Data Integrity Errors:    79
Error Information Log Entries:      79
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               53 Celsius
Temperature Sensor 2:               60 Celsius

Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0         79     4  0x0223  0xc502  0x000    315193074     1     -
  1         78     3  0x02a7  0xc502  0x000    315193074     1     -
  2         77     8  0x014c  0x4502  0x000    315193074     1     -
  3         76    18  0x00c2  0xc502  0x000    315193074     1     -
  4         75    16  0x02f0  0xc502  0x000    315193074     1     -
  5         74    15  0x0012  0x4502  0x000    315193074     1     -
  6         73    17  0x0218  0xc502  0x000    315193076     1     -
  7         72     1  0x0090  0xc502  0x000    315193074     1     -
  8         71    17  0x0212  0x4502  0x000    315193074     1     -
  9         70    10  0x022f  0xc502  0x000    350164564     1     -
 10         69    12  0x007f  0xc502  0x000    350164564     1     -
 11         68    10  0x0222  0x4502  0x000    350164564     1     -
 12         67    14  0x0016  0x4502  0x000    267315500     1     -
 13         66    12  0x0064  0xc502  0x000    315193074     1     -
 14         65    11  0x0309  0xc502  0x000    315193074     1     -
 15         64    17  0x0207  0x4502  0x000    315193074     1     -
... (3 entries not shown)

root@pve:~#
 
Last edited:
Thank you for the Syslog and the smartctl!

Code:
Nov 14 19:22:59 pve kernel: [103971.821809] blk_update_request: critical medium error, dev nvme0n1, sector 315193016 op 0x0:(READ) flags 0x80700 phys_seg 9 prio class 0
Nov 14 19:22:59 pve kernel: [103971.900123] blk_update_request: critical medium error, dev nvme0n1, sector 315193072 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Nov 14 19:22:59 pve kernel: [103971.901681] Buffer I/O error on dev dm-6, logical block 7593038, async page read
Nov 14 19:22:59 pve kernel: [103971.977964] blk_update_request: critical medium error, dev nvme0n1, sector 315193072 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Nov 14 19:22:59 pve kernel: [103971.979497] Buffer I/O error on dev dm-6, logical block 7593038, async page read

Looks like the nvme storage has a problem. Is the backup job works on the "local" storage?
 
Thank you for the Syslog and the smartctl!

Code:
Nov 14 19:22:59 pve kernel: [103971.821809] blk_update_request: critical medium error, dev nvme0n1, sector 315193016 op 0x0:(READ) flags 0x80700 phys_seg 9 prio class 0
Nov 14 19:22:59 pve kernel: [103971.900123] blk_update_request: critical medium error, dev nvme0n1, sector 315193072 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Nov 14 19:22:59 pve kernel: [103971.901681] Buffer I/O error on dev dm-6, logical block 7593038, async page read
Nov 14 19:22:59 pve kernel: [103971.977964] blk_update_request: critical medium error, dev nvme0n1, sector 315193072 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Nov 14 19:22:59 pve kernel: [103971.979497] Buffer I/O error on dev dm-6, logical block 7593038, async page read

Looks like the nvme storage has a problem. Is the backup job works on the "local" storage?
local-lvm is nvme where is vm-100-disk-0

bck is on hdd 8Tb (is other storage)

Do you have any idea how to fix the error on "Buffer I/O error on dev dm-6"?
 
Could you please share the VM config and the output of lvs command? you can get the VM config using the qm config {VMID} CLI
 
here:
Code:
root@pve:~# qm config 100
agent: 1
balloon: 4096
bootdisk: sata0
cores: 6
ide2: iso:iso/virtio-win-0.1.185.iso,media=cdrom,size=402812K
memory: 22528
name: MKCDC01
net0: e1000=C6:06:A1:5E:F3:80,bridge=vmbr0
numa: 0
onboot: 1
ostype: win10
parent: TuttoFunzionante
sata0: local-lvm:vm-100-disk-0,cache=writeback,size=180G
sata1: vm:100/vm-100-disk-0.qcow2,cache=writeback,size=600G
sata2: vm:100/vm-100-disk-1.qcow2,cache=writeback,size=600G
sata3: vm:100/vm-100-disk-2.qcow2,cache=writeback,size=500G
scsihw: virtio-scsi-pci
smbios1: uuid=c38afe04-77c1-4947-97ae-4f83a118a76d
sockets: 1
vmgenid: 35f7b7dc-72a6-4094-bbeb-4b4f3ae5ec20
root@pve:~#
Code:
root@pve:~# lvs
  LV                                  VG  Attr       LSize    Pool Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  data                                pve twi-aotz-- <794.79g                    5.24   0.54                           
  root                                pve -wi-ao----   96.00g                                                           
  snap_vm-100-disk-0_TuttoFunzionante pve Vri---tz-k  180.00g data vm-100-disk-0                                       
  swap                                pve -wi-ao----    8.00g                                                           
  vm-100-disk-0                       pve Vwi-aotz--  180.00g data               22.43                                 
root@pve:~#

N.B. ostype is WS 2K19

Schermata da 2021-11-15 11-20-27.png
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!