error while backup please help

ibbex

New Member
Oct 28, 2024
2
0
1
INFO: starting new backup job: vzdump 100 --notes-template '{{guestname}}' --node promox --mode snapshot --storage test --compress zstd --remove 0 --notification-mode auto
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2024-10-29 00:08:11
INFO: status = running
INFO: VM Name: pos-ubuntu
INFO: include disk 'scsi0' 'local-lvm:vm-100-disk-0' 32G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating vzdump archive '/mnt/pve/test/dump/vzdump-qemu-100-2024_10_29-00_08_11.vma.zst'
INFO: started backup task 'ef4e8a6e-f9fc-4470-a387-f8db92c9a2dc'
INFO: resuming VM again
INFO: 5% (1.9 GiB of 32.0 GiB) in 3s, read: 640.0 MiB/s, write: 269.3 MiB/s
INFO: 8% (2.6 GiB of 32.0 GiB) in 6s, read: 234.8 MiB/s, write: 211.2 MiB/s
INFO: 9% (3.1 GiB of 32.0 GiB) in 8s, read: 266.3 MiB/s, write: 258.8 MiB/s
ERROR: job failed with err -5 - Input/output error
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 100 failed - job failed with err -5 - Input/output error
INFO: Failed at 2024-10-29 00:08:27
INFO: Backup job finished with errors
INFO: notified via target `mail-to-root`
TASK ERROR: job errors
 
Hi,
ERROR: job failed with err -5 - Input/output error
please check your system logs/journal for more information and the physical disks, both where the VM disks reside and the one used for the backup storage with e.g. smartctl.
 
Hi,
ERROR: job failed with err -5 - Input/output error
please check your system logs/journal for more information and the physical disks, both where the VM disks reside and the one used for the backup storage with e.g. smartctl.
root@promox:~# smartctl --all /dev/sda
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.4-2-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: NT-256 2280
Serial Number: 0023367A01131
LU WWN Device Id: 5 000000 000000003
Firmware Version: H190117D
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: Not in smartctl database 7.3/5319
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Oct 29 15:37:46 2024 +05
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 33) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0031) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1813
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 16
167 Unknown_Attribute 0x0022 100 100 000 Old_age Always - 0
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 21
169 Unknown_Attribute 0x0013 096 096 010 Pre-fail Always - 3670030
171 Unknown_Attribute 0x0032 000 000 000 Old_age Always - 2
172 Unknown_Attribute 0x0032 000 000 000 Old_age Always - 0
173 Unknown_Attribute 0x0012 200 200 000 Old_age Always - 8595636277
175 Program_Fail_Count_Chip 0x0022 070 100 010 Old_age Always - 0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 1644
187 Reported_Uncorrect 0x0032 100 000 000 Old_age Always - 21
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 16
194 Temperature_Celsius 0x0022 030 030 000 Old_age Always - 30 (Min/Max 10/50)
206 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 2
207 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 87
208 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 53
209 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 5
210 Unknown_Attribute 0x0032 200 200 000 Old_age Always - 46
211 Unknown_Attribute 0x0032 200 200 000 Old_age Always - 25
231 Unknown_SSD_Attribute 0x0023 097 097 005 Pre-fail Always - 3
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 28364591744
234 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 228351836
241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 19696617340
242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 842359733
245 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 1798 hours (74 days + 22 hours)
When the command that caused the error occurred, the device was in an unknown state.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 80 00 92 0b ea Error: UNC 128 sectors at LBA = 0x0a0b9200 = 168530432

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 80 00 92 0b ea 80 17d+05:34:40.504 READ DMA
c8 00 08 00 3a b5 ed 80 17d+05:34:40.504 READ DMA
c8 00 20 00 82 72 ed 80 17d+05:34:40.504 READ DMA
c8 00 08 90 dc 41 ed 80 17d+05:34:40.504 READ DMA
c8 00 00 48 50 36 ed 80 17d+05:34:40.504 READ DMA

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
The smart output does show that an error happened. You might want to try and create a copy of the VM disk to a storage on a different physical disk, e.g. using the Disk Action > Move Storage operation in the UI. If that also fails you can try with qemu-img convert using the --salvage flag.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!