Hi, I have proxmox with an msata for system and an SSD for VMs... also I have a HDD for backups tasks, 250 Gb hard disk... two days ago, I´m experiencing a high IO delay and a load average of 4.0 more or less.... I make the backups at 4.30am and now it´s 2pm and a VM still locked... yesterday was another machine that I deleted thinking it could be the issue, but not... today it´s locked to another VM....
I think it could be a faulty hard disk??
Here is the smartctl results, but I don´t know to interpret the info...
Is any faulty value in this hard disk? if not, where could be the issue?
I hope could get any help for any user / staff people
This is a copy/paste of the task viewer / backup job (last lines)
---------------------
--------------------------------------
Thanks!!
Here is the smartctl report of the HDD (the hard disk where I save the backups)
-------------------
----------------------------------
I think it could be a faulty hard disk??
Here is the smartctl results, but I don´t know to interpret the info...
Is any faulty value in this hard disk? if not, where could be the issue?
I hope could get any help for any user / staff people
This is a copy/paste of the task viewer / backup job (last lines)
---------------------
Code:
INFO: 88% (88.0 GiB of 100.0 GiB) in 4m 49s, read: 2.6 GiB/s, write: 0 B/s
INFO: 91% (91.8 GiB of 100.0 GiB) in 4m 52s, read: 1.3 GiB/s, write: 230.7 KiB/s
INFO: 97% (97.1 GiB of 100.0 GiB) in 4m 55s, read: 1.8 GiB/s, write: 0 B/s
INFO: 100% (100.0 GiB of 100.0 GiB) in 4m 57s, read: 1.4 GiB/s, write: 0 B/s
INFO: backup is sparse: 95.06 GiB (95%) total zero data
INFO: transferred 100.00 GiB in 297 seconds (344.8 MiB/s)
INFO: stopping kvm after backup task
INFO: archive file size: 1.86GB
INFO: prune older backups with retention: keep-last=2
INFO: removing backup 'backup:backup/vzdump-qemu-108-2022_02_07-09_21_59.vma.zst'
INFO: pruned 1 backup(s) not covered by keep-retention policy
INFO: Finished Backup of VM 108 (00:04:59)
INFO: Backup finished at 2022-02-10 11:29:31
INFO: Starting Backup of VM 109 (lxc)
INFO: Backup started at 2022-02-10 11:29:31
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: CT Name: 109-Ubuntu-Pruebas
INFO: including mount point rootfs ('/') in backup
INFO: creating vzdump archive '/mnt/Disco250Gb/backup/dump/vzdump-lxc-109-2022_02_10-11_29_31.tar.zst'
Thanks!!
Here is the smartctl report of the HDD (the hard disk where I save the backups)
-------------------
Code:
root@pve1:~# smartctl -a /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.19-4-pve] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Travelstar 5K320
Device Model: Hitachi HTS543225L9A300
Serial Number: 090807FB8D00LJHBR1KA
LU WWN Device Id: 5 000cca 55ed36a5c
Firmware Version: FBEOC40C
User Capacity: 250,059,350,016 bytes [250 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 3f
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Thu Feb 10 13:56:19 2022 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 645) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 102) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 253 253 033 Pre-fail Always - 0
4 Start_Stop_Count 0x0012 065 065 000 Old_age Always - 55611
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 076 076 000 Old_age Always - 10561
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 2002
191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 138
193 Load_Cycle_Count 0x0012 089 089 000 Old_age Always - 114215
194 Temperature_Celsius 0x0002 203 203 000 Old_age Always - 27 (Min/Max 13/52)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 1
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Attachments
Last edited: