Proxmox VE7 Backup Failing with large Mounted Volume in VM

bmcgonag

Member
Jul 22, 2021
17
0
6
51
Good day.

I have a couple of VMs up and running, and one of them backs up fine, but one has failed every time. I have an 8 TB drive connected via USB to the Proxmox Server install itself, and added as storage in DataCenter.

I have setup a backup job of a VM (101) which is Ubuntu 20.04 server, with a mounted volume of 3 TB for media, etc.

It appears when I set a backup to run, it backs up both the VM and the mounted storage. So, around 4 TB total. The backup unfortunately fails before completing, every time..so not a great backup strategy so far. I really like the option to restore a backup if I have to re-install Proxmox (as I did recently when my Proxmox OS drive failed). Just not sure why this isn't completing.

Here is the most recent log from the backup attempt last night:
Code:
WARN: both 'storage' and 'dumpdir' defined in '/etc/vzdump.conf' - ignoring 'dumpdir'
INFO: starting new backup job: vzdump 101 --prune-backups 'keep-last=3' --compress zstd --mailto brian@someemail.com --quiet 1 --mailnotification always --mode snapshot --node cevace --storage PMoxBU
INFO: Starting Backup of VM 101 (qemu)
INFO: Backup started at 2022-01-03 02:00:01
INFO: status = running
INFO: VM Name: Ubuntu-2
INFO: include disk 'scsi0' 'VM_1:vm-101-disk-0' 200G
INFO: include disk 'scsi1' 'VM_8_TB:vm-101-disk-0' 3000G
INFO: include disk 'scsi2' 'FileSync:vm-101-disk-0' 1000G
INFO: backup mode: snapshot
INFO: bandwidth limit: 10000000 KB/s
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/PMoxBU/dump/vzdump-qemu-101-2022_01_03-02_00_01.vma.zst'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '11e92cb9-cb7a-4745-8413-c30b8bc8862d'
INFO: resuming VM again
INFO:   0% (170.4 MiB of 4.1 TiB) in 3s, read: 56.8 MiB/s, write: 15.6 MiB/s
INFO:   1% (42.0 GiB of 4.1 TiB) in 23m 6s, read: 31.0 MiB/s, write: 29.0 MiB/s
INFO:   2% (84.0 GiB of 4.1 TiB) in 47m 42s, read: 29.2 MiB/s, write: 28.0 MiB/s
INFO:   3% (126.0 GiB of 4.1 TiB) in 1h 15m 57s, read: 25.4 MiB/s, write: 23.7 MiB/s
INFO:   4% (168.0 GiB of 4.1 TiB) in 1h 45m 48s, read: 24.0 MiB/s, write: 22.1 MiB/s
INFO:   5% (210.0 GiB of 4.1 TiB) in 2h 13m 41s, read: 25.7 MiB/s, write: 24.6 MiB/s
INFO:   6% (252.0 GiB of 4.1 TiB) in 2h 42m 57s, read: 24.5 MiB/s, write: 23.7 MiB/s
INFO:   7% (294.0 GiB of 4.1 TiB) in 3h 12m 15s, read: 24.5 MiB/s, write: 23.7 MiB/s
INFO:   8% (336.0 GiB of 4.1 TiB) in 3h 40m 42s, read: 25.2 MiB/s, write: 24.3 MiB/s
ERROR: VM 101 qmp command 'query-backup' failed - got timeout
INFO: aborting backup job
ERROR: VM 101 qmp command 'backup-cancel' failed - unable to connect to VM 101 qmp socket - timeout after 5976 retries
INFO: resuming VM again
ERROR: Backup of VM 101 failed - VM 101 qmp command 'cont' failed - unable to connect to VM 101 qmp socket - timeout after 450 retries
INFO: Failed at 2022-01-03 06:19:16
INFO: Backup job finished with errors
TASK ERROR: job errors

Any help on how to overcome this is very much appreciated.
 
Just bumping this, wondering if there is any thought on why it would be failing every time?

Additionally, wondering if I can manually backup the VM(s) with something like rsync and 7Zip?
 
Your disks is very slow. Maybe thats not a USB3 port? Is that 8 TB backup disk maybe using SMR instead of CMR? SMR HDDs are so slow and when writing big amounts of data to it can get that slow, that the system thinks that the disk is dead, because it can't answer in time.
You could also test another USB cable and port.
 
Last edited:
I was noticing it was seeming very slow. Not sure why actually. Maybe, it wasn't an overly expensive drive by any means. I thought the same for the USB port, I'll double check it and see if it makes any difference.
 
Can you see the disk model? You could for example look at the output of smartctl -a /dev/yourUSBHDD.
 
Code:
=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5
Device Model:     ST8000DM004-2U9188
Serial Number:    ZR11XF45
LU WWN Device Id: 5 000c50 0dc66b8a8
Add. Product Id:  DELL(tm)
Firmware Version: 0001
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Jan  6 09:42:56 2022 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
I moved it to a USB3 for certain. It shows read speed 120 MiB/s, and Write 35 MiB/s - still seems quite slow to me.
 
The ST8000DM004 is a SMR drive, so its slow by design and not really usable for writing more then some few GBs at once. So a really bad disk for storing big backups. Writes are fine as long as the DRAM and CMR cache is empty. As soon as both caches are full the performance drops down so a few kb/s or MB/s and the disk gets unresponsive.
I've got the 4TB version of your disk (ST4000DM004) and if I unzip a 50GB zip the complete Windows will freeze and get unresponsive for one or two hours until the HDDs is responding again. I monitored the average response time of that HDD and I saw values above a minute when the caches are full.
 
Last edited:
Wow, ok. Good to know, and thanks for the great information. I'll look for a better drive in the future for sure.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!