scheduled backups keep failing can't figure out why

michael1

New Member
Jul 23, 2022
8
0
1
Massive Proxmox newb but I have a Proxmox setup and my scheduled backups keep failing and i'm trying to figure out if my storage is in danger...

this is from the error listed whereit shows that the backup had errors.. :
"var/tmp/vzdumptmp395447_110' failed: exit code 11
INFO: Failed at 2022-12-01"
I'm also noticing that my torrents keep failing when i'm trying to download to the ZFS. I get an I/O error from my torrent programs.
It appears my drives are fine... (you can see in screenshots)

But i'm a bit skeptical of my USB drive which Proxmox boots from...

Any advice what to test first?
 

Attachments

  • proxmox1.JPG
    proxmox1.JPG
    61.5 KB · Views: 12
  • proxmox2.JPG
    proxmox2.JPG
    109.9 KB · Views: 12
this is from the error listed whereit shows that the backup had errors.. :
"var/tmp/vzdumptmp395447_110' failed: exit code 11
INFO: Failed at 2022-12-01"
You should double click the failed backup task at the bottom of the webUI to get a popup with additional informations why it failed.

When seeing IO errors I would:
1.) check that all filesystems, LVM thin pools and ZFS pools got enough free space (df -h, lvs and zfs list -o space)
2.) scrub the ZFS pool (zpool scrub YourPool)
3.) do some long smart tests (smartctl -t long /dev/YourDisk)
4.) check the syslog (tail -1000 /var/log/syslog)
5.) check the cabeling
6.) do a fsck of your linux filesystems
 
Last edited:
You should double click the failed backup task at the bottom of the webUI to get a popup with additional informations why it failed.

When seeing IO errors I would:
1.) check that all filesystems, LVM thin pools and ZFS pools got enough free space (df -h, lvs and zfs list -o space)
2.) scrub the ZFS pool (zpool scrub YourPool)
3.) do some long smart tests (smartctl -t long /dev/YourDisk)
4.) check the syslog (tail -1000 /var/log/syslog)
5.) check the cabeling
6.) do a fsck of your linux filesystems
OK im scrubbing now... on three disks 10TB each... this wont harm my data will it by any chance? Its taken 2.5 hours so far how long should i expect it to take?
 
You can run zpool status YourPoolName to see the status of your pool. There you will also find an estimation how long that scrub will run. The scrub will read and checksum all your data again and compare the newly calculated checkum with the stored checksum to find out if data of your data got corrupted. If it finds corrupted data, it will repair it, as long as there is parity data or mirroring.
And scrubbing won't harm your data. Its the opposite. You should scrub your pools once each month or so, because that is how you prevent bit rot silently corrupting your data over time.
 
Last edited:
  • Like
Reactions: michael1
Nice...

scan: scrub in progress since Sat Dec 3 12:46:24 2022
5.79T scanned at 664M/s, 4.45T issued at 511M/s, 7.48T total
0B repaired, 59.51% done, 01:43:35 to go
config:

NAME STATE READ WRITE CKSUM
Zpppp ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-WDC_WD101EFBX-68B0AN0_VCK1UXRP ONLINE 0 0 0
ata-WDC_WD101EFBX-68B0AN0_VCK3711P ONLINE 0 0 0
ata-WDC_WD101EFBX-68B0AN0_VCK23LRP ONLINE 0 0 0
 
I did the scrub and there didn't seem to be any errors found.
Here is the backup error log... not sure if this gives any clues that i'm missing...


Task viewer: VM/CT 110 - Backup

OutputStatus

Stop
INFO: starting new backup job: vzdump --mode snapshot --prune-backups 'keep-last=5' --quiet 1 --mailnotification always --storage ZPool-Storage --compress zstd --notes-template '{{guestname}}' --all 1
INFO: filesystem type on dumpdir is 'zfs' -using /var/tmp/vzdumptmp1338586_110 for temporary files
INFO: Starting Backup of VM 110 (lxc)
INFO: Backup started at 2022-12-06 03:00:00
INFO: status = running
INFO: CT Name: PlexMediaServer
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/mnt/Media') from backup (not a volume)
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: PlexMediaServer
INFO: including mount point rootfs ('/') in backup
INFO: excluding bind mount point mp0 ('/mnt/Media') from backup (not a volume)
INFO: starting first sync /proc/1971/root/ to /var/tmp/vzdumptmp1338586_110
ERROR: Backup of VM 110 failed - command 'rsync --stats -h -X -A --numeric-ids -aH --delete --no-whole-file --sparse --one-file-system --relative '--exclude=/tmp/?*' '--exclude=/var/tmp/?*' '--exclude=/var/run/?*.pid' '--exclude=/mnt/Media' /proc/1971/root//./ /var/tmp/vzdumptmp1338586_110' failed: exit code 11
INFO: Failed at 2022-12-06 03:00:26
INFO: Backup job finished with errors
TASK ERROR: job errors
 
Does this give any hints? Like does this mean one of my disks are full? The one my main proxmox is on?
 

Attachments

  • ProxMox_disk full.JPG
    ProxMox_disk full.JPG
    279.8 KB · Views: 9
Jup, your root filesystem is nearly full. this can cause troubles and you should free up some space or your server will stop working as soon as it gets completely filled up.
 
  • Like
Reactions: michael1
Honestly don't even know how to find that drive/folder. Is there a proxmox cleanup command or something I could do for now?
 
Is there a proxmox cleanup command or something I could do for now?
No. You would have to do basic linux administration in CLI. Browsing through folders, looking at files and sizes, deciding what can be deleted and what not, deleting unneeded stuff and so on. Good folders to start with would be /var/lib/pve, /var/log and /var/tmp. commands like apt autoremove and fstrim -a might come handy too.
 
Last edited:
  • Like
Reactions: michael1
Cool.. apt autoremove seems to got me down to 94% usage from 97%.. I have about 750MB of stuff in var/log but not sure what it is or if i can delete it.
 
I have about 750MB of stuff in var/log but not sure what it is or if i can delete it.
You only should delete stuff where you know what they are used for. Save to delete should be failed uploads (files starting with "pveupload-" in "/var/tmp") as well as rotated logfiles (ending with ".1", ".2", ".3", ... or ".gz" in "/var/log").
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!