Backups - Random Time Outs

Kafoof

Active Member
Oct 17, 2018
19
1
43
Tokyo
Hi,

We are testing Proxmox with 3x Node cluster. Configuration is:
  • 3x Nodes with 64gb each
  • 2tb NVMe x2 p/node for OSD in CEPH
  • Dedicated 10GbE uplink for storage
  • 1x 10tb SATA HDD 7200rpm connected to one host and exported to all as NFS over 10GbE.
  • Version: 5.2-10
So far the cluster has been working perfectly I am really impressed. I have nighly backups running for the majority of the VMs but I noticed recently that I am seeing random time outs on different VMs.
Code:
110: 2019-02-05 02:00:48 INFO: Starting Backup of VM 110 (qemu)
110: 2019-02-05 02:00:48 INFO: status = running
110: 2019-02-05 02:00:48 INFO: update VM 110: -lock backup
110: 2019-02-05 02:00:48 INFO: VM Name: jd-vpn
110: 2019-02-05 02:00:48 INFO: include disk 'ide0' 'ceph-pool-0_vm:vm-110-disk-0' 20G
110: 2019-02-05 02:00:48 INFO: backup mode: snapshot
110: 2019-02-05 02:00:48 INFO: ionice priority: 7
110: 2019-02-05 02:00:54 INFO: creating archive '/mnt/pve/backups-n-isos/dump/vzdump-qemu-110-2019_02_05-02_00_47.vma.lzo'
110: 2019-02-05 02:01:03 ERROR: got timeout
110: 2019-02-05 02:01:03 INFO: aborting backup job
110: 2019-02-05 02:01:09 ERROR: Backup of VM 110 failed - got timeout
Has anyone else had similar issues on any advice on how to trouble shoot?
Thank you in advance!
 
I assume you backup to the NFS share, if so it's most likely the share which is not responding, but could also be related to heavy I/O.
Take a look at the /var/log/syslog and check if there's something logged at that time.