Large difference in backup times between different VMs of similar sizes

kalleyne

New Member
May 22, 2013
21
0
1
Good day.

We are running a 16-node Proxmox VE 2.3 cluster. Over the last few days, we are seeing a problem where automated snapshot backups of some VMs seem to be taking as long as 10 - 12 hours to complete while other VMs are finished in less than 45 mins. Most or all of the troublesome VMs seem to reside on a single NFS resource. However, this resource is as far as we know the least busy in terms of overall usage.

Please see "pveversion - v" below:

Code:
root@proxmox1:~# pveversion -v
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-19-pve
proxmox-ve-2.6.32: 2.3-96
pve-kernel-2.6.32-19-pve: 2.6.32-96
pve-kernel-2.6.32-18-pve: 2.6.32-88
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-20
pve-firmware: 1.0-21
libpve-common-perl: 1.0-49
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-7
vncterm: 1.0-4
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-10
ksm-control-daemon: 1.1-1

What can we do to isolate and repair this problem?
Thanks.
 
All VMs are accessed through one of three NFS servers: nfs1 (10.199.0.21), nfs2 (10.199.0.22) and nfs3 (10.199.0.23).
All NFS servers access AoE SAN storage with each mount point representing a single shelf of disks.

So nfs1 is the front-end (NAS) to /coraid0, /coraid1 and /coraid2.
nfs2 front-ends /jwe51.10, /jwe52, /jwe53 and /jwe54.
nfs3 front-ends /coraid55.

Automated backups (snapshot) are performed on all nodes and all VMs (about 65+) and start at 19:00 daily. All backups are written to NFS ID:jwe52backup5.2TB which is
/jwe52.



The email reports are very mixed. Some show VMs being backed-up within 30 - 45 mins while other VMs can take over 11 hours although they are around the same size. The final email report with all remaining jobs seems to arrive by 16:30 the following day.





What kind of backup node is used for the backup jobs which is fast?
 
I was referring to the hardware, file system on the servers (on what file system does NFS run), disks, hardware raid or not etc. What network is between the PVE and the backup nodes, traffic over the network when the backup is performed, load on other VM's on both source and target while the backup runs and so on...
 
The NFS machines are old IBM x306 1U servers, Intel Pentium 4 CPU 3.40GHz, 4GB RAM (3.3 GB usable) with two GigE nics. Ubuntu 12.04.2 i386 Server.
One GigE NIC runs to the dedicated GigE switch just for Proxmox nodes. The other GigE NIC runs to the AoE SAN storage GigE switch. The SAN shelves were all formatted with Ext3. The nfs service exports these to the Proxmox VE nodes.

All Proxmox VE nodes use standard 2.3 with patches applied by apt-get. We have a mix of hardware serving as Proxmox VE nodes. Most are Supermicro but there are also some IBMs. All have 8 x Intel Xeon (2 sockets) with RAM ranging from 8GB RAM on some systems and most having 16GB RAM. We have one system with 24GB RAM but I guess that we should check on this. All systems use 1 GigE NIC to the dedicated GigE switch.

Traffic from the outside should be somewhat minimal as clients access these VMs via 100Mbps link which generally only sees Kbps flows. Still though it's difficult to guage exact network usage as the total backup job ends around 20 or 21 hours after it is initiated. The loads on the VMs can also vary during this time. What should we use to pinpoint those numbers?

Thanks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!