Serious problem with backups QEMU/KVM on mode snapshots

abkrim

Well-Known Member
Sep 5, 2009
97
1
48
Zamora (España)
castris.com
Scenario

QEMU/KVM VPS on host machine for only QEMU VPS (3 VPS)
8GB RAM / 2 Disk SATA / 1 XEON
backup mode: snapshot
Disk (try with raw & qcow2) on BUS IDE or SCSI

Code:
vzdump --compress --dumpdir=/backup --mailto fenzen@gmail.com 516
INFO: starting new backup job: vzdump --compress --dumpdir=/backup --mailto fenzen@gmail.com 516
INFO: Starting Backup of VM 516 (qemu)
INFO: running
INFO: status = running
INFO: backup mode: snapshot
INFO: bandwidth limit: 10240 KB/s
INFO:   Logical volume "vzsnap-vz003.islaserver.com-0" created
INFO: creating archive '/backup/vzdump-qemu-516-2010_05_02-21_16_10.tgz'
INFO: adding '/backup/vzdump-qemu-516-2010_05_02-21_16_10.tmp/qemu-server.conf' to archive ('qemu-server.conf')
INFO: adding '/mnt/vzsnap0/images/516/vm-516-disk-1.qcow2' to archive ('vm-disk-ide0.qcow2')
Just on this momet, VPS don't work. (Apache, SMTP, ...) also not ping to VPS. Dead. Stopped.

Load on host 1,49 to 2

Frustrated.
 
With OpenVZ work perfectly.

pveperf
CPU BOGOMIPS: 22537.14
REGEX/SECOND: 970755
HD SIZE: 9.92 GB (/dev/sda1)
BUFFERED READS: 74.77 MB/sec
AVERAGE SEEK TIME: 7.43 ms
FSYNCS/SECOND: 982.78
DNS EXT: 32.39 ms
DNS INT: 1.37 ms (ovh.net)
 
Looks quite normal. But other users already reported such problems:

http://forum.proxmox.com/threads/434-I-O-scheduler

Maybe you should try to set the IO Scheduler to 'deadline'. Edit /boot/grub/menu.lst and set

Code:
# kopt=root=/dev/XYZ ro elevator=deadline

The run

# update-grub

and reboot. Does that help?
 
Well.

For change scheluder, i dont' use modifiy grub. Don't need. Only change on system.

Of course, we try with normal scheluder and try with
echo deadline > /sys/block/sda/queue/scheduler

http://wiki.openvz.org/I/O_priorities_for_containers

Desesperate. Just like move from OpenVZ to KVM but poor I/O Perfomance and a lot of problems with I/O Overloads
 
AFter several test.

1.- If put cfq scheluder, KVM VPS don't work. Very heavy load on server
2.- If put deadline scheluder, backup grow, grow, grow

Example:
VPS 999
43810819 7,2G -rw-r--r-- 1 root root 7,1G may 10 20:22 vm-999-disk-1.qcow2
43810820 5,1G -rw-r--r-- 1 root root 5,1G may 9 22:26 vm-999-disk-2.qcow2

Backup after 10 hours
34 -rw-r--r-- 1 root root 32G may 10 20:21 vzdump-qemu-999-2010_05_10-15_48_35.dat

Frustrated. Desesperated.
 
On log any... because I don't like see how growing backup size



may 11 06:02:56 INFO: Starting Backup of VM 516 (qemu)
may 11 06:02:56 INFO: running
may 11 06:02:56 INFO: status = running
may 11 06:02:56 INFO: backup mode: snapshot
may 11 06:02:56 INFO: bandwidth limit: 10240 KB/s
may 11 06:02:56 INFO: Logical volume "vzsnap-vz003.islaserver.com-0" created
may 11 06:02:57 INFO: creating archive '/backup/vzdump-qemu-516-2010_05_11-06_02_56.tar'
may 11 06:02:57 INFO: adding '/backup/vzdump-qemu-516-2010_05_11-06_02_56.tmp/qemu-server.conf' to archive ('qemu-server.conf')
may 11 06:02:57 INFO: adding '/mnt/vzsnap0/images/516/vm-516-disk-1.raw' to archive ('vm-disk-ide0.raw')
may 11 06:03:53 INFO: Logical volume "vzsnap-vz003.islaserver.com-0" successfully removed
may 11 06:03:53 ERROR: Backup of VM 516 failed - interrupted by signal
 
pveversion -v
pve-manager: 1.5-8 (pve-manager/1.5/4674)
running kernel: 2.6.24-10-pve
proxmox-ve-2.6.24: 1.5-21
pve-kernel-2.6.24-10-pve: 2.6.24-21
qemu-server: 1.1-11
pve-firmware: 1.0-3
libpve-storage-perl: 1.0-10
vncterm: 0.9-2
vzctl: 3.0.23-1pve8
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1dso1
pve-qemu-kvm: 0.11.1-2
 
Panic for me. Machine it's on production.

Not backup for sites.

First like create clones of VPS on another machine.

After try change kernel.
 
We`ve got the same problems with snapshots of large kvm images or vm`s with more than one virtual disk (raw)

"endless growing" backup.dat files

node1:~# pveversion -v
pve-manager: 1.5-5 (pve-manager/1.5/4627)
running kernel: 2.6.24-8-pve
proxmox-ve-2.6.18: 1.5-4
pve-kernel-2.6.24-7-pve: 2.6.24-11
pve-kernel-2.6.24-8-pve: 2.6.24-16
pve-kernel-2.6.18-1-pve: 2.6.18-4
qemu-server: 1.1-11
pve-firmware: 1.0-3
libpve-storage-perl: 1.0-8
vncterm: 0.9-2
vzctl: 3.0.23-1pve8
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm-2.6.18: 0.9.1-5

node1:~# pveperf
CPU BOGOMIPS: 18468.54
REGEX/SECOND: 455959
HD SIZE: 94.49 GB (/dev/pve/root)
BUFFERED READS: 96.25 MB/sec
AVERAGE SEEK TIME: 10.66 ms
FSYNCS/SECOND: 1085.60
DNS EXT: 44.68 ms
DNS INT: 13.88 ms (landesstelle.kjh.de)

System HP DL185G5 Hardware Raid SmartArray P400 BBWC with 7.2k S-ATA Raid1
 
uhmm..

Easy.

1.- Ask for keys of mahcine.
2.- Sysadmin give you key
3.- Goi to system
4.- Run backup command for XX VID whit problems..

Other way... i don't kown
 
We got the problem with proxmox 1.4 and now with 1.5. VMs are Windows 2008R2 on different hardware (now HP DL185G5 - 8GB RAM)

Reproducing the bug seems to be difficult (still no idea) I`m "happy" to read that on another hardware the same problems exists.

Build a VM with more than 100GB used space and for example 2 vm-disk. Maybe you will also get the problem. Anybody else in the Forum with such problems?

Is our I/O performance to low???
 
you are running an quite old 2.6.24 kernel and you are using the kvm module for the 2.6.18 kernel. this leads to a lot of issues.

so make sure you got the right packages, for a howto upgrade see: http://pve.proxmox.com/wiki/Downloads
 
...

Build a VM with more than 100GB used space and for example 2 vm-disk. Maybe you will also get the problem. Anybody else in the Forum with such problems?

Is our I/O performance to low???
Hi,
I backup one VM with two rawdisk (32+145GB) without trouble... but on a fast raid:
Code:
cat vzdump-qemu-125-2010_05_29-07_12_51.log
May 29 07:12:51 INFO: Starting Backup of VM 125 (qemu)
May 29 07:12:51 INFO: running
May 29 07:12:51 INFO: status = running
May 29 07:12:52 INFO: backup mode: snapshot
May 29 07:12:52 INFO: bandwidth limit: 10240 KB/s
May 29 07:12:52 INFO:   Logical volume "vzsnap-proxmox2-0" created
May 29 07:12:52 INFO: creating archive '/bckup/vzdump-qemu-125-2010_05_29-07_12_51.tgz'
May 29 07:12:52 INFO: adding '/bckup/vzdump-qemu-125-2010_05_29-07_12_51.tmp/qemu-server.conf' to archive ('
qemu-server.conf')
May 29 07:12:52 INFO: adding '/mnt/vzsnap0/images/125/vm-125-disk-1.raw' to archive ('vm-disk-virtio0.raw')
May 29 07:35:18 INFO: adding '/mnt/vzsnap0/images/125/vm-125-disk-2.raw' to archive ('vm-disk-virtio1.raw')
May 29 09:02:01 INFO: Total bytes written: 157522399744 (22.94 MiB/s)
May 29 09:02:01 INFO: archive file size: 56.69GB
May 29 09:02:01 INFO: delete old backup '/bckup/vzdump-qemu-125-2010_05_22-07_14_13.tgz'
May 29 09:02:03 INFO:   Logical volume "vzsnap-proxmox2-0" successfully removed
May 29 09:02:03 INFO: Finished Backup of VM 125 (01:49:12)
What is the output of
Code:
vgdisplay pve | grep Free
especially during the backup process? Only an idea...

Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!