vzdump no space on device although plenty of space

adoII

Renowned Member
Jan 28, 2010
174
17
83
Hi,
i am testing proxmox 1.5 and the backup schedules.

I backup a qcow2 kvm domain with 100 GB from a share with 900 GTB free to a san volume that has 500 GB space free.

The Backup finishes with a disk full error on a temporary generated snapshot volume

Code:
Feb 03 22:06:01 INFO: Starting Backup of VM 112 (qemu)
Feb 03 22:06:02 INFO: running
Feb 03 22:06:02 INFO: status = running
Feb 03 22:06:02 INFO: backup mode: snapshot
Feb 03 22:06:02 INFO: bandwidth limit: 10000 KB/s
Feb 03 22:06:02 INFO:   Logical volume "vzsnap-san04-0" created
Feb 03 22:06:02 INFO: creating archive '/mnt/pve/san03backupexport/vzdump-qemu-112-2010_02_03-22_06_01.tar'
Feb 03 22:06:02 INFO: adding '/mnt/pve/san03backupexport/vzdump-qemu-112-2010_02_03-22_06_01.tmp/qemu-server.conf' to archive ('qemu-server.conf')
Feb 03 22:06:02 INFO: adding '/mnt/vzsnap0/images/112/vm-112-disk-1.qcow2' to archive ('vm-disk-virtio0.qcow2')
Feb 04 08:31:36 INFO: 383314812928 B 357.0 GB 37534.1 s (10:25 h) 10212450 B/s 9.74 MB/s
Feb 04 08:31:36 INFO: write: No space left on device
Feb 04 08:31:36 INFO: received signal - terminate process
Feb 04 08:31:38 INFO:   Logical volume "vzsnap-san04-0" successfully removed
Feb 04 08:32:17 ERROR: Backup of VM 112 failed - command '/usr/lib/qemu-server/vmtar '/mnt/pve/san03backupexport/vzdump-qemu-112-2010_02_03-22_06_01.tmp/qemu-server.conf' 'qemu-server.conf' '/mnt/vzsnap0/images/112/vm-112-disk-1.qcow2' 'vm-disk-virtio0.qcow2' |cstream -t 10240000 >/mnt/pve/san03backupexport/vzdump-qemu-112-2010_02_03-22_06_01.dat' failed with exit code 2

What is excactly happening there ? How can I influence the size of the temporary snapshot volume and why the helll gets a snapshot of a 100GB Volume 350 GB big ?? There were no big changes on the webserver during the 10h backup period.


I also noticed a very high disk load during the first 20 minutes of the vzdump backup. Afterwards the bwlimit seemed to take effect and the backup load was not noticeable anymore. Can I do anything do reduce the initial high load ? Before on a manual kvm server I was using lvm snapshots on raw images where the snapshot load was not noticeable.
 
What is excactly happening there ? How can I influence the size of the temporary snapshot volume and why the helll gets a snapshot of a 100GB Volume 350 GB big ?? There were no big changes on the webserver during the 10h backup period.

You can spezify the size of the snapshot with the --size parameter. It is best to configure that in /etc/vzdump.conf. Already discussed many time in this forum.
 
Hi,

I still do not understand, how big should i make the snapshot size ? In my old lvm installation with lvm snapshots a snapshot size of 40 GB was always far more than enough for backing up a 100 GB vm.

Here with vzdump there were 350 GB needed ? Where do the 350 GB come from ? 350 GB is about the whole size of the nfs target directory where all backups go.

What process causes the inititial very high disk load in the first 20 minutes of the backup and how could i circumvent that ?
 
I still do not understand, how big should i make the snapshot size ? In my old lvm installation with lvm snapshots a snapshot size of 40 GB was always far more than enough for backing up a 100 GB vm.

vzdump uses 1GB as default, which seems to small (see 'man vzdump')

Here with vzdump there were 350 GB needed ? Where do the 350 GB come from ?

The snapshot runs out of space, and the filsystem on the snapshot gets corrupt.

What process causes the inititial very high disk load in the first 20 minutes of the backup and how could i circumvent that ?

How many VMs do you run?
 
I have 10 vms, some started, some not, they do very little io, 8 vms are mostly sleeping with cpu and io 0, this is a test-environment
Alltogether all vms have 210 GB qcow-disks.
I think vzdump only snapshots the 1 qcow file it just backs up ? Or does it behave different ?
 
I think vzdump only snapshots the 1 qcow file it just backs up ? Or does it behave different ?

It takes a snapshot of the underlying lvm device. Seems all your VM data is in the same logical lvm volume (/dev/pve/data).
 
Aah, i understand,yes, all my vms are in 1 logical volume.

so the whole device is snapshotted just to backup 1 vm. Of course this is a big overhead. I thought there was a method of snapshotting just 1 qcow2 file.
So if i bwlimit my backup and it takes 10 hours for 10 hours all writes to the vm disk are done twice, once in the snapshot and once in the vm, crazy.

What might be an alternative ? Will suspend/resume behave different, is this an alternative ? DOes suspend/resume work for windows and sql-server/exchange server ? Or should I write my own scripts ?
 
so the whole device is snapshotted just to backup 1 vm. Of course this is a big overhead.

yes

I thought there was a method of snapshotting just 1 qcow2 file.
So if i bwlimit my backup and it takes 10 hours for 10 hours all writes to the vm disk are done twice, once in the snapshot and once in the vm, crazy.

yes

What might be an alternative ?

use an lvm storage (each vm disk is a logical volume then).

I guess in future we can use btrfs.

Will suspend/resume behave different, is this an alternative?

This will have long downtimes.
 
use an lvm storage (each vm disk is a logical volume then).

I come from a setup where each of the vm disks had an own lvm volume. It was a hassle to manage all these volumes which all took the maximum amount of space and I don't want to go back there, with or without web-interface,

I was so glad I found proxmox as an easy to manage alternative but the problem is that in the simple setup i cannot backup machines in a consistent state with reasonable io-overhead with vzdump.

Does somebody know if there is an alternative ? Could I e.g. make a consistent snapshot of a running vm with qemu-monitor and gcow2 files and write a few scripts for backups ?
 
adoII,
I am facing the same issue. I was wondering, if instead of using vzdump for backups, would it be a good idea to to take snapshots at the datastore itself with mutiple restore points? For example, I am using openfiler as a datastore with isci, and if I did snapshots to the volumes on openfiler itself, would that be an alternative?
Just wondering.
 
adoII,
I am facing the same issue. I was wondering, if instead of using vzdump for backups, would it be a good idea to to take snapshots at the datastore itself with mutiple restore points? For example, I am using openfiler as a datastore with isci, and if I did snapshots to the volumes on openfiler itself, would that be an alternative?
Just wondering.

First: you need plenty of space on the SAN if you do that for several VMs.

Second: performance degrades when you have active snapshots (at least when the SAN uses LVM)

Besides, what you talk about is not a 'Backup'. The idea is that you move the images created by vzdump to an offline location, for example a tape. Everything else is quite dangerous.