Move qcow2 from NFS to CEPH - No Sparse?!

fragger · Feb 11, 2015

Hello,

we setup an CEPH Storage and want to move all VMs from the NFS Storage to the CEPH Storage - but after i moved a 265G VM to the CEPH Storage i was very surprised, the Image has now 265G and not ~10GB.

We use qcow2 Format on the NFS Share and i moved the Image over PVE to the CEPH Storage, so there are no way to select an other format than raw. I test multiple Commands e.g. resize with shrink, flatten etc. but nothing helped about this. But if i restore an VM to the CEPH, then i see this in the Log "space reduction due to 4K zero bocks 1.99%".

How can we Sparse the raw File on CEPH? We need to do this Online or a minimun Downtime.
We have 1TB Space on each node and currently all VMs use around 370GB - so there is enough Space to grow, but only if we not lose thin Provisioning.

Thanks!

brad_mssw · Feb 11, 2015

What command did you use for your conversion? Did you just use qemu-convert directly from qcow to ceph rbd such as:

Code:

qemu-img convert -f qcow2 -O raw debian_squeeze.qcow2 rbd:data/vm-121-disk-1

If so, that's probably the issue as I believe qemu-img only writes in RBD format 1. You need to write in format 2 for proper sparse support.

Its possible that doing this might work

Code:

qemu-img convert -f qcow2 -O raw debian_squeeze.qcow2 /dev/stdout | rbd --image-format 2 import - data/vm-121-disk-1

but I see references that qemu-img does NOT write data sequentially so that might not work.

So if you're wanting a direct conversion, that may not be possible, you might need to do it in 2 steps like:

Code:

qemu-img convert -f qcow2 -O raw debian_squeeze.qcow2 debian_squeeze.raw && 
rbd --image-format 2 import debian_squeeze.raw data/vm-121-disk-1

nethfel · Feb 11, 2015

Deleted - brad_mssw gave a better solution text as I was writing mine

fragger · Feb 11, 2015

@brad_mssw
>> What command did you use for your conversion?

i moved the Image over PVE to the CEPH Storage, so there are no way to select an other format than raw

Is used the Button from PVE.

>> You need to write in format 2 for proper sparse support

rbd --pool ceph info vm-1xx-disk-1

This Command give me this Informations:
rbd image 'vm-1xx-disk-1':
size 10000 MB in 2500 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.9abc74b0dc51
format: 2
features: layering

I know that this VM currently use only 1,6G and this VM has an 10GB Disk, the Image is currently 9,8G, after i use the Shrink Command on the CEPH.

brad_mssw · Feb 11, 2015

I've never tried using the GUI for that so can't say if it imports it properly, but the fact that it says format 2 is a good thing.

rbd info nor rbd ls will show you the on-disk size, they show you the allocated size. You have to do some nasties to query the on-disk size:

Code:

rbd diff ceph/vm-1xx-disk-1 | awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }'

Check out http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/3684

fragger · Feb 11, 2015

The Command give me the Size of 10000MB back. Our CEPH Storage is currently empty, only this Image on this, so i can say, the Image use ~9,8G and not the ~2,8GB which are used on the NFS Storage.

If i use the command "qemu-img info vm-1xx-disk-1.qcow2" on the NFS Storage, i see that this VM only need 2,8G.
image: vm-1xx-disk-1.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 2.8G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false

So i dont know why the Backup Restore command has no problems with this but the Move command. After i restore the Backup, he shrink the VM - but thats not a way for an VM which need up to 2h for Backup and then i have to restore it, a Downtime from ~5h are too much.

brad_mssw · Feb 11, 2015

Please provide the full cut-and-paste output of the command provided along with the command you ran to confirm.

I know for a fact that you cannot simply look at disk used on the filesystem ceph creates to check the used space as even with ZERO images mine shows 5.6GB used!

Finally, if you use SCSI for your disk controller, and have your SCSI controller set to Virtio, and enable 'discard' support, within the guest you can run

Code:

fstrim -v /

to have it release any zero'd disk space back to Ceph and it WILL release it from disk.

Infact, in all our VMs on ceph, we set up a cron job that does:

Code:

#!/bin/bash
cat /proc/mounts | grep ^/ | grep -e " xfs " -e " ext4 " | awk '{ print $2 };' | while read mountpoint ; do
  fstrim ${mountpoint}
done

Once a week.

Note that if you use LVM within your VM, you've got to set issue_discards = 1 in /etc/lvm/lvm.conf in the devices { } section otherwise it won't be pushed up to Ceph.

fragger · Feb 11, 2015

>> Please provide the full cut-and-paste output of the command provided along with the command you ran to confirm.
I only paste the full output, i dont cut anything.

>> if you use SCSI for your disk controller, and have your SCSI controller set to Virtio
I don't know what exactly you mean. We use SAS Hard Disks for Proxmox, the VM is on the NFS Share which have 6x 2TB SATA Disk in RAID6 with HW Controller. But yes, i try to use VirtIO if i can and the OS support this. No, i dont have activated discard - i thin i have to stop and start the VM and this is not possible.

After i killed the FS with this command, i don't want to us it a second time and not running as a Cron

brad_mssw · Feb 11, 2015

Sounds like you need to be testing this in a test environment since you can't do any testing on these vms.

I really can't help you much further. If your VM isn't imported as sparse, then it sounds like Proxmox's GUI option isn't doing that. In which case your options are to either fstrim after import to tell ceph what data should be sparse, or use the other import command I gave you that you'd use on the command line.

However, no approach would be live/online.

fragger · Feb 11, 2015

I test the complete Day, and find out a few thinks.

1.) If i moved the Disk from NFS Storage to CEPH over the GUI - The Disk are Cloned and used the complete Space. No Shrinking or flatten helped.
2.) If i create an VM on the NFS Storage with qcow2 and don't install an OS on this and move the Disk over GUI to CEPH, it works - the Disk use 0G on the CEPH.
3.) If i create an VM directly on the CEPH Storage, it is Thin Provisioning.
4.) If i restore an VM from an LZO Backup to the CEPH, the VM are restored and sparses at the End of the restore Process.

And the newest, what i found out in the last Minutes.
5.) If i move the qcow2 Image from the NFS Storage to the NFS Storage but select "raw" as Format and move the Disk after it on the CEPH Storage, it is Thin Provisioning on the CEPH Strorage.

Whats a little bit nervy, if i stopped the move Process, i cant delete the Image over the GUI - i have to delete this over the Shell.

So i think i move a few VMs and test it in Productive

But brad_mssw thats for your Support!

//EDIT:
Because trim, i tried a few things, the command only Works if the Disk Image is mounted as SCSI and the Controller Type is VIRTIO, with VirtIO and VirtIO it doesn't work.

mir · Feb 12, 2015

fragger said:
Because trim, i tried a few things, the command only Works if the Disk Image is mounted as SCSI and the Controller Type is VIRTIO, with VirtIO and VirtIO it doesn't work.

I can confirm this. I therefore wonder what purpose enabling discard have on virtio?

brad_mssw · Feb 12, 2015

mir said:
I can confirm this. I therefore wonder what purpose enabling discard have on virtio?

Afaik none. The option shouldn't be check-able in proxmox when using standard virtio, and supposedly there are no plans on extending the standard/old virtio. It works fine with virtio-scsi though, and it sounds like they (kvm/qemu people) want to phase the old virtio out completely at some point in time.

spirit · Jun 8, 2015

Hi,

small info update for this old post:

with qemu 2.4, it'll be possible to drive-mirror (move disk in proxmox), to sparse file.

(So for proxmox 4.0)

Search

Search

Move qcow2 from NFS to CEPH - No Sparse?!

fragger

New Member

brad_mssw

Renowned Member

nethfel

Member

fragger

New Member

brad_mssw

Renowned Member

fragger

New Member

brad_mssw

Renowned Member

fragger

New Member

brad_mssw

Renowned Member

fragger

New Member

mir

Famous Member

brad_mssw

Renowned Member

spirit

Distinguished Member

We value your privacy