GlusterFS with RAW and qcow2 images

charon

Renowned Member
Nov 30, 2008
120
1
83
Hi,
I was able to test a Storage Cluster backend for Proxmox yesterday. Since I only had to Server, which should run a replicated scenario and be able to add more servers I chose GlusterFS.
The setup was really easy and I could add it using the PVE Gui and locally.
I got 2 Server running a RAID10 with 6 SAS disks and they are directly connected via 1 GB Ethernet. Normally I would have done it with DRDB, but i wanted to be able to expand this cluster setup.

As I understand it, Ceph only makes sence starting with 3 nodes?

So here's my problem, all VMs running RAW images really run quite well and the speed is OK, I didn't expect a miracle because I dont have a distributed setup. But everytime I chose a qcow2 image (during image creation or moving from local to remote storage) I get a massive I/O wait and the raid disks running wild. Proxmox reports lock timeouts and i have to wait a couple of minutes for the gluster to fsync all pending writes.

So what I read is that 3.4 Gluster Setups should work very well with qemu. Should I avoid using the storage as a filesystem mount and create the vm manually with gluster using libgfapi?
Will there be an option in the PVE gui for this or is the non-Fuse mount of the storage enough?

Maybe someone could help adding his Gluster Experience to the Proxmox wiki.
 
As I understand it, Ceph only makes sence starting with 3 nodes?
Yes, and generally all active-active cluster need 3 nodes to have quorum and avoid split brain. (I think it's same for glusterfs)


So what I read is that 3.4 Gluster Setups should work very well with qemu. Should I avoid using the storage as a filesystem mount and create the vm manually with gluster using libgfapi?
Will there be an option in the PVE gui for this or is the non-Fuse mount of the storage enough?

Maybe someone could help adding his Gluster Experience to the Proxmox wiki.

Proxmox use fuse mount to manage disk (create/delete/resize,snapshots,....)
But qemu access directly to glusterfs through internal libgfapi.

Simply use the glusterfs plugin from last proxmox.
 
Simply use the glusterfs plugin from last proxmox.

As i mentioned above i was using latest proxmox plugin and ran into the qcow2 performance issues which were the main reason for this thread (topic).
So, does anyone have a clue why these are so slow?
 
As i mentioned above i was using latest proxmox plugin and ran into the qcow2 performance issues which were the main reason for this thread (topic).
So, does anyone have a clue why these are so slow?

qcow2 have a little overhead on write, because of metadatas write, but it impact should be so high.

do you have tried with qcow2 local without glusterfs to compare ?

what is your hardware setup ? (disk/raid)
 
@spirit

i setup a dedicated storage cluster not running proxmox. qcow2 creation on local raid1 setup was running normally. but when i tried to migrate to the glusterfs storage with qcow2 i got these i/o hangs. migrating as raw was running without problems.
my setup was like this:

2x latest centos with glusterfs epel repo replica 2 setup on 6 x SAS raid 10 mirrored (direct patch and management lan)
1x raid1 proxmox host (management lan)


@dietmar


Sorry, holiday today and no access to the server. I just added the remote ip of one gluster server with the gluster storage plugin in latest proxmox.
same result happend with local gluster mount and using shared directory in proxmox.
 
it was a FreeBSD VM with IDE disk and no cache enabled.

no cache is ok, I don't known if ide can have an impact with qcow2 on glusterfs.
Do you have tested with linux guest and virtio to compare ?

We are also going to update glusterfs packages from 3.4 to 3.4.1, I see a lot of bugfixes, maybe it can help.
 
no cache is ok, I don't known if ide can have an impact with qcow2 on glusterfs.
Do you have tested with linux guest and virtio to compare ?

We are also going to update glusterfs packages from 3.4 to 3.4.1, I see a lot of bugfixes, maybe it can help.

As I mentioned earlier it's just a test setup and I was curious about it. I'll test it after your update again and add my experience to this thread.
 
I came up with one more question,

are you planning to support direct Gluster Storae of qemu using libgfapi without a mounting section?

like

Code:
[FONT=DejaVu Sans]qemu-img create -f qcow2 gluster://server.domain.com:24007/myglustervol/myvm.img 5G[/FONT]
 
I came up with one more question,

are you planning to support direct Gluster Storae of qemu using libgfapi without a mounting section?

like

Code:
[FONT=DejaVu Sans]qemu-img create -f qcow2 gluster://server.domain.com:24007/myglustervol/myvm.img 5G[/FONT]


Maybe later, to create/delete/snapshot it's ok, resize is implemented in last qemu only, and for listing files we need to implement libgfapi inside proxmox perl code.
(and I only found python binding for now)
 
@dietmar
so i would just create it on the gluster storage via gui and update the /etc/pve/qemu-server/myVM.conf to use gluster://server.domain.com:24007/myglustervol/myvm.img ?
Or do I get it wrong and it is fully supported throught the mounted GlusterFS storage plugin? There it looked as if it is using the native mount.glusterfs mountpoint!?

@spirit:

i read sth. about them planning a RestFUL api like Ceph, maybe this could be an easier way instead of rewriting the python bindings
 
here an example of vmid.conf (vm285)

virtio0: gluster:285/vm-285-disk-1.raw,format=raw,size=5G

storage.cfg
-----------
glusterfs: gluster
path /mnt/pve/gluster
server 127.0.0.1
volume glustervolume
content images
maxfiles 1


qemu will use gfapi to access to directly disk without using the fuse mountpoint
gluster://127.0.0.1/glustervolume/images/285/vm-285-disk-1.raw


(file location is /mnt/pve/gluster/images/VMID/vm-VMID-disk-1.raw)
 
Hi,

today I could continue my testing. It just happened to me with a 2 node replicated setup. distributed and striped setup of the 2 nodes work with image creating, even with qcow2.

I also ran into the issue that the local fuse mount on the proxmox node performs with

dd if=/dev/zero of=/mnt/pve/test/zero1.out bs=1M count=1000 conv=fsync

80 MB/s

and the local dd in a Centos VM with virtio driver using raw and qcow2 with

dd if=/dev/zero of=/root/zero1.out bs=1M count=1000 oflag=direct

26 MB/s
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!