ProxMox v1.5 with Gluster storage system

oeginc

Member
Mar 21, 2009
133
0
16
I am currently facing two problems with my ProxMox installation... I currently have 4 machines running ProxMox with approximately 50 or so VM's... Each host is configured with 2 drives, 1 for /var/lib/vz and the other for /backup.

#1 - I only have enough hard drive space in each host to save 1 day of backups...
#2 - The hosts spend ~20-40% of their time in I/O wait... This is KILLING me!

I'm trying to find the best way to resolve these two problems while not spending a fortune on hardware. I've been keeping an eye on distributed file systems as a possible solution for quite some time. I've only recently had a chance to play with one (Gluster @ http://www.gluster.org) and I've come to be quite fond of the "idea" behind it...

That being said, what is the practicality of it? Is it possible to resolve these issues using a distributed file system, or is that not very likely?

I have several spare 1U servers laying around, I was thinking about configuring a five-node gluster system with those and mounting /var/lib/vz from there..

The only other option I see is purchasing several new hard drives, a raid card, and building a raid-10 setup in each host (read: very expensive).

I've done some preliminary testing of creating/reading files on the gluster system and the throughput doesn't see that far off from local storage (at least in a single drive setup). I used the tools and information provided here (http://www.gluster.com/community/documentation/index.php/GlusterFS_2.0_I/O_Benchmark_Results) to test with.

Any thoughts, opinions or real-life results would be greatly appreciated.
 
Hi - currently we use drbd in primary/secondary mode. As we need to replace one of the servers, I was looking for an alternative to use like gfs2 or gluster . My goal is to increase reliability , availability and ease of administration.

I'm still at the point of researching which is the best path for us to take .

We use mainly open-vz and debian. with a few kvms. our business works around the clock and run our own mail / imap / database servers .

Some questions -
- how reliable has it been?

- how are the admin tools?

- how do you accomplish high availability?

- and suggestions for good forums / how to's etc ?

thanks in advance!
 
Hi,
Im using proxmox 1.8 with gluster, can I give you any pointers?

Hi I'm testing proxmox 1.8 with gluster 3.2.2, for storage virtual disk of kvm machine.

What about the problem of self-heal when you stop a gluster-node and than restart it.
I've read that with 3.3 (actualy in beta) the problem will be resolved.

Are you using proxmox+glusterfs for OpenVZ machine ?
 
thanks for the info.

Have you tried gfs2 instead?

I'm just looking into it and wondered if it is used much in proxmox production.
 
thanks for the info.

Have you tried gfs2 instead?

I'm just looking into it and wondered if it is used much in proxmox production.

No, I'm not using gfs2... I have read only the documentation :-|

I can give you some information of glusterfs:
- not use metadata to store information of filesystem
- it's a distributed filesystem
- file stored directly on the filesystem of the every node used as a brick (you can access, for read, directly the filesystem of any cluster-node)
- there are many kind of volume (replicated, distributed, striped ....)
- a simple command line tool to configure all the volumes on "live" mode
- client side use fuse and run on userspace (no kernel space)
- scalable
- if you have many server to dedicate for your cluster the performance can be very interesting

Actually there are a problem with the stable version of glusterfs 3.2.2, I try to explain here:

I'm using GlusterFS 3.2.2 on two peer with replicate volume to store virtual disk used by KVM for a virtual machine.


- Mount using the GlusterFS
- Fail one of replica nodes (for example detach the ethernet cable !)
** Production is unaffected at this point, the virtual machine work perfectly
- Restart the failed node (attach the ethernet cable !)
** The virtual machine freeze until GlusterFS synchronizes the virtual disk to all node..... then the virtual machine "un-freeze" and continue to work perfectly

Yesterday are published the 3.3 beta version... at the end of august, maybe, these problem can be resolved !

sorry for my poor English.
 
We are using GlusterFS 3 in our multi-node-Proxmox cluster, though not for VM storage, but as the central replicated distributed filesystem for several TB of data. The VMs mount the glusterfs volume. I tried VM storage and found performance very bad (note, I'm using KVM) and even experienced some crashes.

Firstly, if you're stuck in IO wait, glusterfs won't fix that, only a better storage subsystem. GlusterFS does not properly scale - one file will only ever be stored on one (pair of) fileservers, and if that's slow.... It's like trying to build a formula 1 car by welding 5 Fiat's together :)

Secondly, I do have to say I'm happily using GlusterFS since I know of no other *stable* free, distributed replicated FS that is POSIX compliant.

BUT ... I'm not impressed at all with GlusterFS performance; especially the rewrite performance is massively poor (I get a maximum of 10-12MB/sec rewrite). I guess FUSE does that to you. At the same time, if you can live with the performance, it works great and stable. Small file access is ok (since it's cached).

I do believe Gluster's time is coming to an end, soon to be replaced by http://ceph.newdream.net. Just some advantages: Filesystem can be mounted in kernel space; storage nodes are actually replicating blocks between themselves (Gluster: selfhealing is done by the client); you can have an odd number of storage nodes; you can control the distribution for improved HA (e.g. disks / nodes / racks / datacenters); included in mainstream kernel ... There is even a KVM backend planned... First tests are very promising, it's just not yet production-stable.

I will stick with GlusterFS for the moment, but as soon as I can I will switch to Ceph.

PS: someone was suggesting GFS2 -- AFAIK that's not a distributed FS, just a Cluster FS -- you still need a distribution mechanism (like drbd).
 
I do believe Gluster's time is coming to an end, soon to be replaced by http://ceph.newdream.net. Just some advantages: Filesystem can be mounted in kernel space; storage nodes are actually replicating blocks between themselves (Gluster: selfhealing is done by the client); you can have an odd number of storage nodes; you can control the distribution for improved HA (e.g. disks / nodes / racks / datacenters); included in mainstream kernel ... There is even a KVM backend planned... First tests are very promising, it's just not yet production-stable.

I will stick with GlusterFS for the moment, but as soon as I can I will switch to Ceph.

Thank for your reply.
Actually I'm only testing the glusterfs for VM storage and, for my necessity, the performance are acceptable.

BUT, actualy there are some problem that must be resolved from Gluster.
I found these http://gocept.wordpress.com/2011/06/27/no-luck-with-glusterfs/ , here AB Pierasamy (the father of gluster) say that the next release of gluster should resolve some problem and we will use it for VM storage.

I think that now exists many good software for "KVM storage system" but everithing are not suggested for production-stable (ceph, gluster, sheepdog, moosefs).... unfortunately :-|
 
thank you for the information.... It'll be interesting to see what works best for Proxmox 2...
for now we're sing a combo of nested drbd http://www.drbd.org/users-guide/s-nested-lvm.html with http://pve.proxmox.com/wiki/DRBD .
we are using drbd in single primary mode with nested lvm. the reason for that is to try to setup a logical volume for open-vz, but use most of the drbd for raw LVM storage.

we have moved our KVM's over to the new setup.

next I'll try creating a logical volume for the vz's.

I like the snapshot backups of KVM, and want it on openvz.
 
I do believe Gluster's time is coming to an end, soon to be replaced by http://ceph.newdream.net. Just some advantages: Filesystem can be mounted in kernel space; storage nodes are actually replicating blocks between themselves (Gluster: selfhealing is done by the client); you can have an odd number of storage nodes; you can control the distribution for improved HA (e.g. disks / nodes / racks / datacenters); included in mainstream kernel ... There is even a KVM backend planned... First tests are very promising, it's just not yet production-stable.

I will stick with GlusterFS for the moment, but as soon as I can I will switch to Ceph.

So now after 1 year, is there anything new concerning ceph? I read on other sites that several people are already using it in production, but I'm not sure if it's stable enough. I also tried MooseFS in the last few days and it works good with KVM raw images.
 
Yes, the latest kernel update in Proxmox 2.1 has support for Ceph as a storage backend, using RBD. It's basically a technology preview at this point, if I understand it right. Thus, it's not a "supported" feature. Some people on the forums have already implemented it. Search around on the forums and you'll find entries about it. There's also a wiki article on how to set it up. I, myself, am waiting for a node to use as my Proxmox headend for testing with Ceph. I have my Ceph nodes ready to go.
 
Yes, the latest kernel update in Proxmox 2.1 has support for Ceph as a storage backend, using RBD. It's basically a technology preview at this point, if I understand it right. Thus, it's not a "supported" feature. Some people on the forums have already implemented it. Search around on the forums and you'll find entries about it. There's also a wiki article on how to set it up. I, myself, am waiting for a node to use as my Proxmox headend for testing with Ceph. I have my Ceph nodes ready to go.

Is it correct that only the Ceph storage cluster (RBD / RADOS) is needed and not CephFS (+ Metadatacluster) for Proxmox? I guess that only KVM images are supported and no openVZ ?
 
@michaeljk Yes, that's my understanding. I don't think CephFS is implemented in Proxmox. Perhaps that is coming.

@dietmar Sounds great. You guys are doing an amazing job. I'm looking forward to more news about Ceph, and maybe Sheepdog too, in Proxmox.
 
Ceph RBD is good enough. :) Looks really promising.

Maybe GlusterFS, which is supposed to be included in KVM 1.2, will be an option too down the road? I guess its speed is the big question mark.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!