Public cloud storage architecture suggestions..

wipeout_dude

Member
Jul 15, 2012
39
0
6
Hi all,

I am planning to branch out from basic web hosting and start providing specialised VPS solutions to certain customers and Proxmox VE seems to be a good platform to build it on..

The initial cloud will probably consist of 3-4 nodes linked my GbE.. Live migration will be a requirement and probably HA as well.. Hopefully the cluster will grow to more nodes before too long..

What I cant decide is how to provision the storage (will be using standard servers configured for serving up the storage not high end NAS/SAN hardware)..

Firstly NFS seems the most simple and flexible in terms of backups and switch over to a spare storage server should there ever be a problem.. Just not sure if the performance and scalability is there.. Especially when the number of VM's starts to increase on the host.. What are your thoughts and experiences??

Next seems to be iSCSI.. Haven't seen much to suggest its any more scalable that NFS would be and its more complicated to implement and move around if there is a problem.. Also haven't seen anything to suggest improved IO performance for the added complexity although is may be inaccurate..

Moving on from that there are the expensive full blown SAN options but those are too expensive at the moment.. Maybe later if things go well and it can be justified..

Finally there are the "new" storage solutions like GlusterFS (very slow currently), Ceph and Sheepdog with ProxmoxVE support for the latter two imminent..

So I would be very interested to hear the thoughts from anyone running the larger clusters and anyone who has tested the "new" storage solutions and what your thoughts/experiences are with things like Ceph or Sheepdog (From an architecture perspective Sheepdog seem to be a good solution because of no need for a metadata server)..

I think a detailed discussion on storage may be useful to others searching out information and I couldn't find anything comprehensive..

Thanks..
 
If you want HA you need shared storage as well as fencing. Container based VM's is not and option since OpenVZ does not support shared storage in which case a live migration means transferring the entire VM over the network from one node to the next - this can, depending on the size of the VM, be very time consuming. For shared storage a standard server equipped with a hardware raid controller with battery backup using either RAID5(6) or RAID 10 should be sufficient. Forget about NFS. It's to slow (40-50 Mb/s) and have problems with I/O and can from time to time causing hanging file locks in which case only a reboot will fix the problem - not very HA friendly. iSCSI is not difficult to setup and will give you 80-90 Mb/s over gigabit ethernet providing you use BLOCKIO (thick provisioning, thin provisioning FILEIO provides the same throughput as NFS). For each node I would plug in a SSD disk, 40-60 GB storage should be sufficient.

IMHO the above should give a decent setup for reasonable commercial hosting.
 
I've ended up using NFS for a Citrix XenServer storage system, with failover. I use GlusterFS replication to maintain 2 storage nodes, and GlusterFS's NFS server to serve them with uCARP for IP failover.

My speed tests with GlusterFS NFS vs iSCSI is here: http://majentis.com/2011/09/21/xenserver-iscsi-and-glusterfsnfs/ The tests were done without GlusterFS replication, but with the system currently in production and replication occurring across a separate GigE link, I see no speed degradation, unless one NFS server is down for a period of time, and a big sync is taking place.

My issues with Proxmox and NFS storage is snapshots. Snapshots require VM downtime, which is NOT the case in XenServer. Honestly, if ProxMox can get zero downtime NFS snapshots working, I would dump XenServer today.

Gerald
 
I've ended up using NFS for a Citrix XenServer storage system, with failover. I use GlusterFS replication to maintain 2 storage nodes, and GlusterFS's NFS server to serve them with uCARP for IP failover.

My speed tests with GlusterFS NFS vs iSCSI is here: http://majentis.com/2011/09/21/xenserver-iscsi-and-glusterfsnfs/ The tests were done without GlusterFS replication, but with the system currently in production and replication occurring across a separate GigE link, I see no speed degradation, unless one NFS server is down for a period of time, and a big sync is taking place.

That's interesting.. In some basic and unscientific tests I did I found GlusterFS to be quite slow.. I was using the GlusterFS client.. Perhaps its worth doing some more testing on the new 3.3 release..

Out of interest what underlying file system did you use? If I remember correctly they recommend BtrFS..
 
I've ended up using NFS for a Citrix XenServer storage system, with failover. I use GlusterFS replication to maintain 2 storage nodes, and GlusterFS's NFS server to serve them with uCARP for IP failover.

My speed tests with GlusterFS NFS vs iSCSI is here: http://majentis.com/2011/09/21/xenserver-iscsi-and-glusterfsnfs/ The tests were done without GlusterFS replication, but with the system currently in production and replication occurring across a separate GigE link, I see no speed degradation, unless one NFS server is down for a period of time, and a big sync is taking place.

My issues with Proxmox and NFS storage is snapshots. Snapshots require VM downtime, which is NOT the case in XenServer. Honestly, if ProxMox can get zero downtime NFS snapshots working, I would dump XenServer today.

Gerald
How did you configure iSCSI - fileIO or blockIO?
blockIO gives you 50% increase in performance.
 
note that a native glusterfs clientaccess in qemu-kvm is coming for kvm 1.2. (no need nfs).Redhat seem to push glusterfs since they have buy it :)

about sheepdog and ceph, I've got around 40000iops with sheepdog and 13000 iops with ceph.
But I don't think they are production ready, some bugs, some hangs,....
 
@spirit You got nearly three times the IOPS with Sheepdog compared to Ceph? What hardware configuration is that on? Interesting no matter what ... :)
 
@wipeout_dude I have no hard evidence for this, but according to what I've read, GlusterFS 3.3 has massive speed improvements and a lot of optimizations, specifically for running as a VM storage backend.
 
note that a native glusterfs clientaccess in qemu-kvm is coming for kvm 1.2. (no need nfs).Redhat seem to push glusterfs since they have buy it :)

about sheepdog and ceph, I've got around 40000iops with sheepdog and 13000 iops with ceph.
But I don't think they are production ready, some bugs, some hangs,....

That all sounds very interesting.. To me Sheepdog sounds like the best designed system for running VM disk images and seems to be reflected in your IO numbers.. Looking forward to seeing how it develops..

What version of KVM is currently in ProxmoxVE??
 
@wipeout_dude I have no hard evidence for this, but according to what I've read, GlusterFS 3.3 has massive speed improvements and a lot of optimizations, specifically for running as a VM storage backend.

I'll have to do some testing.. As I said in a previous post I think the Sheepdog design is the best I have seen for VM's but GlusterFS is far more mature by comparison so if I am going to attempt a clustered storage backend over a single SAN/NAS then GlusterFS is probably the more logical option..
 
That all sounds very interesting.. To me Sheepdog sounds like the best designed system for running VM disk images and seems to be reflected in your IO numbers.. Looking forward to seeing how it develops..

What version of KVM is currently in ProxmoxVE??

kvm 1.1.1
 
Container based VM's is not and option since OpenVZ does not support shared storage in which case a live migration means transferring the entire VM over the network from one node to the next - this can, depending on the size of the VM, be very time consuming.

Can I use OpenVZ containers on a FC shared storage with cluster file system like gfs2? Would HA works with this configuration?
 
Can I use OpenVZ containers on a FC shared storage with cluster file system like gfs2? Would HA works with this configuration?
When OpenVZ containers reside on a shared storage using a distributed file system like gfs2 I assume it will work. I have not tried it yet so do some test before going into production.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!