Virtual SAN

casalicomputers · Mar 14, 2015

Hello everybody,
we're evaluating Proxmox VE as a VMware ESXi replacement, and we'd like uo understand if, in a HA environment, a physical SAN/NAS is REALLY required.
Several commercial KVM-based projects found on the internet, use some storage replication techniques (so-called Virtual SAN) which are meant avoid the requirement of purchasing a physical SAN.

I would like to understand if there's something similar for use with Proxmox, which would allow easily adding/removing nodes when business load changes, and at the same time provide a good performance. By reading the Storage Models wiki page, I found out that something of that kind could be done using DRBD, CEPH or GlusterFS, but I didn't found sufficient information to do what i'm looking for (the DRBD article describes a two-nodes configuration and doesn't mention how to scale up, the CEPH article instead describes a configuration which requires at least 3 nodes by default).

Is, what i'm looking for, feasible?
What is the technology which would best fit our needs?
What requirements should I meet? (disks?, raid level?, vlans?, ... )
Could you provide some links with some detailed information?

Thank you.

Regards,
Michele

brad_mssw · Mar 14, 2015

For any sort of HA, 3 nodes is the minimum requirement. Using 2 nodes you can 'hack' it to make it work, but its not a good idea since you can't get quorum.

I would strongly recommend Ceph for your backend storage, it offers high performance distributed storage with high reliability and quick recovery, no SAN/NAS required. Some setup info is available here as well: http://www.jaxlug.net/wiki/2014/07/16 Though a lot of that can be ignored due to the fact that the install was using Mac Minis so some additional hoops had to be jumped through. It also assumes you're using a shared disk for OS + Ceph which means you can't use 'pveceph' for OSD creation and instead you have to use the direct ceph utilities.

For disks, Ceph recommends no using raid at all and instead just making each disk an OSD, and it handles the replication and failure detection. Just make sure you monitor it. Personally I still use HW raid controllers and build a virtual disk to use with Ceph just because I think it's easier to replace a failed disk in a RAID 5 than it is to remove an OSD from ceph and add a new one, but I'm technically losing performance and also reducing my total storage available by doing that.

For network, it depends on your infrastructure and budget. I use dual/bonded 10GbE with direct-attach SFP+ cables for my servers to my redundant switches and Ceph can max that out under load. If your environment is small, you can get away with bonded 1GbE. However, I'd strongly recommend using Open vSwitch with Proxmox, it makes your life a heck of a lot easier in the long run: http://pve.proxmox.com/wiki/Open_vSwitch

And yes, any sane deployment will use vlans for properly segmenting their traffic, regardless if you're using virtualization or not.

pixel · Mar 15, 2015

setting up is not hard if you happen to use ansible. inktank,the makers of ceph, maintain a nice set of roles and playbooks for it. its worth diving into some theory, but it mostly just works. to add it to your proxmox cluster, you copy one file, /etc/ceph/ceph.client.admin.keyring to /etc/pve/prive/ceph/rbd.keyring and add it from the storage tab under datacenter. thats it.

casalicomputers · Mar 17, 2015

Hello,
thank you for the replies.

Most of my customers have just one VMware ESXi node so the idea behind was to bring the cluster-power to those small enviroments (where the purchase of SAN is not an option) by employing open source technologies and keeping costs as lower as possibile.

I understand that ceph is suitable for mid-sized to large deployments (3+ nodes), and I would surely dive into its documentation and spend some time playing with it.
But what can you tell me for smaller deployments where I have only two nodes? As brad_mssw said, seems that having ceph running on only 2 nodes could be a bit "hackish" and not a good idea at all, so is DRBD the way to go for such deployments? If yes, how do I handle adding new nodes then?

Thank you
Michele

acidrop · Mar 17, 2015

For two nodes, DRBD is the way to go.The problem is that it cannot scale up.You can just use "pairs" of DRBD nodes.HA with just two nodes and DRBD will be an issue(you have to use qdisc).
Otherwise start with 3 nodes and use Ceph.

pixel · Mar 18, 2015

I would make a 2 node ceph cluster and add a third node just to be a monitor. the 3rd monitor doesnt have to be much, just reasonably reliable. its only there for quorum. set replica size to 2 for now. you can change it later.

casalicomputers · Mar 18, 2015

@pixel:
which are the requirements of the monitor node?
I mean, does it need to be powerful as the others running proxmox? And there's the need to have the same amount of disks in there as well?

(I was just wondering if I could use some old machine for that)

Search

Search

Virtual SAN

casalicomputers

Renowned Member

brad_mssw

Well-Known Member

pixel

Renowned Member

casalicomputers

Renowned Member

acidrop

Renowned Member

pixel

Renowned Member

casalicomputers

Renowned Member