hardware support and iscsi - home clusters

nsheridan

New Member
May 5, 2014
6
0
1
Hi forum, I want to create a "domestic blade product". I was thinking about using a stackable nano itx form factor that connects to a gig switch which uses a combination of boot luns (iscsi) and maybe NFS to run VMs. I need to get hardware together .. Cost being the driving factor. Anyone tried this kind of thing?
Getting tired of continually upgrading PC s and would love a stack as you grow... Maybe running windows off a terminal server session off my tablet instead etc?
 
Since you mentioned "Home Cluster" in the heading, i am assuming you are planning to setup a cluster for your home use.

Instead of combining iscsi and NFS, you can just setup 2 nodes Proxmox+CEPH and use RBD for VM storage. Since budget is an issue, i very much believe the following setup will suffice for your case without big budget:

CPU : i5-4440
Motherboard: Asus B85M-E/CSM LGA 1150
RAM : 8GB or more
SSD: 64GB
HDD: 500GB x 2

Just use this specification for 2 nodes and you will be set. You could use AMD platform to further reduce the cost, but AMD sometimes has issues.
 
Why not Asrock z87 Extreme6? http://www.asrock.com/mb/intel/z87 extreme6/

Price is the double but you also get dual Intel Gigabit network (bonding which is a must if you are going to have both cluster and Ceph communication as well as VM communication on the same nics) and a lot of SATA ports for later expansion.
  • 10 x SATA3, 1 x eSATA, 8 x USB 3.0, 7 x USB 2.0
 
That is indeed a good motherboard. I was just showing what is possible in very low budget since his driving factor was Cost. That Asus B85M motherboard i have mentioned, i actually have some setups with the very same motherboard. For the price of 1 Asrock Z87 he could have 2 Asus B85M.

@nsheridan, If you can spend extra money, the Z87 Extreme that mir mentioned above is a really good choice.
 
Great advice guys. I suppose I am looking for something that is kind of 'greater than the sum of its parts'. Regarding storage back end I would really like the actual machines responsible for the hypervisors to be totally stateless so I could get another machine and distribute the vm across more machines. Not heard of ceph may need to read some documentation. You guys use free NAS for the storage back end?
 
I personally use Omnios (Illumos clone of Solaris) with ZFS. Both native ZFS plugin and NFS for VM's and CT's. For backup I use NFS to a Qnap.
 
Do you get the servers (the servers that run the hypervisors) to boot from an iscsi based lun?
 
What do you mean by "the servers that run the hypervisors"?

If you mean whether the VM's boot from iscsi based luns it is a yes. The Proxmox host boot from local SSD. I only have 1 disk in each host. All VM's boots and run from storage.
 
Sorry I am only familiar with vmware so I guess I mean the equivalent to the esxi server. If I can save on a hard disk in the equivalent of the esxi server then even better. Apologies for my ignorance of the product. Newbie alert please bear with me :)
 
In my opinion two options exists for a "cheap" home solution:
Using Ceph:
2 nodes configured as suggested by symmcom but change MB to Asrock Extreme6
Using ZFS external storage:
2 nodes configured as suggested by symmcom but without extra HDD
1 node with a i3-4130 and a MB like the Asus (16 GB RAM is much better than 8 GB) with 3-4 SATA3 1TB as raidz1 or striped mirror (RAID10). For best network performance add an extra nic for bonding. If you opt for the Solaris way only Intel nics are worth the effort.
 
In my opinion two options exists for a "cheap" home solution:
Using Ceph:
2 nodes configured as suggested by symmcom but change MB to Asrock Extreme6
Using ZFS external storage:
2 nodes configured as suggested by symmcom but without extra HDD
1 node with a i3-4130 and a MB like the Asus (16 GB RAM is much better than 8 GB) with 3-4 SATA3 1TB as raidz1 or striped mirror (RAID10). For best network performance add an extra nic for bonding. If you opt for the Solaris way only Intel nics are worth the effort.


great stuff many thanks!
 
great stuff many thanks!

Don't forget that exist DRBD, with DRBD you can get:

Pros (with only two PCs in the infrastructure):
1- High availability in VMs
2- High availability in storage - DRBD do synchronous replication of volumes between PVE Nodes (as two hardware storages in High Availability)
3- Migration online of VMs.
4- Great read speed (for the VMs) - the same speed of read compared to the local read of HDDs.
5- "Only the writes (of the VMs) are transmitted by the NIC(s)" used by DRBD, and not the read of data - therefore DRBD need many less bandwith for his communication and works very well. For this reason, obviously DRBD works more quick that CEPH or any other solution of storage for the readings, the writes and the high availability in storage.
6- In anyway, a only NIC is sufficient for that DRBD can do his work.
7-If you have free space of HDDs in the PVE Nodes, always will can to do backups crossed, ie, PVE-Node1 do backup in PVE-Node2, and PVE-Node2 do backup in PVE-Node1. Of this manner, you don't need a extra server for backups, and also, you will can test your backups restoring them in the same PVE host for be sure that your backups work well
8- Only 2 PVE Nodes are sufficient for get all these features explained above

Cons:
1- DRBD isn't supported by PVE team, but is supported by Linux since many years ago, DRBD have a large trayectory, and since many years ago that is supported in the linux kernel.
2- Since that DRBD isn't supported by PVE team, we deduce that will require a little more work manually

Recomendations:
1- In anyway, if you don't want little speed of access for read and writes to your virtual disks of the VMs nor lose the PVE cluster communication, two dedicated nics in bonding balance-rr (doubling the speed) for the data replication is the best way for that your VMs don't be slow for the access to disk. In any case, with DRBD always you have more speed because only writes are replicated by the NIC(s), while in CEPH or any other storage solution, the NICs are used for reads, writes (and replications if High Availability in storage is configured), and this is too for a NIC or two NICs of 1 Gb/s
2- You should use NICs Intel, and not Realtek (unless you know compile the latest version for install it in the PVE Kernel), with the old version of Realtek that is in the actual PVE Kernel you will have many problems of disconnection.

Experiences:
I work with DRBD since many years ago successfully, and never had problems (always with two dedicated NICs for each HDD for use it with DRBD, and finally doing the tuning of DRBD).

@to experimented people:
Don't forget that for get High availability for the VMs, also exist "Manual fence", that allow to do manual fence to the PVE Host when you want (always human intervention is requiered for this case)


Best regards
Cesar
 
Last edited:
Don't forget that exist DRBD, with DRBD you can get:

Pros (with only two PCs in the infrastructure):
Are you suggesting that DRBD can co exist in same Proxmox node without any issue?
I dont have much experience with DRBD. I deal with mostly enterprise class environments where uptime, scalability and simplicity are the most important focus. Proxmox+CEPH allows me to have all 3 of them plus more. It does not matter as few as 2 nodes or 50 nodes, it just works.
obviously DRBD works more quick that CEPH or any other solution of storage for the readings, the writes and the high availability in storage.
This is true, DRBD has faster performance than CEPH. But the performance increases as the number of HDD/OSD increases in the cluster. In a large CEPH environment it out performs DRBD and other storages.
The scenario of Split-Brain disaster is almost impossible in CEPH which is very disastrous situation for DRBD. CEPH can easily expand several peta bytes simply by adding new nodes and hdd in the cluster. I am not sure if that is true for DRBD.

One of the reason why DRBD is not "officially" supported by Proxmox team, is probably because DRBD really does not fall into massive enterprise class.
@cesarpk, since you mentioned you have been using DRBD for a very long time, i am interested to see what you think about above points. This is purely based on my experience and what i think. Nothing against DRBD users.
 
I must admit that the ability to add a third member or more is highly desirable. I'd like to buy cheaper hardware and more of it as I get more money available and when I need to upscale compute. Or even sell off older hardware if I don't need it anymore.
 
I must admit that the ability to add a third member or more is highly desirable. I'd like to buy cheaper hardware and more of it as I get more money available and when I need to upscale compute. Or even sell off older hardware if I don't need it anymore.

I dont know how DRBD works, but with CEPH retiring older hardware and replace them with new ones is very simple. Simply add new node, add new OSDs, then when data is distributed to the new OSDs, take older OSDs offline one at a time. Even an entire node can be replaced without offlining entire cluster, regardless the size of the cluster. Many tight budget environment does it this way and i believe it is the only logical way to do it.
 
@cesarpk, since you mentioned you have been using DRBD for a very long time, i am interested to see what you think about above points. This is purely based on my experience and what i think. Nothing against DRBD users.


@symmcom (I don't have problem, gladly I will try to dispel your doubts)
@nsheridan (for you to think about your best strategy)

DRBD 8.4.4 (the latest stable version) don't have the features that have CEPH or Gluster for now, and about of the speed of DRBD, almost have the speed of a local hard disk for the writes, always the readings have the same speed of local disks, please see this link:
http://blogs.linbit.com/p/469/843-random-writes-faster/

I talk about DRBD, because nsheridan talk us about of a Home Cluster, so DRBD can benefit him of his technology with few PCs. But I also use it in small business, no more than 12 nodes.

I know that CEPH or Gluster was created thinking in many servers working as storages in HA, in this case, DRBD can result disappointing, due to that DRBD use the readings of the local disk, and the writes are synchonized in other Node, ie DRBD isn't made for use in large scale. When the version 9.x will be stable, i will test his new features.

Also DRBD can be used as storage servers with iSCSI, NFS and other types of technologies of type NAS, for that in turn can be used with any type of application Servers as PVE.

The new 9.x series of DRBD, will have many more options of growth to scale:
http://www.drbd.org/home/roadmap/

In conclusion:
DRBD - Cons:
I believe that for growth to great scale, from 20 Nodes or more, use CEPH or Gluster will have sense, where performance begins to be acceptable with little spending of hardware in each physical server.

DRBD - Pro:
A example, if you have a BB.DD. of great size, or some applications that consume a lot readings and writes of disk, and need much speed, without pairs of NICs of 10 Gb/s in bonding LACP 802.3ad for use it as net of replication for each RAID-10 (without mentioning about of the switches of 10 Gb/s), and without HDDs SAS 15K configured in RAID 10, where it is dividing the OS, of Data and Logs in each RAID-10 respectively, your BB.DD. or heavy application will not have a optimal speed, and with a hardware cheaply with few nodes, you never you will get the desired performance, and here is where DRBD can shine with little spending and without sacrificing nothing of performance.

This is that i believe, and I hope I have been helpful. Any comment, complaint, throw me a great shoe in the head, or any thing that you desire, will be welcome

Best regards
Cesar
 
Last edited: