Ideal HA node config on a budget + Ceph nodes as two separate data pools?

medicineman25 · Dec 27, 2021

After spending some time learning that SAN is the traditional storage route and that Ceph + "hyperconverged" is actually encouraged now that we have the technology, I am a little lost on how to proceed with my home lab. I don't know if using Ceph would be like trying to shoe horn a new technology into an old design method.

Ideally I would have 4-5 identical VM nodes, all with a tonne of storage and just call it a day. Unfortunately that is extremely expensive. So I have opted for VM nodes + storage nodes in the traditional sense of the hardware.

In this config I was to have two VM servers with failover both with a smallish (8tb usable) local SSD array for fast access to current projects (media production workloads i.e. video and audio), then two storage servers with HDD capacity for non-performant use cases also setup as failover. (I would add a NUC with kubernetes for quorum).

The local storage would run ZFS with GlusterFS for replication. The storage server would be either Ceph or just expose blocks using ZFS over iSCSI.

Now, the issue is I want to use block storage for LXC and iSCSI is not supported there. So I figured using Ceph on the storage nodes would be ideal, which of course leads one to think perhaps all of the nodes should just run Ceph albeit with two separate data pools.

My questions is: is this possible? Am I approaching Ceph in the right way? In my defense there is a lot to digest.

Can I have my local storage be one Ceph "data pool" and the storage nodes another? Would this indeed be shoe-horning a new technology into an old design paradigm? Would it still count as having four real Ceph nodes i.e. near production standard? Is having two separate data pools even necessary?

What I want is reasonably good redundancy, some quick access local storage, then capacity for non-performant storage... on a relative budget. One other consideration is that I have need for PCI passthrough.

medicineman25 · Dec 27, 2021

After (not) much deliberation, I am looking to pursue the following configuration:

2x Compute servers, with small U.2 SSD array for current working set. Immediately recent projects awaiting revisions can spillover to an additional local SSD SATA array. Everything replicated using GlusterFS on ZFS, for instant recovery. 2x Nvidia A10 cards to avoid the need for migration of media workstation VMs with PCI Passthrough.

3x Storage servers, 24TB usable spinning rust. Ceph.

n# rpis running Ceph monitor nodes, small SATA SDD attached for logs. Extra for quorum and DNS redundancy.

10 gb/s fibre channel all the way through.

Search

Search

Ideal HA node config on a budget + Ceph nodes as two separate data pools?

medicineman25

Member

medicineman25

Member