After spending some time learning that SAN is the traditional storage route and that Ceph + "hyperconverged" is actually encouraged now that we have the technology, I am a little lost on how to proceed with my home lab. I don't know if using Ceph would be like trying to shoe horn a new technology into an old design method.
Ideally I would have 4-5 identical VM nodes, all with a tonne of storage and just call it a day. Unfortunately that is extremely expensive. So I have opted for VM nodes + storage nodes in the traditional sense of the hardware.
In this config I was to have two VM servers with failover both with a smallish (8tb usable) local SSD array for fast access to current projects (media production workloads i.e. video and audio), then two storage servers with HDD capacity for non-performant use cases also setup as failover. (I would add a NUC with kubernetes for quorum).
The local storage would run ZFS with GlusterFS for replication. The storage server would be either Ceph or just expose blocks using ZFS over iSCSI.
Now, the issue is I want to use block storage for LXC and iSCSI is not supported there. So I figured using Ceph on the storage nodes would be ideal, which of course leads one to think perhaps all of the nodes should just run Ceph albeit with two separate data pools.
My questions is: is this possible? Am I approaching Ceph in the right way? In my defense there is a lot to digest.
Can I have my local storage be one Ceph "data pool" and the storage nodes another? Would this indeed be shoe-horning a new technology into an old design paradigm? Would it still count as having four real Ceph nodes i.e. near production standard? Is having two separate data pools even necessary?
What I want is reasonably good redundancy, some quick access local storage, then capacity for non-performant storage... on a relative budget. One other consideration is that I have need for PCI passthrough.
Ideally I would have 4-5 identical VM nodes, all with a tonne of storage and just call it a day. Unfortunately that is extremely expensive. So I have opted for VM nodes + storage nodes in the traditional sense of the hardware.
In this config I was to have two VM servers with failover both with a smallish (8tb usable) local SSD array for fast access to current projects (media production workloads i.e. video and audio), then two storage servers with HDD capacity for non-performant use cases also setup as failover. (I would add a NUC with kubernetes for quorum).
The local storage would run ZFS with GlusterFS for replication. The storage server would be either Ceph or just expose blocks using ZFS over iSCSI.
Now, the issue is I want to use block storage for LXC and iSCSI is not supported there. So I figured using Ceph on the storage nodes would be ideal, which of course leads one to think perhaps all of the nodes should just run Ceph albeit with two separate data pools.
My questions is: is this possible? Am I approaching Ceph in the right way? In my defense there is a lot to digest.
Can I have my local storage be one Ceph "data pool" and the storage nodes another? Would this indeed be shoe-horning a new technology into an old design paradigm? Would it still count as having four real Ceph nodes i.e. near production standard? Is having two separate data pools even necessary?
What I want is reasonably good redundancy, some quick access local storage, then capacity for non-performant storage... on a relative budget. One other consideration is that I have need for PCI passthrough.
Last edited: