Hi all.
I've been a Proxmox user since v2.x, but am about to roll out a new lab environment using PVE 5.0 and Ceph, to do some testing with. Hardware-wise, I've got 3 nodes spec'd out like this:
Supermicro 2U
dual Xeon E5-2670's
192GB RAM
LSI 9207-8i HBA
Intel X520-DA2
x2 240GB SanDisk SSD PLUS drives
x2 960GB SanDisk Ultra II SSD's
x1 Samsung SM961 256GB NVMe M.2 SSD (in PCIe adapter)
For a 3-node PVE cluster, this hardware will do great. For Ceph, it should do great as well (for monitor nodes). My current plan is to use the 2 240GB SSD's in a software RAID 1 for the OS (ZFS, maybe?), the 2 960GB SSD's as OSD's, and the 256GB NVMe drive for journals. This SSD-only pool would be used ONLY for LXC and KVM instances. I know consumer/prosumer SSD's aren't ideal, but I don't think my IO load will be so horrible to kill these drives in less than 3 years, especially with a dedicated journal device.
So first off, does my plan seem sane? Would 3/2 or 2/1 be recommended from a performance perspective? Available storage space-wise, either should be fine...just didn't know if a replica of 3 would be advisable on a 3-node cluster (that could potentially grow to 4-nodes in the future, but never beyond 4-nodes). Replica of 3/2 would still yield somewhere around 1.8TB usable, which will likely be plenty of space. However, I have the ability to add 2 more 960GB Ultra II's to each node (so 4 total 960GB Ultra II SSD OSD's per node, 12 total in the cluster). Would having 4 per node give me a noticeable performance increase, or would 2 SSD OSD's per node give me enough performance to handle a lab environment? I would say the most I/O intensive task that will be in this cluster would be syslog-ng and a clustered Splunk environment. Going with 12 total 960GB SSD OSD's would be over-budget, but doable if needed.
Secondly, I'm also considering plans for a second Ceph pool using Seagate 5TB 2.5" drives (5400RPM, slow, consumer-grade) since these servers have 24 2.5" bays each. Starting with 4 drives per node, and expanding up to ~16 drives per node (48 5TB drives total). If I was to do this, would I be able to use the SAME 256GB NVMe drive for a journal? Meaning, both Ceph pools would use the same journal device. I would say 20 max OSD's per node, so I figure 10GB journal per drive. This sound doable? I'm trying to plan ahead for max plans here. This second Ceph pool would be used for WORM (Write Once, Read Many) data, such as LXC/KVM backups, user drives, etc. Would also think 3/2 would be appropriate for this pool.
Network-wise, each node would have 2 10gbit links (SFP+ DAC cables); 1 dedicated for Ceph, and 1 dedicated for LXC/KVM use. Each node would also have 2 1GbE links in a LAG for LAN management (UI access, updates, etc) as well as Corosync communication.
Anything additional I didn't think to cover, or does this sound like a solid environment (for a lab, considering the non-enterprise storage)?
Thanks!!
I've been a Proxmox user since v2.x, but am about to roll out a new lab environment using PVE 5.0 and Ceph, to do some testing with. Hardware-wise, I've got 3 nodes spec'd out like this:
Supermicro 2U
dual Xeon E5-2670's
192GB RAM
LSI 9207-8i HBA
Intel X520-DA2
x2 240GB SanDisk SSD PLUS drives
x2 960GB SanDisk Ultra II SSD's
x1 Samsung SM961 256GB NVMe M.2 SSD (in PCIe adapter)
For a 3-node PVE cluster, this hardware will do great. For Ceph, it should do great as well (for monitor nodes). My current plan is to use the 2 240GB SSD's in a software RAID 1 for the OS (ZFS, maybe?), the 2 960GB SSD's as OSD's, and the 256GB NVMe drive for journals. This SSD-only pool would be used ONLY for LXC and KVM instances. I know consumer/prosumer SSD's aren't ideal, but I don't think my IO load will be so horrible to kill these drives in less than 3 years, especially with a dedicated journal device.
So first off, does my plan seem sane? Would 3/2 or 2/1 be recommended from a performance perspective? Available storage space-wise, either should be fine...just didn't know if a replica of 3 would be advisable on a 3-node cluster (that could potentially grow to 4-nodes in the future, but never beyond 4-nodes). Replica of 3/2 would still yield somewhere around 1.8TB usable, which will likely be plenty of space. However, I have the ability to add 2 more 960GB Ultra II's to each node (so 4 total 960GB Ultra II SSD OSD's per node, 12 total in the cluster). Would having 4 per node give me a noticeable performance increase, or would 2 SSD OSD's per node give me enough performance to handle a lab environment? I would say the most I/O intensive task that will be in this cluster would be syslog-ng and a clustered Splunk environment. Going with 12 total 960GB SSD OSD's would be over-budget, but doable if needed.
Secondly, I'm also considering plans for a second Ceph pool using Seagate 5TB 2.5" drives (5400RPM, slow, consumer-grade) since these servers have 24 2.5" bays each. Starting with 4 drives per node, and expanding up to ~16 drives per node (48 5TB drives total). If I was to do this, would I be able to use the SAME 256GB NVMe drive for a journal? Meaning, both Ceph pools would use the same journal device. I would say 20 max OSD's per node, so I figure 10GB journal per drive. This sound doable? I'm trying to plan ahead for max plans here. This second Ceph pool would be used for WORM (Write Once, Read Many) data, such as LXC/KVM backups, user drives, etc. Would also think 3/2 would be appropriate for this pool.
Network-wise, each node would have 2 10gbit links (SFP+ DAC cables); 1 dedicated for Ceph, and 1 dedicated for LXC/KVM use. Each node would also have 2 1GbE links in a LAG for LAN management (UI access, updates, etc) as well as Corosync communication.
Anything additional I didn't think to cover, or does this sound like a solid environment (for a lab, considering the non-enterprise storage)?
Thanks!!