Hey all,
long time lurker, first time poster here
Following problem:
The plan for the environment is to use proxmox ha ceph cluster with 3x nodes and a Business Subscription.
Specs per node (edit2: different NICs):
1x CSE-116AC2-R706WB2
1x Supermicro H11DSi-NT , 2x 10gbit/rj45 NIC onboard, 2x CPU possible
1x EPYC 7302
2x 64GB ram 3200mhz, up to 16x DDR4 possible
1x 4TB Intel P4510 U.2 SSD (connected via NVMe) for OSD
2x 480GB Samsung PM883 SSD for Proxmox OS
1x AOC-S25G-m2S 2x25gbit/sfp28
1x Intel i210-T1 1x1gbit/rj45
The plan is to first start with 2 VMs (One on each node) and see how the system behaves. If the service (1x Windows Server, 1x Linux Server) mentioned above is working flawlessy, we might add more VMs to the cluster (for example a bare-metal exchange server that we want to virtualize).
The case can house up to 2x NVME U.2 and 8x SATA6 2,5" SSDs, 1x NVME will be populated aswell as 2xSATA for Proxmox OS (raid1) if first purchased. We first want to scale the cluster vertically (by adding 2nd CPU, more RAM, more SSDs) before we go the route with horizontal scaling (adding more nodes), due to high network equipment requirements.
Questions regarding hardware:
QH1:
The XL710-QDA2 will be used to directly connect each node with each other by DAC cables (peer2peer), which will be the cluster network. The 2x10Gbit onboard NICs will be used for public/proxmox/userclient networking via 10G switch. Do i need to add another NIC for corosync/heartbeat network or is the current networking sufficient?
Solution: Add NIC for dedicated network (not sufficient)
QH2:
If we want to expand the fast NVME storage, we do have to purchase 3x4TB, put one into each node and configure the 4TB NVME as OSD and add it to ceph_nvme pool. This will double the usable space and usable IO performance (if the networking bandwidth allows it), correct?
Solution: Yes, not linear
QH3:
Point:
Question regarding project
QP1:
Will scaling vertically work in this project? My biggest concern is the storage. The database and the user application is not going to use alot (less than 1TB) of space. IO performance is way more important, thats why we want both initial VMs to be on fast NVME storage.
Later on we do want to populate the empty 6x SATA 2,5" bays with cheaper SATA SSDs (and maintain 2 different pools, ceph_nvme and ceph_ssd) for future vms that do not require fast NVME performance.
QP2:
Each node will process VMs and handle data storage if first built. Later on, if the project is succesful and the need for more space comes up, we want to seperate storage and computing nodes.
The "dream" would be to upgrade the initial 3 nodes to maximum computing power, yank all the then installed OSDs, purchase additional 3x storage nodes, populate those with the existing OSDs, throw in 3,5" drives and add a 3rd pool ceph_hdd.
So we do have 3x Proxmox Hosts with mon, mgr on each node and a "real" Ceph storage cluster with 3x nodes.
Does this work well with Proxmox? What would be the process of transforming a Proxmox HA ceph to a Proxmox Host & Ceph storage backend?
Questions regarding Ceph:
QC1:
Plan is to run 3/2 rules (3 replicas with 2 minimum copies), which allows to take down one node for maintenance (upgrading software/hardware etc.). Good choice? What are the culprits? Anyone got a better idea?
QC2:
I just started learning Ceph. Any good references besides RedHats documentation (books etc.). Im currently playing with my proxmox testlab (3x virtualized Proxmox nodes with CephFS installed, 6GB RAM each) and want to test out various scenarios (URE of a node, replacing Disks, replacing Node, adding additional nodes etc.).
QC3:
Regarding
Im really happy for every advice i can soak up.
Hope to find answers soon
Fabius
Edit1: typos
long time lurker, first time poster here
Following problem:
- It is required to have up to 100 users access a central service
- The service consists of an user application running on Windows Server 2019 and a linux-backed database (Debian 10 and FirebirdDB)
- Those 100 users are split across different offices across the country (Germany)
- Required is high-availability during business hours (mon-sat, 8am-6pm). Downtime of <15 minutes is acceptable.
- Project is still in design phase. No Hardware is purchased yet (Neither Server nor Networking equipment)
- Administrators are off-site and it takes at least half a day to physically reach the servers
- Only one office has a proper server room. Thats why we think about colocating the servers in a datacenter. Does Ceph even make sense in this scenario?
The plan for the environment is to use proxmox ha ceph cluster with 3x nodes and a Business Subscription.
Specs per node (edit2: different NICs):
1x CSE-116AC2-R706WB2
1x Supermicro H11DSi-NT , 2x 10gbit/rj45 NIC onboard, 2x CPU possible
1x EPYC 7302
2x 64GB ram 3200mhz, up to 16x DDR4 possible
1x 4TB Intel P4510 U.2 SSD (connected via NVMe) for OSD
2x 480GB Samsung PM883 SSD for Proxmox OS
1x AOC-S25G-m2S 2x25gbit/sfp28
1x Intel i210-T1 1x1gbit/rj45
The plan is to first start with 2 VMs (One on each node) and see how the system behaves. If the service (1x Windows Server, 1x Linux Server) mentioned above is working flawlessy, we might add more VMs to the cluster (for example a bare-metal exchange server that we want to virtualize).
The case can house up to 2x NVME U.2 and 8x SATA6 2,5" SSDs, 1x NVME will be populated aswell as 2xSATA for Proxmox OS (raid1) if first purchased. We first want to scale the cluster vertically (by adding 2nd CPU, more RAM, more SSDs) before we go the route with horizontal scaling (adding more nodes), due to high network equipment requirements.
Questions regarding hardware:
QH1:
The XL710-QDA2 will be used to directly connect each node with each other by DAC cables (peer2peer), which will be the cluster network. The 2x10Gbit onboard NICs will be used for public/proxmox/userclient networking via 10G switch. Do i need to add another NIC for corosync/heartbeat network or is the current networking sufficient?
Solution: Add NIC for dedicated network (not sufficient)
QH2:
If we want to expand the fast NVME storage, we do have to purchase 3x4TB, put one into each node and configure the 4TB NVME as OSD and add it to ceph_nvme pool. This will double the usable space and usable IO performance (if the networking bandwidth allows it), correct?
Solution: Yes, not linear
QH3:
Point:
- Administrators are off-site and it takes at least half a day to physically reach the servers
Question regarding project
QP1:
Will scaling vertically work in this project? My biggest concern is the storage. The database and the user application is not going to use alot (less than 1TB) of space. IO performance is way more important, thats why we want both initial VMs to be on fast NVME storage.
Later on we do want to populate the empty 6x SATA 2,5" bays with cheaper SATA SSDs (and maintain 2 different pools, ceph_nvme and ceph_ssd) for future vms that do not require fast NVME performance.
QP2:
Each node will process VMs and handle data storage if first built. Later on, if the project is succesful and the need for more space comes up, we want to seperate storage and computing nodes.
The "dream" would be to upgrade the initial 3 nodes to maximum computing power, yank all the then installed OSDs, purchase additional 3x storage nodes, populate those with the existing OSDs, throw in 3,5" drives and add a 3rd pool ceph_hdd.
So we do have 3x Proxmox Hosts with mon, mgr on each node and a "real" Ceph storage cluster with 3x nodes.
Does this work well with Proxmox? What would be the process of transforming a Proxmox HA ceph to a Proxmox Host & Ceph storage backend?
Questions regarding Ceph:
QC1:
Plan is to run 3/2 rules (3 replicas with 2 minimum copies), which allows to take down one node for maintenance (upgrading software/hardware etc.). Good choice? What are the culprits? Anyone got a better idea?
QC2:
I just started learning Ceph. Any good references besides RedHats documentation (books etc.). Im currently playing with my proxmox testlab (3x virtualized Proxmox nodes with CephFS installed, 6GB RAM each) and want to test out various scenarios (URE of a node, replacing Disks, replacing Node, adding additional nodes etc.).
QC3:
Regarding
- Only one office has a proper server room. Thats why we think about colocating the servers in a datacenter. Does Ceph even make sense in this scenario?
Im really happy for every advice i can soak up.
Hope to find answers soon
Fabius
Edit1: typos
Last edited: