hi,
we planning a CEPH storage cluster to use with PVE 4.2 and after a lot of reading, what for HW we should use for our lab, we consider to following:
The basis layout:
The other three nodes are HW we already have. Some PSSC blades with same CPU and mainboard and up to 64GB ram. Only the 10Gbit interfaces are missing to connect to the CEPH cluster. But for testing 2x1GBit in LACP mode should be fine.
One question that comes into my mind: is it O.K to host VMs and mon daemon on the same physical host.
All OSD nodes are later connected via a LACP trunk (so we have 2x10Gbit) to the stacked switch so if one switch gets off, one link is up, through the other switch.
The main goal is to have a shared storage and replace our ISCSI setup in two racks. We don't have so much high I/O related VMs, just normal Debian VMs for our (Web) services and backups.
Also, we want to build everything redundant, so we can put hosts/switches in maintenance mode, or one can fail, without interrupting the productive environment (maybe slower, but not stopping)
Is there something wrong with our HW or are there some other suggestions?
Big update:
I put now our already in testing hardware here, that could be usefull for other:
We have now:
6 x Ceph Proxmox 4 (upgrade comes later)
5 x Proxmox 5
For all we have the same basics:
The hardware list:
* Chassis: 2HE 24 slot http://www.aicipc.com/en/productdetail/446
* Chassis Backplane: 12Gbit without expander
* Motherboard: Supermicro X10DRI with new HTML5 IPMI firmware
* CPU: Intel Xeon 2620v4 (or v6?) 2,1Ghz
* Ram: DDR4 64GB ECC buffered 2400Mhz
* System disk: 2 x Crucial MX300 250GB
Ceph has:
* Ceph SSD Pool: 6 x Samsung 850Evo 500Gb
* Ceph SATA Pool: 6 x WD Red 1TB ( only for logs/elasticsearch ...)
* HBA: LSI sas-9305-16i
* Ceph journal : Intel SSD DC P3700 400GB (cache for the SSDS)
* Network: Mellanox CX-4 two port 100Gb -> Ceph only
Proxmox has:
* Ceph network: Mellanox CX-4 two port 25Gb -> Ceph storage connection
* VM network: Intel dualport X520 SFP+
* Switch: 3 x Ubuiquity Edgeswitch 16-XG
Ceph Network:
* Switch: 2x Mellanox MSN2100 (100Gb) 12 port switch
* Cable: 4 x 100Gb -> 4 x 25Gb splitting cable
* Cable: 12 x 100Gb cable for Ceph nodes
The journal device is only for the SSDs and a single point of failure, because if the journal device dies, the whole node dies.
But we have the option to change that later.
The X10DRi has enough PCIe slots to extend with NVMe and HBA controller, if we fill the 12 free storage slots.
Important: The CX-4 must be assigned to CPU 2 to get the full speed, otherwise it reaches only ~ 25% and lower.
We created two Ceph pools:
* ssds -> For all VMs
* sata -> For logs and elastic search
Both are created with 3 replicas and 2048PGs.
We will test now all the details and handling
cu denny
we planning a CEPH storage cluster to use with PVE 4.2 and after a lot of reading, what for HW we should use for our lab, we consider to following:
The basis layout:
- 3 x OSD nodes with 10 x consumer SSD and / or seagate spinning disks
- 3 x nodes acting as MON and PVE host
- Asus Z10PR-D16
- CPU Intel Xeon E5-2620v3
- Chassis RSC-2AH
- Storage SSD Cache: Seagate MLC 200GB (ST200FM0053)
- Storage OSD: Crucial MX200 and / or Seagate constellation disks (we take later care on it, if we now, how much space/io we need)
- Memory: 16GB or 32GB DDR4
- SAS controller: LSI Megaraid 9361-8i in JBOD (Raid0) or direct mode (if cache is used) with battery pack
- Network: Intel X520-SR2
- Switch: HP Aruba 2920-24 in stacked mode with (via stack module) and HP 10Gbit backplane module
The other three nodes are HW we already have. Some PSSC blades with same CPU and mainboard and up to 64GB ram. Only the 10Gbit interfaces are missing to connect to the CEPH cluster. But for testing 2x1GBit in LACP mode should be fine.
One question that comes into my mind: is it O.K to host VMs and mon daemon on the same physical host.
All OSD nodes are later connected via a LACP trunk (so we have 2x10Gbit) to the stacked switch so if one switch gets off, one link is up, through the other switch.
The main goal is to have a shared storage and replace our ISCSI setup in two racks. We don't have so much high I/O related VMs, just normal Debian VMs for our (Web) services and backups.
Also, we want to build everything redundant, so we can put hosts/switches in maintenance mode, or one can fail, without interrupting the productive environment (maybe slower, but not stopping)
Is there something wrong with our HW or are there some other suggestions?
Big update:
I put now our already in testing hardware here, that could be usefull for other:
We have now:
6 x Ceph Proxmox 4 (upgrade comes later)
5 x Proxmox 5
For all we have the same basics:
The hardware list:
* Chassis: 2HE 24 slot http://www.aicipc.com/en/productdetail/446
* Chassis Backplane: 12Gbit without expander
* Motherboard: Supermicro X10DRI with new HTML5 IPMI firmware
* CPU: Intel Xeon 2620v4 (or v6?) 2,1Ghz
* Ram: DDR4 64GB ECC buffered 2400Mhz
* System disk: 2 x Crucial MX300 250GB
Ceph has:
* Ceph SSD Pool: 6 x Samsung 850Evo 500Gb
* Ceph SATA Pool: 6 x WD Red 1TB ( only for logs/elasticsearch ...)
* HBA: LSI sas-9305-16i
* Ceph journal : Intel SSD DC P3700 400GB (cache for the SSDS)
* Network: Mellanox CX-4 two port 100Gb -> Ceph only
Proxmox has:
* Ceph network: Mellanox CX-4 two port 25Gb -> Ceph storage connection
* VM network: Intel dualport X520 SFP+
* Switch: 3 x Ubuiquity Edgeswitch 16-XG
Ceph Network:
* Switch: 2x Mellanox MSN2100 (100Gb) 12 port switch
* Cable: 4 x 100Gb -> 4 x 25Gb splitting cable
* Cable: 12 x 100Gb cable for Ceph nodes
The journal device is only for the SSDs and a single point of failure, because if the journal device dies, the whole node dies.
But we have the option to change that later.
The X10DRi has enough PCIe slots to extend with NVMe and HBA controller, if we fill the 12 free storage slots.
Important: The CX-4 must be assigned to CPU 2 to get the full speed, otherwise it reaches only ~ 25% and lower.
We created two Ceph pools:
* ssds -> For all VMs
* sata -> For logs and elastic search
Both are created with 3 replicas and 2048PGs.
We will test now all the details and handling
cu denny
Last edited: