Hi All,
So I've recently decided to try Proxmox. I'll be honest; it was primarily the Ceph integration that initially sold it to me for the following reasons:
In production we use/pay license fees for the Hitachi VSP, and to be honest, it's been very stable over the last 2/3 years. However, with our data processing requirements rapidly changing day by day, I thought Proxmox could be a potential winner.
At the moment, it's early days in terms of infrastructure planning, etc. I would like to test a couple of ideas and see if it could be a viable option.
Initially I was looking at VMware/vSAN for a similar approach but have been drawn towards Proxmox/Ceph.
Side note: Should the testing look promising, we would be purchasing the Proxmox enterprise subscriptions, but at the moment it's early days and doesn't warrant jumping straight into a subscription.
Production: (The difference between the test environment I've put together vs our production will primarily be newer hardware, a faster network and double the number of nodes, but the concept will stay the same.)
Test Environment:
Connectivity: 2 x 10g independent uplinks
Network: 2 x Cisco Nexus N9k Switches (32 x 40 Gbps ports)
Servers: 3 x Dell R730XD's (All 3 of them have the following identical configuration)
RAM: 128 GB DDR4 2133 MHz
CPU: 2 x Xeon E5-2690 v4 @ 2.60 GHz (56 Cores)
DISCS: 20 x Seagate Exos 12 Gbps 1.8 TB SAS HDD (ST1800MM0129)
FLASH: 4 x 1TB PCI NVMe
The network is 40GbE via QSFP+ & fibre from Mellanox ConnectX-Pro's to the N9K switches.
All discs are being presented to Proxmox without any RAID configuration; currently they still have the embedded H730P Mini's, but in HBA mode, they can be changed for HBAs if needed.
So each of the servers has about 80TB of raw storage.
My thoughts/goal:
Most online posts talk about high availability/replication; however, I would like to stray away from the replication as much as possible, primarily because our use case is like this:
The cluster will simply be like a dumping ground; there will be a handful of small containers ingesting data into the Ceph cluster throughout the day, for example, and this data will be handed off at specific intervals to the more stable, long-term storage (for now, the Hitachi VSP).
So the primary purpose of the Proxmox/Ceph cluster will be to ingest data, do some basic compute tasks, and then it will be backed up on long-term storage.
Thus, the point I'm making is that replication is of no importance but, in reality, becomes a hindrance by wasting significant resources for data that will be short-lived.
Worst case, we lose a couple of hours of data that we could get back with a few hours of manual work. Therefore, the risk of having to spend a few hours manually pulling the data for potential downtime should something fail far outweighs the need to spend X amount of resources on a resilient "highly available" setup.
We will benefit far more from the resources and the performance gain by not having any raid or replication in place.
So my question to the seasoned pros here is...
What would one recommend for such a use case? Can one simply create a crush rule that requires 0 replicas? What options are available to maximise performance in trade for replication?
Or is there a minimalistic approach that can give the best of both worlds (for example, 1 drive failure per pool) without the need of dedicating too many resources to the replication side?
So I've recently decided to try Proxmox. I'll be honest; it was primarily the Ceph integration that initially sold it to me for the following reasons:
In production we use/pay license fees for the Hitachi VSP, and to be honest, it's been very stable over the last 2/3 years. However, with our data processing requirements rapidly changing day by day, I thought Proxmox could be a potential winner.
At the moment, it's early days in terms of infrastructure planning, etc. I would like to test a couple of ideas and see if it could be a viable option.
Initially I was looking at VMware/vSAN for a similar approach but have been drawn towards Proxmox/Ceph.
Side note: Should the testing look promising, we would be purchasing the Proxmox enterprise subscriptions, but at the moment it's early days and doesn't warrant jumping straight into a subscription.
Production: (The difference between the test environment I've put together vs our production will primarily be newer hardware, a faster network and double the number of nodes, but the concept will stay the same.)
Test Environment:
Connectivity: 2 x 10g independent uplinks
Network: 2 x Cisco Nexus N9k Switches (32 x 40 Gbps ports)
Servers: 3 x Dell R730XD's (All 3 of them have the following identical configuration)
RAM: 128 GB DDR4 2133 MHz
CPU: 2 x Xeon E5-2690 v4 @ 2.60 GHz (56 Cores)
DISCS: 20 x Seagate Exos 12 Gbps 1.8 TB SAS HDD (ST1800MM0129)
FLASH: 4 x 1TB PCI NVMe
The network is 40GbE via QSFP+ & fibre from Mellanox ConnectX-Pro's to the N9K switches.
All discs are being presented to Proxmox without any RAID configuration; currently they still have the embedded H730P Mini's, but in HBA mode, they can be changed for HBAs if needed.
So each of the servers has about 80TB of raw storage.
My thoughts/goal:
Most online posts talk about high availability/replication; however, I would like to stray away from the replication as much as possible, primarily because our use case is like this:
The cluster will simply be like a dumping ground; there will be a handful of small containers ingesting data into the Ceph cluster throughout the day, for example, and this data will be handed off at specific intervals to the more stable, long-term storage (for now, the Hitachi VSP).
So the primary purpose of the Proxmox/Ceph cluster will be to ingest data, do some basic compute tasks, and then it will be backed up on long-term storage.
Thus, the point I'm making is that replication is of no importance but, in reality, becomes a hindrance by wasting significant resources for data that will be short-lived.
Worst case, we lose a couple of hours of data that we could get back with a few hours of manual work. Therefore, the risk of having to spend a few hours manually pulling the data for potential downtime should something fail far outweighs the need to spend X amount of resources on a resilient "highly available" setup.
We will benefit far more from the resources and the performance gain by not having any raid or replication in place.
So my question to the seasoned pros here is...
What would one recommend for such a use case? Can one simply create a crush rule that requires 0 replicas? What options are available to maximise performance in trade for replication?
Or is there a minimalistic approach that can give the best of both worlds (for example, 1 drive failure per pool) without the need of dedicating too many resources to the replication side?