Ceph Cluster Installation - why fast SSD for OS?

cmonty14

Well-Known Member
Mar 4, 2014
343
5
58
Hi,
in the wiki you provide this recommendation:
"We install Proxmox VE on a fast and reliable enterprise class SSD, so we can use all bays for OSD (Object Storage Devices) data."

Unfortunately there's not more information for this architectural design.

Could you please explain why the OS should be installed on a fast SSD?

THX
 
  • Like
Reactions: AlexLup
Hi,

having the OS on a fast enterprise SSD has advantages among:
* a lot more IOPS than on a spinning disk - or any other slow medium. Nice, among others, for our configuration filesystem's database
* all things go faster - package updates (also a lot of fsyncs)
* under heavy load, IO bottlenecks may become bad having the OS and possibly it's SWAP on a fast (bandwidth, latency and IOPS, wise) medium may help quite a bit here

And yeah, often mainboards have now something like 1 M.2 slot and 6 or even 12 SATA 6Gbps ones. For a Ceph hosting server it could make sense to just add the 6 OSDs and use a smaller (<256GB) M.2 NVME as OS, that's simply convenient - IMO. Maybe using the remaining fast space also as temporary backup store to speed up backups, etc.

That said, it clearly isn't a must. Most executables are in memory quite quick after booting, the pmxcfs may profit from faster IOPS but it makes only problems on really really slow stuff (e.g., USB keys) or on big setups (> ~12-16 nodes) with resource usage on the limit of maxing everything out.
PVE definitively works fine from a spinning disk. Try at least to use server-class enterprise hardware, it improves reliability and reduces headaches.

We just try to avoid the case where people use slow hardware and wonder why things are slow, which pops up more often than one would think.
We also recommend fast hardware because Admins can then tell their manager that it's an requirement with a "proof", and make their life easier to get fast enough hardware instead of barely fast enough ones which, when adding maintenance cost and increased wait times for various operations, costs even more in the long run. :)
 
  • Like
Reactions: fibo_fr
Hi,
the advantages of SSD vs. spinning disk are clear.
But my question is regarding the special demand of the OS being installed on SSD with Ceph Cluster.

The Ceph Object Storage Node is mainly build of OSD + OSD Journal.
For better performance the OSD Journal should be outsourced to SSD where a single SSD could serve as journal for multiple OSDs.
This means one would create multiple partitions (8-12) of 500MB on a SSD serving 8-12 OSDs (defined on HDDs).

Ceph Monitors or Monitoring Nodes are also performance critical.

However, this would not explain why the OS must reside on a SSD.

Could you please clarify?

THX
 
For better performance the OSD Journal should be outsourced to SSD where a single SSD could serve as journal for multiple OSDs.

Side note: As all OSDs share it you must divide it's performance through the count of OSDs (to get an estimate). Also all OSDs sharing the same journal go down if the journal fails,.

Thus often you do good, maybe even better, with the default where the journal is on the OSD itself.

However, this would not explain why the OS must reside on a SSD.
Maybe the sentence is phrased confusingly. PVE should always be on an (fast) SSD, if possible and in the setup budged, not explicitly in ceph setups...

The quoted sentence from your initial post means, AFAICT, "we try to keep all bays and SATA/SAS ports free for OSDs thus we put the OS on an (now common) m.2 slot, thus do not "waste" a disk bay for the OS.
 
Question:
What is demanding high I/O performance on the OS disk when running PVE?

Well, if it's also about wasting disk bay for the OS I would consider to put the OS on a device that is specially available on my mainboard: RAID1 SD-Card.
Don't ask me why the vendor constructed this board, but it's there as this server is built for storage pool mainly, means 8 disk bays internal + 48 HDDs ext. storage box.
 
PVE definitively works fine from a spinning disk. Try at least to use server-class enterprise hardware, it improves reliability and reduces headaches.

This :) I've also run proxmox successfully diskless. For a standalone node I have not seen much benefit to SSD except for faster boot time. That said, the busier the system the more you'd benefit from faster location for /var/log, etc.
 
Generally I would prefer to install OS on a dedicated drive w/o wasting any SAS slot (or disk) bay.
Therefore this server offers me a dual SD slot with RAID1 for the OS.
All other drives, means +40 HDDs and +5 SSDs are reserved for Ceph OSD usage only.

Would this setup run with Proxmox VE + Ceph?
 
Would this setup run with Proxmox VE + Ceph?
Work? Yes. The question is: How long :) SD Cards are not known to be very write-friendly and PVE _will_ kill normal SD Cards in a matter of months (or even weeks in a cluster).*
I don't know if there are enterprise-grade SD-cards but the SanDisk Industrial cards may be a good start.

* I have a cluster running one node on an USB Stick. It has been two nodes but one crashed every two-four weeks. The other one is still running but I'd recommend to use an HDD or SSD for the PVE installation.
 
There are SD memory sticks available with SLC memory cells.
This technology is more robust and allows much more write cycles compared to standard flash memory cells.
 
  • Like
Reactions: AlexLup

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!