Compute and Ceph Storage Cluster

oendaps

New Member
Mar 1, 2023
9
0
1
Hi Everyone

I want to ask best practice for 4 nodes that i have

node-1 DL380 dual proc xeon gold with 256gb ram
node-2 DL380 dual proc xeon silver with 96gb ram
node-3 DL380 single proc xeon silver with 32gb ram, 3x1.92tb ssd
node-4 DL380 single proc xeon silver with 32gb ram, 4x1.92tb ssd

All network already on 10Gbit

i plan to add 1 node with identic spec with node-3 and node 4

the existing topology is like this

Everything is in 1 cluster


1677824665697.jpeg

and i have some reference to make the HCI like this, which is 2 cluster in there, 1 cluster for cumpute and 1 cluster for ceph storage

1677824835399.png

my point is which one is better to applicate, 1 cluster mix compute and ceph storage or 2 cluster separate compute-cluster and ceph-cluster, based on nodes i have and maybe its difficult on attach ceph storage pool on another cluster in pic 2
 
Last edited:
Hi Everyone

I want to ask best practice for 4 nodes that i have

node-1 DL380 dual proc xeon gold with 256gb ram
node-2 DL380 dual proc xeon silver with 96gb ram
node-3 DL380 single proc xeon silver with 32gb ram, 3x1.92tb ssd
node-4 DL380 single proc xeon silver with 32gb ram, 4x1.92tb ssd

All network already on 10Gbit

i plan to add 1 node with identic spec with node-3 and node 4

the existing topology is like this

Everything is in 1 cluster


View attachment 47500

and i have some reference to make the HCI like this, which is 2 cluster in there, 1 cluster for cumpute and 1 cluster for ceph storage

View attachment 47502

my point is which one is better to applicate, 1 cluster mix compute and ceph storage or 2 cluster separate compute-cluster and ceph-cluster, based on nodes i have and maybe its difficult on attach ceph storage pool on another cluster in pic 2
All in one pve and Ceph cluster hyperconverged offers more flexibility. Moreover, when Ceph storage is hosted by pve nodes it is supported (i.e. e.g. repository always fully compatible to current pve version) by Proxmox.
 
All in one pve and Ceph cluster hyperconverged offers more flexibility. Moreover, when Ceph storage is hosted by pve nodes it is supported (i.e. e.g. repository always fully compatible to current pve version) by Proxmox.
Hi Richard, Thank you for your time to reply this
I see, and maybe its more flexible on the budget i think, when 2 cluster running we need more 10Gig switch
Talk about ceph i'm confused when on existing cluster i upgrade on 6.4 to 7 and with octopus ceph, and it says local ceph version is too low and health warn on 6 osd's

Maybe when i purchase 1 more node i want to start from the begining, destroy everything and build again on newest version of proxmox and ceph, (also backup all my vm to save place)

Maybe you have some advice and problem solving of this case, regards from me

1678116222379.png


1678116324865.jpeg
 
Hi Richard, Thank you for your time to reply this
I see, and maybe its more flexible on the budget i think, when 2 cluster running we need more 10Gig switch
Talk about ceph i'm confused when on existing cluster i upgrade on 6.4 to 7 and with octopus ceph, and it says local ceph version is too low and health warn on 6 osd's
Upgrade to octopus is not a mystery - see https://pve.proxmox.com/wiki/Ceph_Nautilus_to_Octopus . Has to be done before upgrade to PVE7.

Maybe when i purchase 1 more node i want to start from the begining, destroy everything and build again on newest version of proxmox and ceph, (also backup all my vm to save place)

Of course, also possible; finally it depends on your current situation what is more convenient for you.
 
There is a problem with the way you laid out your networking.

you only seem to have a single network, 192.168.10.0, available to service all traffic types for your ceph cluster; since that is also your interconnect to the rest of your network, that means you'll have public traffic, cluster traffic, and ceph public traffic all going over the same network.

You really want those separated- and at minimum for good, predictable performance the cluster and ceph traffic should have 10gb+ for themselves.
 
There is a problem with the way you laid out your networking.

you only seem to have a single network, 192.168.10.0, available to service all traffic types for your ceph cluster; since that is also your interconnect to the rest of your network, that means you'll have public traffic, cluster traffic, and ceph public traffic all going over the same network.

You really want those separated- and at minimum for good, predictable performance the cluster and ceph traffic should have 10gb+ for themselves.
You can figure my first topology, maybe zoom it, the network is different not on the same network,

For remote i use 192.168.22.xxx on 1G speed

For clustering i use 192.168.111.xxx on 10G speed

And for the ceph i forgot to mention that has 1 more network 192.168.222.xxx and that interface only on HCI 3 and HCI 4
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!