What kind of design should I make for Proxmox?

adarguner

New Member
Jan 26, 2024
11
0
1
Dear Friends,

I am planning to do a project. I want to use Proxmox as a virtualization technology. I want to design HCI. I will make an HCI design with Ceph with 3 servers. I will make this design for a medium-sized company. For example;

This is a factory.
40 virtual machines are running.
There are 500 clients
ERP software (Oracle or SAP) is running.
There is no big budget.

Should I choose Intel or AMD server microprocessor as CPU, which one do you recommend? Which one is more efficient and less problematic in Proxmox?

Are SAS (10K) disks sufficient or should I choose NVME disk as HCI?

If one of the 3 servers breaks down and the capacity is sufficient (2 servers can meet the capacity of the 3rd server), will Ceph cause problems?

Is 25G port required between these 3 servers or is 10G port sufficient?
(maybe 5Gbps traffic will occur)

Thank you for your help.

Regards
 
Please be aware of one point: sadly Proxmox is NOT A SAP supported Hypervisor. Here are the supported ones. I hope someone could put enough pressure on SAP to change this. If you meant the two vendors just as an ERP-example this might differ of course.
 
  • Like
Reactions: Johannes S
Please be aware of one point: sadly Proxmox is NOT A SAP supported Hypervisor. Here are the supported ones. I hope someone could put enough pressure on SAP to change this. If you meant the two vendors just as an ERP-example this might differ of course.
Does SAP impose a limitation on proxmox? According to the link you provided, HyperV, Citrix XEN and others are not available. So does SAP not support them either?
 
Does SAP impose a limitation on proxmox? According to the link you provided, HyperV, Citrix XEN and others are not available. So does SAP not support them either?
This is correct. They do not support them. SLES with KVM is supported with defined versions. SAP itself is pushing their clients towards SAP-aaS, so not love from them towards a new KVM-Distro like e.g. Proxmox. The internet finds 1x Reference towards SAP and Proxmox - might be that its SAP ECC or the old "Business One". I do not know. You can utilize ANYTHING for DEV or TEST, but SAP will deny and Tickets and Support for Not Supported Hypervisors-Distros either on PROD, TEST OR DEV (Or INT).
 
  • Like
Reactions: Johannes S
Thank you for the link you provided. However, the recommended Ceph OSD capacities cannot exceed 30% for 3 nodes. Thus, the other two server nodes carry the disk OSDs of the 3rd node.
I have no idea what is meant by this.

you need a MINIMUM of 3 osd nodes. the USABLE CAPACITY will be 1/3 of the deployed disk capacity. The reason 3 nodes can be insufficient for optimal operation is because there is no target for rebalance/self heal in case of a node failure, so you really want AT LEAST 4 nodes. There is also the matter of performance- 3 nodes means that all IO hits the same nodes limiting the subsystem's capacity to respond to requests. More nodes=more performance.

On that note, make sure to match your network design to account for the needs of your cluster, ceph private traffic, ceph public traffic, and your vm traffic.

Please be aware of one point: sadly Proxmox is NOT A SAP supported Hypervisor
I dont know if this matters meaningfully, since RHEV , SLES, and Huawei's hypervisors are official supported (all KVM.) As long as you're not opening a ticket for pve specific issue I dont think it'll invalidate your maintenance contract- but SAP being SAP I'd be careful not to even mention it ;)
 
  • Like
Reactions: Johannes S and UdoB
I have no idea what is meant by this.

you need a MINIMUM of 3 osd nodes. the USABLE CAPACITY will be 1/3 of the deployed disk capacity. The reason 3 nodes can be insufficient for optimal operation is because there is no target for rebalance/self heal in case of a node failure, so you really want AT LEAST 4 nodes. There is also the matter of performance- 3 nodes means that all IO hits the same nodes limiting the subsystem's capacity to respond to requests. More nodes=more performance.

On that note, make sure to match your network design to account for the needs of your cluster, ceph private traffic, ceph public traffic, and your vm traffic.


I dont know if this matters meaningfully, since RHEV , SLES, and Huawei's hypervisors are official supported (all KVM.) As long as you're not opening a ticket for pve specific issue I dont think it'll invalidate your maintenance contract- but SAP being SAP I'd be careful not to even mention it ;)
I don't understand this. Why are 4. nodes requested? Shouldn't the capacity not be more than 25% when there are 4 nodes? Wouldn't the OSD disk capacity usage be less this time? What I mean is, if it is 30% for each node for 3 nodes, doesn't it mean a 25% capacity disk and OSD limitation for 4 nodes? After all, shouldn't disk space be allocated in other nodes for each node?

Or should I understand this.. Is the 4th node only used for disk space?
 
I don't understand this.
start reading :) https://docs.ceph.com/en/latest/start/beginners-guide/

Why are 4. nodes requested?
when you read the above it will start making sense.

houldn't the capacity not be more than 25% when there are 4 nodes? Wouldn't the OSD disk capacity usage be less this time? What I mean is, if it is 30% for each node for 3 nodes, doesn't it mean a 25% capacity disk and OSD limitation for 4 nodes? After all, shouldn't disk space be allocated in other nodes for each node?
No. storage utilization follows your deployed crush rules, which will dictate HOW data is written to disks, no matter how many disks or nodes you deploy. whats cool is that this is defined PER POOL which means you can have pools with different crush rules using the same disks at the same time. The most common rule for rbd use (which is what you will be deploying) is replication:3, which means each write is sent to three seperate osds (disk.)
 
start reading :) https://docs.ceph.com/en/latest/start/beginners-guide/


when you read the above it will start making sense.


No. storage utilization follows your deployed crush rules, which will dictate HOW data is written to disks, no matter how many disks or nodes you deploy. whats cool is that this is defined PER POOL which means you can have pools with different crush rules using the same disks at the same time. The most common rule for rbd use (which is what you will be deploying) is replication:3, which means each write is sent to three seperate osds (disk.)

I apologize. But I asked the AI to verify. It gave me a sample table like the one below. Don't you think it's the right choice and calculation for 3 nodes?

Number of servers 3
Number of disks 3 (1 x 12 TB on each server)
Total physical space 36 TB
Usable space 12 TB (3x replication)
Replication 3x

Commands:
ceph-volume lvm create --data /dev/sdb
ceph osd pool create vm-pool 128
ceph osd pool set vm-pool size 3
ceph osd pool set vm-pool min_size 2

pveceph pool create vm-pool --size 3

/etc/pve/storage.cfg->
rbd: ceph-vm-storage
pool vm-pool
content images,rootdir
krbd 0


Durability No data loss even if 1 server / disk is lost
 
Don't you think it's the right choice and calculation for 3 nodes?
it is (calculation; choice is a separate matter since the number of nodes is an arbitrary number you chose.)

Durability No data loss even if 1 server / disk is lost
Correct, but you're not considering the consequences in their totality. Ceph is designed to maintain its full resilience even in the event of a failure. if an OSD fails, it will AUTOMATICALLY redeploy the contents of that OSD to other surviving OSDs in the pool as long as crush rules are maintained.

If a node fails but there are not sufficient survivors to rebuild, the subsystem will not be able to self heal and will remain degraded (read: not functioning with full high availability.) You may be ok with this- it really depends on your goals.

non sequitur- be VERY CAREFUL making decisions based on generative AI. there is no guarantee that what they generate is not an outright hallucination. if it were me, and I'm planning on deploying a system which has substantial costs associated as well as providing services my business depends on, I'd be sure to understand my decisions fully on my own.
 
it is (calculation; choice is a separate matter since the number of nodes is an arbitrary number you chose.)


Correct, but you're not considering the consequences in their totality. Ceph is designed to maintain its full resilience even in the event of a failure. if an OSD fails, it will AUTOMATICALLY redeploy the contents of that OSD to other surviving OSDs in the pool as long as crush rules are maintained.

If a node fails but there are not sufficient survivors to rebuild, the subsystem will not be able to self heal and will remain degraded (read: not functioning with full high availability.) You may be ok with this- it really depends on your goals.

non sequitur- be VERY CAREFUL making decisions based on generative AI. there is no guarantee that what they generate is not an outright hallucination. if it were me, and I'm planning on deploying a system which has substantial costs associated as well as providing services my business depends on, I'd be sure to understand my decisions fully on my own.
Thanks for the information. I will do some more research. I will pay attention to your advice. I think it is better to calculate as "Total OSD (disk capacity) / Node number". Or is your calculation in the other forum more accurate? max capacity= (number of Nodes-1)*(osd capacity/node *.08) / nodes.
 
I think it is better to calculate as "Total OSD (disk capacity) / Node number". Or is your calculation in the other forum more accurate? max capacity= (number of Nodes-1)*(osd capacity/node *.08) / nodes.
at the risk of questioning your reading comprehension, I'll restate:
storage utilization follows your deployed crush rules, which will dictate HOW data is written to disks, no matter how many disks or nodes you deploy.
number of nodes is irrelevant for usable capacity ratio calculation.