A few information about the System:
Its a Hyperconverged Cluster of 5 Supermicro AS -1114S-WN10RT:
4 of the Servers have:
CPU: 128 x AMD EPYC 7702P 64-Core Processor (1 Socket)
RAM: 512 GB
1 of the Servers has:
64 x AMD EPYC 7502P 32-Core Processor (1 Socket)
RAM: 256 GB
We recently purchased three small servers and need advice on the network setup. What would be recommended with regard to distributing the network services amongst the adapter/ports? Both adapters are npar capable.
3x Dell 6515 with:
1x AMD 7542
128 GB memory
2x 1Gb nic
2x 25GbE sfp28 (Broadcom...
I've a ceph Cluster with 3 nodes HPE each with 10xSAS 1TB and 2xnvme 1TB below the config.
The replica and ceph network is 10Gb but the performance are very low...
in VM I got (in sequential mode) Read: 230MBps Write: 65MBps
What I can do/check to tune my storage environment?
This week we have been balancing storage across our 5 node cluster; Everything is going relatively smoothly but am getting a warning in CEPH:
"pgs not being deep-scrubbed in time"
This only began happening AFTER we made changes to the disks on one of our nodes; CEPH is still healing properly...
Hi, I plan to build my first ceph cluster and have some newbie questions. In the beginning I will start with 5 nodes, and plan to reach 50 nodes.
Those nodes quite old (CPU E3,16GB RAM, 2x1Gbps network), so I think to gain the performance in adding more nodes but not upgrading RAM or CPU.
the autoscaler increased the number of PGs on our Ceph storage (Hardware like this but 5 nodes).
As soon as the backfill starts the VMs become unusable and we startet killing OSD processes that cause high read io load. So as in this picture we would kill the ceph-osd process working on...
I have 2 ceph nodes. Each has a mon and mgr installed. Whenever I shutdown any one mon-instance on any of my nodes ceph becomes completely unresponsive until I start that mon again. Is this normal or can I fix this?
I don't know how to fix this. I'm just starting out with ceph. It just keeps on showing active+clean+remapped. It doesn't fix it over time. How do I fix this? I just use the default replication rule for my pools.
I am looking to some guidance to finalize the setup of a 3-nodes Proxmox cluster with Ceph and shared ISCSI storage. While it's working, I am not really happy with the ceph cluster resilienc and I am looking for some guidance.
Each nodes have 2x10GbE ports and 2x480GB SSD dedicated for ceph...
I reconfigured a server from scratch.
Then installed the ceph package but did cancel the configuration after the install, so it could use the setup of the already configured setup.
Then made it join the cluster.
now I cannot configure it with the GUI, and have the got timeout (500)'...
I realized that I'm deleting data from the vms but this space is not being released in Ceph. I found in the documentation that I should do a Fstrim in the RDB but I can't find its assembly text such as: fstrim /mnt/ myrbd.
On this moment we have:
6 x Proxmox Nodes
2 x 10 cores (2 nodes have 2 x 14 cores)
512 GB RAM
4 x 10 GB (2 x 10 GB LACP for network en corosync and 2 x 10 GB LACP for Storage)
3 x Ceph Monitor
4 GB RAM
2 x 10 GB LACP
4 x Ceph OSD
2 x 6 Core 2,6 Ghz
96 GB RAM
4 x 10 GB (2 x...
After I installed proxmox I decided to tinker around with ceph. Some things didn't work out and I removed ceph from the proxmox node. After stopping all the ceph services I removed it with 'pveceph purge'. That worked!
Now when I tried to reconfigure ceph I keep getting this error "Could...
I have had this issue for a while now, and after upgrading to Proxmox 6 and the new Ceph it is still there.
The problem is that the Ceph Display page shows that I have 17 OSD's when I only have 16. It shows the extra one as being down and out. (Side note, I do in fact have one OSD that is down...
How do you define the Ceph OSD Disk Partition Size?
It always creates with only 10 GB usable space.
Disk size = 3.9 TB
Partition size = 3.7 TB
Using *ceph-disk prepare* and *ceph-disk activate* (See below)
OSD created but only with 10 GB, not 3.7 TB
Currently as all nodes are under load and memory consumption is around 90-95% on each of them.
CEPH cluster details:
* 5 nodes in total, all 5 used for OSD's 3 of them also used as monitors
* All 5 nodes currently have 64G ram
* OSD's 12 disks in total per node - 6x6TB hdd and 6x500G ssd.
Has there anyone encountered the same issue as mine?
I found one of OSDs in our production proxmox CEPH cluster environment which had high apply latency(around 500ms.)
It caused our CEPH cluster performance to degrade. After I restarted the OSD, the cluster performance is back to...
I have 2 PVE nodes and 5 servers as CEPH Storage, also building under PVE Servers.
So I have two cluster:
1 cluster with 2 PVE nodes, named PROXMOX01 and PROXMOX02.
* PROXMOX01 runs proxmox-ve: 5.3-1 (running kernel: 4.15.18-11-pve) pve-manager: 5.3-11 (running version...