FYI I've encountered the problem on PVE installations with both ceph and local disks, so I don't think it's a ceph specific issue. Maybe ceph exacerbates it more easily.
You are correct, I was looking on the latest documentation which didn't mention about ceph balancer on
I thought that it was enabled by manually creating a plan, but obviously I was wrong.
So I run ceph balancer on and it immediatelly started moving around a few PGs and the usable space grew to...
So, we brought pve02 back online, and changed the pg_num to 2048.
After the rebalancing completed, we gained tons of usable storage!
From 28.1TiB it went to 44.24TiB, which is a huge gain of 16.14TiB!
But still, there are almost 10TiB that are not accounted for.
ceph reports the following raw...
This is ceph octopus (15.2.15). We haven't upgraded yet to a newer version but it's on our todo list.
Ok, we will increase the pg_num first. Can we try that now (while pve02 is down) or should we wait until its fixed first?
I also enabled the balancer but from what I read on the documentation...
Thanks for the hints.
I've set the autoscaler mode to warn at the moment and the ratio to 1
root@pve01 ~ # pveceph pool ls --noborder
Name Size Min Size PG Num min. PG Num Optimal PG Num PG Autoscale Mode PG Autoscale Target Size PG Autoscale Target Ratio Crush Rule Name...
root@pve01 ~ # ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 162.40396 - 131 TiB 53 TiB 53 TiB 1.0 GiB 149 GiB 77 TiB 40.71 1.00 - root default
-3...
Hello,
We run a PVE cluster of 5 nodes with ceph on each node.
Each node has a number of OSDs, each backed by SSDs of various sizes.
A few months ago the OSDs / SSD drives per node were as follows:
PVE1
4x 3.49TiB (3.84TB)
5x 1.75TiB (1.92TB)
3x 745GiB (800GB)
PVE2
4x 3.49TiB (3.84TB)...
Has anyone noticed any difference switching to "OS control mode" from "Static High Performance"?
I am experiencing the same issue as abzsol when deleting snapshots or images on CEPH. The whole cluster slows down.
I would prefer an official guide from Proxmox.
All the posts in the forum with "success stories" about moving OSDs are kind of anecdotal.
Half people say it didn't work for them (me included) half say "Great it worked" with not so much as a detailed description of how exactly they made it work...
Can someone post a 100% working workflow for this.
I tried today to move an OSD from one host to another, and it simply wouldn't get recognized by Ceph.
"ceph-volume lvm activate --all" was supposedly successful but osd tree would not move the OSD from the old node to the new.
# ceph-volume lvm...
No, I mean the total space gets more.
See my quote in my first post.
These copy/pastes are from the dashboard over a period of 24 hours.
At one point for example it was:
Storage
8.02 TiB of 12.10 TiB
And then at another point later on:
Storage
8.05 TiB of 12.13 TiB
I get that the used...
Thank you but I've already done that.
Still, this doesn't answer my question.
Not which storages are included. But how those included storages are calculated.
Hello,
How does Proxmox calculate the storage usage in the Datacenter Summary section?
I am running 3 nodes, with 4x3.8TB SSDs per node used in a Ceph Cluster (3 replicas - standard/default ceph installation).
I've configured the dashboard to only show the storage for ceph for a single node...
Also, while we are at it, when does proxmox consider the smart status "Not passed" ?
I've got drives with reallocated sectors, offline uncorrectable and smart self test with LBA errors, and proxmox still shows this drive as SMART: PASSED
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.