SSD for OSDs in my ceph cluster is reaching it's "write" life span. can I just clone it?

nttec

Active Member
Jun 1, 2016
93
0
26
38
Can I clone my db SSD that I use for the OSDs in my ceph cluster? It is reaching it's "write" life span.
 

aaron

Proxmox Staff Member
Staff member
Jun 3, 2019
2,836
433
88
Never tried it, and it is definitely a "your mileage may vary" situation. Ideally, and that is the beauty of Ceph, you just recreate those OSDs with the new DB SSD. Ceph will do some rebalancing but if you have enough OSDs and Nodes in your cluster, you will never have a reduced redundancy situation.

Ideally, you first set the affected OSDs to out. Wait for Ceph to recreate the data that is on those OSDs somewhere else in the cluster. Once Ceph reports a Health_OK you can stop and destroy those OSDs.
Then recreate them with the new DB SSD and Ceph will rebalance again.
 

nttec

Active Member
Jun 1, 2016
93
0
26
38
Never tried it, and it is definitely a "your mileage may vary" situation. Ideally, and that is the beauty of Ceph, you just recreate those OSDs with the new DB SSD. Ceph will do some rebalancing but if you have enough OSDs and Nodes in your cluster, you will never have a reduced redundancy situation.

Ideally, you first set the affected OSDs to out. Wait for Ceph to recreate the data that is on those OSDs somewhere else in the cluster. Once Ceph reports a Health_OK you can stop and destroy those OSDs.
Then recreate them with the new DB SSD and Ceph will rebalance again.
Thank you for your reply. Have a nice day.
 
Aug 20, 2020
5
1
1
if you have the physical space for adding the new SSD _before_ removing the old one, you could even add a new OSD and set the affected OSD out (not down!) and wait for ceph to rebalance. That would save you the second rebalancing while maintaining full redundancy during the operation
 
Last edited:
  • Like
Reactions: jsterr

nttec

Active Member
Jun 1, 2016
93
0
26
38
if you have the physical space for adding the new SSD _before_ removing the old one, you could even add a new OSD and set the affected OSD out (not down!) and wait for ceph to rebalance. That would safe you from the second rebalancing while maintaining full redundancy during the operation
will keep this in mind. thank you for your recommendation.
 

nttec

Active Member
Jun 1, 2016
93
0
26
38
Never tried it, and it is definitely a "your mileage may vary" situation. Ideally, and that is the beauty of Ceph, you just recreate those OSDs with the new DB SSD. Ceph will do some rebalancing but if you have enough OSDs and Nodes in your cluster, you will never have a reduced redundancy situation.

Ideally, you first set the affected OSDs to out. Wait for Ceph to recreate the data that is on those OSDs somewhere else in the cluster. Once Ceph reports a Health_OK you can stop and destroy those OSDs.
Then recreate them with the new DB SSD and Ceph will rebalance again.
@aaron I have 15 OSD. 5 OSD on each SSD and I believe the redundancy is set as 3/2. Better to remove 1 OSD at a time or all 5 OSD that is on the SSD? To be more clear 3 ceph host, and 5 OSD on each host.

@michael.schaefers - any thoughts?
 
Last edited:

nttec

Active Member
Jun 1, 2016
93
0
26
38
we proceed with installing additional ssd one at a time and now my additional question is:

Can we set the no rebalance and no backfill flag ,destroy the OSD and readd the OSD, unset the flags and have it rebuild.
to prevent multiple rebuilds each time.
 

aaron

Proxmox Staff Member
Staff member
Jun 3, 2019
2,836
433
88
I would also set the "norecovery" flag. Keep an eye on the cluster because if another OSD fails (let's hope not, but you never know), you are likely to have at least some PGs with only one copy left -> IO blocked for the affected pools. In that situation, you should disable the flags to let Ceph recover and hopefully get you back to 2 copies fast to have IO working again.
 
  • Like
Reactions: nttec and jsterr

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!