Migrating Ceph from 3 small OSDs to 1 large OSD per host?

justjosh

Well-Known Member
Nov 4, 2019
103
2
58
59
Hello,

I have a 5 node Ceph cluster each with 3x400GB hot swap drives. I have purchased 5x3.2TB PCIe drives with the intention of replacing the 15x400GB drives in the near future.

My expected migration plan is to bring down each host one by one to bring all the 3.2TB drives online. Assuming everything goes smoothly I should have both pools running concurrently. Not sure how to do a live move of the data without disruption though.

Comments appreciated!
 
  • Like
Reactions: Johannes S
Usually Ceph will move the data for you.

Before you bring down a host drain and remove its OSDs. After swapping the disks create a new OSD on the new NVMe and Ceph will happily use it.

After doing this with all nodes all your data will be on the new disks.
 
But what about the uneven sized OSDs? I'm replacing 3x400GB with 1x3.2TB. I don't have to remove the old disks immediately. They're hot swap so I was thinking I would just bring the new disks online and do a migration along the way
 
What would be the process to power down the machines? Do I simply just power down one machine at a time to replace the drives? Or do I need to prep the pool in some way?
 
Do I simply just power down one machine at a time to replace the drives?
Yes.
Or do I need to prep the pool in some way?
No.

If you have hot-swap possibility you can even add them live and as fast as you can click with the mouse. Ceph will spread data over all disks as soon as it gets access to them.

In most cases it is advisable to first flash the latest firmware onto the drives using a cluster-independent machine. Then do short tests and benchmarks and also mostly you will want to reformat the drives to 4k for more throughput.
 
  • Like
Reactions: Johannes S
Yes.

No.

If you have hot-swap possibility you can even add them live and as fast as you can click with the mouse. Ceph will spread data over all disks as soon as it gets access to them.

In most cases it is advisable to first flash the latest firmware onto the drives using a cluster-independent machine. Then do short tests and benchmarks and also mostly you will want to reformat the drives to 4k for more throughput.
The new drives are plugging into the pcie slot so will have to power down the machine

I'm terrified of flashing drive firmwares tbh
 
  • Like
Reactions: Johannes S
PCI-E could do hot-swap in theory, but yeah the really safer approach is power-down. :)

I'm terrified of flashing drive firmwares tbh
There's no need to be afraid. Micron has a very brief description of NVMe/PCIe drives (https://www.micron.com/content/dam/...ds/7450-ssd/micron-7450-firmwaredownloads.txt). The steps are the same, only the firmware file is different depending on the model/manufacturer.
Edit: I've done this with all kinds of drives and even tried "wrong" firmware as a test. It was always rejected and you can't break anything with it.

There is no general obligation to update the firmware, but some models have or had nasty bugs in the initial firmware that lead to total failure after a certain period of time and I would classify that as rather frightening and that is why I generally recommend updating the firmware before the drives are used productively.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!