Hi,
I'm currently in the process of planing a migration of a production
PVE cluster to new hardware, which should be acomplished with minimal
(idealy no) downtime.
The old HW is:
* 3 Nodes
** VMs are stored on Ceph
** 2x Intel E5-2640 v2 / 64GB RRAM
** 2x 400GB SSD (System + Ceph Journal)
** 4x 300GB SAS 15kRPM OSD w/ SSD Journal (Filestore)
** 2 Port X520 10GBe Ethernet (Ceph + Cluster)
** 2 Port GBe (Frontend)
Given the Expirience over the 8 years running this configuration,
the new HW is slightly different:
* 3 Nodes:
** VMs are stored on Ceph (3 copies)
** 1x Intel Xeon Gold 6134/128 GB RAM
** 2x 400GB SSD (System + DB/WAL)
** 6x 1.8TB 10kRPM OSD (Bluestore)
** 2 Port X710 10GBe Ethernet (Ceph + Cluster)
** 2 Port GBe (Frontend)
My Idea would be (after burn in tests of the new HW):
* Install the new nodes (3 copies)
* add them to the cluster
* for each node:
** add the new Disks as OSDs
** Remove the OSDs from one old node
** wait until redundency is re-established
* HA-migrate the VMs to the new nodes
* Migrate (with restarts of course) the LXC Containers
* remove the old Nodes from the Cluster
In theory this should migrate everything with minimal downtime
(none for the VMs) and risk (as i still have redundency
for all data given that i'm starting 3 copies).
I still have some questions however:
* Will this be possible without getting addtional subscriptions for the new hosts?
(given that they only run in paralell for the mirgration, i would like to avoid that)
* Are there known problems live migrating between different CPU generations (older to newer)?
ps: i'm aware that the hardware configuration has changed quite a bit, over the 3 years the old
cluster was running, the requirements changed a bit
I'm currently in the process of planing a migration of a production
PVE cluster to new hardware, which should be acomplished with minimal
(idealy no) downtime.
The old HW is:
* 3 Nodes
** VMs are stored on Ceph
** 2x Intel E5-2640 v2 / 64GB RRAM
** 2x 400GB SSD (System + Ceph Journal)
** 4x 300GB SAS 15kRPM OSD w/ SSD Journal (Filestore)
** 2 Port X520 10GBe Ethernet (Ceph + Cluster)
** 2 Port GBe (Frontend)
Given the Expirience over the 8 years running this configuration,
the new HW is slightly different:
* 3 Nodes:
** VMs are stored on Ceph (3 copies)
** 1x Intel Xeon Gold 6134/128 GB RAM
** 2x 400GB SSD (System + DB/WAL)
** 6x 1.8TB 10kRPM OSD (Bluestore)
** 2 Port X710 10GBe Ethernet (Ceph + Cluster)
** 2 Port GBe (Frontend)
My Idea would be (after burn in tests of the new HW):
* Install the new nodes (3 copies)
* add them to the cluster
* for each node:
** add the new Disks as OSDs
** Remove the OSDs from one old node
** wait until redundency is re-established
* HA-migrate the VMs to the new nodes
* Migrate (with restarts of course) the LXC Containers
* remove the old Nodes from the Cluster
In theory this should migrate everything with minimal downtime
(none for the VMs) and risk (as i still have redundency
for all data given that i'm starting 3 copies).
I still have some questions however:
* Will this be possible without getting addtional subscriptions for the new hosts?
(given that they only run in paralell for the mirgration, i would like to avoid that)
* Are there known problems live migrating between different CPU generations (older to newer)?
ps: i'm aware that the hardware configuration has changed quite a bit, over the 3 years the old
cluster was running, the requirements changed a bit