Migrating from one Node Proxmox to clustered Proxmox

jstrauss · Dec 25, 2024

Hello,

i am planning to move my one node Proxmox server with VMs to a new 3 node Cluster with Ceph Pool.
Now are my questions:

1. Whats the best way to move the VMs to the according Ceph Pools on the Cluster, with minimal downtime.
2. Is there a way for testing to just copy a VM from the active server to the Cluster without downtime?

Thank you for advice.

waltar · Dec 25, 2024

You are talking about minimal downtime and without downtime ... are you really ready for ceph ?
Set up your new cluster, cp over snapshot'ed vm's, rename them (maybe change static IP's and hardcore stress test hard your cluster until you got errors.
Become familar with ceph olve the errors at your own. Search in this forum for ceph ..., read lots of them, try to re-test the problems in your cluster and be sure you are able to solve most of them too. After that (few month) think about your migration of single pve to your pve-ceph cluster again ... and still after all your testing you are able to answer your questions by yourself.

UdoB · Dec 26, 2024

jstrauss said:
a new 3 node Cluster with Ceph Pool.

@waltar sounds a little bit pessimistic. ;-)

Ceph is great, but it needs some resources above the theoretical minimum to work reliable. I would like to add this:

You plan to have three nodes. That is the absolute minimum for a cluster. Probably Ceph works with the default settings "size=3/min_size=2". (*Never* go below that!)

The first problem: if each node has only one single OSD: when (not: if) one device or one node fails Ceph is immediately degraded. There is no room for Ceph to heal itself, so "degraded" is permanent. For a stable situation you really want to have nodes that can jump in and return to a stable condition - automatically. For Ceph in this picture this means to have at least *four* nodes. (In this specific aspect; in other regards you really want to have five or more...)

Another detail often forgotten: let's say you have those three nodes with two OSD each. When one OSD fails its direct neighbor will need to take over the data from the dead disk. (That lost data can not given to another node - the only two other nodes already have a copy!) This means you can fill all OSD in this picture only up to 45 percent: the original 45% plus the "other" 45% gets you 90% on this surviving OSD. To avoid this problem you want several OSDs per node or - better! - more than three nodes.

Note that Ceph is more critical for a cluster than a local SSD is for one of the three nodes: when Ceph goes readonly *all VMs in the whole cluster* will stop immediately - they can not write any data (including log messages, which is practically *always* done) and will stall.

Network: note that data-to-be-written will go over the wire multiple times before it is considered "written". A fast network is a must. This means 10 GBit/s should be considered the minimum. But yeah, technically it works with slower speeds. At first, with low load. When high usage leads to congestion it will increase the latency and you will encounter "strange" errors, which may be hard to debug.

Regarding SSDs/NVMe: you probably already know the recommendation regarding Enterprise class devices. Those have reasons (plural), please consider them. (If you go for -let's say- seven nodes with five OSD each the quality of the OSDs (in a homelab) may be lower, but with a bare minimum number of disks they really need to be high quality.)

YMMV! And while I am experimenting with Ceph I am not an expert...

waltar · Dec 26, 2024

No, I don't mean pessimistic. I just want to prepare for ceph as it's not only a install and then forget.
And yeah...

UdoB said:
Note that Ceph is more critical for a cluster than a local SSD is for one of the three nodes: when Ceph goes readonly *all VMs in the whole cluster* will stop immediately - they can not write any data (including log messages, which is practically *always* done) and will stall.

jstrauss · Dec 26, 2024

Yes i understand that CEPH is not plug and play and doing everything automatically without consent and further investigation and effort.

I will use the cluster for production and therefore i use it now for some months in homelab for testing and getting closer to CEPH.

My current setup is:
2x 980GB RAID-1 SSD-Drives for Proxmox
4x 10TB HDD for CEPH Pool with WAL/DB on a Enterprise NVMe (2 osds on 1 TB each)
2x 2TB Enterprise NVMe on seperate NVMe CEPH Pool

and this 3x everything with enterprise grade hardware connected with 10Gbit/s DAC in a ring-net and everyone connected with 1Gbit/s to "public net" and IPMI with separated Connection.

I spent some time in googling and forum to get an idea how to build my cluster, and i think this is pretty a valuable and solid setup. I also have spare drives, so if something fails i can relatively replace them directly.

Node with 4 or more Nodes will be come in question in some years, but firstly it's better than 1 Node with only RAIDZ1, or equivalent to 3 nodes in ceph, but with the benefit of 1 node can die itself and no data is directly gone and everything is mostly working fine. Also it's important for me to maintain or upgrade the server without downtime.

And for this to my questions:
1. How can i copy one of my currently prod vms to my homelab without getting the prod VM beeing stopped to test the CEPH performance with my currently prod VMs on my cluster.
2. When I'm ready to move the 1 node server to my cluster, what's the best way, with minimal downtime? Should i add the 1 node server to the cluster and migrate the vms to the cluster and specifiy the new cephpool as target pool? Or is there a better way to do this.

(Sorry for my english)

THanks

leesteken · Dec 26, 2024

jstrauss said:
Should i add the 1 node server to the cluster and migrate the vms to the cluster

Never add a node with VM/CTs to a cluster: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_join_node_to_cluster

jstrauss said:
1. How can i copy one of my currently prod vms to my homelab without getting the prod VM beeing stopped to test the CEPH performance with my currently prod VMs on my cluster.

Why not restore an already existing backup of that VM on the cluster? Then you don't have to touch production at all. Just make sure your test cluster does not share the network with production (as to prevent interference while trying things out).

jstrauss said:
Node with 4 or more Nodes will be come in question in some years, but firstly it's better than 1 Node with only RAIDZ1, or equivalent to 3 nodes in ceph, but with the benefit of 1 node can die itself and no data is directly gone and everything is mostly working fine. Also it's important for me to maintain or upgrade the server without downtime.

Your new cluster is three times more likely to have a failed node than your current production setup with a single node. And Ceph won't be "mostly working fine" with just two nodes. Best to have a spare node (or two), which you might as well put in the cluster right away.
What will you do when one node fails and a new one is "some years" away? You say you want redundancy, but you're going for the bare minimum instead. I fear you'll get into trouble quickly with this approach.

waltar · Dec 26, 2024

And have a doku what's to do for different failure states for another one be familar as of you as it's possible you aren't available always as like on holiday in other country or take unintentionally stay in hospital for a couple of weeks. That's real life as just a disk died in our pve cluster over these cristmas days and new acquisition will take some days in january ...

jstrauss · Dec 26, 2024

leesteken said:
Why not restore an already existing backup of that VM on the cluster? Then you don't have to touch production at all. Just make sure your test cluster does not share the network with production (as to prevent interference while trying things out).

Yes i will try this. It's an good idea, which I haven't thought of...

leesteken said:
Your new cluster is three times more likely to have a failed node than your current production setup with a single node. And Ceph won't be "mostly working fine" with just two nodes. Best to have a spare node (or two), which you might as well put in the cluster right away.
What will you do when one node fails and a new one is "some years" away? You say you want redundancy, but you're going for the bare minimum instead. I fear you'll get into trouble quickly with this approach.

Yes logically viewed, it's right. But on my single node server, if this server fails everything is down... on the cluster, when 1 node fails... it's not the greatest, but the servers are still online and I am able to resolve the issue and replace a node or drives and so on in some shorter time. And the odds really happening is very low on enterprise hardware. (Even it's possible, but really....). I maybe think about a fourth node. Or even join the cluster of a friend and mine together (because he also wants to but his cluster into the same rack in the dc). So then we have 6 nodes and can share our infrastructure and computing power.

But do you think running prod on 3 node cluster is really that bad? It's way better than a single server, has redundancy on server-level and not only on drive-only... I try to run some prod vms on the cluster and play around with the setup and look how well it's doing when a node and drives are failing... And maybe in 1-2 months i will get another node or put it straight into dc.

adp · Dec 27, 2024

jstrauss said:
2. When I'm ready to move the 1 node server to my cluster, what's the best way, with minimal downtime? Should i add the 1 node server to the cluster and migrate the vms to the cluster and specifiy the new cephpool as target pool? Or is there a better way to do this.

Did you have a look at the recently released Proxmox Datacenter Manager already? It allows to migrate machines between clusters/nodes. Maybe that can be of interest. Be aware, still alpha quality software.

twei · Dec 27, 2024

adp said:
Did you have a look at the recently released Proxmox Datacenter Manager already? It allows to migrate machines between clusters/nodes. Maybe that can be of interest. Be aware, still alpha quality software.

It's Alpha Software, so please make sure your Backup works before trying that

Search

Search

Migrating from one Node Proxmox to clustered Proxmox

jstrauss

Member

waltar

Renowned Member

UdoB

Distinguished Member

waltar

Renowned Member

jstrauss

Member

leesteken

Distinguished Member

waltar

Renowned Member

jstrauss

Member

adp

New Member

twei

Member

We value your privacy