[SOLVED] Upgrade 3.4 HA cluster to 4.0 via reinstallation with minimal downtime

jampy

Member
Jun 26, 2015
39
0
6
I have a 3-node Proxmox 3.4 HA cluster running since a few months and need to upgrade to Proxmox 4.0.
I'm not very happy with uprading Wheezy to Jessie in-place (bad experience), so I'd prefer to reinstall each server one by one using a clean Jessie installation (in order to sleep better afterwards..).

I do have an idea how this could be done with minimal downtime for the VMs (<10 minutes), but please have a look and tell me if there are any pitfalls...

Current situation:
- three Proxmox 3.4 HA nodes (paid subscription)
- Dell PowerEdge R730 Hardware hosted at Hetzner
- 64 GB RAM, 4 TB RAID10 HDD each
- redundant 1 Gbit LAN (192.168.1.x/24)
- GlusterFS used for shared storage (in 3x replication mode), accessed by Proxmox via localhost NFS (not native GlusterFS API)
- about ~13 VMs, all in HA mode
- no OpenVZ containers
- external Backup Storage, also accessed via NFS
- currently the system load is low and can be handled by a single physical server

Most VMs are used for internal purposes and could accept a longer downtime (max 1 day), but some are critical and downtime must be kept at a minimum.


My upgrade plan:
A-1) shut down node #1 (the first that is about to be upgraded)
A-2) remove node #1 from the Proxmox cluster (pvevm delnode "metal1")
A-3) remove node #1 from the Gluster volume/cluster (gluster volume remove-brick ... && gluster peer detach "metal1")
A-4) install Debian Jessie on node #1, overwriting all data on the HDD - with same Network settings and hostname as before
A-5) install Proxmox 4.0 on node #1
A-6) install Gluster on node #1 and add it back to the Gluster volume (gluster volume add-brick ...) => shared storage will be complete again (spanning 3.4 and 4.0 nodes)
A-7) configure the Gluster volume as shared storage in Proxmox 4 (node #1)
A-8) configure the external Backup storage on node #1 (Proxmox 4)

Then, for each VM (starting with some less-critical ones):
B-1) stop the VM in the Proxmox 3.4 cluster
B-2) backup the VM
B-3) restore the VM on the Proxmox 4.0 node (node#1)
B-4) start the VM on node#1
B-5) check if it is working correctly

IMHO this should move the VMs from one cluster to "another" and still allow LAN communication between VMs during that operation no matter where they are (4.0 VMs talking to 3.4 VMs and vice-verse).

The remaining 3.4 HA cluster (still having quorum) would be without and VMs left running now, so I can just shut down them, install Debian Jessie + Proxmox 4.0 and rebuild the cluster.

Finally, activate HA again and cross fingers.


Do you see any problem with this plan? Please note that I'll have Proxmox 3.4 HA and Proxmox 4.0 on the same subnet - could that cause any unwanted side-effects? Any issues with the subscription key being re-used?

Some tests using VirtualBox were promising, but it's hard to test without real hardware.

Thanks for any hint in advance...
 
Last edited:
No answers?

Anway, in case someone else stumbles upon this: it works. I just did it and could upgrade the cluster by re-installing each node.

I just had the problem that multicast didn't want to work at first, even though it worked before with the Proxmox 3.4 cluster.

With my Netgear 24 port switches I had IGMP snooping enabled with Proxmox 3.4. Switching IGMP snooping off didn't help, but once I re-enabled IGMP snooping, multicast was working again. Perhaps re-enabling it caused some sort of reinitialization, I don't know.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!