[SOLVED] PVE 6 to 7 Upgrade Pre-check request

neuron

Active Member
Mar 15, 2019
25
3
43
43
Hello all,

We are running PVE 6.4 and looking to upgrade to 7. I have done 5.4 to 6 in the past, back in 2020, with no issues. However, this time around, there seem to be a little more "gotchas" outside my comfort zone - particularly when it comes to hardcoding the MAC Address of the bridge(s) and possibly ZFS/GRUB boot issues. I have checked over the 6 to 7 upgrade documentation as well as ran the pve6to7 tool. The only change I have made so far was to the Debian Bullseye security repo. We have 5 hosts in our cluster, but we don't use Ceph at all. We're only using ZFS storage for redundancy, with the VMs running on the hosts locally off their dedicated SSDs.

My pve6to7 tool comes up mostly clean, with some updates not being applicable due to Buster vs. Bullseye (right?):

Code:
root@pve1:~# pve6to7
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =

Checking for package updates..
WARN: updates for the following packages are available:
  perl-base, libavformat58, librabbitmq4, libavfilter7, libpocketsphinx3, libsphinxbase3, ocl-icd-libopencl1, ocl-icd-libopencl1, libswresample3, libwbclient0, bind9-host, bind9-libs, dnsutils, bind9-dnsutils, lynx, libpostproc55, samba-libs, libldb2, python3-ldb, samba-common, libavcodec58, libavutil56, ocl-icd-libopencl1, ocl-icd-libopencl1, libswscale5, libsmbclient, smbclient, perl, perl-modules-5.32, libperl5.32

Checking proxmox-ve package version..
PASS: proxmox-ve package has version >= 6.4-1

Checking running kernel version..
PASS: expected running kernel '5.4.151-1-pve'.

= CHECKING CLUSTER HEALTH/SETTINGS =

PASS: systemd unit 'pve-cluster.service' is in state 'active'
PASS: systemd unit 'corosync.service' is in state 'active'
PASS: Cluster Filesystem is quorate.

Analzying quorum settings and state..
INFO: configured votes - nodes: 5
INFO: configured votes - qdevice: 0
INFO: current expected votes: 5
INFO: current total votes: 5

Checking nodelist entries..
PASS: nodelist settings OK

Checking totem settings..
PASS: totem settings OK

INFO: run 'pvecm status' to get detailed cluster status..

= CHECKING HYPER-CONVERGED CEPH STATUS =

SKIP: no hyper-converged ceph setup detected!

= CHECKING CONFIGURED STORAGES =

PASS: storage 'PVE0-NFS' enabled and active.
PASS: storage 'local' enabled and active.
SKIP: storage 'local-lvm' disabled.
PASS: storage 'pve1-vm-storage' enabled and active.
SKIP: storage 'pve2-vm-storage' disabled.
SKIP: storage 'pve3-vm-storage' disabled.
SKIP: storage 'pve4-vm-storage' disabled.

= MISCELLANEOUS CHECKS =

INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
WARN: 4 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'pve1' is resolvable..
INFO: Checking if resolved IP is configured on local node..
PASS: Resolved node IP '10.0.1.4' configured and active on single interface.
INFO: Checking backup retention settings..
PASS: no problems found.
INFO: checking CIFS credential location..
PASS: no CIFS credentials at outdated location found.
INFO: Checking custom roles for pool permissions..
INFO: Checking node and guest description/note legnth..
PASS: All node config descriptions fit in the new limit of 64 KiB
PASS: All guest config descriptions fit in the new limit of 8 KiB
INFO: Checking container configs for deprecated lxc.cgroup entries
PASS: No legacy 'lxc.cgroup' keys found.
INFO: Checking storage content type configuration..
PASS: no problems found
INFO: Checking if the suite for the Debian security repository is correct..
PASS: already using 'bullseye-security'
SKIP: NOTE: Expensive checks, like CT cgroupv2 compat, not performed without '--full' parameter

= SUMMARY =

TOTAL:    29
PASSED:   21
SKIPPED:  6
WARNINGS: 2
FAILURES: 0

ATTENTION: Please check the output for detailed information!
root@pve1:~#

My biggest concern is booting up post-upgrade. All of my hosts are using ZFS 100%, across all disks/partitions and I have confirmed it with these checks:

Code:
root@pve1:~# findmnt /
TARGET SOURCE           FSTYPE OPTIONS
/      rpool/ROOT/pve-1 zfs    rw,relatime,xattr,noacl

root@pve1:~# ls /sys/firmware/efi
ls: cannot access '/sys/firmware/efi': No such file or directory

root@pve0:~# lsblk -o +FSTYPE
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT FSTYPE
sda      8:0    0 223.6G  0 disk            zfs_member
├─sda1   8:1    0  1007K  0 part            zfs_member
├─sda2   8:2    0   512M  0 part            zfs_member
└─sda3   8:3    0 223.1G  0 part            zfs_member
sdb      8:16   0   7.3T  0 disk            
├─sdb1   8:17   0   7.3T  0 part            zfs_member
└─sdb9   8:25   0     8M  0 part            
sdc      8:32   0   7.3T  0 disk            
├─sdc1   8:33   0   7.3T  0 part            zfs_member
└─sdc9   8:41   0     8M  0 part            
sdd      8:48   0   7.3T  0 disk            
├─sdd1   8:49   0   7.3T  0 part            zfs_member
└─sdd9   8:57   0     8M  0 part            
sde      8:64   0   7.3T  0 disk            
├─sde1   8:65   0   7.3T  0 part            zfs_member
└─sde9   8:73   0     8M  0 part            
sdf      8:80   0   7.3T  0 disk            
├─sdf1   8:81   0   7.3T  0 part            zfs_member
└─sdf9   8:89   0     8M  0 part            
sdg      8:96   0   7.3T  0 disk            
├─sdg1   8:97   0   7.3T  0 part            zfs_member
└─sdg9   8:105  0     8M  0 part            
sdh      8:112  0   7.3T  0 disk            
├─sdh1   8:113  0   7.3T  0 part            zfs_member
└─sdh9   8:121  0     8M  0 part            
sdi      8:128  0   7.3T  0 disk            
├─sdi1   8:129  0   7.3T  0 part            zfs_member
└─sdi9   8:137  0     8M  0 part            
sdj      8:144  0   7.3T  0 disk            
├─sdj1   8:145  0   7.3T  0 part            zfs_member
└─sdj9   8:153  0     8M  0 part            
sdk      8:160  0   7.3T  0 disk            
├─sdk1   8:161  0   7.3T  0 part            zfs_member
└─sdk9   8:169  0     8M  0 part            
sdl      8:176  0 223.6G  0 disk            zfs_member
├─sdl1   8:177  0  1007K  0 part            zfs_member
├─sdl2   8:178  0   512M  0 part            zfs_member
└─sdl3   8:179  0 223.1G  0 part            zfs_member

I also am concerned about the need to hardcode MAC Addresses. This is my /etc/network/interfaces on one of my hosts:


Code:
auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

iface enp130s0f0 inet manual

iface enp130s0f1 inet manual

iface enp4s0f0 inet manual

iface enp4s0f1 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves eno2 eno3
        bond-miimon 100
        bond-mode active-backup

auto vmbr0
iface vmbr0 inet static
        address  10.0.1.10
        netmask  255.255.255.0
        gateway  10.0.1.1
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        
auto vmbr2
iface vmbr2 inet static
        address  10.4.4.10
        netmask  255.255.255.0
        bridge-ports enp130s0f1
        bridge-stp off
        bridge-fd 0

I just want this major upgrade to go really smoothly. Thank you all in advance for your time and support.
 
The only change I have made so far was to the Debian Bullseye security repo.

Shouldn´t this be a part of the actual upgrade? Why doing this upfront?

We have 5 hosts in our cluster,....

Then you can start with one and test with some (less important) VM after the upgrade. If that alll works ok then you can proceed with the rest.

My pve6to7 tool comes up mostly clean, with some updates not being applicable due to Buster vs. Bullseye (right?):

I would switch back to the Buster repos and run pve6to7 again.

If ali is ok, then upgrade by precisely following the upgrade guide.
 
Shouldn´t this be a part of the actual upgrade? Why doing this upfront?



Then you can start with one and test with some (less important) VM after the upgrade. If that alll works ok then you can proceed with the rest.



I would switch back to the Buster repos and run pve6to7 again.

If ali is ok, then upgrade by precisely following the upgrade guide.
Well, well, well. I decided to go ahead and upgrade one of our servers which doesn't have any VMs and sure enough the upgrade bombed after a short time:

Code:
Errors were encountered while processing:
 /tmp/apt-dpkg-install-srvgMv/418-xdelta3_3.0.11-dfsg-1+b1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

I then tried to do apt update and apt upgrade -y (it found 1 package) again. bombed again.

I rebooted it and it boots to the prompt just fine but isn't accessible via the web GUI, SSH or anything else.

I should note that I did install ifupdown2. I also should note that I did change the repos to Bullseye.

What troubleshooting steps can I take to get this working again?
 
Ever use apt dist-upgrade or apt full-upgrade.
Ok, understood. I've always done the upgrades via the WebGUI so I know it was doing dist-upgrade all along. Thanks for the clarification

I did apt --fix-broken install and it looks like it's working so far.

I noticed that my networking is completely down so hopefully after this install fix goes through, it will work again. Will update.
 
Ok, it looks like my server is back online and operational! Thank you for the tips and guidance here - really appreciate it.

I'm confident enough to upgrade the rest of my cluster when the time comes.

Going to spend the rest of the night actually re-doing my first server as a proxmox backup server. thanks again all.