3 hosts bricked due to apt upgrade

AngryAdm

Member
Sep 5, 2020
145
33
18
94
when the updater gets to pve-manager 7.0.11 there is a timeout, a brief dump and then apt hangs. After a reboot, system wont start vm's saying all sorts of nonsense, such as:

()
TASK ERROR: KVM virtualisation configured, but not available. Either disable in VM configuration or enable in BIOS.

Which does not suddenly get disabled in bios on a system with no keyboard or mouse...



[ 16.655215] BPF: type_id=104829 bits_offset=256
[ 16.655224] BPF:
[ 16.655226] BPF:Invalid name
[ 16.655229] BPF:

[ 16.655232] failed to validate module [irqbypass] BTF: -22
[ 16.655445] BPF:[104824] STRUCT
[ 16.655452] BPF:size=96 vlen=1
[ 16.655471] BPF:
[ 16.655474] BPF:Invalid name
[ 16.655476] BPF:

[ 16.655479] failed to validate module [cryptd] BTF: -22
[ 16.655777] BPF:[104825] ENUM ves
[ 16.655784] BPF:size=4 vlen=20
[ 16.655785] BPF:
[ 16.655786] BPF:Invalid name
[ 16.655787] BPF:

[ 16.655789] failed to validate module [intel_rapl_common] BTF: -22
[ 16.763378] BPF: type_id=104829 bits_offset=256
[ 16.763388] BPF:
[ 16.763390] BPF:Invalid name
[ 16.763391] BPF:

[ 16.763395] failed to validate module [irqbypass] BTF: -22
[ 16.763570] BPF:[104824] ENUM SV
[ 16.763595] BPF:size=4 vlen=4
[ 16.763599] BPF:
[ 16.763600] BPF:Invalid name
[ 16.763601] BPF:

[ 16.763605] failed to validate module [edac_mce_amd] BTF: -22
[ 16.839447] BPF:[104825] ENUM ves
[ 16.839456] BPF:size=4 vlen=20
[ 16.839459] BPF:
[ 16.839461] BPF:Invalid name
[ 16.839462] BPF:

[ 16.839466] failed to validate module [intel_rapl_common] BTF: -22
[ 16.839748] BPF:[104824] ENUM SV
[ 16.839756] BPF:size=4 vlen=4
[ 16.839758] BPF:
[ 16.839777] BPF:Invalid name
[ 16.839779] BPF:

[ 16.839782] failed to validate module [edac_mce_amd] BTF: -22
[ 16.903521] BPF:[104825] ENUM ves
[ 16.903528] BPF:size=4 vlen=20
[ 16.903529] BPF:
[ 16.903530] BPF:Invalid name
[ 16.903531] BPF:

[ 16.903533] failed to validate module [intel_rapl_common] BTF: -22
[ 22.684797] BPF:[104825] ENUM oups
[ 22.684804] BPF:size=4 vlen=11
[ 22.684805] BPF:
[ 22.684806] BPF:Invalid name
[ 22.684807] BPF:

[ 22.684809] failed to validate module [nfnetlink] BTF: -22
[ 22.690811] audit: type=1400 audit(1632931359.995:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=3326 comm="apparmor_parser"
[ 22.692522] audit: type=1400 audit(1632931359.999:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=3322 comm="apparmor_parser"
[ 22.692730] audit: type=1400 audit(1632931359.999:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=3327 comm="apparmor_parser"
[ 22.692739] audit: type=1400 audit(1632931359.999:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=3327 comm="apparmor_parser"
[ 22.692744] audit: type=1400 audit(1632931359.999:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=3327 comm="apparmor_parser"
[ 22.693037] audit: type=1400 audit(1632931359.999:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=3323 comm="apparmor_parser"
[ 22.693043] audit: type=1400 audit(1632931359.999:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=3323 comm="apparmor_parser"
[ 22.694638] audit: type=1400 audit(1632931359.999:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=3320 comm="apparmor_parser"
[ 22.697145] audit: type=1400 audit(1632931360.003:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="tcpdump" pid=3321 comm="apparmor_parser"
[ 22.698374] audit: type=1400 audit(1632931360.003:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=3324 comm="apparmor_parser"
[ 22.702134] Process accounting resumed
[ 22.787117] BPF:[104823] FUNC
[ 22.787126] BPF:type_id=1213
[ 22.787127] BPF:
[ 22.787129] BPF:Invalid name
[ 22.787130] BPF:

[ 22.787133] failed to validate module [softdog] BTF: -22
[ 22.837136] BPF:[104831] FUNC
[ 22.837143] BPF:type_id=1213
[ 22.837144] BPF:
[ 22.837146] BPF:Invalid name
[ 22.837147] BPF:

[ 22.837149] failed to validate module [tls] BTF: -22
[ 23.083520] BPF:[104831] FUNC
[ 23.083527] BPF:type_id=1213
[ 23.083528] BPF:
[ 23.083529] BPF:Invalid name
[ 23.083530] BPF:

[ 23.083533] failed to validate module [tls] BTF: -22
[ 23.196844] vmbr0: port 1(enp4s0f0) entered blocking state
[ 23.196854] vmbr0: port 1(enp4s0f0) entered disabled state
[ 23.196935] device enp4s0f0 entered promiscuous mode
[ 23.254572] pps pps1: new PPS source ptp1
[ 23.254627] ixgbe 0000:04:00.0: registered PHC device on enp4s0f0
[ 23.548494] BPF:[104831] FUNC
[ 23.548501] BPF:type_id=1213
[ 23.548502] BPF:
[ 23.548503] BPF:Invalid name
[ 23.548504] BPF:

[ 23.548506] failed to validate module [tls] BTF: -22
[ 23.712412] BPF:[104823] STRUCT
[ 23.712419] BPF:size=24 vlen=5
[ 23.712421] BPF:
[ 23.712422] BPF:Invalid name
[ 23.712423] BPF:

[ 23.712425] failed to validate module [bpfilter] BTF: -22
[ 23.722476] BPF:[104835] STRUCT
[ 23.722483] BPF:size=16 vlen=4
[ 23.722485] BPF:
[ 23.722486] BPF:Invalid name
[ 23.722488] BPF:

[ 23.722491] failed to validate module [scsi_transport_iscsi] BTF: -22
[ 23.811188] BPF:[104898] STRUCT
[ 23.811196] BPF:size=8 vlen=2
[ 23.811198] BPF:
[ 23.811199] BPF:Invalid name
[ 23.811201] BPF:

[ 23.811203] failed to validate module [x_tables] BTF: -22
[ 24.929347] BPF:[104826] STRUCT
[ 24.929354] BPF:size=48 vlen=10
[ 24.929355] BPF:
[ 24.929356] BPF:Invalid name
[ 24.929357] BPF:

[ 24.929359] failed to validate module [udp_tunnel] BTF: -22
[ 25.384047] BPF:[104823] STRUCT
[ 25.384055] BPF:size=24 vlen=5
[ 25.384056] BPF:
[ 25.384057] BPF:Invalid name
[ 25.384058] BPF:
 
Last edited:
So far so good, and (bad)

root@pve01:~# apt-get dist-upgrade
E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
root@pve01:~# dpkg --configure -a
Setting up proxmox-backup-file-restore (2.0.10-1) ...
Updating file-restore initramfs...
11516 blocks
Setting up librados2 (16.2.6-pve2) ...
dpkg: error processing package pve-kernel-5.11.22-4-pve (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
Setting up librgw2 (16.2.6-pve2) ...
Setting up python3-ceph-argparse (16.2.6-pve2) ...
Setting up libpve-common-perl (7.0-9) ...
Setting up libcephfs2 (16.2.6-pve2) ...
Setting up tzdata (2021a-1+deb11u1) ...

Current default time zone: 'Europe/Copenhagen'
Local time is now: Wed Sep 29 18:11:11 CEST 2021.
Universal Time is now: Wed Sep 29 16:11:11 UTC 2021.
Run 'dpkg-reconfigure tzdata' if you wish to change it.

Setting up libjaeger (16.2.6-pve2) ...
Setting up proxmox-backup-client (2.0.10-1) ...
Setting up libradosstriper1 (16.2.6-pve2) ...
Setting up python3-ceph-common (16.2.6-pve2) ...
Setting up librbd1 (16.2.6-pve2) ...
Setting up ceph-mgr-modules-core (16.2.6-pve2) ...
Setting up ceph-fuse (16.2.6-pve2) ...
Setting up python3-rados (16.2.6-pve2) ...
Setting up python3-rbd (16.2.6-pve2) ...
Setting up python3-rgw (16.2.6-pve2) ...
Setting up python3-cephfs (16.2.6-pve2) ...
Setting up ceph-common (16.2.6-pve2) ...
Setting system user ceph properties..usermod: no changes
..done
Fixing /var/run/ceph ownership....done
Setting up ceph-base (16.2.6-pve2) ...
Setting up ceph-mds (16.2.6-pve2) ...
Setting up ceph-mgr (16.2.6-pve2) ...
Setting up ceph-osd (16.2.6-pve2) ...
Setting up ceph-mon (16.2.6-pve2) ...
Setting up ceph (16.2.6-pve2) ...
Processing triggers for libc-bin (2.31-13) ...
Processing triggers for pve-manager (7.0-11) ...
Processing triggers for man-db (2.9.4-2) ...
Processing triggers for pve-ha-manager (3.3-1) ...
Errors were encountered while processing:
pve-kernel-5.11.22-4-pve
root@pve01:~#
 
root@pve01:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 0 B/75.6 MB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]
dpkg: error processing package pve-kernel-5.11.22-4-pve (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
Errors were encountered while processing:
pve-kernel-5.11.22-4-pve
E: Sub-process /usr/bin/dpkg returned an error code (1)
root@pve01:~# apt-get remove pve-kernel-5.11.22-4-pve
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be REMOVED:
proxmox-ve pve-kernel-5.11 pve-kernel-5.11.22-4-pve
0 upgraded, 0 newly installed, 3 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 392 MB disk space will be freed.
Do you want to continue? [Y/n]
W: (pve-apt-hook) !! WARNING !!
W: (pve-apt-hook) You are attempting to remove the meta-package 'proxmox-ve'!
W: (pve-apt-hook)
W: (pve-apt-hook) If you really want to permanently remove 'proxmox-ve' from your system, run the following command
W: (pve-apt-hook) touch '/please-remove-proxmox-ve'
W: (pve-apt-hook) run apt purge proxmox-ve to remove the meta-package
W: (pve-apt-hook) and repeat your apt invocation.
W: (pve-apt-hook)
W: (pve-apt-hook) If you are unsure why 'proxmox-ve' would be removed, please verify
W: (pve-apt-hook) - your APT repository settings
W: (pve-apt-hook) - that you are using 'apt full-upgrade' to upgrade your system
E: Sub-process /usr/share/proxmox-ve/pve-apt-hook returned an error code (1)
E: Failure running script /usr/share/proxmox-ve/pve-apt-hook
root@pve01:~#
 
rebooting the host yielded it not coming back up.

I guess I know what happens if I reboot the other two hosts...
Time to waste a few days fixing this mess!

At least the network interfaces on the vm's have not gone missing like last time something exciting happened.
 
Is your ifdownup2 working? Upgrade failed writing that package for me and others..
 
Yeah, it's working. What I am pondering is the following on an un-upgraded host:

root@pve05:~# uname -r
5.11.22-4-pve
root@pve05:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
libpve-common-perl proxmox-backup-client proxmox-backup-file-restore pve-container pve-kernel-5.11.22-4-pve qemu-server tzdata
7 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 81.5 MB of archives.
After this operation, 760 kB disk space will be freed.
Do you want to continue? [Y/n]


Why am i presented with the option to upgrade to the kernel I already have? I know that if I do this, the system will be screwed :/

PS: I fixed this on my three hosts by editing /boot/grub/grub.cfg
I replaced all instances of pve-ker.......22-4 with 22-3 on the hosts that had -3 installed.
On one host, I had 22-1 so I put that in there on that host.
They now boot.
Before booting they looked like attachment when booting.

Oh, sorry, I cant attach a file because it is 2021 and 3.8mb which is normal cell phone picture size does not fit the proxmox forum. Denied due to filesize...

So I guess I have to spend a few extra minuttes in the middle of this crisis to resize the image so that it fits this aging forum....
 
Last edited:
Now I am getting this on one host: Ohh the joy!

[ 1924.671429] RIP: 0033:0x7f48918d3b07
[ 1924.671432] RSP: 002b:00007fff033c71f8 EFLAGS: 00000246 ORIG_RAX: 00000000000 00053
[ 1924.671436] RAX: ffffffffffffffda RBX: 0000564c4ef552a0 RCX: 00007f48918d3b07
[ 1924.671438] RDX: 0000564c4d8d1a65 RSI: 00000000000001ff RDI: 0000564c52db3d30
[ 1924.671441] RBP: 0000000000000000 R08: 0000564c52fbed68 R09: 0000000000000000
[ 1924.671443] R10: 0000000000000006 R11: 0000000000000246 R12: 0000564c52db3d30
[ 1924.671445] R13: 0000564c5030eb48 R14: 0000564c531e2658 R15: 00000000000001ff
[ 2044.676554] INFO: task pvesr:63540 blocked for more than 724 seconds.
[ 2044.676564] Tainted: P O 5.11.22-3-pve #1
[ 2044.676567] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2044.676570] task:pvesr state:D stack: 0 pid:63540 ppid: 1 fl ags:0x00000000
[ 2044.676575] Call Trace:
[ 2044.676579] __schedule+0x2ca/0x880
[ 2044.676585] schedule+0x4f/0xc0
[ 2044.676588] rwsem_down_write_slowpath+0x212/0x590
[ 2044.676593] down_write+0x43/0x50
[ 2044.676596] filename_create+0x7e/0x160
[ 2044.676600] do_mkdirat+0x58/0x140
[ 2044.676603] __x64_sys_mkdir+0x1b/0x20
[ 2044.676607] do_syscall_64+0x38/0x90
[ 2044.676610] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2044.676615] RIP: 0033:0x7f48918d3b07
[ 2044.676619] RSP: 002b:00007fff033c71f8 EFLAGS: 00000246 ORIG_RAX: 00000000000 00053
[ 2044.676623] RAX: ffffffffffffffda RBX: 0000564c4ef552a0 RCX: 00007f48918d3b07
[ 2044.676626] RDX: 0000564c4d8d1a65 RSI: 00000000000001ff RDI: 0000564c52db3d30
[ 2044.676629] RBP: 0000000000000000 R08: 0000564c52fbed68 R09: 0000000000000000
[ 2044.676631] R10: 0000000000000006 R11: 0000000000000246 R12: 0000564c52db3d30
[ 2044.676634] R13: 0000564c5030eb48 R14: 0000564c531e2658 R15: 00000000000001ff
[ 2044.676640] INFO: task qm:119406 blocked for more than 120 seconds.
[ 2044.676643] Tainted: P O 5.11.22-3-pve #1
[ 2044.676646] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2044.676649] task:qm state:D stack: 0 pid:119406 ppid:116558 f lags:0x00000000
[ 2044.676653] Call Trace:
[ 2044.676656] __schedule+0x2ca/0x880
[ 2044.676660] schedule+0x4f/0xc0
[ 2044.676664] rwsem_down_write_slowpath+0x212/0x590
[ 2044.676670] down_write+0x43/0x50
[ 2044.676674] filename_create+0x7e/0x160
[ 2044.676679] do_mkdirat+0x58/0x140
[ 2044.676684] __x64_sys_mkdir+0x1b/0x20
[ 2044.676688] do_syscall_64+0x38/0x90
[ 2044.676692] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2044.676696] RIP: 0033:0x7f6a9651db07
[ 2044.676698] RSP: 002b:00007fff25213758 EFLAGS: 00000246 ORIG_RAX: 00000000000 00053
[ 2044.676702] RAX: ffffffffffffffda RBX: 0000559051c4a2a0 RCX: 00007f6a9651db07
[ 2044.676705] RDX: 0000000000000020 RSI: 00000000000001ff RDI: 0000559055aa8430
[ 2044.676707] RBP: 0000000000000000 R08: 0000559055adcdc8 R09: 0000000000000000
[ 2044.676710] R10: 0000000000000006 R11: 0000000000000246 R12: 0000559055aa8430
[ 2044.676712] R13: 00005590531845d8 R14: 0000559055a44290 R15: 00000000000001ff
root@pve01:~# <.[A
 
This might be the reason...

root@pve01:~# apt-get update
Hit:1 http://security.debian.org bullseye-security InRelease
Hit:2 http://download.proxmox.com/debian/pve bullseye InRelease
Hit:3 http://ftp.dk.debian.org/debian bullseye InRelease
Hit:4 http://download.proxmox.com/debian/ceph-pacific bullseye InRelease
Get:5 http://ftp.dk.debian.org/debian bullseye-updates InRelease [39.4 kB]
Fetched 39.4 kB in 2s (25.0 kB/s)
Reading package lists... Done
root@pve01:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 75.6 MB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://download.proxmox.com/debian/pve bullseye/pve-no-subscription amd64 pve-kernel-5.11.22-4-pve amd64 5.11.22-9 [75.6 MB]
Fetched 75.6 MB in 1min 22s (926 kB/s)
dpkg: error processing package pve-kernel-5.11.22-4-pve (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
Errors were encountered while processing:
pve-kernel-5.11.22-4-pve
E: Sub-process /usr/bin/dpkg returned an error code (1)

I remove the file from cache to redownload it and above error pops up. I guess we have a corrupt file on the proxmox servers? What is the probability that 3 systems fail to install this?
 
Last edited:
disk errors? -Nope..


Code:
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:42 with 0 errors on Thu Sep 30 23:06:02 2021
config:

        NAME                                         STATE     READ WRITE CKSUM
        rpool                                        ONLINE       0     0     0
          mirror-0                                   ONLINE       0     0     0
            ata-INTENSO_AA000000000000005960-part3   ONLINE       0     0     0
            ata-Intenso_SSD_3813430-532104011-part3  ONLINE       0     0     0

errors: No known data errors

  pool: storage
 state: ONLINE
  scan: scrub repaired 0B in 06:35:16 with 0 errors on Sun Sep 12 06:59:18 2021
config:

        NAME                                          STATE     READ WRITE CKSUM
        storage                                       ONLINE       0     0     0
          raidz2-0                                    ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1HY09GG       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1J9YJ9G       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_VBGT4W5F       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1JAGJAG       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1JAP6WG       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_VBGSTVVF       ONLINE       0     0     0
        logs
          nvme-RMS-200_0038450                        ONLINE       0     0     0
        cache
          ata-KINGSTON_SA400S37240G_50026B7380C6FD4F  ONLINE       0     0     0
          ata-KINGSTON_SA400S37240G_50026B7380C7043A  ONLINE       0     0     0

errors: No known data errors
 
MIght be a longshot, but I have had this message popping out on me, turned out to be RAM/CPU channel that was broken so had to replace motherboard.

Saying that because the only thing that the last download touches is RAM.
 
Not on three different nodes. one does not have ECC, two others do. Ryzen 5 PRO 4650G, Ryzen 7 3800X, and an i3.

I just tried on a laptop with a celeron in it, same bingo. I am network corruption would show up on my MPTCP router.

It seems the cluster "behaves" somewhat stable now.

Also, i never run upgrade, i always run dist-upgrade
I might have done that after the initial break frantically attempting to fix things, but it started with a dist-upgrade
 
I know this doesnt help you now, but in future, dont do your other boxes until the first is done without problems. Ideally have a test proxmox install somewhere where you can run through upgrade cycle before doing production boxes.

Good luck in getting this resolved.
 
Screenshot of PVE01 trying to boot the -4 kernel. Before i replaced stuff in grub.conf to boot -3
 

Attachments

  • Untitled.jpg
    Untitled.jpg
    251.6 KB · Views: 17
can you try with an added --reinstall?
 
can you try with an added --reinstall?

I can attempt so. Currently the systems are running somewhat crippled, but stable. I will reinstall the cluster during the coming weekend.
Before reinstalling I will attempt your suggestion and return to you with results.
 
trying dist-upgrading pve02 today. This was the result:

Code:
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 5.11.22-5-pve /boot/vmlinuz-5.11.22-5-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.11.22-5-pve
Found initrd image: /boot/initrd.img-5.11.22-5-pve
Found linux image: /boot/vmlinuz-5.11.22-4-pve
Found initrd image: /boot/initrd.img-5.11.22-4-pve
Found linux image: /boot/vmlinuz-5.11.22-3-pve
Found initrd image: /boot/initrd.img-5.11.22-3-pve
Found linux image: /boot/vmlinuz-5.11.22-1-pve
Found initrd image: /boot/initrd.img-5.11.22-1-pve
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
Adding boot menu entry for EFI firmware configuration
done
Setting up pve-kernel-5.11 (7.0-8) ...
Processing triggers for pve-ha-manager (3.3-1) ...
Processing triggers for pve-manager (7.0-11) ...
got timeout
Found initrd image: /boot/initrd.img-5.11.22-1-pve
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
Adding boot menu entry for EFI firmware configuration
done
Setting up pve-kernel-5.11 (7.0-8) ...
Processing triggers for pve-ha-manager (3.3-1) ...
Processing triggers for pve-manager (7.0-11) ...
got timeout

Processing triggers for man-db (2.9.4-2) ...


This is essentially what happened to the other systems when attempting to upgrade. pve-manager 7.0-11 would timeout and then more or less hell broke loose.
 
could you check the journal around the time of the upgrade? the pve-manager trigger does the following:

Code:
 70     # test if /etc/pve is mounted; else simple exit to avoid
 71     # error during updates
 72     test -f /etc/pve/local/pve-ssl.pem || exit 0;
 73     test -e /proxmox_install_mode && exit 0;
 74
 75     # the ExecStartPre doesn't triggers on service reload, so just in case
 76     /usr/bin/pvecm updatecerts --silent || true
 77
 78     deb-systemd-invoke reload-or-try-restart pvedaemon.service
 79     deb-systemd-invoke reload-or-try-restart pvestatd.service
 80     deb-systemd-invoke reload-or-try-restart pveproxy.service
 81     deb-systemd-invoke reload-or-try-restart spiceproxy.service
 82
 83     exit 0;;

so if it was one of the service reloads that ran into a timeout, this should be visible in the journal.