3 hosts bricked due to apt upgrade

AngryAdm · Sep 29, 2021

when the updater gets to pve-manager 7.0.11 there is a timeout, a brief dump and then apt hangs. After a reboot, system wont start vm's saying all sorts of nonsense, such as:

()
TASK ERROR: KVM virtualisation configured, but not available. Either disable in VM configuration or enable in BIOS.

Which does not suddenly get disabled in bios on a system with no keyboard or mouse...

[ 16.655215] BPF: type_id=104829 bits_offset=256
[ 16.655224] BPF:
[ 16.655226] BPF:Invalid name
[ 16.655229] BPF:

[ 16.655232] failed to validate module [irqbypass] BTF: -22
[ 16.655445] BPF:[104824] STRUCT
[ 16.655452] BPF:size=96 vlen=1
[ 16.655471] BPF:
[ 16.655474] BPF:Invalid name
[ 16.655476] BPF:

[ 16.655479] failed to validate module [cryptd] BTF: -22
[ 16.655777] BPF:[104825] ENUM ves
[ 16.655784] BPF:size=4 vlen=20
[ 16.655785] BPF:
[ 16.655786] BPF:Invalid name
[ 16.655787] BPF:

[ 16.655789] failed to validate module [intel_rapl_common] BTF: -22
[ 16.763378] BPF: type_id=104829 bits_offset=256
[ 16.763388] BPF:
[ 16.763390] BPF:Invalid name
[ 16.763391] BPF:

[ 16.763395] failed to validate module [irqbypass] BTF: -22
[ 16.763570] BPF:[104824] ENUM SV
[ 16.763595] BPF:size=4 vlen=4
[ 16.763599] BPF:
[ 16.763600] BPF:Invalid name
[ 16.763601] BPF:

[ 16.763605] failed to validate module [edac_mce_amd] BTF: -22
[ 16.839447] BPF:[104825] ENUM ves
[ 16.839456] BPF:size=4 vlen=20
[ 16.839459] BPF:
[ 16.839461] BPF:Invalid name
[ 16.839462] BPF:

[ 16.839466] failed to validate module [intel_rapl_common] BTF: -22
[ 16.839748] BPF:[104824] ENUM SV
[ 16.839756] BPF:size=4 vlen=4
[ 16.839758] BPF:
[ 16.839777] BPF:Invalid name
[ 16.839779] BPF:

[ 16.839782] failed to validate module [edac_mce_amd] BTF: -22
[ 16.903521] BPF:[104825] ENUM ves
[ 16.903528] BPF:size=4 vlen=20
[ 16.903529] BPF:
[ 16.903530] BPF:Invalid name
[ 16.903531] BPF:

[ 16.903533] failed to validate module [intel_rapl_common] BTF: -22
[ 22.684797] BPF:[104825] ENUM oups
[ 22.684804] BPF:size=4 vlen=11
[ 22.684805] BPF:
[ 22.684806] BPF:Invalid name
[ 22.684807] BPF:

[ 22.684809] failed to validate module [nfnetlink] BTF: -22
[ 22.690811] audit: type=1400 audit(1632931359.995:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=3326 comm="apparmor_parser"
[ 22.692522] audit: type=1400 audit(1632931359.999:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lsb_release" pid=3322 comm="apparmor_parser"
[ 22.692730] audit: type=1400 audit(1632931359.999:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=3327 comm="apparmor_parser"
[ 22.692739] audit: type=1400 audit(1632931359.999:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=3327 comm="apparmor_parser"
[ 22.692744] audit: type=1400 audit(1632931359.999:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=3327 comm="apparmor_parser"
[ 22.693037] audit: type=1400 audit(1632931359.999:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=3323 comm="apparmor_parser"
[ 22.693043] audit: type=1400 audit(1632931359.999:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=3323 comm="apparmor_parser"
[ 22.694638] audit: type=1400 audit(1632931359.999:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=3320 comm="apparmor_parser"
[ 22.697145] audit: type=1400 audit(1632931360.003:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="tcpdump" pid=3321 comm="apparmor_parser"
[ 22.698374] audit: type=1400 audit(1632931360.003:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=3324 comm="apparmor_parser"
[ 22.702134] Process accounting resumed
[ 22.787117] BPF:[104823] FUNC
[ 22.787126] BPF:type_id=1213
[ 22.787127] BPF:
[ 22.787129] BPF:Invalid name
[ 22.787130] BPF:

[ 22.787133] failed to validate module [softdog] BTF: -22
[ 22.837136] BPF:[104831] FUNC
[ 22.837143] BPF:type_id=1213
[ 22.837144] BPF:
[ 22.837146] BPF:Invalid name
[ 22.837147] BPF:

[ 22.837149] failed to validate module [tls] BTF: -22
[ 23.083520] BPF:[104831] FUNC
[ 23.083527] BPF:type_id=1213
[ 23.083528] BPF:
[ 23.083529] BPF:Invalid name
[ 23.083530] BPF:

[ 23.083533] failed to validate module [tls] BTF: -22
[ 23.196844] vmbr0: port 1(enp4s0f0) entered blocking state
[ 23.196854] vmbr0: port 1(enp4s0f0) entered disabled state
[ 23.196935] device enp4s0f0 entered promiscuous mode
[ 23.254572] pps pps1: new PPS source ptp1
[ 23.254627] ixgbe 0000:04:00.0: registered PHC device on enp4s0f0
[ 23.548494] BPF:[104831] FUNC
[ 23.548501] BPF:type_id=1213
[ 23.548502] BPF:
[ 23.548503] BPF:Invalid name
[ 23.548504] BPF:

[ 23.548506] failed to validate module [tls] BTF: -22
[ 23.712412] BPF:[104823] STRUCT
[ 23.712419] BPF:size=24 vlen=5
[ 23.712421] BPF:
[ 23.712422] BPF:Invalid name
[ 23.712423] BPF:

[ 23.712425] failed to validate module [bpfilter] BTF: -22
[ 23.722476] BPF:[104835] STRUCT
[ 23.722483] BPF:size=16 vlen=4
[ 23.722485] BPF:
[ 23.722486] BPF:Invalid name
[ 23.722488] BPF:

[ 23.722491] failed to validate module [scsi_transport_iscsi] BTF: -22
[ 23.811188] BPF:[104898] STRUCT
[ 23.811196] BPF:size=8 vlen=2
[ 23.811198] BPF:
[ 23.811199] BPF:Invalid name
[ 23.811201] BPF:

[ 23.811203] failed to validate module [x_tables] BTF: -22
[ 24.929347] BPF:[104826] STRUCT
[ 24.929354] BPF:size=48 vlen=10
[ 24.929355] BPF:
[ 24.929356] BPF:Invalid name
[ 24.929357] BPF:

[ 24.929359] failed to validate module [udp_tunnel] BTF: -22
[ 25.384047] BPF:[104823] STRUCT
[ 25.384055] BPF:size=24 vlen=5
[ 25.384056] BPF:
[ 25.384057] BPF:Invalid name
[ 25.384058] BPF:

AngryAdm · Sep 29, 2021

So far so good, and (bad)

root@pve01:~# apt-get dist-upgrade
E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
root@pve01:~# dpkg --configure -a
Setting up proxmox-backup-file-restore (2.0.10-1) ...
Updating file-restore initramfs...
11516 blocks
Setting up librados2 (16.2.6-pve2) ...
dpkg: error processing package pve-kernel-5.11.22-4-pve (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
Setting up librgw2 (16.2.6-pve2) ...
Setting up python3-ceph-argparse (16.2.6-pve2) ...
Setting up libpve-common-perl (7.0-9) ...
Setting up libcephfs2 (16.2.6-pve2) ...
Setting up tzdata (2021a-1+deb11u1) ...

Current default time zone: 'Europe/Copenhagen'
Local time is now: Wed Sep 29 18:11:11 CEST 2021.
Universal Time is now: Wed Sep 29 16:11:11 UTC 2021.
Run 'dpkg-reconfigure tzdata' if you wish to change it.

Setting up libjaeger (16.2.6-pve2) ...
Setting up proxmox-backup-client (2.0.10-1) ...
Setting up libradosstriper1 (16.2.6-pve2) ...
Setting up python3-ceph-common (16.2.6-pve2) ...
Setting up librbd1 (16.2.6-pve2) ...
Setting up ceph-mgr-modules-core (16.2.6-pve2) ...
Setting up ceph-fuse (16.2.6-pve2) ...
Setting up python3-rados (16.2.6-pve2) ...
Setting up python3-rbd (16.2.6-pve2) ...
Setting up python3-rgw (16.2.6-pve2) ...
Setting up python3-cephfs (16.2.6-pve2) ...
Setting up ceph-common (16.2.6-pve2) ...
Setting system user ceph properties..usermod: no changes
..done
Fixing /var/run/ceph ownership....done
Setting up ceph-base (16.2.6-pve2) ...
Setting up ceph-mds (16.2.6-pve2) ...
Setting up ceph-mgr (16.2.6-pve2) ...
Setting up ceph-osd (16.2.6-pve2) ...
Setting up ceph-mon (16.2.6-pve2) ...
Setting up ceph (16.2.6-pve2) ...
Processing triggers for libc-bin (2.31-13) ...
Processing triggers for pve-manager (7.0-11) ...
Processing triggers for man-db (2.9.4-2) ...
Processing triggers for pve-ha-manager (3.3-1) ...
Errors were encountered while processing:
pve-kernel-5.11.22-4-pve
root@pve01:~#

AngryAdm · Sep 29, 2021

root@pve01:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 0 B/75.6 MB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]
dpkg: error processing package pve-kernel-5.11.22-4-pve (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
Errors were encountered while processing:
pve-kernel-5.11.22-4-pve
E: Sub-process /usr/bin/dpkg returned an error code (1)
root@pve01:~# apt-get remove pve-kernel-5.11.22-4-pve
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be REMOVED:
proxmox-ve pve-kernel-5.11 pve-kernel-5.11.22-4-pve
0 upgraded, 0 newly installed, 3 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 392 MB disk space will be freed.
Do you want to continue? [Y/n]
W: (pve-apt-hook) !! WARNING !!
W: (pve-apt-hook) You are attempting to remove the meta-package 'proxmox-ve'!
W: (pve-apt-hook)
W: (pve-apt-hook) If you really want to permanently remove 'proxmox-ve' from your system, run the following command
W: (pve-apt-hook) touch '/please-remove-proxmox-ve'
W: (pve-apt-hook) run apt purge proxmox-ve to remove the meta-package
W: (pve-apt-hook) and repeat your apt invocation.
W: (pve-apt-hook)
W: (pve-apt-hook) If you are unsure why 'proxmox-ve' would be removed, please verify
W: (pve-apt-hook) - your APT repository settings
W: (pve-apt-hook) - that you are using 'apt full-upgrade' to upgrade your system
E: Sub-process /usr/share/proxmox-ve/pve-apt-hook returned an error code (1)
E: Failure running script /usr/share/proxmox-ve/pve-apt-hook
root@pve01:~#

AngryAdm · Sep 29, 2021

rebooting the host yielded it not coming back up.

I guess I know what happens if I reboot the other two hosts...
Time to waste a few days fixing this mess!

At least the network interfaces on the vm's have not gone missing like last time something exciting happened.

AlexLup · Sep 30, 2021

Is your ifdownup2 working? Upgrade failed writing that package for me and others..

AngryAdm · Sep 30, 2021

Yeah, it's working. What I am pondering is the following on an un-upgraded host:

root@pve05:~# uname -r
5.11.22-4-pve
root@pve05:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
libpve-common-perl proxmox-backup-client proxmox-backup-file-restore pve-container pve-kernel-5.11.22-4-pve qemu-server tzdata
7 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 81.5 MB of archives.
After this operation, 760 kB disk space will be freed.
Do you want to continue? [Y/n]

Why am i presented with the option to upgrade to the kernel I already have? I know that if I do this, the system will be screwed :/

PS: I fixed this on my three hosts by editing /boot/grub/grub.cfg
I replaced all instances of pve-ker.......22-4 with 22-3 on the hosts that had -3 installed.
On one host, I had 22-1 so I put that in there on that host.
They now boot.
Before booting they looked like attachment when booting.

Oh, sorry, I cant attach a file because it is 2021 and 3.8mb which is normal cell phone picture size does not fit the proxmox forum. Denied due to filesize...

So I guess I have to spend a few extra minuttes in the middle of this crisis to resize the image so that it fits this aging forum....

AngryAdm · Sep 30, 2021

Now I am getting this on one host: Ohh the joy!

[ 1924.671429] RIP: 0033:0x7f48918d3b07
[ 1924.671432] RSP: 002b:00007fff033c71f8 EFLAGS: 00000246 ORIG_RAX: 00000000000 00053
[ 1924.671436] RAX: ffffffffffffffda RBX: 0000564c4ef552a0 RCX: 00007f48918d3b07
[ 1924.671438] RDX: 0000564c4d8d1a65 RSI: 00000000000001ff RDI: 0000564c52db3d30
[ 1924.671441] RBP: 0000000000000000 R08: 0000564c52fbed68 R09: 0000000000000000
[ 1924.671443] R10: 0000000000000006 R11: 0000000000000246 R12: 0000564c52db3d30
[ 1924.671445] R13: 0000564c5030eb48 R14: 0000564c531e2658 R15: 00000000000001ff
[ 2044.676554] INFO: task pvesr:63540 blocked for more than 724 seconds.
[ 2044.676564] Tainted: P O 5.11.22-3-pve #1
[ 2044.676567] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2044.676570] task

vesr state

stack: 0 pid:63540 ppid: 1 fl ags:0x00000000
[ 2044.676575] Call Trace:
[ 2044.676579] __schedule+0x2ca/0x880
[ 2044.676585] schedule+0x4f/0xc0
[ 2044.676588] rwsem_down_write_slowpath+0x212/0x590
[ 2044.676593] down_write+0x43/0x50
[ 2044.676596] filename_create+0x7e/0x160
[ 2044.676600] do_mkdirat+0x58/0x140
[ 2044.676603] __x64_sys_mkdir+0x1b/0x20
[ 2044.676607] do_syscall_64+0x38/0x90
[ 2044.676610] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2044.676615] RIP: 0033:0x7f48918d3b07
[ 2044.676619] RSP: 002b:00007fff033c71f8 EFLAGS: 00000246 ORIG_RAX: 00000000000 00053
[ 2044.676623] RAX: ffffffffffffffda RBX: 0000564c4ef552a0 RCX: 00007f48918d3b07
[ 2044.676626] RDX: 0000564c4d8d1a65 RSI: 00000000000001ff RDI: 0000564c52db3d30
[ 2044.676629] RBP: 0000000000000000 R08: 0000564c52fbed68 R09: 0000000000000000
[ 2044.676631] R10: 0000000000000006 R11: 0000000000000246 R12: 0000564c52db3d30
[ 2044.676634] R13: 0000564c5030eb48 R14: 0000564c531e2658 R15: 00000000000001ff
[ 2044.676640] INFO: task qm:119406 blocked for more than 120 seconds.
[ 2044.676643] Tainted: P O 5.11.22-3-pve #1
[ 2044.676646] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2044.676649] task:qm state

stack: 0 pid:119406 ppid:116558 f lags:0x00000000
[ 2044.676653] Call Trace:
[ 2044.676656] __schedule+0x2ca/0x880
[ 2044.676660] schedule+0x4f/0xc0
[ 2044.676664] rwsem_down_write_slowpath+0x212/0x590
[ 2044.676670] down_write+0x43/0x50
[ 2044.676674] filename_create+0x7e/0x160
[ 2044.676679] do_mkdirat+0x58/0x140
[ 2044.676684] __x64_sys_mkdir+0x1b/0x20
[ 2044.676688] do_syscall_64+0x38/0x90
[ 2044.676692] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2044.676696] RIP: 0033:0x7f6a9651db07
[ 2044.676698] RSP: 002b:00007fff25213758 EFLAGS: 00000246 ORIG_RAX: 00000000000 00053
[ 2044.676702] RAX: ffffffffffffffda RBX: 0000559051c4a2a0 RCX: 00007f6a9651db07
[ 2044.676705] RDX: 0000000000000020 RSI: 00000000000001ff RDI: 0000559055aa8430
[ 2044.676707] RBP: 0000000000000000 R08: 0000559055adcdc8 R09: 0000000000000000
[ 2044.676710] R10: 0000000000000006 R11: 0000000000000246 R12: 0000559055aa8430
[ 2044.676712] R13: 00005590531845d8 R14: 0000559055a44290 R15: 00000000000001ff
root@pve01:~# <.[A

AngryAdm · Sep 30, 2021

This might be the reason...

root@pve01:~# apt-get update
Hit:1 http://security.debian.org bullseye-security InRelease
Hit:2 http://download.proxmox.com/debian/pve bullseye InRelease
Hit:3 http://ftp.dk.debian.org/debian bullseye InRelease
Hit:4 http://download.proxmox.com/debian/ceph-pacific bullseye InRelease
Get:5 http://ftp.dk.debian.org/debian bullseye-updates InRelease [39.4 kB]
Fetched 39.4 kB in 2s (25.0 kB/s)
Reading package lists... Done
root@pve01:~# apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 75.6 MB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://download.proxmox.com/debian/pve bullseye/pve-no-subscription amd64 pve-kernel-5.11.22-4-pve amd64 5.11.22-9 [75.6 MB]
Fetched 75.6 MB in 1min 22s (926 kB/s)
dpkg: error processing package pve-kernel-5.11.22-4-pve (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
Errors were encountered while processing:
pve-kernel-5.11.22-4-pve
E: Sub-process /usr/bin/dpkg returned an error code (1)

I remove the file from cache to redownload it and above error pops up. I guess we have a corrupt file on the proxmox servers? What is the probability that 3 systems fail to install this?

AngryAdm · Sep 30, 2021

disk errors? -Nope..

Code:

  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:42 with 0 errors on Thu Sep 30 23:06:02 2021
config:

        NAME                                         STATE     READ WRITE CKSUM
        rpool                                        ONLINE       0     0     0
          mirror-0                                   ONLINE       0     0     0
            ata-INTENSO_AA000000000000005960-part3   ONLINE       0     0     0
            ata-Intenso_SSD_3813430-532104011-part3  ONLINE       0     0     0

errors: No known data errors

  pool: storage
 state: ONLINE
  scan: scrub repaired 0B in 06:35:16 with 0 errors on Sun Sep 12 06:59:18 2021
config:

        NAME                                          STATE     READ WRITE CKSUM
        storage                                       ONLINE       0     0     0
          raidz2-0                                    ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1HY09GG       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1J9YJ9G       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_VBGT4W5F       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1JAGJAG       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_V1JAP6WG       ONLINE       0     0     0
            ata-WDC_WD4003FRYZ-01F0DB0_VBGSTVVF       ONLINE       0     0     0
        logs
          nvme-RMS-200_0038450                        ONLINE       0     0     0
        cache
          ata-KINGSTON_SA400S37240G_50026B7380C6FD4F  ONLINE       0     0     0
          ata-KINGSTON_SA400S37240G_50026B7380C7043A  ONLINE       0     0     0

errors: No known data errors

vesalius · Sep 30, 2021

I assume the reddit post below was from you? Had you not initially run apt upgrade and not apt dist-upgrade? If so that might be pertinent info for the gurus here to help you with.

https://www.reddit.com/r/Proxmox/comments/pxz3hm/aptupdate_just_bricked_three_nodes/

AlexLup · Sep 30, 2021

MIght be a longshot, but I have had this message popping out on me, turned out to be RAM/CPU channel that was broken so had to replace motherboard.

Saying that because the only thing that the last download touches is RAM.

AngryAdm · Oct 1, 2021

Not on three different nodes. one does not have ECC, two others do. Ryzen 5 PRO 4650G, Ryzen 7 3800X, and an i3.

I just tried on a laptop with a celeron in it, same bingo. I am network corruption would show up on my MPTCP router.

It seems the cluster "behaves" somewhat stable now.

Also, i never run upgrade, i always run dist-upgrade
I might have done that after the initial break frantically attempting to fix things, but it started with a dist-upgrade

chrcoluk · Oct 1, 2021

I know this doesnt help you now, but in future, dont do your other boxes until the first is done without problems. Ideally have a test proxmox install somewhere where you can run through upgrade cycle before doing production boxes.

Good luck in getting this resolved.

AngryAdm · Oct 1, 2021

Screenshot of PVE01 trying to boot the -4 kernel. Before i replaced stuff in grub.conf to boot -3

AngryAdm · Oct 1, 2021

pve01 also does not like the kernel package.

fabian · Oct 1, 2021

can you try with an added --reinstall?

AngryAdm · Oct 5, 2021

fabian said:
can you try with an added --reinstall?

I can attempt so. Currently the systems are running somewhat crippled, but stable. I will reinstall the cluster during the coming weekend.
Before reinstalling I will attempt your suggestion and return to you with results.

AngryAdm · Oct 5, 2021

trying dist-upgrading pve02 today. This was the result:

Code:

run-parts: executing /etc/kernel/postinst.d/zz-update-grub 5.11.22-5-pve /boot/vmlinuz-5.11.22-5-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.11.22-5-pve
Found initrd image: /boot/initrd.img-5.11.22-5-pve
Found linux image: /boot/vmlinuz-5.11.22-4-pve
Found initrd image: /boot/initrd.img-5.11.22-4-pve
Found linux image: /boot/vmlinuz-5.11.22-3-pve
Found initrd image: /boot/initrd.img-5.11.22-3-pve
Found linux image: /boot/vmlinuz-5.11.22-1-pve
Found initrd image: /boot/initrd.img-5.11.22-1-pve
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
Adding boot menu entry for EFI firmware configuration
done
Setting up pve-kernel-5.11 (7.0-8) ...
Processing triggers for pve-ha-manager (3.3-1) ...
Processing triggers for pve-manager (7.0-11) ...
got timeout
Found initrd image: /boot/initrd.img-5.11.22-1-pve
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
Adding boot menu entry for EFI firmware configuration
done
Setting up pve-kernel-5.11 (7.0-8) ...
Processing triggers for pve-ha-manager (3.3-1) ...
Processing triggers for pve-manager (7.0-11) ...
got timeout

Processing triggers for man-db (2.9.4-2) ...

This is essentially what happened to the other systems when attempting to upgrade. pve-manager 7.0-11 would timeout and then more or less hell broke loose.

fabian · Oct 6, 2021

could you check the journal around the time of the upgrade? the pve-manager trigger does the following:

Code:

 70     # test if /etc/pve is mounted; else simple exit to avoid
 71     # error during updates
 72     test -f /etc/pve/local/pve-ssl.pem || exit 0;
 73     test -e /proxmox_install_mode && exit 0;
 74
 75     # the ExecStartPre doesn't triggers on service reload, so just in case
 76     /usr/bin/pvecm updatecerts --silent || true
 77
 78     deb-systemd-invoke reload-or-try-restart pvedaemon.service
 79     deb-systemd-invoke reload-or-try-restart pvestatd.service
 80     deb-systemd-invoke reload-or-try-restart pveproxy.service
 81     deb-systemd-invoke reload-or-try-restart spiceproxy.service
 82
 83     exit 0;;

so if it was one of the service reloads that ran into a timeout, this should be visible in the journal.

Dacesilian · Feb 6, 2022

I have this same problem when upgrading my server. But this BPF spam comes from old kernel - once I've rebooted, it's gone and fine.

My issue was that I was using legacy boot - before reboot I had to proceed this and install UEFI with https://pve.proxmox.com/wiki/ZFS:_Switch_Legacy-Boot_to_Proxmox_Boot_Tool .

3 hosts bricked due to apt upgrade

Member

Member

Member

Member

Well-Known Member

Member

Member

Member

Member

Well-Known Member

Well-Known Member

Member

Renowned Member

Member

Attachments

Member

Proxmox Staff Member

Member

Member

Proxmox Staff Member

Member