Debian update with open-vswitch stopped networking

ednt

Active Member
Mar 16, 2017
96
7
28
Hi,
this morning I installed the latest debian updates on our cluster.
I do 4 servers at once. 3 of the first 4 were losing the network connection.
All 3 of them uses OVS the 4th is using linux bridges,
The last message on the proxmox console was always, that it is not possible to start openvswitch.
I had to go to the basement and a 'networking stop' and start was needed to bring them back to live.
Unfortunatelly is the complete ceph ssd pool located on this 3 servers.

Any ideas for further updaters?
 
  • Like
Reactions: joaquim22

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
2,109
253
103
Can you provide the syslogs for the time of the upgrade? /var/log/syslog.*
Were there any additional informations/error messages?

Please provide the output of pveversion -v
 

ednt

Active Member
Mar 16, 2017
96
7
28
Hi,

the pveversion:
Code:
proxmox-ve: 7.1-1 (running kernel: 5.15.12-1-pve)
pve-manager: 7.1-11 (running version: 7.1-11/8d529482)
pve-kernel-5.15: 7.1-13
pve-kernel-helper: 7.1-13
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.27-1-pve: 5.15.27-1
pve-kernel-5.15.19-2-pve: 5.15.19-3
pve-kernel-5.15.12-1-pve: 5.15.12-3
pve-kernel-5.13.19-6-pve: 5.13.19-14
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1


Kernel is 5.15.12-1-pve
 

ednt

Active Member
Mar 16, 2017
96
7
28
Code:
Mar 28 08:02:06 pm-ceph-167 chronyd[2556]: chronyd exiting
Mar 28 08:02:06 pm-ceph-167 systemd[1]: Stopping chrony, an NTP client/server...
Mar 28 08:02:06 pm-ceph-167 systemd[1]: chrony.service: Succeeded.
Mar 28 08:02:06 pm-ceph-167 systemd[1]: Stopped chrony, an NTP client/server.
Mar 28 08:02:06 pm-ceph-167 systemd[1]: chrony.service: Consumed 44.325s CPU time.
Mar 28 08:02:06 pm-ceph-167 systemd[1]: Starting chrony, an NTP client/server...
Mar 28 08:02:06 pm-ceph-167 chronyd[800864]: chronyd version 4.0 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SECHASH +IPV6 -DEBUG)
Mar 28 08:02:06 pm-ceph-167 chronyd[800864]: Frequency 5.132 +/- 0.021 ppm read from /var/lib/chrony/chrony.drift
Mar 28 08:02:06 pm-ceph-167 chronyd[800864]: Using right/UTC timezone to obtain leap second data
Mar 28 08:02:06 pm-ceph-167 chronyd[800864]: Loaded seccomp filter
Mar 28 08:02:06 pm-ceph-167 systemd[1]: Started chrony, an NTP client/server.
Mar 28 08:02:07 pm-ceph-167 systemd[1]: Reloading.
Mar 28 08:02:07 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

Mar 28 08:02:07 pm-ceph-167 systemd[1]: /lib/systemd/system/bareos-filedaemon.service:29: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
Mar 28 08:02:07 pm-ceph-167 systemd[1]: Reloading.
Mar 28 08:02:07 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
Mar 28 08:02:07 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
Mar 28 08:02:07 pm-ceph-167 systemd[1]: /lib/systemd/system/bareos-filedaemon.service:29: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Stopping Open vSwitch...
Mar 28 08:02:08 pm-ceph-167 systemd[1]: openvswitch-switch.service: Succeeded.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Stopped Open vSwitch.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Stopping Open vSwitch Forwarding Unit...
Mar 28 08:02:08 pm-ceph-167 ovs-ctl[801018]: Exiting ovs-vswitchd (1908).
Mar 28 08:02:08 pm-ceph-167 systemd[1]: ovs-vswitchd.service: Succeeded.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Stopped Open vSwitch Forwarding Unit.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: ovs-vswitchd.service: Consumed 14h 5min 29.202s CPU time.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Stopping Open vSwitch Database Unit...
Mar 28 08:02:08 pm-ceph-167 ovs-ctl[801065]: Exiting ovsdb-server (1694).
Mar 28 08:02:08 pm-ceph-167 systemd[1]: ovsdb-server.service: Succeeded.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Stopped Open vSwitch Database Unit.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: ovsdb-server.service: Consumed 27min 9.462s CPU time.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Starting Open vSwitch Database Unit...
Mar 28 08:02:08 pm-ceph-167 ovs-ctl[801106]: Starting ovsdb-server.
Mar 28 08:02:08 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=8.2.0
Mar 28 08:02:08 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.15.0 "external-ids:system-id=\"43fda56b-9c04-45d1-af71-2455a8a1d6bd\"" "external-ids:rundir=\"/var/run/openvswitch\"" "system-type=\"debian\"" "system-version=\"11\""
Mar 28 08:02:08 pm-ceph-167 ovs-ctl[801106]: Configuring Open vSwitch system IDs.
Mar 28 08:02:08 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait add Open_vSwitch . external-ids hostname=pm-ceph-167.ednt.de
Mar 28 08:02:08 pm-ceph-167 ovs-ctl[801106]: Enabling remote OVSDB managers.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Started Open vSwitch Database Unit.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Starting Open vSwitch Forwarding Unit...
Mar 28 08:02:08 pm-ceph-167 systemd[1]: Reloading.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

Mar 28 08:02:08 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
Mar 28 08:02:08 pm-ceph-167 systemd[1]: /lib/systemd/system/bareos-filedaemon.service:29: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
Mar 28 08:02:09 pm-ceph-167 ovs-ctl[801150]: Starting ovs-vswitchd.
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.194099] device ceph_sync left promiscuous mode
Mar 28 08:02:09 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait add Open_vSwitch . external-ids hostname=pm-ceph-167.ednt.de
Mar 28 08:02:09 pm-ceph-167 ovs-ctl[801150]: Enabling remote OVSDB managers.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Started Open vSwitch Forwarding Unit.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Starting Open vSwitch...
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Finished Open vSwitch.
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.260319] device ceph_public left promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.291880] device ens803 left promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.291922] device vmbr1 left promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.327860] device eno1 left promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.328631] device vmbr0 left promiscuous mode
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Reloading.
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.364507] No such timeout policy "ovs_test_tp"
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.364512] Failed to associated timeout policy `ovs_test_tp'
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.431517] device eno1 entered promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.439456] device vmbr0 entered promiscuous mode
Mar 28 08:02:09 pm-ceph-167 systemd-udevd[801218]: Using default interface naming scheme 'v247'.
Mar 28 08:02:09 pm-ceph-167 systemd-udevd[801218]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:09 pm-ceph-167 systemd-udevd[801218]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.450836] device ceph_public entered promiscuous mode
Mar 28 08:02:09 pm-ceph-167 systemd-udevd[801218]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.460767] device ceph_sync entered promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.461147] device ens803 entered promiscuous mode
Mar 28 08:02:09 pm-ceph-167 kernel: [5864685.471236] device vmbr1 entered promiscuous mode
Mar 28 08:02:09 pm-ceph-167 systemd-udevd[801218]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

Mar 28 08:02:09 pm-ceph-167 systemd[1]: /lib/systemd/system/bareos-filedaemon.service:29: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Stopping Open vSwitch...
Mar 28 08:02:09 pm-ceph-167 systemd[1]: openvswitch-switch.service: Succeeded.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Stopped Open vSwitch.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Stopping Open vSwitch Forwarding Unit...
Mar 28 08:02:09 pm-ceph-167 ovs-ctl[801280]: Exiting ovs-vswitchd (801208).
Mar 28 08:02:09 pm-ceph-167 systemd[1]: ovs-vswitchd.service: Succeeded.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Stopped Open vSwitch Forwarding Unit.
Mar 28 08:02:09 pm-ceph-167 systemd[1]: Stopping Open vSwitch Database Unit...
Mar 28 08:02:09 pm-ceph-167 ovs-ctl[801301]: Exiting ovsdb-server (801135).
Mar 28 08:02:10 pm-ceph-167 systemd[1]: ovsdb-server.service: Succeeded.
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Stopped Open vSwitch Database Unit.
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Starting Open vSwitch Database Unit...
Mar 28 08:02:10 pm-ceph-167 ovs-ctl[801323]: Starting ovsdb-server.
Mar 28 08:02:10 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=8.2.0
Mar 28 08:02:10 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.15.0 "external-ids:system-id=\"43fda56b-9c04-45d1-af71-2455a8a1d6bd\"" "external-ids:rundir=\"/var/run/openvswitch\"" "system-type=\"debian\"" "system-version=\"11\""
Mar 28 08:02:10 pm-ceph-167 ovs-ctl[801323]: Configuring Open vSwitch system IDs.
Mar 28 08:02:10 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait add Open_vSwitch . external-ids hostname=pm-ceph-167.ednt.de
Mar 28 08:02:10 pm-ceph-167 ovs-ctl[801323]: Enabling remote OVSDB managers.
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Started Open vSwitch Database Unit.
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Starting Open vSwitch Forwarding Unit...
Mar 28 08:02:10 pm-ceph-167 ovs-ctl[801365]: Starting ovs-vswitchd.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.357699] device vmbr1 left promiscuous mode
Mar 28 08:02:10 pm-ceph-167 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait add Open_vSwitch . external-ids hostname=pm-ceph-167.ednt.de
Mar 28 08:02:10 pm-ceph-167 ovs-ctl[801365]: Enabling remote OVSDB managers.
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Started Open vSwitch Forwarding Unit.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.407855] device ens803 left promiscuous mode
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.407895] device ceph_sync left promiscuous mode
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Starting Open vSwitch...
Mar 28 08:02:10 pm-ceph-167 systemd[1]: Finished Open vSwitch.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.451778] device ceph_public left promiscuous mode
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.507783] device vmbr0 left promiscuous mode
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.543857] device eno1 left promiscuous mode
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.545343] No such timeout policy "ovs_test_tp"
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.545346] Failed to associated timeout policy `ovs_test_tp'
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.616930] device eno1 entered promiscuous mode
Mar 28 08:02:10 pm-ceph-167 systemd-udevd[801218]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.623942] device vmbr0 entered promiscuous mode
Mar 28 08:02:10 pm-ceph-167 systemd-udevd[801217]: Using default interface naming scheme 'v247'.
Mar 28 08:02:10 pm-ceph-167 systemd-udevd[801217]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.633696] device ceph_public entered promiscuous mode
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.643538] device ceph_sync entered promiscuous mode
Mar 28 08:02:10 pm-ceph-167 systemd-udevd[801437]: Using default interface naming scheme 'v247'.
Mar 28 08:02:10 pm-ceph-167 systemd-udevd[801437]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.643973] device ens803 entered promiscuous mode
Mar 28 08:02:10 pm-ceph-167 systemd-udevd[801217]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Mar 28 08:02:10 pm-ceph-167 kernel: [5864686.654037] device vmbr1 entered promiscuous mode
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 15 link: 0 is down
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 7 link: 0 is down
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 6 link: 0 is down
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 4 link: 0 is down
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 3 link: 0 is down
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 2 link: 0 is down
Mar 28 08:02:12 pm-ceph-167 corosync[3366]:   [KNET  ] link: host: 1 link: 0 is down
 
Jun 8, 2016
343
65
53
46
Johannesburg, South Africa
We were up to date as of Friday morning 7:11am (GMT+2), updates released since then result in OvS being broken when running:
Code:
apt-get update; apt-get -y dist-upgrade;

We're running ifupdown2 with /etc/network/interfaces as such:
Code:
auto lo
iface lo inet loopback

auto vmbr0
allow-ovs vmbr0
iface vmbr0 inet manual
        ovs_type OVSBridge
        ovs_ports bond0 vlan1 vlan100
        ovs_mtu 9216

auto ether0
allow-vmbr0 ether0
iface ether0 inet manual

auto ether1
allow-vmbr0 ether1
iface ether1 inet manual

auto bond0
allow-vmbr0 bond0
iface bond0 inet manual
        ovs_bridge vmbr0
        ovs_type OVSBond
        ovs_bonds ether0 ether1
        ovs_options bond_mode=balance-tcp lacp=active other_config:lacp-time=fast tag=1 vlan_mode=native-untagged
        ovs_mtu 9216

auto vlan1
allow-vmbr0 vlan1
iface vlan1 inet dhcp
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=1
        ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
        hwaddress 82:de:ad:be:ef:22
        ovs_mtu 1500

auto vlan100
allow-vmbr0 vlan100
iface vlan100 inet static
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=100
        ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
        hwaddress 7a:de:ad:be:ef:c8
        address 10.255.1.3
        netmask 255.255.255.0
        ovs_mtu 9216

auto ether2
iface ether2 inet manual

auto ether3
iface ether3 inet manual

auto ether4
iface ether4 inet manual

auto ether5
iface ether5 inet manual

This essentially uses OvS to create a LACP bond interface, using ether0 and ether1, to connect this to a bridge. VLANs 1 (Front End) and 100 (Ceph and Corosync) connect to the same bridge. PS: e0 and e1 are 10Gbps whereas e2 - e5 are 1 Gbps.


Pre-upgrade state (patched as of Friday morning:
Code:
[admin@kvm1c ~]# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-11 (running version: 7.1-11/8d529482)
pve-kernel-helper: 7.1-13
pve-kernel-5.13: 7.1-9
pve-kernel-5.13.19-6-pve: 5.13.19-14
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1


Logs of upgrade:
Code:
[admin@kvm1c ~]# apt-get update; apt-get -y dist-upgrade; apt-get autoremove; apt-get autoclean;
Get:1 http://security.debian.org bullseye-security InRelease [44.1 kB]
Hit:2 http://ftp.debian.org/debian bullseye InRelease
Get:3 http://ftp.debian.org/debian bullseye-updates InRelease [39.4 kB]
Get:4 http://security.debian.org bullseye-security/main amd64 Packages [123 kB]
Hit:5 http://download.proxmox.com/debian/ceph-pacific bullseye InRelease
Hit:6 https://enterprise.proxmox.com/debian/pve bullseye InRelease
Fetched 207 kB in 2s (112 kB/s)
<snip>
The following packages will be upgraded:
  base-files dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server gpgconf gpgsm gpgv gtk-update-icon-cache intel-microcode
  libarchive13 libc-bin libc-dev-bin libc-devtools libc-l10n libc6 libc6-dev libflac8 libnss-systemd libpam-systemd libssl1.1 libsystemd0 libudev1 libxml2
  linux-libc-dev locales nscd openssl openvswitch-common openvswitch-switch systemd systemd-sysv sysvinit-utils tasksel tasksel-data tzdata udev usb.ids
42 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 38.4 MB of archives.
After this operation, 162 kB disk space will be freed.
Get:1 http://ftp.debian.org/debian bullseye/main amd64 base-files amd64 11.1+deb11u3 [70.1 kB]
Get:2 http://ftp.debian.org/debian bullseye/main amd64 libc-devtools amd64 2.31-13+deb11u3 [245 kB]
Get:3 http://ftp.debian.org/debian bullseye/main amd64 libc6-dev amd64 2.31-13+deb11u3 [2,348 kB]
Get:4 http://ftp.debian.org/debian bullseye/main amd64 libc-dev-bin amd64 2.31-13+deb11u3 [275 kB]
Get:5 http://ftp.debian.org/debian bullseye/main amd64 linux-libc-dev amd64 5.10.106-1 [1,470 kB]
Get:6 http://ftp.debian.org/debian bullseye/main amd64 libc6 amd64 2.31-13+deb11u3 [2,811 kB]
Get:7 http://ftp.debian.org/debian bullseye/main amd64 libc-bin amd64 2.31-13+deb11u3 [821 kB]
Get:8 http://ftp.debian.org/debian bullseye/main amd64 sysvinit-utils amd64 2.96-7+deb11u1 [25.6 kB]
Get:9 http://ftp.debian.org/debian bullseye/main amd64 libnss-systemd amd64 247.3-7 [198 kB]
Get:10 http://ftp.debian.org/debian bullseye/main amd64 libsystemd0 amd64 247.3-7 [376 kB]
Get:11 http://ftp.debian.org/debian bullseye/main amd64 libpam-systemd amd64 247.3-7 [283 kB]
Get:12 http://ftp.debian.org/debian bullseye/main amd64 systemd amd64 247.3-7 [4,500 kB]
Get:13 http://ftp.debian.org/debian bullseye/main amd64 udev amd64 247.3-7 [1,464 kB]
Get:14 http://ftp.debian.org/debian bullseye/main amd64 libudev1 amd64 247.3-7 [168 kB]
Get:15 http://ftp.debian.org/debian bullseye/main amd64 systemd-sysv amd64 247.3-7 [113 kB]
Get:16 http://ftp.debian.org/debian bullseye/main amd64 tasksel-data all 3.68+deb11u1 [18.0 kB]
Get:17 http://ftp.debian.org/debian bullseye/main amd64 tasksel all 3.68+deb11u1 [101 kB]
Get:18 http://ftp.debian.org/debian bullseye/main amd64 gpgv amd64 2.2.27-2+deb11u1 [626 kB]
Get:19 http://ftp.debian.org/debian bullseye/main amd64 libssl1.1 amd64 1.1.1n-0+deb11u1 [1,557 kB]
Get:20 http://ftp.debian.org/debian bullseye-updates/main amd64 tzdata all 2021a-1+deb11u3 [285 kB]
Get:21 http://ftp.debian.org/debian bullseye/main amd64 libc-l10n all 2.31-13+deb11u3 [863 kB]
Get:22 http://ftp.debian.org/debian bullseye/main amd64 locales all 2.31-13+deb11u3 [4,084 kB]
Get:23 http://ftp.debian.org/debian bullseye/main amd64 gpgsm amd64 2.2.27-2+deb11u1 [645 kB]
Get:24 http://ftp.debian.org/debian bullseye/main amd64 gpg-wks-client amd64 2.2.27-2+deb11u1 [524 kB]
Get:25 http://ftp.debian.org/debian bullseye/main amd64 gpg-wks-server amd64 2.2.27-2+deb11u1 [516 kB]
Get:26 http://ftp.debian.org/debian bullseye/main amd64 gpg amd64 2.2.27-2+deb11u1 [928 kB]
Get:27 http://ftp.debian.org/debian bullseye/main amd64 gnupg-utils amd64 2.2.27-2+deb11u1 [905 kB]
Get:28 http://ftp.debian.org/debian bullseye/main amd64 gnupg-l10n all 2.2.27-2+deb11u1 [1,085 kB]
Get:29 http://ftp.debian.org/debian bullseye/main amd64 dirmngr amd64 2.2.27-2+deb11u1 [763 kB]
Get:30 http://ftp.debian.org/debian bullseye/main amd64 gnupg all 2.2.27-2+deb11u1 [825 kB]
Get:31 http://ftp.debian.org/debian bullseye/main amd64 gpg-agent amd64 2.2.27-2+deb11u1 [669 kB]
Get:32 http://ftp.debian.org/debian bullseye/main amd64 gpgconf amd64 2.2.27-2+deb11u1 [548 kB]
Get:33 http://ftp.debian.org/debian bullseye/main amd64 gtk-update-icon-cache amd64 3.24.24-4+deb11u2 [88.2 kB]
Get:34 http://ftp.debian.org/debian bullseye/non-free amd64 intel-microcode amd64 3.20220207.1~deb11u1 [3,845 kB]
Get:35 http://ftp.debian.org/debian bullseye/main amd64 libxml2 amd64 2.9.10+dfsg-6.7+deb11u1 [693 kB]
Get:36 http://ftp.debian.org/debian bullseye/main amd64 libarchive13 amd64 3.4.3-2+deb11u1 [343 kB]
Get:37 http://ftp.debian.org/debian bullseye/main amd64 libflac8 amd64 1.3.3-2+deb11u1 [112 kB]
Get:38 http://ftp.debian.org/debian bullseye/main amd64 nscd amd64 2.31-13+deb11u3 [290 kB]
Get:39 http://ftp.debian.org/debian bullseye/main amd64 openssl amd64 1.1.1n-0+deb11u1 [853 kB]
Get:40 http://ftp.debian.org/debian bullseye/main amd64 openvswitch-common amd64 2.15.0+ds1-2+deb11u1 [1,773 kB]
Get:41 http://ftp.debian.org/debian bullseye/main amd64 openvswitch-switch amd64 2.15.0+ds1-2+deb11u1 [54.6 kB]
Get:42 http://ftp.debian.org/debian bullseye/main amd64 usb.ids all 2022.02.15-0+deb11u1 [205 kB]
Fetched 38.4 MB in 4s (10.9 MB/s)
<snip>
Setting up openvswitch-common (2.15.0+ds1-2+deb11u1) ...
update-alternatives: updating alternative /usr/lib/openvswitch-common/ovs-vswitchd because link group ovs-vswitchd has changed slave links
Setting up libc6-dev:amd64 (2.31-13+deb11u3) ...
Setting up openvswitch-switch (2.15.0+ds1-2+deb11u1) ...
ovs-vswitchd.service is a disabled or a static unit not running, not starting it.

apt-listchanges: Reading changelogs...
Extracting templates from packages: 100%
Preconfiguring packages ...
(Reading database ... 64776 files and directories currently installed.)
Preparing to unpack .../base-files_11.1+deb11u3_amd64.deb ...
<snip>
Setting up dirmngr (2.2.27-2+deb11u1) ...
Setting up gpg-wks-server (2.2.27-2+deb11u1) ...
Setting up openvswitch-common (2.15.0+ds1-2+deb11u1) ...
update-alternatives: updating alternative /usr/lib/openvswitch-common/ovs-vswitchd because link group ovs-vswitchd has changed slave links
Setting up libc6-dev:amd64 (2.31-13+deb11u3) ...
Setting up openvswitch-switch (2.15.0+ds1-2+deb11u1) ...
ovs-vswitchd.service is a disabled or a static unit not running, not starting it.
 
Last edited:
Jun 8, 2016
343
65
53
46
Johannesburg, South Africa
Connecting to the host out of band (iDrac) we can initiate a controlled restart, after which everything works perfectly.

State after reboot:
Code:
[admin@kvm1c ~]# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-11 (running version: 7.1-11/8d529482)
pve-kernel-helper: 7.1-13
pve-kernel-5.13: 7.1-9
pve-kernel-5.13.19-6-pve: 5.13.19-14
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1


Any hints on how we could avoid loosing network connectivity during update installation?
 
Jun 8, 2016
343
65
53
46
Johannesburg, South Africa
This does appear to be an upstream Debian issue, what's odd is that Proxmox nodes don't fence...

Sample Debian 11.2 VM where OvS also dies when running an apt-get update:
Code:
[root@mininet ~]# cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

allow-vmbr0 eth0
iface eth0 inet manual
        ovs_bridge vmbr0
        ovs_type OVSPort
        ovs_options tag=1 vlan_mode=native-untagged
        mtu 9000

auto vmbr0
allow-ovs vmbr0
iface vmbr0 inet manual
        ovs_type OVSBridge
        ovs_ports eth0 vlan1
        mtu 9000

allow-vmbr0 vlan1
iface vlan1 inet dhcp
        ovs_type OVSIntPort
        ovs_bridge vmbr0
        ovs_options tag=1
        ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
        hwaddress 56:de:ad:be:ef:2d
        mtu 1500


Upgrade process:
Code:
[root@mininet ~]# apt-get update; apt-get -y dist-upgrade; apt-get autoremove; apt-get autoclean;
<snip>
The following NEW packages will be installed:
  linux-headers-5.10.0-13-amd64 linux-headers-5.10.0-13-common linux-image-5.10.0-13-amd64
The following packages will be upgraded:
  base-files bind9-host bind9-libs containerd.io dirmngr docker-ce docker-ce-cli docker-ce-rootless-extras gnupg gnupg-l10n gnupg-utils gnupg2 gpg gpg-agent
  gpg-wks-client gpg-wks-server gpgconf gpgsm gpgv gtk-update-icon-cache intel-microcode libc-bin libc-dev-bin libc-devtools libc-l10n libc6 libc6-dev
  libmariadb3 libnss-systemd libpam-systemd libssl1.1 libsystemd0 libtiff5 libudev1 libxml2 linux-compiler-gcc-10-x86 linux-headers-amd64 linux-image-amd64
  linux-kbuild-5.10 linux-libc-dev locales mariadb-common openssl openvswitch-common openvswitch-switch systemd systemd-sysv systemd-timesyncd sysvinit-utils
  task-english tasksel tasksel-data tzdata udev usb.ids
55 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 198 MB of archives.
After this operation, 356 MB of additional disk space will be used.
Get:1 http://security.debian.org/debian-security bullseye-security/main amd64 libtiff5 amd64 4.2.0-1+deb11u1 [289 kB]
Get:2 http://ftp.debian.org/debian bullseye/main amd64 base-files amd64 11.1+deb11u3 [70.1 kB]
Get:3 http://ftp.debian.org/debian bullseye/main amd64 libc-devtools amd64 2.31-13+deb11u3 [245 kB]
Get:4 http://ftp.debian.org/debian bullseye/main amd64 libc6-dev amd64 2.31-13+deb11u3 [2,348 kB]
Get:5 https://download.docker.com/linux/debian bullseye/stable amd64 containerd.io amd64 1.5.11-1 [22.9 MB]
Get:6 http://ftp.debian.org/debian bullseye/main amd64 libc-dev-bin amd64 2.31-13+deb11u3 [275 kB]
Get:7 http://ftp.debian.org/debian bullseye/main amd64 linux-libc-dev amd64 5.10.106-1 [1,470 kB]
Get:8 http://ftp.debian.org/debian bullseye/main amd64 libc6 amd64 2.31-13+deb11u3 [2,811 kB]
Get:9 http://ftp.debian.org/debian bullseye/main amd64 libc-bin amd64 2.31-13+deb11u3 [821 kB]
Get:10 http://ftp.debian.org/debian bullseye/main amd64 sysvinit-utils amd64 2.96-7+deb11u1 [25.6 kB]
Get:11 http://ftp.debian.org/debian bullseye/main amd64 libnss-systemd amd64 247.3-7 [198 kB]
Get:12 http://ftp.debian.org/debian bullseye/main amd64 libsystemd0 amd64 247.3-7 [376 kB]
Get:13 http://ftp.debian.org/debian bullseye/main amd64 systemd-timesyncd amd64 247.3-7 [131 kB]
Get:14 http://ftp.debian.org/debian bullseye/main amd64 libpam-systemd amd64 247.3-7 [283 kB]
Get:15 http://ftp.debian.org/debian bullseye/main amd64 systemd amd64 247.3-7 [4,500 kB]
Get:16 http://ftp.debian.org/debian bullseye/main amd64 udev amd64 247.3-7 [1,464 kB]
Get:17 http://ftp.debian.org/debian bullseye/main amd64 libudev1 amd64 247.3-7 [168 kB]
Get:18 http://ftp.debian.org/debian bullseye/main amd64 systemd-sysv amd64 247.3-7 [113 kB]
Get:19 http://ftp.debian.org/debian bullseye/main amd64 libc-l10n all 2.31-13+deb11u3 [863 kB]
Get:20 http://ftp.debian.org/debian bullseye/main amd64 locales all 2.31-13+deb11u3 [4,084 kB]
Get:21 http://ftp.debian.org/debian bullseye/main amd64 task-english all 3.68+deb11u1 [956 B]
Get:22 http://ftp.debian.org/debian bullseye/main amd64 tasksel-data all 3.68+deb11u1 [18.0 kB]
Get:23 http://ftp.debian.org/debian bullseye/main amd64 tasksel all 3.68+deb11u1 [101 kB]
Get:24 http://ftp.debian.org/debian bullseye/main amd64 gpgv amd64 2.2.27-2+deb11u1 [626 kB]
Get:25 http://ftp.debian.org/debian bullseye/main amd64 libssl1.1 amd64 1.1.1n-0+deb11u1 [1,557 kB]
Get:26 http://ftp.debian.org/debian bullseye-updates/main amd64 tzdata all 2021a-1+deb11u3 [285 kB]
Get:27 http://ftp.debian.org/debian bullseye/main amd64 libxml2 amd64 2.9.10+dfsg-6.7+deb11u1 [693 kB]
Get:28 http://ftp.debian.org/debian bullseye/main amd64 bind9-libs amd64 1:9.16.27-1~deb11u1 [1,413 kB]
Get:29 https://download.docker.com/linux/debian bullseye/stable amd64 docker-ce-cli amd64 5:20.10.14~3-0~debian-bullseye [41.0 MB]
Get:30 http://ftp.debian.org/debian bullseye/main amd64 bind9-host amd64 1:9.16.27-1~deb11u1 [302 kB]
Get:31 http://ftp.debian.org/debian bullseye/main amd64 gpgsm amd64 2.2.27-2+deb11u1 [645 kB]
Get:32 http://ftp.debian.org/debian bullseye/main amd64 gpg-wks-client amd64 2.2.27-2+deb11u1 [524 kB]
Get:33 http://ftp.debian.org/debian bullseye/main amd64 gpg-wks-server amd64 2.2.27-2+deb11u1 [516 kB]
Get:34 http://ftp.debian.org/debian bullseye/main amd64 gpg amd64 2.2.27-2+deb11u1 [928 kB]
Get:35 http://ftp.debian.org/debian bullseye/main amd64 gnupg-utils amd64 2.2.27-2+deb11u1 [905 kB]
Get:36 http://ftp.debian.org/debian bullseye/main amd64 gnupg-l10n all 2.2.27-2+deb11u1 [1,085 kB]
Get:37 http://ftp.debian.org/debian bullseye/main amd64 dirmngr amd64 2.2.27-2+deb11u1 [763 kB]
Get:38 http://ftp.debian.org/debian bullseye/main amd64 gnupg all 2.2.27-2+deb11u1 [825 kB]
Get:39 http://ftp.debian.org/debian bullseye/main amd64 gpg-agent amd64 2.2.27-2+deb11u1 [669 kB]
Get:40 http://ftp.debian.org/debian bullseye/main amd64 gpgconf amd64 2.2.27-2+deb11u1 [548 kB]
Get:41 http://ftp.debian.org/debian bullseye/main amd64 gnupg2 all 2.2.27-2+deb11u1 [434 kB]
Get:42 http://ftp.debian.org/debian bullseye/main amd64 gtk-update-icon-cache amd64 3.24.24-4+deb11u2 [88.2 kB]
Get:43 http://ftp.debian.org/debian bullseye/non-free amd64 intel-microcode amd64 3.20220207.1~deb11u1 [3,845 kB]
Get:44 http://ftp.debian.org/debian bullseye/main amd64 mariadb-common all 1:10.5.15-0+deb11u1 [36.7 kB]
Get:45 http://ftp.debian.org/debian bullseye/main amd64 libmariadb3 amd64 1:10.5.15-0+deb11u1 [176 kB]
Get:46 http://ftp.debian.org/debian bullseye/main amd64 linux-compiler-gcc-10-x86 amd64 5.10.106-1 [439 kB]
Get:47 http://ftp.debian.org/debian bullseye/main amd64 linux-headers-5.10.0-13-common all 5.10.106-1 [8,947 kB]
Get:48 http://ftp.debian.org/debian bullseye/main amd64 linux-kbuild-5.10 amd64 5.10.106-1 [681 kB]
Get:49 http://ftp.debian.org/debian bullseye/main amd64 linux-headers-5.10.0-13-amd64 amd64 5.10.106-1 [961 kB]
Get:50 http://ftp.debian.org/debian bullseye/main amd64 linux-headers-amd64 amd64 5.10.106-1 [1,180 B]
Get:51 http://ftp.debian.org/debian bullseye/main amd64 linux-image-5.10.0-13-amd64 amd64 5.10.106-1 [53.8 MB]
Get:52 https://download.docker.com/linux/debian bullseye/stable amd64 docker-ce amd64 5:20.10.14~3-0~debian-bullseye [20.9 MB]
Get:53 https://download.docker.com/linux/debian bullseye/stable amd64 docker-ce-rootless-extras amd64 5:20.10.14~3-0~debian-bullseye [7,922 kB]
Get:54 http://ftp.debian.org/debian bullseye/main amd64 linux-image-amd64 amd64 5.10.106-1 [1,484 B]
Get:55 http://ftp.debian.org/debian bullseye/main amd64 openssl amd64 1.1.1n-0+deb11u1 [853 kB]
Get:56 http://ftp.debian.org/debian bullseye/main amd64 openvswitch-common amd64 2.15.0+ds1-2+deb11u1 [1,773 kB]
Get:57 http://ftp.debian.org/debian bullseye/main amd64 openvswitch-switch amd64 2.15.0+ds1-2+deb11u1 [54.6 kB]
Get:58 http://ftp.debian.org/debian bullseye/main amd64 usb.ids all 2022.02.15-0+deb11u1 [205 kB]
Fetched 198 MB in 23s (8,509 kB/s)
<snip>
Unpacking openssl (1.1.1n-0+deb11u1) over (1.1.1k-1+deb11u2) ...
Preparing to unpack .../32-openvswitch-common_2.15.0+ds1-2+deb11u1_amd64.deb ...
Unpacking openvswitch-common (2.15.0+ds1-2+deb11u1) over (2.15.0+ds1-2) ...
Preparing to unpack .../33-openvswitch-switch_2.15.0+ds1-2+deb11u1_amd64.deb ...
Unpacking openvswitch-switch (2.15.0+ds1-2+deb11u1) over (2.15.0+ds1-2) ...
Preparing to unpack .../34-usb.ids_2022.02.15-0+deb11u1_all.deb ...
<snip>
Setting up gpg-wks-server (2.2.27-2+deb11u1) ...
Setting up openvswitch-common (2.15.0+ds1-2+deb11u1) ...
update-alternatives: updating alternative /usr/lib/openvswitch-common/ovs-vswitchd because link group ovs-vswitchd has changed slave links
Setting up libc6-dev:amd64 (2.31-13+deb11u3) ...
Setting up bind9-host (1:9.16.27-1~deb11u1) ...
Setting up openvswitch-switch (2.15.0+ds1-2+deb11u1) ...
ovs-vswitchd.service is a disabled or a static unit not running, not starting it.
<loose networking>
 
Jun 8, 2016
343
65
53
46
Johannesburg, South Africa
The following appears to work but is very hacky:

Open two sessions, one to trigger a powerdown/up cycle in 5 minutes time and another where you run updates in a screen session:
Code:
Enable SysRq, set a power on event in 5:20 minutes, sleep 5 minutes and instantly power off:
echo 1 > /proc/sys/kernel/sysrq;
delay=5;
echo `date '+%s' -d "+ $delay minutes + 20 seconds"` > /sys/class/rtc/rtc0/wakealarm \
  && sleep $((delay*60)) \
  && echo o > /proc/sysrq-trigger;

In the other session:
screen
apt-get update; apt-get -y dist-upgrade; apt-get autoremove; apt-get autoclean;

You could also replace 'echo o' with 'echo b' to initiate a reset without power cycling. We have some quirky systems where memory initialises 2 of the 4 DIMMs on reboot (init 6 or reset) whereas all 4 init and work perfectly when cold started (every time).
 
Last edited:

ednt

Active Member
Mar 16, 2017
96
7
28
As I have written:
Code:
/etc/init.d/networking stop
/etc/init.d/networking start
was enough to activate the network again.
No reboot was required.
(In our case)
 

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
2,109
253
103
A simple ifreload -a might also be enough and shouldn't interfere with VM network connectivity.

Thanks for also testing Debian directly!
 

caramb

Member
Oct 19, 2020
8
5
8
Hi there,

I just would like to suggest to reconsider the severity of this issue as it may lead to some kind of disaster in some setups.

As already stated by others, for some reason, the update considers not necessary to restart open-vswitch after upgrading it :
Code:
Setting up openvswitch-switch (2.15.0+ds1-2+deb11u1) ...
ovs-vswitchd.service is a disabled or a static unit not running, not starting it.

If running hyperconverged nodes (corosync + ceph) with openswitch, this leads to :
- Loss of access to the node unless you have out of band console.
- Node fencing itself due to complete loss of communication and rebooting while it was finishing applying updates.
- Dirty shutdown of all running VMs on this specific node (as storage is on Ceph) with possible guest filesystem consistency problems, loss of data, and service disruption. (HA doesn't help here as it performs cold boots of the VMs ; no network implies no way to migrate).

Got the same behaviour on two different clusters (non critical ones).
As I never perform simultaneous update of nodes, quorum was still there (corosync and ceph).
Of course after the node power-cycle, the cluster recovered ; as well as ceph ; but the workloads got hit and that's really bad.

Remaining questions :
1) Is this a pure Debian issue? Or may it be a side-effect of a Proxmox specific way of controlling network configuration over Debian ?
2) How to safely perform upgrade on my remaining nodes without migrating the VMs? (if answer to question n°1 is Debian ; I guess I have to wait & see).


Regards.
 
Last edited:

caramb

Member
Oct 19, 2020
8
5
8
Hi there,

I just would like to suggest to reconsider the severity of this issue as it may lead to some kind of disaster in some setups.

As already stated by others, for some reason, the update considers not necessary to restart open-vswitch after upgrading it :
Code:
Setting up openvswitch-switch (2.15.0+ds1-2+deb11u1) ...
ovs-vswitchd.service is a disabled or a static unit not running, not starting it.

If running hyperconverged nodes (corosync + ceph) with openswitch, this leads to :
- Loss of access to the node unless you have out of band console.
- Node fencing itself due to complete loss of communication and rebooting while it was finishing applying updates.
- Dirty shutdown of all running VMs on this specific node (as storage is on Ceph) with possible guest filesystem consistency problems, loss of data, and service disruption. (HA doesn't help here as it performs cold boots of the VMs ; no network implies no way to migrate).

Got the same behaviour on two different clusters (non critical ones).
As I never perform simultaneous update of nodes, quorum was still there (corosync and ceph).
Of course after the node power-cycle, the cluster recovered ; as well as ceph ; but the workloads got hit and that's really bad.

Remaining questions :
1) Is this a pure Debian issue? Or may it be a side-effect of a Proxmox specific way of controlling network configuration over Debian ?
2) How to safely perform upgrade on my remaining nodes without migrating the VMs? (if answer to question n°1 is Debian ; I guess I have to wait & see).


Regards.
Hi,

I reply to myself to provide additional details.

Today, I just figured out that the proxmox node did not properly recover from the "fencing while updating" situation.

Trying to perform a regular (apt) upgrade of the node issues an error :
Code:
Starting system upgrade: apt-get dist-upgrade
E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.

System not fully up to date (found 3 new packages)

starting shell

Issued the command :
Code:
dpkg --configure -a
Setting up pve-ha-manager (3.3-3) ...
watchdog-mux.service is a disabled or a static unit, not starting it.
Processing triggers for libc-bin (2.31-13+deb11u3) ...
Processing triggers for initramfs-tools (0.140) ...
update-initramfs: Generating /boot/initrd.img-5.13.19-6-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
Processing triggers for pve-manager (7.1-12) ...

Initrd was altered ; so I guess, I'll have to reboot again ; even if it seems I'm already running the updated kernel
Code:
uname -a
Linux yll-th2-hci-04 5.13.19-6-pve #1 SMP PVE 5.13.19-15 (Tue, 29 Mar 2022 15:59:50 +0200) x86_64 GNU/Linux
 
Last edited:

uberdome

Member
Mar 19, 2019
21
1
8
As I have written:
Code:
/etc/init.d/networking stop
/etc/init.d/networking start
was enough to activate the network again.
No reboot was required.
(In our case)

After a bunch of checking - on our first node a reboot seemed to resolve everything. I recreated the issue on a second node, and your suggestion worked successfully. Thank you, that was a rough time trying to get back up.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!