Upgraded my tiny lab cluster today - 3 headless nodes of miniPC, (2) with C2930, and (1) with N3160. Identical drives, ram. dual NIC, LACP, managed by open-vswitch, with multiple vlan's (including management) across the bundle.
all had green pve7to8 reports.
All dumped networking at the same spot:
```
Selecting previously unselected package pve-kernel-6.2.16-3-pve.
Preparing to unpack .../121-pve-kernel-6.2.16-3-pve_6.2.16-3_amd64.deb ...
Unpacking pve-kernel-6.2.16-3-pve (6.2.16-3) ...
client_loop: send disconnect: Broken pipe
Mac-Pro:~ $
Progress: [ 55%] [#################################################################################################################..
```
Not knowing the status of the upgrade/etc, I powered cycled one of the boxes. Gave it 5 minutes, no pings/etc. Rebooted the second one, same deal.
Hooked up keyboard to the 1st node, boot failure, cannot load kernel. Same with second box.
Had NOT power cycled the 3rd box, so I hooked up keyboard/hdmi, and it had zero networking, for some reason alll the open-vswitch modules had been disabled/stopped which killed networking, which stopped the install (I was upgrading over SSH) mid way through. I was able to resume dpkg, which completed all the install steps, and after a reboot, and lots more patching, everything appears to be back up.
This leaves me with (2) boxes however, that won't boot any more, same "cannot load kernel" message. I pulled the boot drive, and rsync'd the /boot folder over from the (#3) node, nodes #1/#2 still won't boot with same error message.
So three things:
1. How do I get node #1 or #2 up and running again so I can complete the upgrade? (see below : https://forum.proxmox.com/threads/p...ade-leaving-node-unusable.130055/#post-570201)
2. why is openvswitch being stopped/restarted/etc during the upgrade process?
3. given #2 - I recommend a sanity check, and error thrown during the pve7to8 script, so users understand if they use open-vswitch, and they are not performing the upgrade from the console, they'll be left with an unusable node.
* https://forum.proxmox.com/threads/download-pve8-packages-and-continue-upgrade-offline.129804/
all had green pve7to8 reports.
All dumped networking at the same spot:
```
Selecting previously unselected package pve-kernel-6.2.16-3-pve.
Preparing to unpack .../121-pve-kernel-6.2.16-3-pve_6.2.16-3_amd64.deb ...
Unpacking pve-kernel-6.2.16-3-pve (6.2.16-3) ...
client_loop: send disconnect: Broken pipe
Mac-Pro:~ $
Progress: [ 55%] [#################################################################################################################..
```
Not knowing the status of the upgrade/etc, I powered cycled one of the boxes. Gave it 5 minutes, no pings/etc. Rebooted the second one, same deal.
Hooked up keyboard to the 1st node, boot failure, cannot load kernel. Same with second box.
Had NOT power cycled the 3rd box, so I hooked up keyboard/hdmi, and it had zero networking, for some reason alll the open-vswitch modules had been disabled/stopped which killed networking, which stopped the install (I was upgrading over SSH) mid way through. I was able to resume dpkg, which completed all the install steps, and after a reboot, and lots more patching, everything appears to be back up.
This leaves me with (2) boxes however, that won't boot any more, same "cannot load kernel" message. I pulled the boot drive, and rsync'd the /boot folder over from the (#3) node, nodes #1/#2 still won't boot with same error message.
So three things:
1. How do I get node #1 or #2 up and running again so I can complete the upgrade? (see below : https://forum.proxmox.com/threads/p...ade-leaving-node-unusable.130055/#post-570201)
2. why is openvswitch being stopped/restarted/etc during the upgrade process?
3. given #2 - I recommend a sanity check, and error thrown during the pve7to8 script, so users understand if they use open-vswitch, and they are not performing the upgrade from the console, they'll be left with an unusable node.
* https://forum.proxmox.com/threads/download-pve8-packages-and-continue-upgrade-offline.129804/
Last edited: