ERR_CONNECTION_REFUSED to the web UI after kernel update

rmh240 · Feb 24, 2022

Hey everyone,

Since updating to the latest kernel, I have been experiencing a Proxmox outage. None of the VM seems to be online. If I log onto the web UI, I get the error ERR_CONNECTION_REFUSED.

What I have done so far

I checked for duplicate IP addresses, and the server is the only one with this given address.
I can get into the server using ssh but cant do an apt-get update && apt-get upgrade it states that there is no internet connection.

I hope someone can help me and/or point me in the right direction.

Thanks very much already.

oguz · Feb 24, 2022

hi,

rmh240 said:
If I log onto the web UI, I get the error ERR_CONNECTION_REFUSED.

* are you connecting like https://your.ip.address.here:8006 ?

rmh240 said:
Since updating to the latest kernel, I have been experiencing a Proxmox outage.

* have you done a reboot after the kernel upgrade was complete? (if not, please do that first!)
* which version are you on? pveversion -v

rmh240 said:
I checked for duplicate IP addresses, and the server is the only one with this given address.

okay

rmh240 said:
I can get into the server using ssh but cant do an apt-get update && apt-get upgrade it states that there is no internet connection.

* are the pve services running? systemctl | grep pve

rmh240 · Feb 24, 2022

Thanks so much for getting back to me so quickly.

oguz said:
* are you connecting like https://your.ip.address.here:8006 ?

Yes indeed.

oguz said:
* have you done a reboot after the kernel upgrade was complete? (if not, please do that first!)

I always reboot avfter a kernel update. Interestingly enough this occurred after the reboot. To be more specific after initiating the reboot I was never able to log on again.

oguz said:
* which version are you on? pveversion -v

proxmox-ve: 6.4-1 (running kernel: 5.4.166-1-pve)
pve-manager: 6.4-13 (running version: 6.4-13/9f411e79)
pve-kernel-5.4: 6.4-13
pve-kernel-helper: 6.4-13
pve-kernel-5.4.166-1-pve: 5.4.166-1
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-5.4.157-1-pve: 5.4.157-1
pve-kernel-5.4.151-1-pve: 5.4.151-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.5-pve2~bpo10+1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve2~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.7-pve1

oguz said:
* are the pve services running? systemctl | grep pve

Some of them are not running but I am not sure what is meant to run. Here is the printout.

Code:

  etc-pve.mount                                                                            loaded active       mounted             /etc/pve                                                                    
  pve-cluster.service                                                                      loaded active       running             The Proxmox VE cluster filesystem                                          
  pve-firewall.service                                                                     loaded deactivating final-sigterm start Proxmox VE firewall                                                        
  pve-guests.service                                                                       loaded inactive     dead          start PVE guests                                                                  
  pve-ha-crm.service                                                                       loaded inactive     dead          start PVE Cluster HA Resource Manager Daemon                                      
  pve-ha-lrm.service                                                                       loaded inactive     dead          start PVE Local HA Resource Manager Daemon                                        
  pve-lxc-syscalld.service                                                                 loaded active       running             Proxmox VE LXC Syscall Daemon                                              
  pvebanner.service                                                                        loaded active       exited              Proxmox VE Login Banner                                                    
  pvedaemon.service                                                                        loaded deactivating final-sigterm start PVE API Daemon                                                              
  pvefw-logger.service                                                                     loaded active       running             Proxmox VE firewall logger                                                  
  pvenetcommit.service                                                                     loaded active       exited              Commit Proxmox VE network changes                                          
  pveproxy.service                                                                         loaded inactive     dead          start PVE API Proxy Server                                                        
  pvesr.service                                                                            loaded activating   start         start Proxmox VE replication runner                                              
  pvestatd.service                                                                         loaded deactivating final-sigterm start PVE Status Daemon                                                          
  dev-pve-swap.swap                                                                        loaded active       active              /dev/pve/swap                                                              
  pve-storage.target                                                                       loaded active       active              PVE Storage Target                                                          
  pve-daily-update.timer                                                                   loaded active       waiting             Daily PVE download activities                                              
  pvesr.timer                                                                              loaded active       running             Proxmox VE replication runner

oguz · Feb 24, 2022

thanks for the outputs.

rmh240 said:
Some of them are not running but I am not sure what is meant to run. Here is the printout.

looks like some important services aren't working (pvestatd, pvedaemon and pveproxy)

could you also post the resulting output file from journalctl -e -u pveproxy -u pvedaemon -u pvestatd > output.txt

rmh240 · Feb 24, 2022

oguz said:
thanks for the outputs.

You are welcome, and thanks again for your time. This is very much appreciated

oguz said:
looks like some important services aren't working (pvestatd, pvedaemon and pveproxy)

That's correct. And for some reason it is shutting down some more services.

oguz said:
could you also post the resulting output file from journalctl -e -u pveproxy -u pvedaemon -u pvestatd > output.txt

I have attached the output.txt file to this reply

oguz · Feb 24, 2022

can't find much of a reason in the output, but thanks for sending.

rmh240 said:
That's correct. And for some reason it is shutting down some more services.

in that case could you also post the output from journalctl -xe -u 'pve*' > output2.txt

rmh240 · Feb 24, 2022

oguz said:
can't find much of a reason in the output, but thanks for sending.

in that case could you also post the output from journalctl -xe -u 'pve*' > output2.txt

Of course

See the file attached

oguz · Feb 24, 2022

rmh240 said:
Of course See the file attached

really weird... could you send us your whole journal? journalctl -b > journal.txt
also post the outputs from:
* cat /etc/hosts
* ping -c2 $(uname -n)

rmh240 · Feb 24, 2022

oguz said:
really weird... could you send us your whole journal? journalctl -b > journal.txt

Please see the journal file attached. I hope this helps as it doesn't seem to go far back in time.

oguz said:
also post the outputs from:
* cat /etc/hosts

Code:

127.0.0.1 localhost.localdomain localhost
192.168.0.10 HomeServer.local HomeServer

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

oguz said:
* ping -c2 $(uname -n)

Code:

PING HomeServer.local (192.168.0.10) 56(84) bytes of data.
64 bytes from HomeServer.local (192.168.0.10): icmp_seq=1 ttl=64 time=0.026 ms
64 bytes from HomeServer.local (192.168.0.10): icmp_seq=2 ttl=64 time=0.034 ms

--- HomeServer.local ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 23ms
rtt min/avg/max/mdev = 0.026/0.030/0.034/0.004 ms

oguz · Feb 24, 2022

rmh240 said:
Please see the journal file attached. I hope this helps as it doesn't seem to go far back in time.

thanks, that should be alright (it's the journal from this boot)

and it seems like there's a kernel oops trace in there, so we'll have to take a look at that.

in the meantime could you try downgrading or pinning to the older kernel version to see if things start working again?

rmh240 · Feb 24, 2022

oguz said:
thanks, that should be alright (it's the journal from this boot)

and it seems like there's a kernel oops trace in there, so we'll have to take a look at that.

in the meantime could you try downgrading or pinning to the older kernel version to see if things start working again?

That's interesting? how can I do this?

oguz · Feb 24, 2022

rmh240 said:
That's interesting? how can I do this?

depends on your bootloader [0], and since you're on PVE6 it might be slightly different.

but in essence you can either:
* manually select the older kernel when booting (if you have physical access to the host)
* edit your bootloader config (grub/systemd-boot) to point to the older kernel

if you use grub check this post here [1]
and if you're using systemd-boot then you can use our tool proxmox-boot-tool

[0]: https://pve.proxmox.com/wiki/Host_Bootloader
[1]: https://forum.proxmox.com/threads/revert-to-prior-kernel.100310/#post-434580

rmh240 · Feb 24, 2022

oguz said:
depends on your bootloader [0], and since you're on PVE6 it might be slightly different.

but in essence you can either:
* manually select the older kernel when booting (if you have physical access to the host)
* edit your bootloader config (grub/systemd-boot) to point to the older kernel

if you use grub check this post here [1]
and if you're using systemd-boot then you can use our tool proxmox-boot-tool

[0]: https://pve.proxmox.com/wiki/Host_Bootloader
[1]: https://forum.proxmox.com/threads/revert-to-prior-kernel.100310/#post-434580

Thank you very much for your assistance! This has solved this problem. Selecting the older kernel manually didn't work for me.
The grub method has worked now, and all is working again. Once there is a new kernel, I should revert the setting to GRUB_DEFAULT=0?

oguz · Feb 28, 2022

rmh240 said:
The grub method has worked now, and all is working again.

great to hear that

rmh240 said:
Once there is a new kernel, I should revert the setting to GRUB_DEFAULT=0?

yes, you could just comment out that one and comment the hardcoded values (just in case)

the kernel oops looks to be related to sound devices. after the oops, your system keeps booting but proceeds to timeout on almost every service since it's in an unstable state.

are you doing any kind of device passthrough to your VMs (especially sound related) ? if yes please post the corresponding VM configs here.

it would also be helpful to know what kind of hardware you have: lspci -nnk | grep -i audio -A 2

rmh240 · Mar 3, 2022

oguz said:
great to hear that

yes, you could just comment out that one and comment the hardcoded values (just in case)

the kernel oops looks to be related to sound devices. after the oops, your system keeps booting but proceeds to timeout on almost every service since it's in an unstable state.

are you doing any kind of device passthrough to your VMs (especially sound related) ? if yes please post the corresponding VM configs here.

I am not, although I was thinking of doing so. It is a Lenovo ThinkCentre M710 which I use to run various server products on.

oguz said:
it would also be helpful to know what kind of hardware you have: lspci -nnk | grep -i audio -A 2

I hope this helps:

Code:

00:1f.3 Audio device [0403]: Intel Corporation Cannon Lake PCH cAVS [8086:a348] (rev 10)
        Subsystem: Lenovo Cannon Lake PCH cAVS [17aa:312d]
        Kernel driver in use: sof-audio-pci
        Kernel modules: snd_hda_intel, snd_sof_pci
00:1f.4 SMBus [0c05]: Intel Corporation Cannon Lake PCH SMBus Controller [8086:a323] (rev 10)

oguz · Mar 3, 2022

thanks for the output

by the way, PVE6 will be end of life at some point, so you should think about upgrading your server to PVE 7 [0]

we also have an opt-in kernel version 5.15 [1] that could potentially work fine with your hardware (the default kernel version on PVE7 at the time of writing is 5.13)

[0]: https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0
[1]: https://forum.proxmox.com/threads/opt-in-linux-kernel-5-15-for-proxmox-ve-7-x-available.100936/

rmh240 · Mar 6, 2022

oguz said:
thanks for the output

by the way, PVE6 will be end of life at some point, so you should think about upgrading your server to PVE 7 [0]

I am aware of this but I haven't had the courage so far to update it. I have taken the time today and upgraded to PVE 7. Initially, after upgrading I had an issue with VMs not booting - i.e. they were looking for the boot disk. I have then reverted to the most current kernel, i.e. uncommented the original command as you have suggested on another forum post and then it all worked again as before.

oguz said:
we also have an opt-in kernel version 5.15 [1] that could potentially work fine with your hardware (the default kernel version on PVE7 at the time of writing is 5.13)

[0]: https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0
[1]: https://forum.proxmox.com/threads/opt-in-linux-kernel-5-15-for-proxmox-ve-7-x-available.100936/

Once again thank you so much for your assistance. It has helped me greatly.

oguz · Mar 8, 2022

rmh240 said:
Once again thank you so much for your assistance. It has helped me greatly.

gladly

i can't reproduce your oops trace here at the moment, and haven't found other users reporting on this (most people have already moved on to PVE7 and newer kernels).
in the meantime you can just keep booting the older working kernel and wait for a new kernel version until there's one available.

for testing purposes, you could try to blacklist the offending kernel modules unless you need sound on your host (see the lspci output, which lists the kernel modules in use).

you can add them to /etc/modprobe.d/blacklist.conf and reboot your machine to the buggy kernel (edit your bootloader config as before).

if you're still getting errors with that, we then deduce that the issue is not related to your firmware/hardware but a different bug.

ERR_CONNECTION_REFUSED to the web UI after kernel update

Member

Proxmox Retired Staff

Member

Proxmox Retired Staff

Member

Attachments

Proxmox Retired Staff

Member

Attachments

Proxmox Retired Staff

Member

Attachments

Proxmox Retired Staff

Member

Proxmox Retired Staff

Member

Proxmox Retired Staff

Member

Proxmox Retired Staff

Member

Proxmox Retired Staff

We value your privacy