Cant connect to host gui or ssh

pharpe

Member
Jul 15, 2020
38
1
13
51
I've been running Proxmox for about a year and have had no issues. Today I had to shutdown to move some wiring and when I rebooted I can no longer access the web GUI or ssh into the host. The containers and VMs are coming up and working. When I get on the terminal I cannot access and local or wan resources. It's like something is blocking all network traffic to the host only. Any ideas on hot to troubleshoot this?

I've been googling and see a lot of issues where people can connect the host but not to VMs and containers. My issue is the opposite.

Update: One thing I'm noticing is that I get a couple seconds of connectivity on boot up. The GUI starts to load then it loses its network connection.
 
Last edited:
Any ideas on hot to troubleshoot this?

Login on the host and check for failed services and the journal for errors:

Bash:
# list failed units (not all of those are necessarily a problem)
systemctl list-units --failed

# check ssh daemon specifically
systemctl status sshd.service

# check current network config
ip addr

# check journal of current boot
# navigate with arrow keys or pgUp/Down for faster scrolling
# press SHIFT+G to go down all the way, g key to go up to start
journalctl -b

Search for any suspicious or erroneous message(s) in the journalctl command, those would be good to know to see whats going on here.
 
I've been running Proxmox for about a year and have had no issues. Today I had to shutdown to move some wiring and when I rebooted I can no longer access the web GUI or ssh into the host. The containers and VMs are coming up and working. When I get on the terminal I cannot access and local or wan resources. It's like something is blocking all network traffic to the host only. Any ideas on hot to troubleshoot this?

I've been googling and see a lot of issues where people can connect the host but not to VMs and containers. My issue is the opposite.

Login on the host and check for failed services and the journal for errors:

Bash:
# list failed units (not all of those are necessarily a problem)
systemctl list-units --failed

# check ssh daemon specifically
systemctl status sshd.service

# check current network config
ip addr

# check journal of current boot
# navigate with arrow keys or pgUp/Down for faster scrolling
# press SHIFT+G to go down all the way, g key to go up to start
journalctl -b

Search for any suspicious or erroneous message(s) in the journalctl command, those would be good to know to see whats going on here.
Output of systemctl list-units --failed
systemctl list-units --failed1.jpg

Output of systemctl status sshd.service

systemctl status sshd.service1.jpg


Out put of ip addr

ip addr1.jpg


There were a number of errors in the journal
journal 1a.jpg


journal 2a.jpg
journal 3a.jpg
journal 5a.jpg

journal 6a.jpg
journal 8b.jpg
 

Attachments

  • ip addr1.jpg
    ip addr1.jpg
    157.4 KB · Views: 4
The last thing I did before this issue was create a pihole container. I didn't think it was related because I do not even have that container started and I'm not using it for DNS right now. I tried starting the pihole CT via cli. When it came up suddenly the host got network connectivity back and I could connect top the web gui. I'm not sure why stating a container would effect the host though. It's not even using the pihole for dns and I don't think dns was ever the problem because I could not ping and traceroute directly to ip addresses when the issue was happening

So everything was working fine after I started the pihole container last night but now this morning its back to the proxmox host having no connectivity. Stopping and starting the container has no effect so I'm back to square 1.
 
Back to no connectivity to the host. Starting and stopping the pihole container doesn't have any effect now. If anyone could help with this I'd really appreciate it.
 
My connectivity to Proxmox started working again. After no connectivity for 20+ hours it magically starts working. I just ping it intermittently to check. I've done nothing to it since yesterday.

Anything I can diagnose now that connectivity is back? No telling how long it will stay online for.

I just did an upgrade to from proxmox-ve 6.2-10 to 6.3-1 while it was working.

Also comparing working state vs non working state. Looking at the ip a output with the non working state and see there are 4 additional routes when it's working. 10-13 did not exist when there was no connectivity. Is this related?

Code:
root@pharpe:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
    link/ether e0:d5:5e:35:44:af brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.91/24 brd 192.168.1.255 scope global noprefixroute enp3s0
       valid_lft forever preferred_lft forever
3: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether e0:d5:5e:35:44:af brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.91/24 brd 192.168.1.255 scope global noprefixroute vmbr0
       valid_lft forever preferred_lft forever
4: tap100i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UNKNOWN group default qlen 1000
    link/ether ce:61:44:a0:b0:ce brd ff:ff:ff:ff:ff:ff
    inet 169.254.17.25/16 brd 169.254.255.255 scope global noprefixroute tap100i0
       valid_lft forever preferred_lft forever
5: veth101i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether fe:47:4e:35:45:7d brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 169.254.31.45/16 brd 169.254.255.255 scope global noprefixroute veth101i0
       valid_lft forever preferred_lft forever
6: veth102i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr102i0 state UP group default qlen 1000
    link/ether fe:b6:aa:c2:0b:70 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet 169.254.189.61/16 brd 169.254.255.255 scope global noprefixroute veth102i0
       valid_lft forever preferred_lft forever
7: fwbr102i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 02:29:1c:d5:b3:a5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.181/24 brd 192.168.1.255 scope global noprefixroute fwbr102i0
       valid_lft forever preferred_lft forever
8: fwpr102p0@fwln102i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether ce:31:8a:8b:3a:a5 brd ff:ff:ff:ff:ff:ff
    inet 169.254.233.0/16 brd 169.254.255.255 scope global noprefixroute fwpr102p0
       valid_lft forever preferred_lft forever
9: fwln102i0@fwpr102p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr102i0 state UP group default qlen 1000
    link/ether 02:29:1c:d5:b3:a5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.135/24 brd 192.168.1.255 scope global noprefixroute fwln102i0
       valid_lft forever preferred_lft forever
10: veth104i0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr104i0 state UP group default qlen 1000
    link/ether fe:d0:f8:c5:d2:5e brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet 169.254.114.219/16 brd 169.254.255.255 scope global noprefixroute veth104i0
       valid_lft forever preferred_lft forever
11: fwbr104i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:c6:75:e5:0a:d6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.189/24 brd 192.168.1.255 scope global noprefixroute fwbr104i0
       valid_lft forever preferred_lft forever
12: fwpr104p0@fwln104i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 52:39:eb:db:b2:64 brd ff:ff:ff:ff:ff:ff
    inet 169.254.248.110/16 brd 169.254.255.255 scope global noprefixroute fwpr104p0
       valid_lft forever preferred_lft forever
13: fwln104i0@fwpr104p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr104i0 state UP group default qlen 1000
    link/ether da:c6:75:e5:0a:d6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.187/24 brd 192.168.1.255 scope global noprefixroute fwln104i0
       valid_lft forever preferred_lft forever
 
Anyone have any ideas? Is the lack of response due to me not providing the right information or is it just an issue no one knows how to solve?
 
Anyone have any ideas? Is the lack of response due to me not providing the right information or is it just an issue no one knows how to solve?
You still have ip conflict. You have the same address on enp3s0 and vmbr0. You can expect problem again.
 
You still have ip conflict. You have the same address on enp3s0 and vmbr0. You can expect problem again.
Any idea what might cause that? My /etc/network/interfaces file is configured as per the documentation I'm seeing. It has not been changed and worked fine for months. Now I suddenly lose connection and get it back randomly.

Code:
root@pharpe:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.91
        netmask 255.255.255.0
        gateway 192.168.1.1
        bridge_ports enp3s0
        bridge_stp off
        bridge_fd 0

I have connectivity right now and ran the pve5to6 to see if upgrading to 6 might be an option. That is saying I have an IP conflict also

Code:
root@pharpe:~# pve5to6
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =

Checking for package updates..
WARN: updates for the following packages are available:
  zfs-initramfs, zfs-zed, libnvpair3linux, libuutil3linux, libzfs4linux, zfsutils-linux                                                 , libnvpair3linux, libuutil3linux, libzfs4linux, libzpool4linux, libzfs2linux, libzpool                                                 2linux, libnvpair1linux, libuutil1linux

Checking proxmox-ve package version..
PASS: already upgraded to Proxmox VE 6

Checking running kernel version..
PASS: expected running kernel '5.4.101-1-pve'.

= CHECKING CLUSTER HEALTH/SETTINGS =

SKIP: standalone node.

= CHECKING HYPER-CONVERGED CEPH STATUS =

SKIP: no hyper-converged ceph setup detected!

= CHECKING CONFIGURED STORAGES =

PASS: storage 'backup' enabled and active.
PASS: storage 'storage' enabled and active.
PASS: storage 'local' enabled and active.
PASS: storage 'local-zfs' enabled and active.

= MISCELLANEOUS CHECKS =

INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
WARN: 4 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'pharpe' is resolvable..
INFO: Checking if resolved IP is configured on local node..
WARN: Resolved node IP '192.168.1.91' active on multiple (2) interfaces!
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters security level for TLS connec                                                 tions (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters security level for TLS connection                                                 s (2048 >= 2048)
INFO: Checking KVM nesting support, which breaks live migration for VMs using it..
PASS: KVM nested parameter set, but currently no VM with a 'vmx' or 'svm' flag is runni                                                 ng.

= SUMMARY =

TOTAL:    17
PASSED:   12
SKIPPED:  2
WARNINGS: 3
FAILURES: 0

ATTENTION: Please check the output for detailed information!
 
Any idea what might cause that? My /etc/network/interfaces file is configured as per the documentation I'm seeing. It has not been changed and worked fine for months. Now I suddenly lose connection and get it back randomly.

Code:
root@pharpe:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.91
        netmask 255.255.255.0
        gateway 192.168.1.1
        bridge_ports enp3s0
        bridge_stp off
        bridge_fd 0

I have connectivity right now and ran the pve5to6 to see if upgrading to 6 might be an option. That is saying I have an IP conflict also

Code:
root@pharpe:~# pve5to6
= CHECKING VERSION INFORMATION FOR PVE PACKAGES =

Checking for package updates..
WARN: updates for the following packages are available:
  zfs-initramfs, zfs-zed, libnvpair3linux, libuutil3linux, libzfs4linux, zfsutils-linux                                                 , libnvpair3linux, libuutil3linux, libzfs4linux, libzpool4linux, libzfs2linux, libzpool                                                 2linux, libnvpair1linux, libuutil1linux

Checking proxmox-ve package version..
PASS: already upgraded to Proxmox VE 6

Checking running kernel version..
PASS: expected running kernel '5.4.101-1-pve'.

= CHECKING CLUSTER HEALTH/SETTINGS =

SKIP: standalone node.

= CHECKING HYPER-CONVERGED CEPH STATUS =

SKIP: no hyper-converged ceph setup detected!

= CHECKING CONFIGURED STORAGES =

PASS: storage 'backup' enabled and active.
PASS: storage 'storage' enabled and active.
PASS: storage 'local' enabled and active.
PASS: storage 'local-zfs' enabled and active.

= MISCELLANEOUS CHECKS =

INFO: Checking common daemon services..
PASS: systemd unit 'pveproxy.service' is in state 'active'
PASS: systemd unit 'pvedaemon.service' is in state 'active'
PASS: systemd unit 'pvestatd.service' is in state 'active'
INFO: Checking for running guests..
WARN: 4 running guest(s) detected - consider migrating or stopping them.
INFO: Checking if the local node's hostname 'pharpe' is resolvable..
INFO: Checking if resolved IP is configured on local node..
WARN: Resolved node IP '192.168.1.91' active on multiple (2) interfaces!
INFO: Check node certificate's RSA key size
PASS: Certificate 'pve-root-ca.pem' passed Debian Busters security level for TLS connec                                                 tions (4096 >= 2048)
PASS: Certificate 'pve-ssl.pem' passed Debian Busters security level for TLS connection                                                 s (2048 >= 2048)
INFO: Checking KVM nesting support, which breaks live migration for VMs using it..
PASS: KVM nested parameter set, but currently no VM with a 'vmx' or 'svm' flag is runni                                                 ng.

= SUMMARY =

TOTAL:    17
PASSED:   12
SKIPPED:  2
WARNINGS: 3
FAILURES: 0

ATTENTION: Please check the output for detailed information!

Simple change ip address on vmbr0 interface. I guess that you have one more unused ip in subnet 192.168.1.0/255.255.255.0 for example 192.168.1.93
 
Simple change ip address on vmbr0 interface. I guess that you have one more unused ip in subnet 192.168.1.0/255.255.255.0 for example 192.168.1.93
Changed the IP of vmbr0 to 192.168.74
Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0

Rebooted. Still no connection to host.

If I run ifdown vmbr0 then it restores connectivity to the host but then I lose connectivity to all CTs and VMs.

If I bring it back up ifup vmbr0 then I lose connection to the host again but CTs do not come back. A reboot restores connection to CTs but not host.
 
Last edited:
Changed the IP of vmbr0 to 192.168.74
Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0

Rebooted. Still no connection to host.

If I run ifdown vmbr0 then it restores connectivity to the host but then I lose connectivity to all CTs and VMs.

If I bring it back up ifup vmbr0 then I lose connection to the host again but CTs do not come back. A reboot restores connection to CTs but not host.
vmbr0 and enp3s0 must be up and running, but with different ip addresses.
For example put 192.168.1.91 on enp3s0 and 192.168.1.74 on vmbr0
 
vmbr0 and enp3s0 must be up and running, but with different ip addresses.
For example put 192.168.1.91 on enp3s0 and 192.168.1.74 on vmbr0
I understand I just don't know how to accomplish that. I changed the IP for vmbr0 in /etc/network/interfaces. Now I see 2 IPs for vmbr0 in ip a. Is there somewhere else I should change it?
 

Attachments

  • ip a full small.jpg
    ip a full small.jpg
    363 KB · Views: 11
I understand I just don't know how to accomplish that. I changed the IP for vmbr0 in /etc/network/interfaces. Now I see 2 IPs for vmbr0 in ip a. Is there somewhere else I should change it?

Can ypu post your /etc/network/interface ?
 
/etc/network/interfaces:
Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0

I tried changing
Code:
iface enp3s0 inet dhcp
and rebooting
Code:
auto lo
iface lo inet loopback

iface enp3s0 inet dhcp

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0

When it came up enp3s0 was still 192.168.1.91 but vmbr0 was bound to 192.18.1.74 and 192.168.1.207. Unfortunately it still didn't work.
 

Attachments

  • after change to dhcp1.jpg
    after change to dhcp1.jpg
    405.5 KB · Views: 10
Last edited:
Just to sum up where I am now. With the default /etc/network/interface I still have intermittent access to the Proxmox host but all VMs and CTs are working fine. It's been pointed out that, with that setup, enp3s0 and vmbr0 are being assigned the same IP address.

To test this I tried taking down vmbr0 ifdown vmbr0 . This immediately restored connectivity to the host but I lost all connectivity the CTs and VMs.

If I change the IP of vmbr0 then it gets 2 IPs (the original that is the same as enp3s0 and the new one. This had no effect on being able to the host. I also tried setting enp3s0 to DHCP and that didn't help either.

Could it be due to enp3s0 and vmbr0 having the same mac address? I'm also seeing some log errors related to NTP. Could that be related? That's just a time server right?
 
Last edited:
Just to sum up where I am now. With the default /etc/network/interface I still have intermittent access to the Proxmox host but all VMs and CTs are working fine. It's been pointed out that, with that setup, enp3s0 and vmbr0 are being assigned the same IP address.

To test this I tried taking down vmbr0 ifdown vmbr0 . This immediately restored connectivity to the host but I lost all connectivity the CTs and VMs.

If I change the IP of vmbr0 then it gets 2 IPs (the original that is the same as enp3s0 and the new one. This had no effect on being able to the host. I also tried setting enp3s0 to DHCP and that didn't help either.

Could it be due to enp3s0 and vmbr0 having the same mac address? I'm also seeing some log errors related to NTP. Could that be related? That's just a time server right?
If you don't have DHCP server don't put nic to get address form DHCP server. NTP is for time sync. NTP and configuration of ip address on interface is not related.

This is good config

Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0

Reboot host
 
If you don't have DHCP server don't put nic to get address form DHCP server. NTP is for time sync. NTP and configuration of ip address on interface is not related.

This is good config

Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0

Reboot host
That is what I have now. Rebooted but still can't get to the host. vmbr0 is showing both the .91 and .74 IP address
1615243537747.png
 
Last edited:
No mater what I do enp3s0 is pulling a DHCP IP address. Even with iface enp3s0 inet manual. I thought I found a simulure problem where someone was having the interface pulling a DHCP IP on boot and fixed it by adding pre-up ip addr flush dev enp3s0

I tried this but it didn't work

Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        pre-up ip addr flush dev enp3s0
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0
 
No mater what I do enp3s0 is pulling a DHCP IP address. Even with iface enp3s0 inet manual. I thought I found a simulure problem where someone was having the interface pulling a DHCP IP on boot and fixed it by adding pre-up ip addr flush dev enp3s0

I tried this but it didn't work

Code:
auto lo
iface lo inet loopback

iface enp3s0 inet manual

auto vmbr0
iface vmbr0 inet static
        pre-up ip addr flush dev enp3s0
        address 192.168.1.74/24
        gateway 192.168.1.1
        bridge-ports enp3s0
        bridge-stp off
        bridge-fd 0
For me is simple impossible to get ip on interface if you do not set it.
If you have only this nic enp3s0 and only this config for vmbr0 and you don't have any script, i think on cloudinit or similiar, is impossible to get more address on interface.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!