Yet another no GUI thread

illuminated · Nov 10, 2020

Hi all,

I have just started the Proxmox VE (6.2) adventure on a new SoYouStart host and since the day one I've been having GUI hangs... at least when I try to open it from my own computer. In localhost the page is served:

Bash:

→ curl -s -k https://localhost:8006 | grep title
    <title>dom01 - Proxmox Virtual Environment</title>

→

I would really appreciate any ideas. Hard reset brings the GUI back, but just for a while... if I leave my computer for a few hours with the GUI tab open (doesn't matter if I am logged in or not), I just cannot open it anymore (server timeout error). Currently the GUI doesn't work, but I do have SSH open (it was opened before the hang) and working. I had a situation when even SSH wasn't working, but SoYouStart's IPMI was connecting me to the server. This is what I saw when connecting through IPMI:

https://imgur.com/GuZIHRm

Firewall is off:

Bash:

→iptables --list

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

port 8006 is open:

Bash:

→ netstat -an |grep 8006
tcp        0      0 0.0.0.0:8006            0.0.0.0:*               LISTEN

opening the page in browser (https://my.url:8006) results in connection time out (tried from latest Firefox, Chrome, Safari) and the access log shows no entries after the hang happened:

Bash:

→ tail -f /var/log/pveproxy/access.log
my.home.ip.addr - root@pam [09/11/2020:16:44:55 +0100] "GET /api2/json/cluster/resources HTTP/1.1" 200 995
my.home.ip.addr - root@pam [09/11/2020:16:44:57 +0100] "GET /api2/json/cluster/tasks HTTP/1.1" 200 898
my.home.ip.addr - - [09/11/2020:19:19:56 +0100] "GET /api2/json/nodes/dom01/status HTTP/1.1" 401 -
my.home.ip.addr - - [09/11/2020:19:20:14 +0100] "GET /api2/json/nodes/dom01/status HTTP/1.1" 401 -
my.home.ip.addr - - [09/11/2020:19:20:32 +0100] "GET /api2/json/nodes/dom01/status HTTP/1.1" 401 -
my.home.ip.addr - - [09/11/2020:19:20:50 +0100] "GET /api2/json/nodes/dom01/status HTTP/1.1" 401 -
my.home.ip.addr - - [09/11/2020:19:21:08 +0100] "GET /api2/json/nodes/dom01/status HTTP/1.1" 401 -
my.home.ip.addr - - [09/11/2020:19:21:26 +0100] "GET /api2/json/nodes/dom01/status HTTP/1.1" 401 -
my.home.ip.addr - - [09/11/2020:19:21:31 +0100] "GET /api2/json/access/domains HTTP/1.1" 200 159
127.0.0.1 - - [09/11/2020:23:01:47 +0100] "GET / HTTP/1.1" 200 2161

tcpdump also doesn't seem to see anything:

Bash:

→ tcpdump -i vmbr0 port 8006
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
2 packets received by filter
0 packets dropped by kernel

I have two VMs, both are running without any issue the entire time:

Bash:

→ qm list
      VMID NAME                      STATUS     MEM(MB)    BOOTDISK(GB) PID
       100 mail                      running    8196             600.00 1523
       101 drive                     running    16384            600.00 1586

The node itself is listed normally with pvesh, but pvecm reports an error regarding corosync:

Bash:

→ pvesh get /nodes
┌───────┬────────┬───────┬───────┬────────┬───────────┬───────────┬─────────────────────────────────────────────────────────────────────────────────────────────────┬─────────────┐
│ node  │ status │   cpu │ level │ maxcpu │    maxmem │       mem │ ssl_fingerprint                                                                                 │      uptime │
╞═══════╪════════╪═══════╪═══════╪════════╪═══════════╪═══════════╪═════════════════════════════════════════════════════════════════════════════════════════════════╪═════════════╡
│ dom01 │ online │ 1.55% │       │     16 │ 62.46 GiB │ 10.53 GiB │ 05:21:69:BA:F6:1E:83:27:E0:50:73:EA:3B:C8:2E:62:FB:3D:C4:5F:02:2B:12:2D:68:F9:24:C0:11:C5:5A:31 │ 19h 10m 51s │
└───────┴────────┴───────┴───────┴────────┴───────────┴───────────┴─────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────┘

Bash:

→ pvecm nodes
Error: Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?

By the way, I don't have a cluster, haven't created one, it's just a single node.

Corosync service is also down:

Bash:

→ service corosync status
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
Condition: start condition failed at Mon 2020-11-09 23:04:20 CET; 37min ago
           └─ ConditionPathExists=/etc/corosync/corosync.conf was not met
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview

Nov 09 04:24:18 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 20:47:20 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 20:49:56 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 21:15:37 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 21:22:55 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 21:26:11 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 21:31:16 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.
Nov 09 23:04:20 dom01 systemd[1]: Condition check resulted in Corosync Cluster Engine being skipped.

continued in the next post due to length limit...

illuminated · Nov 10, 2020

Other services do work OK (I have manually stopped the pve-firewall and have tried restarting all of them but the manager and guest services):

Bash:

→ systemctl status 'pve*'
● pve-lxc-syscalld.service - Proxmox VE LXC Syscall Daemon
   Loaded: loaded (/lib/systemd/system/pve-lxc-syscalld.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 20:50:40 CET; 2h 52min ago
 Main PID: 28207 (pve-lxc-syscall)
    Tasks: 9 (limit: 4915)
   Memory: 988.0K
   CGroup: /system.slice/pve-lxc-syscalld.service
           └─28207 /usr/lib/x86_64-linux-gnu/pve-lxc-syscalld/pve-lxc-syscalld --system /run/pve/lxc-syscalld.sock

Nov 09 20:50:40 dom01 systemd[1]: Starting Proxmox VE LXC Syscall Daemon...
Nov 09 20:50:40 dom01 systemd[1]: Started Proxmox VE LXC Syscall Daemon.

● pve-ha-crm.service - PVE Cluster HA Resource Manager Daemon
   Loaded: loaded (/lib/systemd/system/pve-ha-crm.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 20:51:03 CET; 2h 52min ago
  Process: 28303 ExecStart=/usr/sbin/pve-ha-crm start (code=exited, status=0/SUCCESS)
 Main PID: 28304 (pve-ha-crm)
    Tasks: 1 (limit: 4915)
   Memory: 88.3M
   CGroup: /system.slice/pve-ha-crm.service
           └─28304 pve-ha-crm

Nov 09 20:51:02 dom01 systemd[1]: Starting PVE Cluster HA Resource Manager Daemon...
Nov 09 20:51:03 dom01 pve-ha-crm[28304]: starting server
Nov 09 20:51:03 dom01 pve-ha-crm[28304]: status change startup => wait_for_quorum
Nov 09 20:51:03 dom01 systemd[1]: Started PVE Cluster HA Resource Manager Daemon.

● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 21:26:11 CET; 2h 17min ago
  Process: 1128 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
 Main PID: 1133 (pmxcfs)
    Tasks: 6 (limit: 4915)
   Memory: 28.3M
   CGroup: /system.slice/pve-cluster.service
           └─1133 /usr/bin/pmxcfs

Nov 09 21:26:10 dom01 systemd[1]: Starting The Proxmox VE cluster filesystem...
Nov 09 21:26:11 dom01 systemd[1]: Started The Proxmox VE cluster filesystem.

● pvebanner.service - Proxmox VE Login Banner
   Loaded: loaded (/lib/systemd/system/pvebanner.service; enabled; vendor preset: enabled)
   Active: active (exited) since Mon 2020-11-09 20:51:32 CET; 2h 51min ago
  Process: 28376 ExecStart=/usr/bin/pvebanner (code=exited, status=0/SUCCESS)
 Main PID: 28376 (code=exited, status=0/SUCCESS)

Nov 09 20:51:32 dom01 systemd[1]: Starting Proxmox VE Login Banner...
Nov 09 20:51:32 dom01 systemd[1]: Started Proxmox VE Login Banner.

● pve-ha-lrm.service - PVE Local HA Resource Manager Daemon
   Loaded: loaded (/lib/systemd/system/pve-ha-lrm.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 20:52:11 CET; 2h 51min ago
  Process: 28465 ExecStart=/usr/sbin/pve-ha-lrm start (code=exited, status=0/SUCCESS)
 Main PID: 28473 (pve-ha-lrm)
    Tasks: 1 (limit: 4915)
   Memory: 88.8M
   CGroup: /system.slice/pve-ha-lrm.service
           └─28473 pve-ha-lrm

Nov 09 20:52:10 dom01 systemd[1]: Starting PVE Local HA Resource Manager Daemon...
Nov 09 20:52:11 dom01 pve-ha-lrm[28473]: starting server
Nov 09 20:52:11 dom01 pve-ha-lrm[28473]: status change startup => wait_for_agent_lock
Nov 09 20:52:11 dom01 systemd[1]: Started PVE Local HA Resource Manager Daemon.

● pvefw-logger.service - Proxmox VE firewall logger
   Loaded: loaded (/lib/systemd/system/pvefw-logger.service; static; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 20:48:56 CET; 2h 54min ago
  Process: 27947 ExecStart=/usr/sbin/pvefw-logger (code=exited, status=0/SUCCESS)
 Main PID: 27948 (pvefw-logger)
    Tasks: 2 (limit: 4915)
   Memory: 500.0K
   CGroup: /system.slice/pvefw-logger.service
           └─27948 /usr/sbin/pvefw-logger

Nov 09 20:48:56 dom01 systemd[1]: Starting Proxmox VE firewall logger...
Nov 09 20:48:56 dom01 pvefw-logger[27948]: starting pvefw logger
Nov 09 20:48:56 dom01 systemd[1]: Started Proxmox VE firewall logger.

● pve-storage.target - PVE Storage Target
   Loaded: loaded (/lib/systemd/system/pve-storage.target; static; vendor preset: enabled)
   Active: active since Mon 2020-11-09 04:24:16 CET; 19h ago

Nov 09 04:24:16 dom01 systemd[1]: Reached target PVE Storage Target.

● pve-daily-update.timer - Daily PVE download activities
   Loaded: loaded (/lib/systemd/system/pve-daily-update.timer; enabled; vendor preset: enabled)
   Active: active (waiting) since Mon 2020-11-09 04:24:03 CET; 19h ago
  Trigger: Tue 2020-11-10 02:08:07 CET; 2h 24min left

Nov 09 04:24:03 dom01 systemd[1]: Started Daily PVE download activities.

● pveproxy.service - PVE API Proxy Server
   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 23:04:03 CET; 39min ago
  Process: 13568 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
  Process: 13574 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
 Main PID: 13577 (pveproxy)
    Tasks: 4 (limit: 4915)
   Memory: 139.6M
   CGroup: /system.slice/pveproxy.service
           ├─13577 pveproxy
           ├─13578 pveproxy worker
           ├─13579 pveproxy worker
           └─13580 pveproxy worker

Nov 09 23:04:01 dom01 systemd[1]: Starting PVE API Proxy Server...
Nov 09 23:04:03 dom01 pveproxy[13574]: Using '/etc/pve/local/pveproxy-ssl.pem' as certificate for the web interface.
Nov 09 23:04:03 dom01 pveproxy[13577]: starting server
Nov 09 23:04:03 dom01 pveproxy[13577]: starting 3 worker(s)
Nov 09 23:04:03 dom01 pveproxy[13577]: worker 13578 started
Nov 09 23:04:03 dom01 pveproxy[13577]: worker 13579 started
Nov 09 23:04:03 dom01 pveproxy[13577]: worker 13580 started
Nov 09 23:04:03 dom01 systemd[1]: Started PVE API Proxy Server.

● pvestatd.service - PVE Status Daemon
   Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 23:04:11 CET; 39min ago
  Process: 13595 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
 Main PID: 13604 (pvestatd)
    Tasks: 1 (limit: 4915)
   Memory: 86.6M
   CGroup: /system.slice/pvestatd.service
           └─13604 pvestatd

Nov 09 23:04:10 dom01 systemd[1]: Starting PVE Status Daemon...
Nov 09 23:04:11 dom01 pvestatd[13604]: starting server
Nov 09 23:04:11 dom01 systemd[1]: Started PVE Status Daemon.

● pve-guests.service - PVE guests
   Loaded: loaded (/lib/systemd/system/pve-guests.service; enabled; vendor preset: enabled)
   Active: active (exited) since Mon 2020-11-09 04:24:32 CET; 19h ago
  Process: 1510 ExecStartPre=/usr/share/pve-manager/helpers/pve-startall-delay (code=exited, status=0/SUCCESS)
  Process: 1511 ExecStart=/usr/bin/pvesh --nooutput create /nodes/localhost/startall (code=exited, status=0/SUCCESS)
 Main PID: 1511 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   Memory: 0B
   CGroup: /system.slice/pve-guests.service

Nov 09 04:24:22 dom01 systemd[1]: Starting PVE guests...
Nov 09 04:24:23 dom01 pve-guests[1511]: <root@pam> starting task UPID:dom01:000005E8:00000CEB:5FA8B667:startall::root@pam:
Nov 09 04:24:23 dom01 pve-guests[1513]: start VM 100: UPID:dom01:000005E9:00000CEE:5FA8B667:qmstart:100:root@pam:
Nov 09 04:24:23 dom01 pve-guests[1512]: <root@pam> starting task UPID:dom01:000005E9:00000CEE:5FA8B667:qmstart:100:root@pam:
Nov 09 04:24:23 dom01 pvesh[1511]: Starting VM 100
Nov 09 04:24:28 dom01 pvesh[1511]: Starting VM 101
Nov 09 04:24:28 dom01 pve-guests[1576]: start VM 101: UPID:dom01:00000628:00000EE6:5FA8B66C:qmstart:101:root@pam:
Nov 09 04:24:28 dom01 pve-guests[1512]: <root@pam> starting task UPID:dom01:00000628:00000EE6:5FA8B66C:qmstart:101:root@pam:
Nov 09 04:24:32 dom01 pve-guests[1511]: <root@pam> end task UPID:dom01:000005E8:00000CEB:5FA8B667:startall::root@pam: OK
Nov 09 04:24:32 dom01 systemd[1]: Started PVE guests.

● pvenetcommit.service - Commit Proxmox VE network changes
   Loaded: loaded (/lib/systemd/system/pvenetcommit.service; enabled; vendor preset: enabled)
   Active: active (exited) since Mon 2020-11-09 20:48:32 CET; 2h 54min ago
  Process: 27878 ExecStartPre=/bin/rm -f /etc/openvswitch/conf.db (code=exited, status=0/SUCCESS)
  Process: 27883 ExecStartPre=/bin/mv /etc/network/interfaces.new /etc/network/interfaces (code=exited, status=1/FAILURE)
  Process: 27886 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
 Main PID: 27886 (code=exited, status=0/SUCCESS)

Nov 09 20:48:32 dom01 systemd[1]: Starting Commit Proxmox VE network changes...
Nov 09 20:48:32 dom01 mv[27883]: /bin/mv: cannot stat '/etc/network/interfaces.new': No such file or directory
Nov 09 20:48:32 dom01 systemd[1]: Started Commit Proxmox VE network changes.

● pvedaemon.service - PVE API Daemon
   Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-11-09 23:04:21 CET; 39min ago
  Process: 13619 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
 Main PID: 13627 (pvedaemon)
    Tasks: 4 (limit: 4915)
   Memory: 126.8M
   CGroup: /system.slice/pvedaemon.service
           ├─13627 pvedaemon
           ├─13628 pvedaemon worker
           ├─13629 pvedaemon worker
           └─13630 pvedaemon worker

Nov 09 23:04:20 dom01 systemd[1]: Starting PVE API Daemon...
Nov 09 23:04:21 dom01 pvedaemon[13627]: starting server
Nov 09 23:04:21 dom01 pvedaemon[13627]: starting 3 worker(s)
Nov 09 23:04:21 dom01 pvedaemon[13627]: worker 13628 started
Nov 09 23:04:21 dom01 pvedaemon[13627]: worker 13629 started
Nov 09 23:04:21 dom01 pvedaemon[13627]: worker 13630 started
Nov 09 23:04:21 dom01 systemd[1]: Started PVE API Daemon.

● pvesr.timer - Proxmox VE replication runner
   Loaded: loaded (/lib/systemd/system/pvesr.timer; enabled; vendor preset: enabled)
   Active: active (waiting) since Mon 2020-11-09 04:24:03 CET; 19h ago
  Trigger: Mon 2020-11-09 23:44:00 CET; 29s left

Nov 09 04:24:03 dom01 systemd[1]: Started Proxmox VE replication runner.

Network interfaces seem normal to me:

Bash:

→ ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 0c:c4:7a:7b:67:92 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 0c:c4:7a:7b:67:93 brd ff:ff:ff:ff:ff:ff
4: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether 0c:c4:7a:7b:6b:7a brd ff:ff:ff:ff:ff:ff
5: eno4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 0c:c4:7a:7b:6b:7b brd ff:ff:ff:ff:ff:ff
6: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0c:c4:7a:7b:6b:7a brd ff:ff:ff:ff:ff:ff
    inet my.node.ip.addr/24 brd my.node.ip_with_different_end.255 scope global dynamic vmbr0
       valid_lft 57816sec preferred_lft 57816sec
    inet6 fe80::ec4:5hff:fe7b:6b7a/64 scope link
       valid_lft forever preferred_lft forever
7: tap100i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr100i0 state UNKNOWN group default qlen 1000
    link/ether 8a:4b:83:f6:88:57 brd ff:ff:ff:ff:ff:ff
8: fwbr100i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 3a:87:e4:7b:8b:05 brd ff:ff:ff:ff:ff:ff
9: fwpr100p0@fwln100i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether b2:ec:14:e1:cc:62 brd ff:ff:ff:ff:ff:ff
10: fwln100i0@fwpr100p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr100i0 state UP group default qlen 1000
    link/ether 3a:87:e4:7b:8b:05 brd ff:ff:ff:ff:ff:ff
11: tap101i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr101i0 state UNKNOWN group default qlen 1000
    link/ether 92:f2:4d:95:eb:29 brd ff:ff:ff:ff:ff:ff
12: fwbr101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 66:ab:85:05:c3:eb brd ff:ff:ff:ff:ff:ff
13: fwpr101p0@fwln101i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether 9e:b4:b5:19:84:c6 brd ff:ff:ff:ff:ff:ff
14: fwln101i0@fwpr101p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr101i0 state UP group default qlen 1000
    link/ether 66:ab:85:05:c3:eb brd ff:ff:ff:ff:ff:ff

pveversion -v:

Bash:

→ pveversion -v
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-15 (running version: 6.2-15/48bd51b6)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.4.65-1-pve: 5.4.65-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-4
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-9
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 0.9.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.3-8
pve-cluster: 6.2-1
pve-container: 3.2-2
pve-docs: 6.2-6
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-6
pve-xtermjs: 4.7.0-2
pve-zsync: 2.0-3
qemu-server: 6.2-19
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve2

illuminated · Nov 10, 2020

And the last part:

Bash:

→dpkg -l | grep pve

ii  corosync                             3.0.4-pve1                      amd64        cluster engine daemon and utilities
ii  dmeventd                             2:1.02.155-pve4                 amd64        Linux Kernel Device Mapper event daemon
ii  dmsetup                              2:1.02.155-pve4                 amd64        Linux Kernel Device Mapper userspace library
ii  grub-common                          2.02+dfsg1-18-pve1              amd64        GRand Unified Bootloader (common files)
ii  grub-efi-amd64-bin                   2.02+dfsg1-18-pve1              amd64        GRand Unified Bootloader, version 2 (EFI-AMD64 modules)
ii  grub-pc                              2.02+dfsg1-18-pve1              amd64        GRand Unified Bootloader, version 2 (PC/BIOS version)
ii  grub-pc-bin                          2.02+dfsg1-18-pve1              amd64        GRand Unified Bootloader, version 2 (PC/BIOS modules)
ii  grub2-common                         2.02+dfsg1-18-pve1              amd64        GRand Unified Bootloader (common files for version 2)
ii  ifupdown                             0.8.35+pve1                     amd64        high level tools to configure network interfaces
ii  libcfg7:amd64                        3.0.4-pve1                      amd64        cluster engine CFG library
ii  libcmap4:amd64                       3.0.4-pve1                      amd64        cluster engine CMAP library
ii  libcorosync-common4:amd64            3.0.4-pve1                      amd64        cluster engine common library
ii  libcpg4:amd64                        3.0.4-pve1                      amd64        cluster engine CPG library
ii  libdevmapper-event1.02.1:amd64       2:1.02.155-pve4                 amd64        Linux Kernel Device Mapper event support library
ii  libdevmapper1.02.1:amd64             2:1.02.155-pve4                 amd64        Linux Kernel Device Mapper userspace library
ii  libknet1:amd64                       1.16-pve1                       amd64        kronosnet core switching implementation
ii  liblvm2cmd2.03:amd64                 2.03.02-pve4                    amd64        LVM2 command library
ii  libnvpair1linux                      0.8.4-pve2                      amd64        Solaris name-value library for Linux
ii  libpve-access-control                6.1-3                           all          Proxmox VE access control library
ii  libpve-apiclient-perl                3.0-3                           all          Proxmox VE API client library
ii  libpve-cluster-api-perl              6.2-1                           all          Proxmox Virtual Environment cluster Perl API modules.
ii  libpve-cluster-perl                  6.2-1                           all          Proxmox Virtual Environment cluster Perl modules.
ii  libpve-common-perl                   6.2-4                           all          Proxmox VE base library
ii  libpve-guest-common-perl             3.1-3                           all          Proxmox VE common guest-related modules
ii  libpve-http-server-perl              3.0-6                           all          Proxmox Asynchrounous HTTP Server Implementation
ii  libpve-storage-perl                  6.2-9                           all          Proxmox VE storage management library
ii  libpve-u2f-server-perl               1.1-1                           amd64        Perl bindings for libu2f-server
ii  libquorum5:amd64                     3.0.4-pve1                      amd64        cluster engine Quorum library
ii  libspice-server1:amd64               0.14.2-4~pve6+1                 amd64        Implements the server side of the SPICE protocol
ii  libuutil1linux                       0.8.4-pve2                      amd64        Solaris userland utility library for Linux
ii  libvotequorum8:amd64                 3.0.4-pve1                      amd64        cluster engine Votequorum library
ii  libzfs2linux                         0.8.4-pve2                      amd64        OpenZFS filesystem library for Linux
ii  libzpool2linux                       0.8.4-pve2                      amd64        OpenZFS pool library for Linux
ii  lvm2                                 2.03.02-pve4                    amd64        Linux Logical Volume Manager
ii  lxc-pve                              4.0.3-1                         amd64        Linux containers userspace tools
ii  lxcfs                                4.0.3-pve3                      amd64        LXC userspace filesystem
ii  novnc-pve                            1.1.0-1                         all          HTML5 VNC client
ii  pve-cluster                          6.2-1                           amd64        "pmxcfs" distributed cluster filesystem for Proxmox Virtual Environment.
ii  pve-container                        3.2-2                           all          Proxmox VE Container management tool
ii  pve-docs                             6.2-6                           all          Proxmox VE Documentation
ii  pve-edk2-firmware                    2.20200531-1                    all          edk2 based firmware modules for virtual machines
ii  pve-firewall                         4.1-3                           amd64        Proxmox VE Firewall
ii  pve-firmware                         3.1-3                           all          Binary firmware code for the pve-kernel
ii  pve-ha-manager                       3.1-1                           amd64        Proxmox VE HA Manager
ii  pve-i18n                             2.2-2                           all          Internationalization support for Proxmox VE
ii  pve-kernel-5.4                       6.2-7                           all          Latest Proxmox VE Kernel Image
ii  pve-kernel-5.4.65-1-pve              5.4.65-1                        amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-helper                    6.2-7                           all          Function for various kernel maintenance tasks.
ii  pve-lxc-syscalld                     0.9.1-1                         amd64        PVE LXC syscall daemon
ii  pve-manager                          6.2-15                          amd64        Proxmox Virtual Environment Management Tools
ii  pve-qemu-kvm                         5.1.0-6                         amd64        Full virtualization on x86 hardware
ii  pve-xtermjs                          4.7.0-2                         amd64        Binaries built from the Rust termproxy crate
ii  pve-zsync                            2.0-3                           all          Proxmox VE ZFS syncing tool
ii  smartmontools                        7.1-pve2                        amd64        control and monitor storage systems using S.M.A.R.T.
ii  zfs-initramfs                        0.8.4-pve2                      all          OpenZFS root filesystem capabilities for Linux - initramfs
ii  zfs-zed                              0.8.4-pve2                      amd64        OpenZFS Event Daemon
ii  zfsutils-linux                       0.8.4-pve2                      amd64        command-line tools to manage OpenZFS filesystems

illuminated · Nov 10, 2020

So, I definitely cannot connect over ssh after the hang happens, but if there is an already active/open ssh connection - it will keep working without an issue.

illuminated · Nov 10, 2020

I think the biggest help at the moment would be if someone could post their network configuration of Proxmox 6 on SoYouStart. The /etc/network/interfaces file.

From everything I know and I've learned the past few days, I think that's the main issue with the accessibility of the node: network configuration in Proxmox at SoYouStart DC...

Thanks in advance!

illuminated · Nov 10, 2020

Solved, at least it seems so. I was able to get the node respond to SSH and to serve the GUI by manually setting the network, without having to hard reboot the server.

If your IP at SoYouStart is a.b.c.d and your gateway is a.b.c.254, then the winning setting is:

Bash:

→cat /etc/network/interfaces

auto lo
iface lo inet loopback
        dns-nameservers 213.186.33.99 8.8.8.8 8.8.4.4

auto eno3
iface eno3 inet manual

iface eno1 inet manual

iface eno2 inet manual

iface eno4 inet manual

auto vmbr0
        iface vmbr0 inet static
        address a.b.c.d
        netmask a.b.c.254
        bridge-ports eno3
        bridge-stp off
        bridge-fd 0
        post-up route add a.b.c.254 dev vmbr0
        post-up route add default gw a.b.c.254

I'll wait for few days to confirm no hangs will happen before marking this as solved.

illuminated · Nov 10, 2020

So, this works perfectly, but for the node itself.

But VM's below, with their failover IP's are not accessible with this.

I've been trying configurations based on Proxmox manual's Network configuration, part about using a bridge, but couldn't make it work. Either my server/node is connected and available and the VM's are not, or vice versa.

Any help appreciated.

illuminated · Nov 11, 2020

So... I have found a network configuration that kinda works for me. The connectivity disruptions still happen, though, but they "revert back" at some point (then happening again and so on).

I still cannot catch the cause of this behavior... None of the logs shows any anomaly. It just stops to respond to any requests from the outside, whether it's ssh or the gui... Internally the business is as usual. GUI is up (pveproxy), all services are up, I can ping any outer host, apt update works... Nothing I do internally fixes the connectivity; I've tried restarting all the PVE services, restarting network. Also, all the VM's work as nothing has happened: I can ssh to their public IP's/hostnames, access their services from the outside... I wouldn't be able to notice anything is wrong if I wasn't connecting to the Proxmox node.
Then, after some time, ssh and GUI just start working again. No indication of any change in the logs. Rebooting the node also makes the services work, but I'm not gonna be able to do that pretty soon, after I finish with the VM setup.

Not sure if it is worth reinstalling everything from scratch?
I did spend several days setting up the VM's, but next few days will be my only window to be able to do this... afterwards, as time passes, it would be harder and harder to do.

When this started happening, everything was default, all the settings. I'm not using ceph or zfs... Just default Proximux 6 install from the SoYouStart image, then adding the certificate for the node hostname and then creating VM's. First few posts in this thread still show the state of the node.

Any advice appreciated.

Search

Search

Yet another no GUI thread

illuminated

Member

illuminated

Member

illuminated

Member

illuminated

Member

illuminated

Member

illuminated

Member

illuminated

Member

illuminated

Member