No VM/LXC can start after Proxmox update and Synology update to 7.2.1

nebz · Oct 5, 2023

Dears,

I just updated my both proxmox (cluster) and restart them, but since, no vm/lxc are starting anymore

Cluster seems alright :

Code:

Cluster information
-------------------
Name:             NEBZCLUSTER
Config Version:   11
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Oct  5 16:50:15 2023
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.1155
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 192.168.1.100 (local)
0x00000002          1    A,V,NMW 192.168.1.101
0x00000000          1            Qdevice

here is the updated packages :

Code:

"ifupdown2/stable 3.2.0-1+pmx5 all [upgradable from: 3.2.0-1+pmx4]",
"libc-bin/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc-dev-bin/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc-devtools/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc-l10n/stable-security 2.36-9+deb12u3 all [upgradable from: 2.36-9+deb12u1]",
"libc6-dev/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc6/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libknet1/stable 1.26-pve1 amd64 [upgradable from: 1.25-pve1]",
"libnozzle1/stable 1.26-pve1 amd64 [upgradable from: 1.25-pve1]",
"libnvpair3linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"libpve-cluster-api-perl/stable 8.0.4 all [upgradable from: 8.0.3]",
"libpve-cluster-perl/stable 8.0.4 all [upgradable from: 8.0.3]",
"libpve-common-perl/stable 8.0.9 all [upgradable from: 8.0.8]",
"libpve-guest-common-perl/stable 5.0.5 all [upgradable from: 5.0.4]",
"libuutil3linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"libzfs4linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"libzpool5linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"locales/stable-security 2.36-9+deb12u3 all [upgradable from: 2.36-9+deb12u1]",
"proxmox-backup-client/stable 3.0.3-1 amd64 [upgradable from: 3.0.2-1]",
"proxmox-backup-file-restore/stable 3.0.3-1 amd64 [upgradable from: 3.0.2-1]",
"proxmox-headers-6.2/stable 6.2.16-15 all [upgradable from: 6.2.16-14]",
"proxmox-kernel-6.2/stable 6.2.16-15 all [upgradable from: 6.2.16-14]",
"proxmox-widget-toolkit/stable 4.0.9 all [upgradable from: 4.0.6]",
"pve-cluster/stable 8.0.4 amd64 [upgradable from: 8.0.3]",
"pve-docs/stable 8.0.5 all [upgradable from: 8.0.4]",
"spl/stable 2.1.13-pve1 all [upgradable from: 2.1.12-pve1]",
"zfs-initramfs/stable 2.1.13-pve1 all [upgradable from: 2.1.12-pve1]",
"zfs-zed/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"zfsutils-linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]"

here is my pvesm status :

Code:

[root@NEBZNUC pve]$ pvesm status
Name                     Type     Status           Total            Used       Available        %
BACKUP_SYNO_NFS           nfs     active     14981718400     10623038464      4358679936   70.91%
BACKUP_SYNO_SMB          cifs     active     14981718344     10623038460      4358679884   70.91%
INTERNAL_DISK             dir     active       479596200       204631100       250529440   42.67%
INTERNAL_DISK_NFS         nfs   disabled               0               0               0      N/A
NFS-SYNO                  nfs     active     14981718400     10623038464      4358679936   70.91%
local                     dir     active        59547564        12647876        43842436   21.24%
local-lvm             lvmthin     active       154533888         6629503       147904384    4.29%

vm/lxc are stored on NFS-SYNO or local (neither of them are starting)

proxmox display (?) almost everywhere :

I see the vm name a little bit after a reboot but then it goes like this

here is a syslog exemple

(would you need anything else, let me know)

Chris · Oct 5, 2023

nebz said:

Dears,

I just updated my both proxmox (cluster) and restart them, but since, no vm/lxc are starting anymore

Cluster seems alright :

Code:

Cluster information
-------------------
Name:             NEBZCLUSTER
Config Version:   11
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Oct  5 16:50:15 2023
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.1155
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 192.168.1.100 (local)
0x00000002          1    A,V,NMW 192.168.1.101
0x00000000          1            Qdevice

here is the updated packages :

Code:

"ifupdown2/stable 3.2.0-1+pmx5 all [upgradable from: 3.2.0-1+pmx4]",
"libc-bin/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc-dev-bin/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc-devtools/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc-l10n/stable-security 2.36-9+deb12u3 all [upgradable from: 2.36-9+deb12u1]",
"libc6-dev/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libc6/stable-security 2.36-9+deb12u3 amd64 [upgradable from: 2.36-9+deb12u1]",
"libknet1/stable 1.26-pve1 amd64 [upgradable from: 1.25-pve1]",
"libnozzle1/stable 1.26-pve1 amd64 [upgradable from: 1.25-pve1]",
"libnvpair3linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"libpve-cluster-api-perl/stable 8.0.4 all [upgradable from: 8.0.3]",
"libpve-cluster-perl/stable 8.0.4 all [upgradable from: 8.0.3]",
"libpve-common-perl/stable 8.0.9 all [upgradable from: 8.0.8]",
"libpve-guest-common-perl/stable 5.0.5 all [upgradable from: 5.0.4]",
"libuutil3linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"libzfs4linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"libzpool5linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"locales/stable-security 2.36-9+deb12u3 all [upgradable from: 2.36-9+deb12u1]",
"proxmox-backup-client/stable 3.0.3-1 amd64 [upgradable from: 3.0.2-1]",
"proxmox-backup-file-restore/stable 3.0.3-1 amd64 [upgradable from: 3.0.2-1]",
"proxmox-headers-6.2/stable 6.2.16-15 all [upgradable from: 6.2.16-14]",
"proxmox-kernel-6.2/stable 6.2.16-15 all [upgradable from: 6.2.16-14]",
"proxmox-widget-toolkit/stable 4.0.9 all [upgradable from: 4.0.6]",
"pve-cluster/stable 8.0.4 amd64 [upgradable from: 8.0.3]",
"pve-docs/stable 8.0.5 all [upgradable from: 8.0.4]",
"spl/stable 2.1.13-pve1 all [upgradable from: 2.1.12-pve1]",
"zfs-initramfs/stable 2.1.13-pve1 all [upgradable from: 2.1.12-pve1]",
"zfs-zed/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]",
"zfsutils-linux/stable 2.1.13-pve1 amd64 [upgradable from: 2.1.12-pve1]"

here is my pvesm status :

Code:

[root@NEBZNUC pve]$ pvesm status
Name                     Type     Status           Total            Used       Available        %
BACKUP_SYNO_NFS           nfs     active     14981718400     10623038464      4358679936   70.91%
BACKUP_SYNO_SMB          cifs     active     14981718344     10623038460      4358679884   70.91%
INTERNAL_DISK             dir     active       479596200       204631100       250529440   42.67%
INTERNAL_DISK_NFS         nfs   disabled               0               0               0      N/A
NFS-SYNO                  nfs     active     14981718400     10623038464      4358679936   70.91%
local                     dir     active        59547564        12647876        43842436   21.24%
local-lvm             lvmthin     active       154533888         6629503       147904384    4.29%

vm/lxc are stored on NFS-SYNO or local (neither of them are starting)

proxmox display (?) almost everywhere :
View attachment 56168
I see the vm name a little bit after a reboot but then it goes like this

here is a syslog exemple

(would you need anything else, let me know)

Hi,
in your logs there is a Delaying on-boot 'startall' command for 300 second(s). so I suppose you configured a onboot delay for VM/LXC startup?

The current node status is however not expected and need further investigation, please post the output of systemctl status pveproxy.service pvedaemon.service pvestatd.service pve-guests.service.

nebz · Oct 5, 2023

Chris said:
in your logs there is a Delaying on-boot 'startall' command for 300 second(s). so I suppose you configured a onboot delay for VM/LXC startup?

indeed

Chris said:
The current node status is however not expected and need further investigation, please post the output of systemctl status pveproxy.service pvedaemon.service pvestatd.service pve-guests.service.

first node :

Code:

[root@NEBZNUC system]$ systemctl status pveproxy.service
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 16:14:42 CEST; 55min ago
    Process: 1128 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
    Process: 1173 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
   Main PID: 1174 (pveproxy)
      Tasks: 4 (limit: 38080)
     Memory: 181.3M
        CPU: 4.733s
     CGroup: /system.slice/pveproxy.service
             ├─1174 pveproxy
             ├─1175 "pveproxy worker"
             ├─1176 "pveproxy worker"
             └─1177 "pveproxy worker"

Oct 05 16:51:48 NEBZNUC pveproxy[1177]: proxy detected vanished client connection
Oct 05 16:52:10 NEBZNUC pveproxy[1176]: proxy detected vanished client connection
Oct 05 16:53:44 NEBZNUC pveproxy[1176]: proxy detected vanished client connection
Oct 05 16:54:15 NEBZNUC pveproxy[1175]: proxy detected vanished client connection
Oct 05 16:56:24 NEBZNUC pveproxy[1177]: proxy detected vanished client connection
Oct 05 16:56:41 NEBZNUC pveproxy[1175]: proxy detected vanished client connection
Oct 05 16:56:45 NEBZNUC pveproxy[1176]: proxy detected vanished client connection
Oct 05 16:56:49 NEBZNUC pveproxy[1177]: proxy detected vanished client connection
Oct 05 17:03:48 NEBZNUC pveproxy[1175]: proxy detected vanished client connection
Oct 05 17:09:59 NEBZNUC pveproxy[1176]: proxy detected vanished client connection
[root@NEBZNUC system]$ systemctl status pvedaemon
● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 16:34:40 CEST; 35min ago
    Process: 4195 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
   Main PID: 4196 (pvedaemon)
      Tasks: 7 (limit: 38080)
     Memory: 213.9M
        CPU: 1.009s
     CGroup: /system.slice/pvedaemon.service
             ├─4196 pvedaemon
             ├─4197 "pvedaemon worker"
             ├─4198 "pvedaemon worker"
             ├─4199 "pvedaemon worker"
             ├─4365 lxc-info -n 104 -p
             ├─4531 lxc-info -n 104 -p
             └─4713 lxc-info -n 104 -p

Oct 05 16:34:39 NEBZNUC systemd[1]: Starting pvedaemon.service - PVE API Daemon...
Oct 05 16:34:40 NEBZNUC pvedaemon[4196]: starting server
Oct 05 16:34:40 NEBZNUC pvedaemon[4196]: starting 3 worker(s)
Oct 05 16:34:40 NEBZNUC pvedaemon[4196]: worker 4197 started
Oct 05 16:34:40 NEBZNUC pvedaemon[4196]: worker 4198 started
Oct 05 16:34:40 NEBZNUC pvedaemon[4196]: worker 4199 started
Oct 05 16:34:40 NEBZNUC systemd[1]: Started pvedaemon.service - PVE API Daemon.
[root@NEBZNUC system]$ systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 16:24:15 CEST; 46min ago
    Process: 3000 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
   Main PID: 3001 (pvestatd)
      Tasks: 2 (limit: 38080)
     Memory: 82.5M
        CPU: 675ms
     CGroup: /system.slice/pvestatd.service
             ├─3001 pvestatd
             └─3039 lxc-info -n 101 -p

Oct 05 16:24:15 NEBZNUC systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Oct 05 16:24:15 NEBZNUC pvestatd[3001]: starting server
Oct 05 16:24:15 NEBZNUC systemd[1]: Started pvestatd.service - PVE Status Daemon.
[root@NEBZNUC system]$ systemctl status pve-guests
● pve-guests.service - PVE guests
     Loaded: loaded (/lib/systemd/system/pve-guests.service; enabled; preset: enabled)
     Active: activating (start) since Thu 2023-10-05 16:14:43 CEST; 55min ago
    Process: 1184 ExecStartPre=/usr/share/pve-manager/helpers/pve-startall-delay (code=exited, status=0/SUCCESS)
   Main PID: 1755 (pvesh)
      Tasks: 4 (limit: 38080)
     Memory: 131.4M
        CPU: 1.799s
     CGroup: /system.slice/pve-guests.service
             ├─1755 /usr/bin/perl /usr/bin/pvesh --nooutput create /nodes/localhost/startall
             ├─1756 "task UPID:NEBZNUC:000006DC:00006A55:651EC5C4:startall::root@pam:"
             └─2197 "task UPID:NEBZNUC:00000895:000076D9:651EC5E4:vzstart:103:root@pam:"

Oct 05 16:14:43 NEBZNUC systemd[1]: Starting pve-guests.service - PVE guests...
Oct 05 16:14:43 NEBZNUC pve-startall-delay[1184]: Delaying on-boot 'startall' command for 240 second(s).
Oct 05 16:18:44 NEBZNUC pve-guests[1755]: <root@pam> starting task UPID:NEBZNUC:000006DC:00006A55:651EC5C4:startall::root@pam:
Oct 05 16:18:44 NEBZNUC pvesh[1755]: Starting CT 113
Oct 05 16:18:44 NEBZNUC pve-guests[1756]: <root@pam> starting task UPID:NEBZNUC:000006DD:00006A57:651EC5C4:vzstart:113:root@pam:
Oct 05 16:18:44 NEBZNUC pve-guests[1757]: starting CT 113: UPID:NEBZNUC:000006DD:00006A57:651EC5C4:vzstart:113:root@pam:
Oct 05 16:18:46 NEBZNUC pvesh[1755]: Waiting for 30 seconds (startup delay)
Oct 05 16:19:16 NEBZNUC pvesh[1755]: Starting CT 103
Oct 05 16:19:16 NEBZNUC pve-guests[1756]: <root@pam> starting task UPID:NEBZNUC:00000895:000076D9:651EC5E4:vzstart:103:root@pam:
Oct 05 16:19:16 NEBZNUC pve-guests[2197]: starting CT 103: UPID:NEBZNUC:00000895:000076D9:651EC5E4:vzstart:103:root@pam:

second node :

Code:

[root@NEBZNUC2 ~]$ systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 17:01:34 CEST; 10min ago
    Process: 1085 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
    Process: 1090 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
   Main PID: 1091 (pveproxy)
      Tasks: 4 (limit: 38079)
     Memory: 141.3M
        CPU: 948ms
     CGroup: /system.slice/pveproxy.service
             ├─1091 pveproxy
             ├─1092 "pveproxy worker"
             ├─1093 "pveproxy worker"
             └─1094 "pveproxy worker"

Oct 05 17:01:27 NEBZNUC2 systemd[1]: Starting pveproxy.service - PVE API Proxy Server...
Oct 05 17:01:34 NEBZNUC2 pveproxy[1091]: starting server
Oct 05 17:01:34 NEBZNUC2 pveproxy[1091]: starting 3 worker(s)
Oct 05 17:01:34 NEBZNUC2 pveproxy[1091]: worker 1092 started
Oct 05 17:01:34 NEBZNUC2 pveproxy[1091]: worker 1093 started
Oct 05 17:01:34 NEBZNUC2 pveproxy[1091]: worker 1094 started
Oct 05 17:01:34 NEBZNUC2 systemd[1]: Started pveproxy.service - PVE API Proxy Server.
[root@NEBZNUC2 ~]$ systemctl status pvedaemon
● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 17:01:27 CEST; 10min ago
    Process: 1050 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
   Main PID: 1080 (pvedaemon)
      Tasks: 4 (limit: 38079)
     Memory: 208.8M
        CPU: 733ms
     CGroup: /system.slice/pvedaemon.service
             ├─1080 pvedaemon
             ├─1081 "pvedaemon worker"
             ├─1082 "pvedaemon worker"
             └─1083 "pvedaemon worker"

Oct 05 17:01:26 NEBZNUC2 systemd[1]: Starting pvedaemon.service - PVE API Daemon...
Oct 05 17:01:27 NEBZNUC2 pvedaemon[1080]: starting server
Oct 05 17:01:27 NEBZNUC2 pvedaemon[1080]: starting 3 worker(s)
Oct 05 17:01:27 NEBZNUC2 pvedaemon[1080]: worker 1081 started
Oct 05 17:01:27 NEBZNUC2 pvedaemon[1080]: worker 1082 started
Oct 05 17:01:27 NEBZNUC2 pvedaemon[1080]: worker 1083 started
Oct 05 17:01:27 NEBZNUC2 systemd[1]: Started pvedaemon.service - PVE API Daemon.
[root@NEBZNUC2 ~]$ systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-10-05 17:01:26 CEST; 10min ago
    Process: 1033 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
   Main PID: 1058 (pvestatd)
      Tasks: 3 (limit: 38079)
     Memory: 134.9M
        CPU: 598ms
     CGroup: /system.slice/pvestatd.service
             ├─1058 pvestatd
             ├─1141 /bin/mount -t nfs 192.168.1.125:/volume1/JeedomBackup /mnt/pve/BACKUP_SYNO_NFS -o vers=4.1
             └─1142 /sbin/mount.nfs 192.168.1.125:/volume1/JeedomBackup /mnt/pve/BACKUP_SYNO_NFS -o rw,vers=4.1

Oct 05 17:01:26 NEBZNUC2 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Oct 05 17:01:26 NEBZNUC2 pvestatd[1058]: starting server
Oct 05 17:01:26 NEBZNUC2 systemd[1]: Started pvestatd.service - PVE Status Daemon.
[root@NEBZNUC2 ~]$ systemctl status pve-guests
● pve-guests.service - PVE guests
     Loaded: loaded (/lib/systemd/system/pve-guests.service; enabled; preset: enabled)
     Active: activating (start) since Thu 2023-10-05 17:01:34 CEST; 10min ago
    Process: 1101 ExecStartPre=/usr/share/pve-manager/helpers/pve-startall-delay (code=exited, status=0/SUCCESS)
   Main PID: 1678 (pvesh)
      Tasks: 6 (limit: 38079)
     Memory: 130.9M
        CPU: 892ms
     CGroup: /system.slice/pve-guests.service
             ├─1678 /usr/bin/perl /usr/bin/pvesh --nooutput create /nodes/localhost/startall
             ├─1679 "task UPID:NEBZNUC2:0000068F:00007B8E:651ED0FB:startall::root@pam:"
             ├─2123 "task UPID:NEBZNUC2:0000084B:00008811:651ED11B:vzstart:109:root@pam:"
             ├─2126 /bin/mount -t nfs 192.168.1.125:/volume1/VMStore-NFS /mnt/pve/NFS-SYNO -o vers=4.1
             └─2127 /sbin/mount.nfs 192.168.1.125:/volume1/VMStore-NFS /mnt/pve/NFS-SYNO -o rw,vers=4.1

Oct 05 17:01:34 NEBZNUC2 systemd[1]: Starting pve-guests.service - PVE guests...
Oct 05 17:01:35 NEBZNUC2 pve-startall-delay[1101]: Delaying on-boot 'startall' command for 300 second(s).
Oct 05 17:06:35 NEBZNUC2 pve-guests[1678]: <root@pam> starting task UPID:NEBZNUC2:0000068F:00007B8E:651ED0FB:startall::root@pam:
Oct 05 17:06:35 NEBZNUC2 pvesh[1678]: Starting CT 114
Oct 05 17:06:35 NEBZNUC2 pve-guests[1680]: starting CT 114: UPID:NEBZNUC2:00000690:00007B90:651ED0FB:vzstart:114:root@pam:
Oct 05 17:06:35 NEBZNUC2 pve-guests[1679]: <root@pam> starting task UPID:NEBZNUC2:00000690:00007B90:651ED0FB:vzstart:114:root@pam:
Oct 05 17:06:37 NEBZNUC2 pvesh[1678]: Waiting for 30 seconds (startup delay)
Oct 05 17:07:07 NEBZNUC2 pvesh[1678]: Starting CT 109
Oct 05 17:07:07 NEBZNUC2 pve-guests[1679]: <root@pam> starting task UPID:NEBZNUC2:0000084B:00008811:651ED11B:vzstart:109:root@pam:
Oct 05 17:07:07 NEBZNUC2 pve-guests[2123]: starting CT 109: UPID:NEBZNUC2:0000084B:00008811:651ED11B:vzstart:109:root@pam:

nebz · Oct 5, 2023

and mount is ok :

Code:

[root@NEBZNUC images]$ pvesm list NFS-SYNO
Volid                            Format  Type             Size VMID
NFS-SYNO:100/vm-100-disk-0.qcow2 qcow2   images    34359738368 100
NFS-SYNO:101/vm-101-disk-0.raw   raw     rootdir    8589934592 101
NFS-SYNO:101/vm-101-disk-1.raw   raw     rootdir    8589934592 101
NFS-SYNO:102/vm-102-disk-0.raw   raw     rootdir    4294967296 102
NFS-SYNO:103/vm-103-disk-0.raw   raw     rootdir    2147483648 103
NFS-SYNO:104/vm-104-disk-0.raw   raw     rootdir    2147483648 104
NFS-SYNO:105/vm-105-disk-0.qcow2 qcow2   images    17179869184 105
NFS-SYNO:106/vm-106-disk-0.raw   raw     images        4194304 106
NFS-SYNO:106/vm-106-disk-1.raw   raw     images    34359738368 106
NFS-SYNO:107/vm-107-disk-0.raw   raw     rootdir    8589934592 107
NFS-SYNO:108/vm-108-disk-0.qcow2 qcow2   images    34359738368 108
NFS-SYNO:109/vm-109-disk-0.raw   raw     rootdir    4294967296 109
NFS-SYNO:110/vm-110-disk-0.raw   raw     rootdir   21474836480 110
NFS-SYNO:111/vm-111-disk-0.raw   raw     rootdir    8589934592 111

Code:

[root@NEBZNUC2 NFS-SYNO]$ pvesm list NFS-SYNO
Volid                            Format  Type             Size VMID
NFS-SYNO:100/vm-100-disk-0.qcow2 qcow2   images    34359738368 100
NFS-SYNO:101/vm-101-disk-0.raw   raw     rootdir    8589934592 101
NFS-SYNO:101/vm-101-disk-1.raw   raw     rootdir    8589934592 101
NFS-SYNO:102/vm-102-disk-0.raw   raw     rootdir    4294967296 102
NFS-SYNO:103/vm-103-disk-0.raw   raw     rootdir    2147483648 103
NFS-SYNO:104/vm-104-disk-0.raw   raw     rootdir    2147483648 104
NFS-SYNO:105/vm-105-disk-0.qcow2 qcow2   images    17179869184 105
NFS-SYNO:106/vm-106-disk-0.raw   raw     images        4194304 106
NFS-SYNO:106/vm-106-disk-1.raw   raw     images    34359738368 106
NFS-SYNO:107/vm-107-disk-0.raw   raw     rootdir    8589934592 107
NFS-SYNO:108/vm-108-disk-0.qcow2 qcow2   images    34359738368 108
NFS-SYNO:109/vm-109-disk-0.raw   raw     rootdir    4294967296 109
NFS-SYNO:110/vm-110-disk-0.raw   raw     rootdir   21474836480 110
NFS-SYNO:111/vm-111-disk-0.raw   raw     rootdir    8589934592 111

Chris · Oct 5, 2023

nebz said:
pveproxy[1177]: proxy detected vanished client connection

This looks off, try restarting the pveproxy and pvedaemon services

nebz · Oct 5, 2023

when i restart almost everything : systemctl restart pvedaemon;systemctl restart pvestatd;systemctl restart pveproxy;systemctl restart corosync

I see a little bit better :

but the vm's seems down when i try to connect, and their CPU is 0%

and few minutes later, everything is grey again...

Chris · Oct 6, 2023

Please share the full systemd journal after boot for each of the nodes, maybe this gives more insights on what is going on. You can generate them by running journalctl -b > $(hostname)-journal.txt.

nebz · Oct 6, 2023

here you are

nebz · Oct 6, 2023

the vm or ct did not started yet, if i relaunch the command after that, :

Code:

Oct 06 08:40:20 NEBZNUC pvedaemon[1116]: <root@pam> successful auth for user 'root@pam'
Oct 06 08:43:09 NEBZNUC pvedaemon[1115]: <root@pam> starting task UPID:NEBZNUC:00000651:00005857:651FAC7D:vzstart:113:root@pam:
Oct 06 08:43:09 NEBZNUC pvedaemon[1617]: starting CT 113: UPID:NEBZNUC:00000651:00005857:651FAC7D:vzstart:113:root@pam:
Oct 06 08:43:09 NEBZNUC systemd[1]: Created slice system-pve\x2dcontainer.slice - PVE LXC Container Slice.
Oct 06 08:43:09 NEBZNUC systemd[1]: Started pve-container@113.service - PVE LXC Container: 113.
Oct 06 08:43:10 NEBZNUC kernel: EXT4-fs (dm-7): mounted filesystem e3eaf52e-6c97-478f-9ef8-ed17840a5f40 with ordered data mode. Quota mode: none.
Oct 06 08:43:10 NEBZNUC audit[1637]: AVC apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-113_</var/lib/lxc>" pid=1637 comm="apparmor_parser"
Oct 06 08:43:10 NEBZNUC kernel: kauditd_printk_skb: 15 callbacks suppressed
Oct 06 08:43:10 NEBZNUC kernel: audit: type=1400 audit(1696574590.245:27): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-113_</var/lib/lxc>" pid=1637 comm="apparmor_parser"
Oct 06 08:43:10 NEBZNUC kernel: vmbr0: port 2(fwpr113p0) entered blocking state
Oct 06 08:43:10 NEBZNUC kernel: vmbr0: port 2(fwpr113p0) entered disabled state
Oct 06 08:43:10 NEBZNUC kernel: device fwpr113p0 entered promiscuous mode
Oct 06 08:43:10 NEBZNUC kernel: vmbr0: port 2(fwpr113p0) entered blocking state
Oct 06 08:43:10 NEBZNUC kernel: vmbr0: port 2(fwpr113p0) entered forwarding state
Oct 06 08:43:10 NEBZNUC kernel: fwbr113i0: port 1(fwln113i0) entered blocking state
Oct 06 08:43:10 NEBZNUC kernel: fwbr113i0: port 1(fwln113i0) entered disabled state
Oct 06 08:43:10 NEBZNUC kernel: device fwln113i0 entered promiscuous mode
Oct 06 08:43:10 NEBZNUC kernel: fwbr113i0: port 1(fwln113i0) entered blocking state
Oct 06 08:43:10 NEBZNUC kernel: fwbr113i0: port 1(fwln113i0) entered forwarding state
Oct 06 08:43:10 NEBZNUC kernel: fwbr113i0: port 2(veth113i0) entered blocking state
Oct 06 08:43:10 NEBZNUC kernel: fwbr113i0: port 2(veth113i0) entered disabled state
Oct 06 08:43:10 NEBZNUC kernel: device veth113i0 entered promiscuous mode
Oct 06 08:43:10 NEBZNUC kernel: eth0: renamed from vethuxg6mf
Oct 06 08:43:10 NEBZNUC pvedaemon[1115]: <root@pam> end task UPID:NEBZNUC:00000651:00005857:651FAC7D:vzstart:113:root@pam: OK
Oct 06 08:43:11 NEBZNUC kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Oct 06 08:43:11 NEBZNUC kernel: fwbr113i0: port 2(veth113i0) entered blocking state
Oct 06 08:43:11 NEBZNUC kernel: fwbr113i0: port 2(veth113i0) entered forwarding state
Oct 06 08:43:19 NEBZNUC pvedaemon[1117]: <root@pam> starting task UPID:NEBZNUC:0000097A:00005C3E:651FAC87:vzstart:103:root@pam:
Oct 06 08:43:19 NEBZNUC pvedaemon[2426]: starting CT 103: UPID:NEBZNUC:0000097A:00005C3E:651FAC87:vzstart:103:root@pam:
Oct 06 08:43:32 NEBZNUC pvedaemon[1115]: <root@pam> starting task UPID:NEBZNUC:000009CA:00006186:651FAC94:vzstart:101:root@pam:
Oct 06 08:43:32 NEBZNUC pvedaemon[2506]: starting CT 101: UPID:NEBZNUC:000009CA:00006186:651FAC94:vzstart:101:root@pam:
Oct 06 08:43:46 NEBZNUC pve-guests[2533]: <root@pam> starting task UPID:NEBZNUC:000009E6:000066D4:651FACA2:startall::root@pam:
Oct 06 08:43:46 NEBZNUC pvesh[2533]: Starting CT 103
Oct 06 08:43:46 NEBZNUC pvesh[2533]: trying to acquire lock...
Oct 06 08:43:56 NEBZNUC pvesh[2533]: can't lock file '/run/lock/lxc/pve-config-103.lock' - got timeout
Oct 06 08:43:56 NEBZNUC pvesh[2533]: Starting CT 110
Oct 06 08:43:56 NEBZNUC pve-guests[2534]: <root@pam> starting task UPID:NEBZNUC:000009FA:00006ABE:651FACAC:vzstart:110:root@pam:
Oct 06 08:43:56 NEBZNUC pve-guests[2554]: starting CT 110: UPID:NEBZNUC:000009FA:00006ABE:651FACAC:vzstart:110:root@pam:
Oct 06 08:43:56 NEBZNUC sshd[2552]: Accepted password for root from 192.168.1.233 port 51465 ssh2
Oct 06 08:43:56 NEBZNUC sshd[2552]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Oct 06 08:43:56 NEBZNUC systemd-logind[662]: New session 3 of user root.
Oct 06 08:43:56 NEBZNUC systemd[1]: Started session-3.scope - Session 3 of User root.
Oct 06 08:43:56 NEBZNUC sshd[2552]: pam_env(sshd:session): deprecated reading of user environment enabled
Oct 06 08:43:56 NEBZNUC sudo[2563]:     root : PWD=/root ; USER=root ; COMMAND=/usr/bin/su -
Oct 06 08:43:56 NEBZNUC sudo[2563]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 06 08:43:56 NEBZNUC su[2564]: (to root) root on none
Oct 06 08:43:56 NEBZNUC su[2564]: pam_unix(su-l:session): session opened for user root(uid=0) by (uid=0)
Oct 06 08:44:11 NEBZNUC chronyd[850]: Selected source 45.87.76.3 (2.debian.pool.ntp.org)
Oct 06 08:44:11 NEBZNUC chronyd[850]: System clock TAI offset set to 37 seconds
Oct 06 08:44:13 NEBZNUC chronyd[850]: Selected source 45.87.77.15 (2.debian.pool.ntp.org)
Oct 06 08:44:36 NEBZNUC pmxcfs[916]: [status] notice: received log
Oct 06 08:44:36 NEBZNUC pmxcfs[916]: [status] notice: received log
Oct 06 08:45:08 NEBZNUC pmxcfs[916]: [status] notice: received log
Oct 06 08:45:17 NEBZNUC chronyd[850]: Selected source 45.87.76.3 (2.debian.pool.ntp.org)

and

Code:

Oct 06 08:40:10 NEBZNUC2 sshd[1193]: pam_env(sshd:session): deprecated reading of user environment enabled
Oct 06 08:40:20 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:43:08 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:43:10 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:43:18 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:43:32 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:43:45 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:43:55 NEBZNUC2 pmxcfs[922]: [status] notice: received log
Oct 06 08:44:02 NEBZNUC2 chronyd[861]: Selected source 45.87.76.3 (2.debian.pool.ntp.org)
Oct 06 08:44:02 NEBZNUC2 chronyd[861]: System clock TAI offset set to 37 seconds
Oct 06 08:44:03 NEBZNUC2 chronyd[861]: Selected source 195.13.1.153 (2.debian.pool.ntp.org)
Oct 06 08:44:21 NEBZNUC2 sshd[1690]: Accepted password for root from 192.168.1.233 port 51520 ssh2
Oct 06 08:44:21 NEBZNUC2 sshd[1690]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Oct 06 08:44:21 NEBZNUC2 systemd-logind[671]: New session 3 of user root.
Oct 06 08:44:21 NEBZNUC2 systemd[1]: Started session-3.scope - Session 3 of User root.
Oct 06 08:44:21 NEBZNUC2 sshd[1690]: pam_env(sshd:session): deprecated reading of user environment enabled
Oct 06 08:44:21 NEBZNUC2 sudo[1696]:     root : PWD=/root ; USER=root ; COMMAND=/usr/bin/su -
Oct 06 08:44:21 NEBZNUC2 sudo[1696]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 06 08:44:21 NEBZNUC2 su[1697]: (to root) root on none
Oct 06 08:44:21 NEBZNUC2 su[1697]: pam_unix(su-l:session): session opened for user root(uid=0) by (uid=0)
Oct 06 08:44:36 NEBZNUC2 pve-guests[1756]: <root@pam> starting task UPID:NEBZNUC2:000006DD:00007C66:651FACD4:startall::root@pam:
Oct 06 08:44:36 NEBZNUC2 pvesh[1756]: removed left over backup lock from '114'!
Oct 06 08:44:36 NEBZNUC2 pve-guests[1757]: removed left over backup lock from '114'!
Oct 06 08:44:36 NEBZNUC2 pvesh[1756]: Starting CT 114
Oct 06 08:44:36 NEBZNUC2 pve-guests[1757]: <root@pam> starting task UPID:NEBZNUC2:000006DE:00007C68:651FACD4:vzstart:114:root@pam:
Oct 06 08:44:36 NEBZNUC2 pve-guests[1758]: starting CT 114: UPID:NEBZNUC2:000006DE:00007C68:651FACD4:vzstart:114:root@pam:
Oct 06 08:44:36 NEBZNUC2 systemd[1]: Created slice system-pve\x2dcontainer.slice - PVE LXC Container Slice.
Oct 06 08:44:37 NEBZNUC2 systemd[1]: Started pve-container@114.service - PVE LXC Container: 114.
Oct 06 08:44:37 NEBZNUC2 kernel: EXT4-fs (dm-6): mounted filesystem e8001c63-e364-4fd0-851e-95de47d1851d with ordered data mode. Quota mode: none.
Oct 06 08:44:37 NEBZNUC2 audit[1778]: AVC apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-114_</var/lib/lxc>" pid=1778 comm="apparmor_parser"
Oct 06 08:44:37 NEBZNUC2 kernel: audit: type=1400 audit(1696574677.558:27): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-114_</var/lib/lxc>" pid=1778 comm="apparmor_parser"
Oct 06 08:44:38 NEBZNUC2 kernel: vmbr0: port 2(veth114i0) entered blocking state
Oct 06 08:44:38 NEBZNUC2 kernel: vmbr0: port 2(veth114i0) entered disabled state
Oct 06 08:44:38 NEBZNUC2 kernel: device veth114i0 entered promiscuous mode
Oct 06 08:44:38 NEBZNUC2 kernel: eth0: renamed from vethPBLtfk
Oct 06 08:44:38 NEBZNUC2 kernel: vmbr0: port 2(veth114i0) entered blocking state
Oct 06 08:44:38 NEBZNUC2 kernel: vmbr0: port 2(veth114i0) entered forwarding state
Oct 06 08:44:38 NEBZNUC2 pvesh[1756]: Waiting for 30 seconds (startup delay)
Oct 06 08:45:07 NEBZNUC2 chronyd[861]: Selected source 94.224.67.24 (2.debian.pool.ntp.org)
Oct 06 08:45:08 NEBZNUC2 pvesh[1756]: Starting CT 109
Oct 06 08:45:08 NEBZNUC2 pve-guests[1757]: <root@pam> starting task UPID:NEBZNUC2:0000088A:000088EA:651FACF4:vzstart:109:root@pam:
Oct 06 08:45:08 NEBZNUC2 pve-guests[2186]: starting CT 109: UPID:NEBZNUC2:0000088A:000088EA:651FACF4:vzstart:109:root@pam:
Oct 06 08:47:17 NEBZNUC2 chronyd[861]: Selected source 45.87.76.3 (2.debian.pool.ntp.org)

it seems the CT are "starting" but never started...

Ralli · Oct 6, 2023

I had exactly the same experience afte the update on both of my two nodes (... and I don't patch/update the two nodes at the same time but one after another with rebooting them).

So I could'nt find the root cause I just rebooted both nodes one more time, they came up in normal state.

Chris · Oct 6, 2023

Please share the full journal again, since you rebooted the hosts just before generating them, there is not much regarding guests and services in them.

Also, are you running the QDevice on NEBZNUC2 in parallel to the regular corosync service? Note that this is not recommended, the qdevice should run on an in-depended device.

nebz · Oct 6, 2023

Ralli said:
I had exactly the same experience afte the update on both of my two nodes (... and I don't patch/update the two nodes at the same time but one after another with rebooting them).

So I could'nt find the root cause I just rebooted both nodes one more time, they came up in normal state.

i rebooted like 20 times

nebz · Oct 6, 2023

Chris said:
Also, are you running the QDevice on NEBZNUC2 in parallel to the regular corosync service? Note that this is not recommended, the qdevice should run on an in-depended device.

normally not, I have a qdevice docker on my synology HA (192.168.1.125)

Code:

[root@NEBZNUC2 pve]$ cat corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: NEBZNUC
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.1.100
  }
  node {
    name: NEBZNUC2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.1.101
  }
}

quorum {
  device {
    model: net
    net {
      algorithm: ffsplit
      host: 192.168.1.125
      tls: on
    }
    votes: 1
  }
  provider: corosync_votequorum
}

totem {
  cluster_name: NEBZCLUSTER
  config_version: 11
  interface {
    linknumber: 0
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

Chris · Oct 6, 2023

Please also provide the output of ps auxwf and pvesh get /cluster/resources from both nodes. There must be something blocking the status updates from pvestatd.

Edit: Both of them when the problem is present of course, so the WebUI shows the question marks.

nebz · Oct 6, 2023

here you are

nebz · Oct 7, 2023

and again a journalctl

nebz · Oct 7, 2023

i updated the to the latest apt update : same issue

BillboardsAreAllLearing · Oct 9, 2023

Same issue since updating Friday-ish. Can't seem to find anything wrong in the logs. Nodes with LXCs with NFS mounts (Synology as well) all appear with '?' in grey and start jobs turn indefinitely. I have one node with no LXCs with no root disk being on NFS and all appear normal. To make things a bit weirder, containers with network mount points (same NAS and NFS as well), but not the root disk, start and can access the mount points.

nebz · Oct 9, 2023

i'm feeling less alone thx

what's your Synology DSM version ? mine : DSM 7.2.1-69057

my logs seems fine, except timeouts

Ralli · Oct 9, 2023

Also see here: https://forum.proxmox.com/threads/n...-nicht-erreichbarkeit-der-gui-von-pve.134290/

I think there is a correlation between the hanging of pvestatd, the pve-nfs-client and the last update of the Synology. Before 7.2.1 on Synology all went fine.

No VM/LXC can start after Proxmox update and Synology update to 7.2.1

Member

Attachments

Proxmox Staff Member

Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Attachments

Member

Member

Proxmox Staff Member

Member

Member

Attachments

Proxmox Staff Member

Member

Attachments

Member

Attachments

Member

Member

Member

Member

We value your privacy