[SOLVED] A red cross case ...

ednt

Well-Known Member
Mar 16, 2017
99
7
48
Hi,

we switched on a long time off member of our cluster.
After this we were not able to login on all members via the web GUI.
We solved this by stopping corosync on all servers an started on one pair with setting expected nodes to 2 and bring up one corosync after an other.
This worked and we were able to login again.

But ...

one server shows a red cross in the web GUI and is only working by half.

It has a qourum:

Code:
Quorum information
------------------
Date:             Thu Nov  2 09:42:00 2023
Quorum provider:  corosync_votequorum
Nodes:            18
Node ID:          0x00000008
Ring ID:          2.6a29
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   18
Highest expected: 18
Total votes:      18
Quorum:           10
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 192.168.248.2
0x00000003          1 192.168.248.3
0x00000004          1 192.168.248.4
0x00000005          1 192.168.248.5
0x00000006          1 192.168.248.6
0x00000007          1 192.168.248.7
0x00000008          1 192.168.248.8 (local)
0x00000009          1 192.168.248.41
0x0000000a          1 192.168.248.42
0x0000000b          1 192.168.248.43
0x0000000c          1 192.168.248.44
0x0000000d          1 192.168.248.45
0x0000000e          1 192.168.248.46
0x0000000f          1 192.168.248.81
0x00000011          1 192.168.248.83
0x00000013          1 192.168.249.210
0x00000014          1 192.168.249.209
0x00000016          1 192.168.249.212

But if I try, for example, to unlock a qm, I get:

Code:
root@pk-pm-cpu-08:~# qm unlock 116
unable to open file '/etc/pve/nodes/pk-pm-cpu-08/qemu-server/116.conf.tmp.3150701' - Permission denied

Or when I do something on the web GUI (try to migrate the running VM to an other node) I get :
Code:
cluster not ready - no quorum? (500)
But I think this is a false message, since I have a quorum.

If I look at the file permissions, due to permission denied message, I see the following:
Code:
drwxr-xr-x   2 root www-data     0 Jan  1  1970 .
drwxr-xr-x 100 root root     12288 Nov  2 09:42 ..
-r--r-----   1 root www-data   451 Nov  1 11:17 authkey.pub
-r--r-----   1 root www-data   451 Nov  1 11:17 authkey.pub.old
-r--r-----   1 root www-data   442 Apr 15  2020 ceph.conf
-r--r-----   1 root www-data 10685 Jan  1  1970 .clusterlog
-r--r-----   1 root www-data  2278 Feb 28  2023 corosync.conf
-r--r-----   1 root www-data    58 Mar  8  2021 datacenter.cfg
-rw-r-----   1 root www-data     2 Jan  1  1970 .debug
dr-xr-xr-x   2 root www-data     0 Jan 29  2020 firewall
dr-xr-xr-x   2 root www-data     0 Mar 13  2021 ha
-r--r-----   1 root www-data   159 Jul 19 09:25 jobs.cfg
lr-xr-xr-x   1 root www-data     0 Jan  1  1970 local -> nodes/pk-pm-cpu-08
lr-xr-xr-x   1 root www-data     0 Jan  1  1970 lxc -> nodes/pk-pm-cpu-08/lxc
-r--r-----   1 root www-data  1498 Jan  1  1970 .members
dr-xr-xr-x   2 root www-data     0 Nov 18  2019 nodes
lr-xr-xr-x   1 root www-data     0 Jan  1  1970 openvz -> nodes/pk-pm-cpu-08/openvz
dr-x------   2 root www-data     0 Nov 18  2019 priv
-r--r-----   1 root www-data  2074 Nov 18  2019 pve-root-ca.pem
-r--r-----   1 root www-data  1679 Nov 18  2019 pve-www.key
lr-xr-xr-x   1 root www-data     0 Jan  1  1970 qemu-server -> nodes/pk-pm-cpu-08/qemu-server
-r--r-----   1 root www-data    71 Aug  3 16:29 replication.cfg
-r--r-----   1 root www-data   727 Jan  1  1970 .rrd
dr-xr-xr-x   2 root www-data     0 Mar 13  2021 sdn
-r--r-----   1 root www-data  1564 Jul  1  2022 storage.cfg
-r--r-----   1 root www-data  2155 Aug  3 16:29 user.cfg
-r--r-----   1 root www-data  5391 Jan  1  1970 .version
dr-xr-xr-x   2 root www-data     0 Mar  8  2021 virtual-guest
-r--r-----   1 root www-data  7117 Jan  1  1970 .vmlist
-r--r-----   1 root www-data   120 Aug  3 16:29 vzdump.cron
Normally root should have w permissions, compared to the other nodes.

Any idea how I can solve this (without rebooting this node)?

We are running:
Code:
proxmox-ve: 7.1-1 (running kernel: 5.11.22-3-pve)
pve-manager: 7.1-11 (running version: 7.1-11/8d529482)
pve-kernel-5.15: 7.1-13
pve-kernel-helper: 7.1-13
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.27-1-pve: 5.15.27-1
pve-kernel-5.15.19-2-pve: 5.15.19-3
pve-kernel-5.15.19-1-pve: 5.15.19-1
pve-kernel-5.13.19-6-pve: 5.13.19-14
pve-kernel-5.11.22-3-pve: 5.11.22-7
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1
Due to the problems with migrations we are still at kernel 5.11.22-3-pve and did not updated the cluster.

Best regards.
 
Last edited:
After several tests on spare units, I used
Code:
service pve-cluster restart
to solve the problem without stoping the VM.
Everything went fine, the red cross is gone and the /etc/pve directory has again write rights for root.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!