Node web interface not responding, pve-ssl.pem' - No such file or director. Strange cluster bevhavio

Dec 26, 2018
138
2
23
35
Soon going into production, so i have to figure this one out.
We have 3 nodes. All was working before.
The web interface of node 2 is not responding.
In the cluster information second of the other two nodes the "Join cluster" i greyed out.
But from the column on the left it all seems fine.

Selection_006.png


root@proxmox1:/etc/pve/nodes/proxmox3# pvesh get /cluster/config/join
unable to read '/etc/pve/nodes/proxmox2/pve-ssl.pem' - No such file or directory


root@proxmox2:/etc/pve/nodes/proxmox1# ls -lah
total 1,5K
drwxr-xr-x 2 root www-data 0 april 8 09:11 .
drwxr-xr-x 2 root www-data 0 april 8 09:11 ..
-rw-r----- 1 root www-data 83 april 24 09:51 lrm_status
drwxr-xr-x 2 root www-data 0 april 8 09:11 lxc
drwxr-xr-x 2 root www-data 0 april 8 09:11 openvz
drwx------ 2 root www-data 0 april 8 09:11 priv
-rw-r----- 1 root www-data 1,7K april 8 09:11 pve-ssl.key
-rw-r----- 1 root www-data 1,7K april 8 09:11 pve-ssl.pem
drwxr-xr-x 2 root www-data 0 april 8 09:11 qemu-server


root@proxmox2:/etc/pve/nodes/proxmox2# ls -lah
total 512
drwxr-xr-x 2 root www-data 0 april 10 14:53 .
drwxr-xr-x 2 root www-data 0 april 8 09:11 ..
-rw-r----- 1 root www-data 83 april 24 10:02 lrm_status
drwxr-xr-x 2 root www-data 0 april 10 14:53 lxc
drwxr-xr-x 2 root www-data 0 april 10 14:53 qemu-server

root@proxmox1:/etc/pve/nodes/proxmox3# ls -lah
total 1,5K
drwxr-xr-x 2 root www-data 0 april 8 09:41 .
drwxr-xr-x 2 root www-data 0 april 8 09:11 ..
-rw-r----- 1 root www-data 83 april 24 09:51 lrm_status
drwxr-xr-x 2 root www-data 0 april 8 09:41 lxc
drwxr-xr-x 2 root www-data 0 april 8 09:41 openvz
drwx------ 2 root www-data 0 april 8 09:41 priv
-rw-r----- 1 root www-data 1,7K april 8 09:41 pve-ssl.key
-rw-r----- 1 root www-data 1,7K april 8 09:41 pve-ssl.pem
drwxr-xr-x 2 root www-data 0 april 8 09:41 qemu-server
 
Don't know if this is a coincidence but in the log folder:
-rw-r----- 1 root adm 1,3K april 10 14:53 user.log.1
-rw-r----- 1 root adm 12K april 10 14:53 mail.log.1
-rw-r----- 1 root adm 12K april 10 14:53 mail.info.1

3 log files were dated the same as the changes to proxmox2 files in /etc/pve/nodes/proxmox2

Only unusual may be from user.log.1
Apr 8 09:19:22 proxmox2 lvm[694]: Monitoring thin pool pve-data.
Apr 8 09:43:56 proxmox2 lvm[693]: Monitoring thin pool pve-data.
Apr 8 10:27:29 proxmox2 lvm[754]: Monitoring thin pool pve-data.
Apr 8 14:53:07 proxmox2 dmeventd[754]: No longer monitoring thin pool pve-data.
Apr 8 14:53:07 proxmox2 lvm[754]: Monitoring thin pool pve-data-tpool.
Apr 8 14:59:37 proxmox2 lvm[754]: WARNING: Thin pool pve-data-tpool data is now 82.01% full.
Apr 8 14:59:57 proxmox2 lvm[754]: WARNING: Thin pool pve-data-tpool data is now 86.26% full.
Apr 8 15:00:17 proxmox2 lvm[754]: WARNING: Thin pool pve-data-tpool data is now 90.53% full.
Apr 8 15:00:47 proxmox2 lvm[754]: WARNING: Thin pool pve-data-tpool data is now 96.80% full.
Apr 8 15:01:07 proxmox2 lvm[754]: WARNING: Thin pool pve-data-tpool data is now 100.00% full.
Apr 10 10:03:09 proxmox2 lvm[824]: Monitoring thin pool pve-data-tpool.
Apr 10 10:09:32 proxmox2 lvm[751]: Monitoring thin pool pve-data-tpool.
Apr 10 10:13:31 proxmox2 lvm[759]: Monitoring thin pool pve-data-tpool.
Apr 10 10:26:18 proxmox2 lvm[796]: Monitoring thin pool pve-data-tpool.
Apr 10 14:05:31 proxmox2 lvm[775]: Monitoring thin pool pve-data-tpool.
Apr 10 14:12:44 proxmox2 lvm[769]: Monitoring thin pool pve-data-tpool.
Apr 10 14:53:09 proxmox2 lvm[785]: Monitoring thin pool pve-data-tpool.
 

Attachments

  • upload_2019-4-24_10-19-32.png
    upload_2019-4-24_10-19-32.png
    95.3 KB · Views: 6
Hi,
please try to regenerate your SSL certificates by running
Code:
pvecm updatecerts --force
systemctl restart pveproxy.service
 
I overlooked that the status says 'standalone node'. What's the output of `pvecm satus` and `cat /etc/pve/corosync.conf`?
 
First. this might be related to the fact that i unsuccessfully tried to remove the proxmox2 node using the NodeId "0x00000002" and then using the name "192.168.99.166" The commands shown below returned nothing.
https://forum.proxmox.com/threads/pvecm-nodes-does-not-show-correct-nodename.53265

112 pvecm delnode 192.168.99.166
113 pvecm delnode 192.168.99.166
114 pvecm delnode 0x00000002
115 pvecm delnode 0x00000002

Later found out in a virtual environment (on another test system) that I had to use the name shown in the web interface. So the node was not removed.



root@proxmox2:/etc/pve/nodes/proxmox2# pvecm status
Quorum information
------------------
Date: Wed Apr 24 10:37:12 2019
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000002
Ring ID: 1/1164
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.99.118
0x00000002 1 192.168.99.166 (local)
0x00000003 1 192.168.99.167




root@proxmox3:~# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: proxmox1
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.99.118
}
node {
name: proxmox2
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.99.166
}
node {
name: proxmox3
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.99.167
}
}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: cluster0
config_version: 3
interface {
bindnetaddr: 192.168.99.118
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!