The server certificate /etc/pve/local/pve-ssl.pem is not yet active

lukasz.matys

Member
Dec 10, 2015
34
3
8
Hello.
We've setup new proxmox cluster.
When starting vhosts, we get the following error:

unning as unit 100.scope.
kvm: -vnc unix:/var/run/qemu-server/100.vnc,x509,password: Failed to start VNC server: The server certificate /etc/pve/local/pve-ssl.pem is not yet active
TASK ERROR: start failed: command '/usr/bin/systemd-run --scope --slice qemu --unit 100 -p 'KillMode=none' -p 'CPUShares=1000' /usr/bin/kvm -id 100 -chardev 'socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/100.pid -daemonize -smbios 'type=1,uuid=25176948-8ab7-4d4c-bbb0-9d95607f93b2' -name test -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 1024 -k pl -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:59762bbfaac' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -drive 'file=/dev/vg0/vm-100-disk-1,if=none,id=drive-ide0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=32:63:36:31:34:36,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'' failed: exit code 1

Cluster is up since few days.
Can you help me?

Regards.
 
I did like here,

https://forum.proxmox.com/threads/f...ve-local-pve-ssl-pem-is-not-yet-active.26004/

But no success. I've removed specified files and I'm trying to generate new certs,

root@cp-node1:/etc/pve/priv# pvecm keygen pve-root-ca.key

Corosync Cluster Engine Authentication key generator.

Gathering 1024 bits for key from /dev/urandom.

corosync-keygen: Failed to set key file permissions to 0400: Function not implemented

command 'corosync-keygen -l -k pve-root-ca.key' failed: exit code 3

root@cp-node1:/etc/pve/priv#

What to do next?
 
I tryied like this,

root@cp-node1:~# pvecm keygen pve-root-ca.key
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/urandom.
Writing corosync key to pve-root-ca.key.
root@cp-node1:~#

root@cp-node1:~# cp pve-root-ca.key /etc/pve/priv/

root@cp-node1:~# pvecm updatecerts -f
unable to load Private Key
140472850441872:error:0906D06C:PEM routines:PEM_read_bio:no start line:pem_lib.c:703:Expecting: ANY PRIVATE KEY
generating pve root certificate failed:
command 'openssl req -batch -days 3650 -new -x509 -nodes -key /etc/pve/priv/pve-root-ca.key -out /etc/pve/pve-root-ca.pem -subj '/CN=Proxmox Virtual Environment/OU=b6b846256e4073b3171b757e035a51cd/O=PVE Cluster Manager CA/'' failed: exit code 1
root@cp-node1:~#

:/
 
you need to run the following command once:

Code:
rm /etc/pve/priv/pve-root-ca.key /etc/pve/pve-root-ca.pem

this deletes the cluster CA certificate and private key (if they exist).

then you need to run this command on all nodes of your cluster

Code:
pvecm updatecerts -f

this command regenerates the node certificate and key, but before that it checks if the cluster CA certificate and key exist, and if necessary, will generate a new one automatically.

finally restart the web interface to reload the new certificate (also on each node of the cluster):

Code:
systemctl restart pveproxy

please don't run random commands that are not mentioned here!
 
Your commands match those run by lukasz.matys and myself. It seems the generated private key is invalid:

Attempting to recreate cluster:
Code:
    systemctl stop corosync.service;
    systemctl stop pve-cluster.service;
    pmxcfs -l;
    rm -f /etc/pve/pve-root-ca.pem /etc/pve/priv/pve-root-ca.key /etc/pve/priv/pve-root-ca.srl;
    rm -f /etc/corosync/corosync.conf;
    cd /root;
    pvecm keygen pve-root-ca.key;
The following subsequently fails as the created key isn't valid:
Code:
    cat pve-root-ca.key > /etc/pve/priv/pve-root-ca.key;
    pvecm updatecerts -force;

Output:
Code:
  [admin@kvm5a ~]# pvecm keygen pve-root-ca.key;
    Corosync Cluster Engine Authentication key generator.
    Gathering 1024 bits for key from /dev/urandom.
    Writing corosync key to pve-root-ca.key.
  [admin@kvm5a ~]# dir
    total 4
    -r-------- 1 admin root 128 Jul 30 16:33 pve-root-ca.key
  [admin@kvm5a ~]# openssl rsa -in pve-root-ca.key -check
    unable to load Private Key
    140234217199248:error:0906D06C:PEM routines:PEM_read_bio:no start line:pem_lib.c:696:Expecting: ANY PRIVATE KEY
 
you run pvecm keygen , which is not for a root ca, but for the corosync key
 
  • Like
Reactions: brucexx
Had an issue with this in the log:
Aug 03 18:35:02 pve1-weha pveproxy[1728]: proxy detected vanished client connection
Aug 03 18:35:02 pve1-weha pveproxy[1729]: '/etc/pve/nodes/pve2-weha/pve-ssl.pem' does not exist!
Aug 03 18:35:32 pve1-weha pveproxy[1729]: proxy detected vanished client connection
Aug 03 18:35:33 pve1-weha pveproxy[1729]: '/etc/pve/nodes/pve2-weha/pve-ssl.pem' does not exist!
Aug 03 18:35:37 pve1-weha pveproxy[1729]: '/etc/pve/nodes/pve2-weha/pve-ssl.pem' does not exist!

Could not connect to both nodes in my newly created test cluster , the second node was not responding to direct web requests and was timing out when looked at from the Datacenter perspective, the cluster was formed:

Date: Mon Aug 3 18:34:37 2020
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000002
Ring ID: 1.2f4
Quorate: Yes

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate

The commands provided by dcsapak fixed that right away.

Thx
 
I have this same error for a new node I just added to my existing cluster. What's the correct way to generate keys?
 
please provide pvecm status and journalctl -b -u pve-cluster output..
 
Code:
pvecm updatecerts -f
(re)generate node files
generate new node certificate
merge authorized SSH keys and known hosts
root@falcon:~# ls -l /etc/pve/nodes/phoenix/
total 0
drwxr-xr-x 2 root www-data 0 May 24 23:10 lxc
drwxr-xr-x 2 root www-data 0 May 24 23:10 openvz
drwx------ 2 root www-data 0 May 24 23:10 priv
drwxr-xr-x 2 root www-data 0 May 24 23:10 qemu-server
root@falcon:~# pvecm status
Cluster information
-------------------
Name:             rebel
Config Version:   32
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Tue May 25 04:59:35 2021
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000004
Ring ID:          1.9251c
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      3
Quorum:           3
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.1.1.16
0x00000003          1 10.1.1.12
0x00000004          1 10.1.1.13 (local)
root@falcon:~# journalctl -b -u pve-cluster
-- Logs begin at Mon 2021-05-24 23:37:05 MDT, end at Tue 2021-05-25 04:59:36 MDT. --
May 24 23:37:32 falcon systemd[1]: Starting The Proxmox VE cluster filesystem...
May 24 23:37:32 falcon pmxcfs[7192]: [quorum] crit: quorum_initialize failed: 2
May 24 23:37:32 falcon pmxcfs[7192]: [quorum] crit: can't initialize service
May 24 23:37:32 falcon pmxcfs[7192]: [confdb] crit: cmap_initialize failed: 2
May 24 23:37:32 falcon pmxcfs[7192]: [confdb] crit: can't initialize service
May 24 23:37:32 falcon pmxcfs[7192]: [dcdb] crit: cpg_initialize failed: 2
May 24 23:37:32 falcon pmxcfs[7192]: [dcdb] crit: can't initialize service
May 24 23:37:32 falcon pmxcfs[7192]: [status] crit: cpg_initialize failed: 2
May 24 23:37:32 falcon pmxcfs[7192]: [status] crit: can't initialize service
May 24 23:37:33 falcon systemd[1]: Started The Proxmox VE cluster filesystem.
May 24 23:37:38 falcon pmxcfs[7192]: [status] notice: update cluster info (cluster name  rebel, version = 32)
May 24 23:37:38 falcon pmxcfs[7192]: [dcdb] notice: members: 4/7192
May 24 23:37:38 falcon pmxcfs[7192]: [dcdb] notice: all data is up to date
May 24 23:37:38 falcon pmxcfs[7192]: [status] notice: members: 4/7192
May 24 23:37:38 falcon pmxcfs[7192]: [status] notice: all data is up to date
May 24 23:37:44 falcon pmxcfs[7192]: [status] notice: cpg_send_message retry 10
May 24 23:37:45 falcon pmxcfs[7192]: [status] notice: cpg_send_message retry 20
May 24 23:37:46 falcon pmxcfs[7192]: [status] notice: cpg_send_message retry 30
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: members: 1/4338, 3/5262, 4/7192
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: starting data syncronisation
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: members: 1/4338, 3/5262, 4/7192
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: starting data syncronisation
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: node has quorum
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: cpg_send_message retried 36 times
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: received sync request (epoch 1/4338/00000044)
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: received sync request (epoch 1/4338/00000037)
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: received all states
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: leader is 1/4338
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: synced members: 1/4338, 3/5262, 4/7192
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: all data is up to date
May 24 23:37:47 falcon pmxcfs[7192]: [dcdb] notice: dfsm_deliver_queue: queue length 2
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: received all states
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: all data is up to date
May 24 23:37:47 falcon pmxcfs[7192]: [status] notice: dfsm_deliver_queue: queue length 12
May 24 23:41:08 falcon pmxcfs[7192]: [dcdb] notice: data verification successful
May 24 23:49:26 falcon pmxcfs[7192]: [status] notice: received log
May 25 00:05:26 falcon pmxcfs[7192]: [status] notice: received log
May 25 00:21:26 falcon pmxcfs[7192]: [status] notice: received log
May 25 00:36:26 falcon pmxcfs[7192]: [status] notice: received log
May 25 00:41:04 falcon pmxcfs[7192]: [dcdb] notice: data verification successful
May 25 00:51:26 falcon pmxcfs[7192]: [status] notice: received log
May 25 01:06:23 falcon pmxcfs[7192]: [status] notice: received log
May 25 01:21:24 falcon pmxcfs[7192]: [status] notice: received log
May 25 01:37:23 falcon pmxcfs[7192]: [status] notice: received log
May 25 01:41:05 falcon pmxcfs[7192]: [dcdb] notice: data verification successful
May 25 01:42:43 falcon pmxcfs[7192]: [status] notice: cpg_send_message retried 1 times
May 25 01:52:24 falcon pmxcfs[7192]: [status] notice: received log
May 25 02:08:24 falcon pmxcfs[7192]: [status] notice: received log
May 25 02:23:24 falcon pmxcfs[7192]: [status] notice: received log
May 25 02:39:23 falcon pmxcfs[7192]: [status] notice: received log
May 25 02:41:06 falcon pmxcfs[7192]: [dcdb] notice: data verification successful
May 25 02:55:23 falcon pmxcfs[7192]: [status] notice: received log
May 25 03:11:24 falcon pmxcfs[7192]: [status] notice: received log
May 25 03:17:33 falcon pmxcfs[7192]: [status] notice: cpg_send_message retried 1 times
May 25 03:27:24 falcon pmxcfs[7192]: [status] notice: received log
May 25 03:33:50 falcon pmxcfs[7192]: [status] notice: received log
May 25 03:33:55 falcon pmxcfs[7192]: [status] notice: received log
May 25 03:41:07 falcon pmxcfs[7192]: [dcdb] notice: data verification successful
May 25 03:43:28 falcon pmxcfs[7192]: [status] notice: received log
May 25 03:59:28 falcon pmxcfs[7192]: [status] notice: received log
May 25 04:15:27 falcon pmxcfs[7192]: [status] notice: received log
May 25 04:29:43 falcon pmxcfs[7192]: [status] notice: cpg_send_message retried 1 times
May 25 04:30:28 falcon pmxcfs[7192]: [status] notice: received log
May 25 04:41:08 falcon pmxcfs[7192]: [dcdb] notice: data verification successful
May 25 04:46:28 falcon pmxcfs[7192]: [status] notice: received log
 
you're mixing nodes - please provide the output from the failing node...
 
ah, sorry. here is the failing node.
Code:
root@phoenix:~# pvecm status
Cluster information
-------------------
Name:             rebel
Config Version:   32
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Tue May 25 05:19:35 2021
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          2.93160
Quorate:          No

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      1
Quorum:           3 Activity blocked
Flags:

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 10.1.1.8 (local)
root@phoenix:~# journalctl -b -u pve-cluster
-- Logs begin at Mon 2021-05-24 23:22:34 MDT, end at Tue 2021-05-25 05:20:06 MDT. --
May 24 23:22:36 phoenix systemd[1]: Starting The Proxmox VE cluster filesystem...
May 24 23:22:36 phoenix pmxcfs[2170]: [quorum] crit: quorum_initialize failed: 2
May 24 23:22:36 phoenix pmxcfs[2170]: [quorum] crit: can't initialize service
May 24 23:22:36 phoenix pmxcfs[2170]: [confdb] crit: cmap_initialize failed: 2
May 24 23:22:36 phoenix pmxcfs[2170]: [confdb] crit: can't initialize service
May 24 23:22:36 phoenix pmxcfs[2170]: [dcdb] crit: cpg_initialize failed: 2
May 24 23:22:36 phoenix pmxcfs[2170]: [dcdb] crit: can't initialize service
May 24 23:22:36 phoenix pmxcfs[2170]: [status] crit: cpg_initialize failed: 2
May 24 23:22:36 phoenix pmxcfs[2170]: [status] crit: can't initialize service
May 24 23:22:37 phoenix systemd[1]: Started The Proxmox VE cluster filesystem.
May 24 23:22:42 phoenix pmxcfs[2170]: [status] notice: update cluster info (cluster name  rebel, version = 32)
May 24 23:22:42 phoenix pmxcfs[2170]: [dcdb] notice: members: 2/2170
May 24 23:22:42 phoenix pmxcfs[2170]: [dcdb] notice: all data is up to date
May 24 23:22:42 phoenix pmxcfs[2170]: [status] notice: members: 2/2170
May 24 23:22:42 phoenix pmxcfs[2170]: [status] notice: all data is up to date
May 24 23:22:59 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 10
May 24 23:23:00 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 20
May 24 23:23:00 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retried 20 times
May 24 23:23:09 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 10
May 24 23:23:10 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 20
May 24 23:23:11 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 30
May 24 23:23:12 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 40
May 24 23:23:13 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 50
May 24 23:23:14 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 60
May 24 23:23:15 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retried 68 times
May 24 23:23:29 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 10
May 24 23:23:30 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retried 17 times
May 24 23:23:39 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 10
May 24 23:23:40 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 20
May 24 23:23:41 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 30
May 24 23:23:42 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 40
May 24 23:23:43 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 50
May 24 23:23:44 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 60
May 24 23:23:45 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 70
May 24 23:23:46 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 80
May 24 23:23:47 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 90
May 24 23:23:48 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retry 100
May 24 23:23:48 phoenix pmxcfs[2170]: [status] notice: cpg_send_message retried 100 times
May 24 23:23:48 phoenix pmxcfs[2170]: [status] crit: cpg_send_message failed: 6
 
so that node is not part of the quorum, so it also won't be able to write to /etc/pve to regenerate any files. find out why it's not part of the quorate partition (hint: the corosync logs might contain an error pointing at the cause), fix that issue, then run the command again on the phoenix node.
 
there are definitely a bunch of errors in the syslog of the machine i'm having trouble with, but I'm not sure which one is causing the problem. i've tried searching for a few of them, but i haven't found anything that's helped yet. the issue with /etc/pve/local/pve-ssl.key seems circular to me. can't generate the keys because it doesn't have quorum, but it can't get quorum because it doesn't have the key. how do i solve this?

Code:
May 25 19:52:50 phoenix systemd-timesyncd[1853]: Synchronized to time server for the first time 45.63.54.13:123 (2.debian.pool.ntp.org).
May 25 19:52:51 phoenix systemd[1]: Created slice User Slice of UID 0.
May 25 19:52:51 phoenix systemd[1]: Starting User Runtime Directory /run/user/0...
May 25 19:52:51 phoenix systemd[1]: Started User Runtime Directory /run/user/0.
May 25 19:52:51 phoenix systemd[1]: Starting User Manager for UID 0...
May 25 19:52:51 phoenix systemd[2849]: Listening on GnuPG cryptographic agent and passphrase cache (access for web browsers).
May 25 19:52:51 phoenix systemd[2849]: Reached target Paths.
May 25 19:52:51 phoenix systemd[2849]: Reached target Timers.
May 25 19:52:51 phoenix systemd[2849]: Listening on GnuPG cryptographic agent and passphrase cache (restricted).
May 25 19:52:51 phoenix systemd[2849]: Listening on GnuPG cryptographic agent and passphrase cache.
May 25 19:52:51 phoenix systemd[2849]: Listening on GnuPG network certificate management daemon.
May 25 19:52:51 phoenix systemd[2849]: Listening on GnuPG cryptographic agent (ssh-agent emulation).
May 25 19:52:51 phoenix systemd[2849]: Reached target Sockets.
May 25 19:52:51 phoenix systemd[2849]: Reached target Basic System.
May 25 19:52:51 phoenix systemd[2849]: Reached target Default.
May 25 19:52:51 phoenix systemd[2849]: Startup finished in 48ms.
May 25 19:52:51 phoenix systemd[1]: Started User Manager for UID 0.
May 25 19:52:51 phoenix systemd[1]: Started Session 1 of user root.
May 25 19:52:51 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 90
May 25 19:52:51 phoenix corosync[2278]:   [TOTEM ] A new membership (2.b5739) was formed. Members
May 25 19:52:52 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 100
May 25 19:52:52 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retried 100 times
May 25 19:52:52 phoenix pmxcfs[2171]: [status] crit: cpg_send_message failed: 6
May 25 19:52:53 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 10
May 25 19:52:53 phoenix corosync[2278]:   [TOTEM ] Token has not been received in 1726 ms
May 25 19:52:54 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 20
May 25 19:52:54 phoenix pveproxy[2844]: worker exit
May 25 19:52:54 phoenix pveproxy[2486]: worker 2844 finished
May 25 19:52:54 phoenix pveproxy[2486]: starting 2 worker(s)
May 25 19:52:54 phoenix pveproxy[2486]: worker 2892 started
May 25 19:52:54 phoenix pveproxy[2486]: worker 2909 started
May 25 19:52:54 phoenix pveproxy[2843]: worker exit
May 25 19:52:54 phoenix pveproxy[2892]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1737.
May 25 19:52:54 phoenix pveproxy[2909]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1737.
May 25 19:52:54 phoenix pveproxy[2486]: worker 2843 finished
May 25 19:52:54 phoenix pveproxy[2486]: starting 1 worker(s)
May 25 19:52:54 phoenix pveproxy[2486]: worker 2946 started
May 25 19:52:54 phoenix pveproxy[2946]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1737.
May 25 19:52:55 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 30
May 25 19:52:55 phoenix corosync[2278]:   [TOTEM ] Token has not been received in 4051 ms
May 25 19:52:56 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 40
May 25 19:52:57 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 50
May 25 19:52:58 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retry 60
May 25 19:52:59 phoenix corosync[2278]:   [TOTEM ] A new membership (2.b574d) was formed. Members
May 25 19:52:59 phoenix corosync[2278]:   [QUORUM] Members[1]: 2
May 25 19:52:59 phoenix corosync[2278]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 25 19:52:59 phoenix pmxcfs[2171]: [status] notice: cpg_send_message retried 66 times
May 25 19:52:59 phoenix pvestatd[2296]: status update time (16.845 seconds)
May 25 19:52:59 phoenix pmxcfs[2171]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-storage/phoenix/local: -1
May 25 19:52:59 phoenix pveproxy[2909]: worker exit
May 25 19:52:59 phoenix pveproxy[2892]: worker exit
May 25 19:52:59 phoenix pveproxy[2486]: worker 2909 finished
May 25 19:52:59 phoenix pveproxy[2486]: starting 1 worker(s)
May 25 19:52:59 phoenix pveproxy[2486]: worker 2953 started
May 25 19:52:59 phoenix pveproxy[2486]: worker 2892 finished
May 25 19:52:59 phoenix pveproxy[2486]: starting 1 worker(s)
May 25 19:52:59 phoenix pveproxy[2486]: worker 2954 started
May 25 19:52:59 phoenix pveproxy[2946]: worker exit
May 25 19:52:59 phoenix pveproxy[2953]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1737.
May 25 19:52:59 phoenix pveproxy[2954]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1737.
May 25 19:52:59 phoenix pveproxy[2486]: worker 2946 finished
May 25 19:52:59 phoenix pveproxy[2486]: starting 1 worker(s)
May 25 19:52:59 phoenix pveproxy[2486]: worker 2955 started
May 25 19:52:59 phoenix pveproxy[2955]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1737.
May 25 19:53:00 phoenix systemd[1]: Starting Proxmox VE replication runner...
May 25 19:53:00 phoenix pvesr[3015]: error during cfs-locked 'file-replication_cfg' operation: no quorum!
May 25 19:53:00 phoenix systemd[1]: pvesr.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
May 25 19:53:00 phoenix systemd[1]: pvesr.service: Failed with result 'exit-code'.
May 25 19:53:00 phoenix systemd[1]: Failed to start Proxmox VE replication runner.
May 25 19:53:00 phoenix corosync[2278]:   [TOTEM ] Token has not been received in 1725 ms
May 25 19:53:01 phoenix cron[2277]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
 
Last edited:
how is the network situation between the nodes? anything out of the ordinary? can you try restarting corosync and pve-cluster services?
 
yep - was just about to follow up on that. networking was the issue. all of the nodes are on a 10g switch - so on that switch i disabled IGMP snooping and on the nodes i ran
Code:
echo 0 >/sys/class/net/vmbr0/bridge/multicast_snooping
and then on the existing node ran:
Code:
service pve-cluster restart
service corosync restart
on the new node, i reinstalled proxmox, updated it, updated multicast_snooping, and re-joined it to the cluster. which worked immediately.
 
  • Like
Reactions: fabian
you need to run the following command once:

Code:
rm /etc/pve/priv/pve-root-ca.key /etc/pve/pve-root-ca.pem

this deletes the cluster CA certificate and private key (if they exist).

then you need to run this command on all nodes of your cluster

Code:
pvecm updatecerts -f

this command regenerates the node certificate and key, but before that it checks if the cluster CA certificate and key exist, and if necessary, will generate a new one automatically.

finally restart the web interface to reload the new certificate (also on each node of the cluster):

Code:
systemctl restart pveproxy

please don't run random commands that are not mentioned here!
This happened to me because I renamed a node to an old node name after replacing a server.

The fix was to remove all the old ssh keys then update the certs like this.