Proxmox Cluster doesn't work anymore

kriskenbe

New Member
May 11, 2022
11
1
3
Hello,

First of all : i am new to Proxmox, so well... It's more then possible that i've made a mistake...
I got 3 servers, and tried to set them up with Proxmox. Everything went well, so I decided to set them together as a cluster (No HA). When that was working, I started to create some vm's.
About 10 days later there was a power problem in the datacenter, and all servers went down. When I tried to start them, only one server was online in the GUI. But I can reach all servers using SSH.

Any idea what's happening?

SERVER 1
Code:
root@proxmox1:~# systemctl status pve-cluster pveproxy pvedaemon
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor p>
     Active: active (running) since Sat 2022-05-14 13:58:24 CEST; 40min ago
    Process: 2610460 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
   Main PID: 2610461 (pmxcfs)
      Tasks: 6 (limit: 28818)
     Memory: 38.3M
        CPU: 3.343s
     CGroup: /system.slice/pve-cluster.service
             └─2610461 /usr/bin/pmxcfs

May 14 13:58:23 proxmox1 pmxcfs[2610461]: [dcdb] crit: can't initialize service
May 14 13:58:23 proxmox1 pmxcfs[2610461]: [status] crit: cpg_initialize failed:>
May 14 13:58:23 proxmox1 pmxcfs[2610461]: [status] crit: can't initialize servi>
May 14 13:58:24 proxmox1 systemd[1]: Started The Proxmox VE cluster filesystem.
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: update cluster info >
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [dcdb] notice: members: 1/2610461
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [dcdb] notice: all data is up to date
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: members: 1/2610461
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: all data is up to da>
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: node has quorum

● pveproxy.service - PVE API Proxy Server
lines 1-23...skipping...
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2022-05-14 13:58:24 CEST; 40min ago
    Process: 2610460 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
   Main PID: 2610461 (pmxcfs)
      Tasks: 6 (limit: 28818)
     Memory: 38.3M
        CPU: 3.343s
     CGroup: /system.slice/pve-cluster.service
             └─2610461 /usr/bin/pmxcfs

May 14 13:58:23 proxmox1 pmxcfs[2610461]: [dcdb] crit: can't initialize service
May 14 13:58:23 proxmox1 pmxcfs[2610461]: [status] crit: cpg_initialize failed: 2
May 14 13:58:23 proxmox1 pmxcfs[2610461]: [status] crit: can't initialize service
May 14 13:58:24 proxmox1 systemd[1]: Started The Proxmox VE cluster filesystem.
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: update cluster info (cluster name  DCO-Proxmox, version = 3)
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [dcdb] notice: members: 1/2610461
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [dcdb] notice: all data is up to date
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: members: 1/2610461
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: all data is up to date
May 14 13:58:29 proxmox1 pmxcfs[2610461]: [status] notice: node has quorum

● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2022-05-14 13:58:33 CEST; 40min ago
    Process: 2610556 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
    Process: 2610567 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
   Main PID: 2610632 (pveproxy)
      Tasks: 4 (limit: 28818)
     Memory: 151.7M
        CPU: 6.414s
     CGroup: /system.slice/pveproxy.service
             ├─2610632 pveproxy
             ├─2610633 pveproxy worker
             ├─2610634 pveproxy worker
             └─2610635 pveproxy worker

May 14 13:58:31 proxmox1 systemd[1]: Starting PVE API Proxy Server...
May 14 13:58:33 proxmox1 pveproxy[2610632]: starting server
May 14 13:58:33 proxmox1 pveproxy[2610632]: starting 3 worker(s)
May 14 13:58:33 proxmox1 pveproxy[2610632]: worker 2610633 started
May 14 13:58:33 proxmox1 pveproxy[2610632]: worker 2610634 started
May 14 13:58:33 proxmox1 pveproxy[2610632]: worker 2610635 started
May 14 13:58:33 proxmox1 systemd[1]: Started PVE API Proxy Server.

● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2022-05-14 13:58:29 CEST; 40min ago
    Process: 2610487 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
   Main PID: 2610548 (pvedaemon)
      Tasks: 4 (limit: 28818)
     Memory: 138.3M
        CPU: 1.945s
     CGroup: /system.slice/pvedaemon.service
             ├─2610548 pvedaemon
             ├─2610549 pvedaemon worker
             ├─2610550 pvedaemon worker
             └─2610551 pvedaemon worker

May 14 13:58:27 proxmox1 systemd[1]: Starting PVE API Daemon...
May 14 13:58:29 proxmox1 pvedaemon[2610548]: starting server
May 14 13:58:29 proxmox1 pvedaemon[2610548]: starting 3 worker(s)
May 14 13:58:29 proxmox1 pvedaemon[2610548]: worker 2610549 started
May 14 13:58:29 proxmox1 pvedaemon[2610548]: worker 2610550 started
May 14 13:58:29 proxmox1 pvedaemon[2610548]: worker 2610551 started
May 14 13:58:29 proxmox1 systemd[1]: Started PVE API Daemon.
May 14 14:06:03 proxmox1 pvedaemon[2610549]: <root@pam> successful auth for user 'root@pam'
May 14 14:21:03 proxmox1 pvedaemon[2610551]: <root@pam> successful auth for user 'root@pam'
May 14 14:36:06 proxmox1 pvedaemon[2610549]: <root@pam> successful auth for user 'root@pam'

SERVER 2
Code:
root@proxmox-okp02:~# systemctl status pve-cluster pveproxy pvedaemon
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor p>
     Active: failed (Result: exit-code) since Sat 2022-05-14 13:58:26 CEST; 40m>
    Process: 52409 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 10ms

May 14 13:58:26 proxmox-okp02 systemd[1]: pve-cluster.service: Scheduled restar>
May 14 13:58:26 proxmox-okp02 systemd[1]: Stopped The Proxmox VE cluster filesy>
May 14 13:58:26 proxmox-okp02 systemd[1]: pve-cluster.service: Start request re>
May 14 13:58:26 proxmox-okp02 systemd[1]: pve-cluster.service: Failed with resu>
May 14 13:58:26 proxmox-okp02 systemd[1]: Failed to start The Proxmox VE cluste>

● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor pres>
     Active: active (running) since Fri 2022-05-13 22:03:56 CEST; 16h ago
    Process: 2424 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited>
    Process: 2425 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCC>
    Process: 8219 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/S>
   Main PID: 2426 (pveproxy)
      Tasks: 4 (limit: 28762)
     Memory: 133.7M
        CPU: 29min 13.154s
     CGroup: /system.slice/pveproxy.service
lines 1-23...skipping...
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sat 2022-05-14 13:58:26 CEST; 40min ago
    Process: 52409 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 10ms

May 14 13:58:26 proxmox-okp02 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
May 14 13:58:26 proxmox-okp02 systemd[1]: Stopped The Proxmox VE cluster filesystem.
May 14 13:58:26 proxmox-okp02 systemd[1]: pve-cluster.service: Start request repeated too quickly.
May 14 13:58:26 proxmox-okp02 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
May 14 13:58:26 proxmox-okp02 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-05-13 22:03:56 CEST; 16h ago
    Process: 2424 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
    Process: 2425 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
    Process: 8219 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUCCESS)
   Main PID: 2426 (pveproxy)
      Tasks: 4 (limit: 28762)
     Memory: 133.7M
        CPU: 29min 13.154s
     CGroup: /system.slice/pveproxy.service
             ├─ 2426 pveproxy
             ├─54304 pveproxy worker
             ├─54305 pveproxy worker
             └─54306 pveproxy worker

May 14 14:38:48 proxmox-okp02 pveproxy[2426]: worker 54300 finished
May 14 14:38:48 proxmox-okp02 pveproxy[2426]: starting 1 worker(s)
May 14 14:38:48 proxmox-okp02 pveproxy[2426]: worker 54305 started
May 14 14:38:48 proxmox-okp02 pveproxy[54301]: worker exit
May 14 14:38:48 proxmox-okp02 pveproxy[54304]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 14 14:38:48 proxmox-okp02 pveproxy[54305]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 14 14:38:48 proxmox-okp02 pveproxy[2426]: worker 54301 finished
May 14 14:38:48 proxmox-okp02 pveproxy[2426]: starting 1 worker(s)
May 14 14:38:48 proxmox-okp02 pveproxy[2426]: worker 54306 started
May 14 14:38:48 proxmox-okp02 pveproxy[54306]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.

● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-05-13 22:03:54 CEST; 16h ago
    Process: 2412 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
   Main PID: 2418 (pvedaemon)
      Tasks: 4 (limit: 28762)
     Memory: 135.9M
        CPU: 7.249s
     CGroup: /system.slice/pvedaemon.service
             ├─2418 pvedaemon
             ├─2419 pvedaemon worker
             ├─2420 pvedaemon worker
             └─2421 pvedaemon worker

May 13 22:03:53 proxmox-okp02 systemd[1]: Starting PVE API Daemon...
May 13 22:03:54 proxmox-okp02 pvedaemon[2418]: starting server
May 13 22:03:54 proxmox-okp02 pvedaemon[2418]: starting 3 worker(s)
May 13 22:03:54 proxmox-okp02 pvedaemon[2418]: worker 2419 started
May 13 22:03:54 proxmox-okp02 pvedaemon[2418]: worker 2420 started
May 13 22:03:54 proxmox-okp02 pvedaemon[2418]: worker 2421 started
May 13 22:03:54 proxmox-okp02 systemd[1]: Started PVE API Daemon.


SERVER 3
Code:
root@proxmox-okp03:~# systemctl status pve-cluster pveproxy pvedaemon
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor p>
     Active: failed (Result: exit-code) since Sat 2022-05-14 13:58:29 CEST; 40m>
    Process: 60549 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 10ms

May 14 13:58:29 proxmox-okp03 systemd[1]: pve-cluster.service: Scheduled restar>
May 14 13:58:29 proxmox-okp03 systemd[1]: Stopped The Proxmox VE cluster filesy>
May 14 13:58:29 proxmox-okp03 systemd[1]: pve-cluster.service: Start request re>
May 14 13:58:29 proxmox-okp03 systemd[1]: pve-cluster.service: Failed with resu>
May 14 13:58:29 proxmox-okp03 systemd[1]: Failed to start The Proxmox VE cluste>

● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor pres>
     Active: active (running) since Fri 2022-05-13 19:12:44 CEST; 19h ago
    Process: 1933 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited>
    Process: 1934 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCC>
    Process: 15999 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/>
   Main PID: 1935 (pveproxy)
      Tasks: 4 (limit: 28762)
     Memory: 133.6M
        CPU: 34min 44.228s
     CGroup: /system.slice/pveproxy.service
lines 1-23...skipping...
● pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sat 2022-05-14 13:58:29 CEST; 40min ago
    Process: 60549 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 10ms

May 14 13:58:29 proxmox-okp03 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
May 14 13:58:29 proxmox-okp03 systemd[1]: Stopped The Proxmox VE cluster filesystem.
May 14 13:58:29 proxmox-okp03 systemd[1]: pve-cluster.service: Start request repeated too quickly.
May 14 13:58:29 proxmox-okp03 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
May 14 13:58:29 proxmox-okp03 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-05-13 19:12:44 CEST; 19h ago
    Process: 1933 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
    Process: 1934 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
    Process: 15999 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUCCESS)
   Main PID: 1935 (pveproxy)
      Tasks: 4 (limit: 28762)
     Memory: 133.6M
        CPU: 34min 44.228s
     CGroup: /system.slice/pveproxy.service
             ├─ 1935 pveproxy
             ├─62682 pveproxy worker
             ├─62683 pveproxy worker
             └─62684 pveproxy worker

May 14 14:38:48 proxmox-okp03 pveproxy[62682]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 14 14:38:48 proxmox-okp03 pveproxy[62676]: worker exit
May 14 14:38:48 proxmox-okp03 pveproxy[1935]: worker 62675 finished
May 14 14:38:48 proxmox-okp03 pveproxy[1935]: starting 1 worker(s)
May 14 14:38:48 proxmox-okp03 pveproxy[1935]: worker 62683 started
May 14 14:38:48 proxmox-okp03 pveproxy[1935]: worker 62676 finished
May 14 14:38:48 proxmox-okp03 pveproxy[1935]: starting 1 worker(s)
May 14 14:38:48 proxmox-okp03 pveproxy[1935]: worker 62684 started
May 14 14:38:48 proxmox-okp03 pveproxy[62683]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 14 14:38:48 proxmox-okp03 pveproxy[62684]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 14 14:38:53 proxmox-okp03 pveproxy[62682]: worker exit
May 14 14:38:53 proxmox-okp03 pveproxy[1935]: worker 62682 finished
May 14 14:38:53 proxmox-okp03 pveproxy[1935]: starting 1 worker(s)
May 14 14:38:53 proxmox-okp03 pveproxy[1935]: worker 62687 started

● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-05-13 19:12:42 CEST; 19h ago
    Process: 1922 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
   Main PID: 1928 (pvedaemon)
      Tasks: 4 (limit: 28762)
     Memory: 135.8M
        CPU: 8.546s
     CGroup: /system.slice/pvedaemon.service
             ├─1928 pvedaemon
             ├─1929 pvedaemon worker
             ├─1930 pvedaemon worker
             └─1931 pvedaemon worker

May 13 19:12:41 proxmox-okp03 systemd[1]: Starting PVE API Daemon...
May 13 19:12:42 proxmox-okp03 pvedaemon[1928]: starting server
May 13 19:12:42 proxmox-okp03 pvedaemon[1928]: starting 3 worker(s)
May 13 19:12:42 proxmox-okp03 pvedaemon[1928]: worker 1929 started
May 13 19:12:42 proxmox-okp03 pvedaemon[1928]: worker 1930 started

1652532365376.png
 
Hello,
Thanks for your answer. I just tried it on the 3 servers.


Server 1
Code:
root@proxmox1:~# pvecm status
Cluster information
-------------------
Name:             DCO-Proxmox
Config Version:   3
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon May 16 11:52:45 2022
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1.276
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 185.18.149.201 (local)
0x00000002          1 185.18.149.202
0x00000003          1 185.18.149.203

Server 2
Code:
root@proxmox-okp02:~# pvecm status
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused

Server 3
Code:
root@proxmox-okp03:~# pvecm status
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
 
Can you post the /etc/hosts of either server 2 or 3?
 
Last edited:
Sure!
Thanks for your help!

Server 1
Code:
127.0.0.1 localhost.localdomain localhost
185.18.149.201 proxmox1.dco-be.it2go.eu proxmox1
#127.0.0.1 localhost
#185.18.149.201 proxmox-okp01 proxmox-okp01.dco-be.it2go.eu


# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Server 2
Code:
127.0.0.1 localhost.localdomain localhost
185.18.149.202 proxmox2.dco-be.it2go.eu proxmox2

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Server 3
Code:
127.0.0.1 localhost.localdomain localhost
185.18.149.203 proxmox3.dco-be.it2go.eu proxmox3

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
 
Hosts file looks good to me. Can you check journalctl -xe gives any errors, and what is the output of journalctl -u corosync -u pve-cluster
 
journalctl -xe
I see a lot of errors on server 2 and 3 concerning private keys...
Code:
May 16 15:42:08 proxmox-okp02 pveproxy[201565]: worker exit
May 16 15:42:08 proxmox-okp02 pveproxy[201566]: worker exit
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: worker 201565 finished
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: starting 1 worker(s)
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: worker 201570 started
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: worker 201566 finished
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: starting 1 worker(s)
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: worker 201571 started
May 16 15:42:08 proxmox-okp02 pveproxy[201567]: worker exit
May 16 15:42:08 proxmox-okp02 pveproxy[201570]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 16 15:42:08 proxmox-okp02 pveproxy[201571]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: worker 201567 finished
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: starting 1 worker(s)
May 16 15:42:08 proxmox-okp02 pveproxy[2426]: worker 201573 started

Code:
May 16 15:40:28 proxmox-okp03 pveproxy[210856]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 16 15:40:28 proxmox-okp03 sshd[210854]: Received disconnect from 20.197.177.187 port 58160:11: Bye Bye [preauth]
May 16 15:40:28 proxmox-okp03 sshd[210854]: Disconnected from invalid user parkpoom 20.197.177.187 port 58160 [preauth]
May 16 15:40:28 proxmox-okp03 pveproxy[210852]: worker exit
May 16 15:40:28 proxmox-okp03 pveproxy[210853]: worker exit
May 16 15:40:28 proxmox-okp03 pveproxy[1935]: worker 210852 finished
May 16 15:40:28 proxmox-okp03 pveproxy[1935]: worker 210853 finished
May 16 15:40:28 proxmox-okp03 pveproxy[1935]: starting 2 worker(s)
May 16 15:40:28 proxmox-okp03 pveproxy[1935]: worker 210857 started
May 16 15:40:28 proxmox-okp03 pveproxy[1935]: worker 210858 started
May 16 15:40:28 proxmox-okp03 pveproxy[210857]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.
May 16 15:40:28 proxmox-okp03 pveproxy[210858]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1904.

journalctl -u corosync -u pve-cluster

server 2
Code:
May 12 10:25:08 proxmox-okp02 corosync[6548]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:25:08 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[2]: 2 3
May 12 10:25:08 proxmox-okp02 corosync[6548]:   [QUORUM] Sync left[1]: 1
May 12 10:25:08 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (2.20b) was formed. Members left: 1
May 12 10:25:08 proxmox-okp02 corosync[6548]:   [QUORUM] Members[2]: 2 3
May 12 10:25:08 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:25:09 proxmox-okp02 corosync[6548]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:25:09 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:25:09 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 has no active links
May 12 10:26:24 proxmox-okp02 corosync[6548]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:26:24 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:26:26 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:26:26 proxmox-okp02 corosync[6548]:   [QUORUM] Sync joined[1]: 1
May 12 10:26:26 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (1.210) was formed. Members joined: 1
May 12 10:26:26 proxmox-okp02 corosync[6548]:   [QUORUM] Members[3]: 1 2 3
May 12 10:26:26 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:29:37 proxmox-okp02 corosync[6548]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:29:37 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[2]: 2 3
May 12 10:29:37 proxmox-okp02 corosync[6548]:   [QUORUM] Sync left[1]: 1
May 12 10:29:37 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (2.214) was formed. Members left: 1
May 12 10:29:37 proxmox-okp02 corosync[6548]:   [QUORUM] Members[2]: 2 3
May 12 10:29:37 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:29:38 proxmox-okp02 corosync[6548]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:29:38 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:29:38 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 has no active links
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [QUORUM] Sync joined[1]: 1
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (1.219) was formed. Members joined: 1
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [QUORUM] Members[3]: 1 2 3
May 12 10:29:41 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:34:24 proxmox-okp02 corosync[6548]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[2]: 2 3
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [QUORUM] Sync left[1]: 1
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (2.21d) was formed. Members left: 1
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [QUORUM] Members[2]: 2 3
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:34:25 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 has no active links
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [QUORUM] Sync joined[1]: 1
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (1.222) was formed. Members joined: 1
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [QUORUM] Members[3]: 1 2 3
May 12 10:35:42 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:54:25 proxmox-okp02 corosync[6548]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:54:25 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[2]: 2 3
May 12 10:54:25 proxmox-okp02 corosync[6548]:   [QUORUM] Sync left[1]: 1
May 12 10:54:25 proxmox-okp02 corosync[6548]:   [TOTEM ] A new membership (2.226) was formed. Members left: 1
May 12 10:54:25 proxmox-okp02 corosync[6548]:   [QUORUM] Members[2]: 2 3
May 12 10:54:25 proxmox-okp02 corosync[6548]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:54:26 proxmox-okp02 corosync[6548]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:54:26 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:54:26 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 has no active links
May 12 10:55:43 proxmox-okp02 corosync[6548]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:55:43 proxmox-okp02 corosync[6548]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:55:43 proxmox-okp02 corosync[6548]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:55:43 proxmox-okp02 corosync[6548]:   [QUORUM] Sync joined[1]: 1

server 3
Code:
May 12 10:25:08 proxmox-okp03 corosync[1896]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:25:08 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[2]: 2 3
May 12 10:25:08 proxmox-okp03 corosync[1896]:   [QUORUM] Sync left[1]: 1
May 12 10:25:08 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (2.20b) was formed. Members left: 1
May 12 10:25:08 proxmox-okp03 corosync[1896]:   [QUORUM] Members[2]: 2 3
May 12 10:25:08 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:25:10 proxmox-okp03 corosync[1896]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:25:10 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:25:10 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 has no active links
May 12 10:26:24 proxmox-okp03 corosync[1896]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:26:24 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:26:26 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:26:26 proxmox-okp03 corosync[1896]:   [QUORUM] Sync joined[1]: 1
May 12 10:26:26 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (1.210) was formed. Members joined: 1
May 12 10:26:26 proxmox-okp03 corosync[1896]:   [QUORUM] Members[3]: 1 2 3
May 12 10:26:26 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:29:37 proxmox-okp03 corosync[1896]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:29:37 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[2]: 2 3
May 12 10:29:37 proxmox-okp03 corosync[1896]:   [QUORUM] Sync left[1]: 1
May 12 10:29:37 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (2.214) was formed. Members left: 1
May 12 10:29:37 proxmox-okp03 corosync[1896]:   [QUORUM] Members[2]: 2 3
May 12 10:29:37 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:29:41 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:29:41 proxmox-okp03 corosync[1896]:   [QUORUM] Sync joined[1]: 1
May 12 10:29:41 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (1.219) was formed. Members joined: 1
May 12 10:29:41 proxmox-okp03 corosync[1896]:   [QUORUM] Members[3]: 1 2 3
May 12 10:29:41 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:34:24 proxmox-okp03 corosync[1896]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:34:25 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[2]: 2 3
May 12 10:34:25 proxmox-okp03 corosync[1896]:   [QUORUM] Sync left[1]: 1
May 12 10:34:25 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (2.21d) was formed. Members left: 1
May 12 10:34:25 proxmox-okp03 corosync[1896]:   [QUORUM] Members[2]: 2 3
May 12 10:34:25 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:34:26 proxmox-okp03 corosync[1896]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:34:26 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:34:26 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 has no active links
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [QUORUM] Sync joined[1]: 1
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (1.222) was formed. Members joined: 1
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [QUORUM] Members[3]: 1 2 3
May 12 10:35:42 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:54:25 proxmox-okp03 corosync[1896]:   [CFG   ] Node 1 was shut down by sysadmin
May 12 10:54:25 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[2]: 2 3
May 12 10:54:25 proxmox-okp03 corosync[1896]:   [QUORUM] Sync left[1]: 1
May 12 10:54:25 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (2.226) was formed. Members left: 1
May 12 10:54:25 proxmox-okp03 corosync[1896]:   [QUORUM] Members[2]: 2 3
May 12 10:54:25 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 12 10:54:27 proxmox-okp03 corosync[1896]:   [KNET  ] link: host: 1 link: 0 is down
May 12 10:54:27 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:54:27 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 has no active links
May 12 10:55:42 proxmox-okp03 corosync[1896]:   [KNET  ] rx: host: 1 link: 0 is up
May 12 10:55:42 proxmox-okp03 corosync[1896]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 10:55:43 proxmox-okp03 corosync[1896]:   [QUORUM] Sync members[3]: 1 2 3
May 12 10:55:43 proxmox-okp03 corosync[1896]:   [QUORUM] Sync joined[1]: 1
May 12 10:55:43 proxmox-okp03 corosync[1896]:   [TOTEM ] A new membership (1.22b) was formed. Members joined: 1
May 12 10:55:43 proxmox-okp03 corosync[1896]:   [QUORUM] Members[3]: 1 2 3
May 12 10:55:43 proxmox-okp03 corosync[1896]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 13 04:11:04 proxmox-okp03 systemd[1]: Starting The Proxmox VE cluster filesystem...
May 13 04:11:04 proxmox-okp03 pmxcfs[131270]: [main] crit: Unable to get local IP address
 
and your hostname -f matches the entry in /etc/hosts? Does hostname --ip-address give you the same IP as in the hosts file?
 
Last edited:
Hmm this is weird... Only server 1 gives me a hostname and an IP address. Server 2 and Server 3 doesn't know the command?

Server 1
root@proxmox1:~# hostname -f
proxmox1.dco-be.it2go.eu
root@proxmox1:~# hostname --ip-address
185.18.149.201

Server 2
root@proxmox-okp02:~# hostname -f
hostname: Name or service not known
root@proxmox-okp02:~# hostname --ip-address
hostname: Name or service not known

Server 3
root@proxmox-okp03:~# hostname -f
hostname: Name or service not known
root@proxmox-okp03:~# hostname --ip-address
hostname: Name or service not kno
 
It knows the command but it can't show the hostname, check the /etc/hostname. This might be the problem :).
 
When I installed the 3 servers, I just gave them the names proxmox1, proxmox2, proxmox3. But after a few days I tought it would be better when i gave them another name, with the abbreviation of the place where it is located. So I changed (well ... I tried to change it...) to proxmox-okp1, proxmox-okp2, proxmox-okp3. OKP is the abbreviation of Oostkamp, a small city in Belgium (Europe) where the datacenter is located.

Server 1
Code:
proxmox1
#proxmox-okp01

Server 2
Code:
proxmox-okp02

Server 3
Code:
proxmox-okp03
 
Ah there is the issue, your /etc/hosts has different hostname entries from you /etc/hostname if you change the hostname file to proxmox2 it should work (it might need restart) and same for server 3 :).
 
  • Like
Reactions: Stoiko Ivanov
That was, indeed, the solution! The whole cluster is up and running again!
What should be the correct way to change the servername from proxmox2 to proxmox-okp2?
 
I'll give it a try within some days, i'll let you know the results
Thanks for your time and effort to help me out with this issue!

Much appreciated!
 
  • Like
Reactions: shrdlicka