Can't access proxmox

WebHostingNeeds

Renowned Member
Dec 4, 2015
19
1
66
I have a Proxmox 5 installation running with few VMS. I created a cluster on it with command

Code:
pvecm create nerdcluster


Also changed hostname on this server with hostnamectl.

Once i rebooted, Proxmox is no longer accesable.

systemctl status pveproxy shows it as running, but shows some key related error

Code:
root@can01-hyp001:~# systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-06-24 08:42:32 UTC; 23min ago
  Process: 2764 ExecStop=/usr/bin/pveproxy stop (code=exited, status=0/SUCCESS)
  Process: 2771 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
 Main PID: 2789 (pveproxy)
    Tasks: 4 (limit: 4915)
   Memory: 115.0M
      CPU: 18.677s
   CGroup: /system.slice/pveproxy.service
           ├─2789 pveproxy
           ├─8448 pveproxy worker
           ├─8449 pveproxy worker
           └─8450 pveproxy worker

Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[2789]: starting 2 worker(s)
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[2789]: worker 8448 started
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[2789]: worker 8449 started
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[8436]: worker exit
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[8448]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1643.
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[8449]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1643.
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[2789]: worker 8436 finished
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[2789]: starting 1 worker(s)
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[2789]: worker 8450 started
Jun 24 09:06:13 can01-hyp001.nerdtel.com pveproxy[8450]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1643.
root@can01-hyp001:~#

Key file (/etc/pve/local/pve-ssl.key) not present on the server.

None of the VMs are working too.

Code:
root@can01-hyp001:~# qm list
root@can01-hyp001:~#


When i try recreate key, i get following error.

Code:
root@can01-hyp001:~# pvecm updatecerts -f
no quorum - unable to update files
root@can01-hyp001:~#


Here is some pvecm results, not sure adding cluster messed up the server.

Code:
root@can01-hyp001:~# pvecm status
Quorum information
------------------
Date:             Sun Jun 24 09:05:20 2018
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1/8
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:           

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.95.29.176 (local)
root@can01-hyp001:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 192.95.29.176 (local)
root@can01-hyp001:~#

Any idea how to fix this ?
 
Hi,

you can't rename a node with "hostnamectl".

you have to change the name also in

/etc/hosts
/etc/hostname (this is the only file changed by hostnamectl)
/etc/pve/corosync.conf (This file you can't write at the moment.)
/etc/corosync/corosync.conf (You must edit it on both cluster member nodes)
 
I made the changes, rebooted the server. Still no VM running.

Here is the current config. ns560483.ip-192-95-29.net is old hostname. new = can01-hyp001.nerdtel.com

Code:
root@can01-hyp001:~# cat /etc/hostname
can01-hyp001.nerdtel.com
root@can01-hyp001:~# cat /etc/pve/corosync.conf 
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: help
    nodeid: 2
    quorum_votes: 1
    ring0_addr: help
  }
  node {
    name: ns560483
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.95.29.176
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: nerdcluster
  config_version: 2
  interface {
    bindnetaddr: 192.95.29.176
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}

root@can01-hyp001:~# cat /etc/corosync/corosync.conf 
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: help
    nodeid: 2
    quorum_votes: 1
    ring0_addr: help
  }
  node {
    name: ns560483
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.95.29.176
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: nerdcluster
  config_version: 2
  interface {
    bindnetaddr: 192.95.29.176
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}

root@can01-hyp001:~#

I was not able to add any nodes, i just created a cluster, then tried to get help. That caused a node with nane "help" get added.

Code:
pvecm addnode help

This node with name "Help" was shown in web interface with red, i tried to delete, it failed with some error related to quorum

Code:
pvecm delnode help
 
/etc/corosync/corosync.conf (You must edit it on both cluster member nodes)

I only have 1 node

Code:
root@can01-hyp001:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 192.95.29.176 (local)
root@can01-hyp001:~#

That is the machine i had proxmox with few VMs running. All i did was enable cluser, not added any real nodes.
 
You have to fix your corosync.conf in /etc/corosync and the restart the corosync.
If this is done you need to copy the corosync.conf from /etc/corosync/ to /etc/pve/
 
You have to fix your corosync.conf in /etc/corosync and the restart the corosync.

I redited /etc/corosync/corosync.conf and restarted corosync.

Code:
root@can01-hyp001:/etc/corosync# vi corosync.conf
root@can01-hyp001:/etc/corosync# cat corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: help
    nodeid: 2
    quorum_votes: 1
    ring0_addr: help
  }
  node {
    name: can01-hyp001
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.95.29.176
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: nerdcluster
  config_version: 2
  interface {
    bindnetaddr: 192.95.29.176
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}

root@can01-hyp001:/etc/corosync# service corosync restart
root@can01-hyp001:/etc/corosync#


If this is done you need to copy the corosync.conf from /etc/corosync/ to /etc/pve/

When i copy /etc/corosync/corosync.conf to /etc/pve/, i get error

Code:
root@can01-hyp001:/etc/corosync# cp /etc/corosync/corosync.conf  /etc/pve/corosync.conf
cp: cannot create regular file '/etc/pve/corosync.conf': Permission denied
root@can01-hyp001:/etc/corosync#

How i do it ?

EDIT: I did some search and found out the file is coming from database /var/lib/pve-cluster/config.db. Should i edit this file or there any other way to fix this problem ?

EDIT2 I edited /var/lib/pve-cluster/config.db, now my /etc/pve/corosync.conf shows proper hostname,

Code:
root@can01-hyp001:~# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: help
    nodeid: 2
    quorum_votes: 1
    ring0_addr: help
  }
  node {
    name: can01-hyp001
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.95.29.176
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: nerdcluster
  config_version: 2
  interface {
    bindnetaddr: 192.95.29.176
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}

root@can01-hyp001:~#

Still proxmox is not coming up.

https://ns560483.ip-192-95-29.net:8006/

What should i check next ? I don't really need cluster, removing it helps ?
 
Last edited:
node { name: help nodeid: 2 quorum_votes: 1 ring0_addr: help
As you told, you have only one node in your cluster and you can't mix IP and hostname as ringx_address.
remove the second node.
 
As you told, you have only one node in your cluster and you can't mix IP and hostname as ringx_address.
remove the second node.

When i delete, i get error

Code:
root@can01-hyp001:~# pvecm delnode help
cluster not ready - no quorum?
root@can01-hyp001:~#

Can i edit sqlite and remove following entry from /etc/pve/corosync.conf ?

Code:
  node {
   name: help
   nodeid: 2
   quorum_votes: 1
   ring0_addr: help
  }
 
You should not edit SQLite.
Just edit the /etc/corosync/corosync.conf
You can write to /etc/pve/* if corosync work.
 
Thanks for the reply. I edited config.db file and get it look like following

https://gitlab.com/snippets/1728225/raw

But still not able to login to proxmox (https://192.95.29.176:8006/).

Should i revert the changes and edit /etc/corosync/corosync.conf ?

You can write to /etc/pve/* if corosync work.

I can't write to this file, documentation says it is read only file system. Are you saying i can edit it directly or editing /etc/corosync/corosync.conf will update that too ?


EDIT: I can edit /etc/pve/corosync.conf now. But still no proxmox web interface loading or "qm list" showing any VM.
 
Last edited: