Deleted Cluster and now I can't access web UI

Deleted member 60080 · Jan 11, 2019

Hello,

I'm kind of a noob in the proxmox shell. I just got my R710 that was going to upgrade to from my 2950 that I have had for a super long time. I installed Proxmox on my new server and created a cluster on my 2950. I was having issues connecting the two so I decided to scrap the idea and thought it would just be best to create VZdump backup files on an external hard drive. I followed this tutorial (1st reply) to delete the cluster on the 2950 (I never connected the R710 to the 2950) but when I restarted the 2950, I could no longer access the web UI. I went to the shell and typed in pvecm status but I get the error:

Cannot initialize CMAP service

When I try and start any vm using qm start, I get this error:

cluster not ready - no quorum?

When I try pvecm e 1 and then pvecm status again, I get the same cannot initialize CMAP service error.

What do you think I should do? I don't need to restore the server, I just need all of my VMs so I can put them on the new server.

t.lamprecht · Jan 11, 2019

The next time please follow our official documentation, the mentioned thread is a bit older (possible outdated, e.g. it mentions cman and sysvinit which are not valid in current Proxmox VE 5.X) and sometimes a bit weird things are suggested. https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_separate_node_without_reinstall

tuckerritti said:
When I try and start any vm using qm start, I get this error:

cluster not ready - no quorum?

When I try pvecm e 1 and then pvecm status again, I get the same cannot initialize CMAP service error.

those two command are expected to behave like this if there is no cluster configured, as the cannot connect to the cluster communication stack (corosync).

First ensure all required services are up and running:

Code:

systemctl restart pve-cluster pveproxy pvedaemon

If that does not work, or throws error I need a bit more info from you. E.g., as root run:

Code:

systemctl list-units --failed --plain -l
ls -l /etc/corosync/ /etc/pve/corosync.conf
systemctl status pveproxy pve-cluster

and post the output here, preferably in [code] output [/code] tags.

Deleted member 60080 · Jan 11, 2019

When I run systemctl list-units --failed --plain -l, I see that corosync, influxdb, and pvesr have all failed

Code:

UNIT             LOAD   ACTIVE SUB    DESCRIPTION                          

  corosync.service loaded failed failed Corosync Cluster Engine              

  influxdb.service loaded failed failed InfluxDB is an open-source, distributed,

  pvesr.service    loaded failed failed Proxmox VE replication runner

t.lamprecht · Jan 11, 2019

OK, that should not matter much for your specific case. Corosync has no config anymore thus it failed, but it's only needed in a cluster which you got rid off.
Did you try the service restarts too and could you please post the rest of the requested outputs.

Deleted member 60080 · Jan 11, 2019

Did you try the service restarts too and could you please post the rest of the requested outputs.

There is no output from the restart command.

Code:

ls -l /etc/corosync/ /etc/pve/corosync.conf
ls: cannot access '/etc/corosync/': No such file or directory

-r--r----- 1 root www-data 362 Jan  9 16:47 /etc/pve/corosync.conf

Code:

systemctl status pveproxy pve-cluster
● pveproxy.service - PVE API Proxy Server

   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset:

   Active: active (running) since Fri 2019-01-11 08:20:34 PST; 47min ago

  Process: 23323 ExecStop=/usr/bin/pveproxy stop (code=exited, status=0/SUCCESS)

  Process: 23165 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUC

  Process: 23417 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCES

 Main PID: 23433 (pveproxy)

    Tasks: 4 (limit: 4915)

   Memory: 113.6M

      CPU: 1min 41.919s

   CGroup: /system.slice/pveproxy.service

          ├─23433 pveproxy

          ├─28455 pveproxy worker

          ├─28456 pveproxy worker

          └─28457 pveproxy worker


Jan 11 09:07:57 proxmox pveproxy[23433]: worker 28448 finished

Jan 11 09:07:57 proxmox pveproxy[23433]: starting 1 worker(s)

Jan 11 09:07:57 proxmox pveproxy[23433]: worker 28456 started

Jan 11 09:07:57 proxmox pveproxy[28455]: /etc/pve/local/pve-ssl.key: failed to l

Jan 11 09:07:57 proxmox pveproxy[28456]: /etc/pve/local/pve-ssl.key: failed to l

Jan 11 09:07:57 proxmox pveproxy[28449]: worker exit

Jan 11 09:07:57 proxmox pveproxy[23433]: worker 28449 finished

t.lamprecht · Jan 11, 2019

tuckerritti said:
Jan 11 09:07:57 proxmox pveproxy[28455]: /etc/pve/local/pve-ssl.key: failed to l

and there's your issue, the proxy cannot load it's ssl key.

can you try:

Code:

pvecm updatecerts
systemctl restart pveproxy

Deleted member 60080 · Jan 11, 2019

Now I get

Code:

no quorum - unable to update files

t.lamprecht · Jan 11, 2019

Oh well, the pmxcfs is in limbo from the cluster separation try. /etc/pve/corosync.conf is still present thus the configuration file systems thinks it's still clustered.
To fix that do:

Code:

systemctl stop pve-cluster
# start in local mode
pmxcfs -l 
rm /etc/pve/corosync.conf 
killall pmxcfs
systemctl stop pve-cluster

then repeat the commands from my previous post.

Deleted member 60080 · Jan 11, 2019

Now only corosync and influxdb are failing

Code:

UNIT             LOAD   ACTIVE SUB    DESCRIPTION                                                  

  corosync.service loaded failed failed Corosync Cluster Engine                                      

  influxdb.service loaded failed failed InfluxDB is an open-source, distributed, time series database

Deleted member 60080 · Jan 14, 2019

Corosync fails because /etc/corosync/corosync.conf does not exist

Code:

● corosync.service - Corosync Cluster Engine

   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset:

   Active: failed (Result: exit-code) since Wed 2019-01-09 16:51:35 PST; 4 days

Condition: start condition failed at Fri 2019-01-11 12:13:34 PST; 2 days ago

          └─ ConditionPathExists=/etc/corosync/corosync.conf was not met

Influxdb gives no information

Code:

● influxdb.service - InfluxDB is an open-source, distributed, time series database

   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)

   Active: failed (Result: exit-code) since Wed 2019-01-09 16:51:33 PST; 4 days ago

    Docs: man:influxd(1)

  Process: 1866 ExecStart=/usr/bin/influxd -config /etc/influxdb/influxdb.conf $INFLUXD_OPTS (code=exited, status=1/FAILURE)

 Main PID: 1866 (code=exited, status=1/FAILURE)

      CPU: 14ms

Also, I don't need to restore this server. I am just wondering if there is any way to get the VMs off of it and on to the new server.

t.lamprecht · Jan 14, 2019

tuckerritti said:
Now only corosync and influxdb are failing

As I already said above, corosync is expected to fail, influxdb isn't important for us right now (not direct part of PVE)...
As you do not say what exactly you did and do not post all requested output this is a bit hard to address, I'm afraid...

I assume you fixed pve-cluster as I said above and it is writeable again, so did you rerun the updatecerts and pveproxy commads:

t.lamprecht said:
can you try:

Code:

pvecm updatecerts systemctl restart pveproxy

?

Deleted member 60080 · Jan 14, 2019

Code:

root@proxmox:~# pvecm updatecerts

(re)generate node files

merge authorized SSH keys and known hosts

and systemctl does not give any output

I also noticed that local-lvm does not exist. When I try to start a VM now, I get:

Code:

root@proxmox:~# qm start 100

storage 'local-lvm' does not exists

karyuu · Dec 22, 2020

t.lamprecht said:
Oh well, the pmxcfs is in limbo from the cluster separation try. /etc/pve/corosync.conf is still present thus the configuration file systems thinks it's still clustered.
To fix that do:

Code:

systemctl stop pve-cluster # start in local mode pmxcfs -l rm /etc/pve/corosync.conf killall pmxcfs systemctl stop pve-cluster

then repeat the commands from my previous post.

Same problem, I have two node . I fixed the problem on node 2, but same code on nodo1, it shows

Code:

pmxcfs -l
fuse: mountpoint is not empty
fuse: if you are sure this is safe, use the 'nonempty' mount option
[main] crit: fuse_mount error: File exists
[main] notice: exit proxmox configuration filesystem (-1)

I tried

Code:

 systemctl list-units --failed --plain -l
  UNIT                LOAD   ACTIVE SUB    DESCRIPTION
  corosync.service    loaded failed failed Corosync Cluster Engine
  pve-cluster.service loaded failed failed The Proxmox VE cluster filesystem
  pvesr.service       loaded failed failed Proxmox VE replication runner

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

3 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

ls -l /etc/corosync/ /etc/pve/corosync.conf
ls: cannot access '/etc/pve/corosync.conf': No such file or directory
/etc/corosync/:
total 12
-r-------- 1 root root  256 Dec 22 09:48 authkey
-rw-r--r-- 1 root root  531 Dec 22 10:37 corosync.conf
drwxr-xr-x 2 root root 4096 Jun 25 11:02 uidgid.d


systemctl status pveproxy pve-cluster
● pveproxy.service - PVE API Proxy Server
   Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2020-12-22 13:19:58 EST; 3h 0min ago
  Process: 12507 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
  Process: 12508 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
 Main PID: 12509 (pveproxy)
    Tasks: 4 (limit: 7372)
   Memory: 142.0M
   CGroup: /system.slice/pveproxy.service
           ├─12509 pveproxy
           ├─25780 pveproxy worker
           ├─25781 pveproxy worker
           └─25784 pveproxy worker

Dec 22 16:20:45 no1 pveproxy[25780]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm
Dec 22 16:20:46 no1 pveproxy[25778]: worker exit
Dec 22 16:20:46 no1 pveproxy[25779]: worker exit
Dec 22 16:20:46 no1 pveproxy[12509]: worker 25779 finished
Dec 22 16:20:46 no1 pveproxy[12509]: worker 25778 finished
Dec 22 16:20:46 no1 pveproxy[12509]: starting 2 worker(s)
Dec 22 16:20:46 no1 pveproxy[12509]: worker 25781 started
Dec 22 16:20:46 no1 pveproxy[12509]: worker 25784 started
Dec 22 16:20:46 no1 pveproxy[25781]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm
Dec 22 16:20:46 no1 pveproxy[25784]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm

● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-12-22 13:19:55 EST; 3h 0min ago
  Process: 12494 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

Dec 22 13:19:55 no1 systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Dec 22 13:19:55 no1 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Dec 22 13:19:55 no1 systemd[1]: Stopped The Proxmox VE cluster filesystem.
Dec 22 13:19:55 no1 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Dec 22 13:19:55 no1 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Dec 22 13:19:55 no1 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

How can I fix this problem?

Thanks

aaron · Dec 23, 2020

karyuu said:
pmxcfs -l fuse: mountpoint is not empty

Looks like the /etc/pve directory is not empty.

Make sure that the pmxcfs process is not running. ps aux | grep pmxcfs. There should only be one line output showing the grep command, but nothing with /usr/bin/pmxcfs

If it is not running already, check what's inside the /etc/pve directory and (re)move it if you know you don't need it.

karyuu · Dec 23, 2020

aaron said:
Looks like the /etc/pve directory is not empty.

Make sure that the pmxcfs process is not running. ps aux | grep pmxcfs. There should only be one line output showing the grep command, but nothing with /usr/bin/pmxcfs

If it is not running already, check what's inside the /etc/pve directory and (re)move it if you know you don't need it.

Thanks, it works.

But I tried

Code:

pvecm updatecerts
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused

It shows refused.

When I access to webui, I found I can control all the VMs .(The VMs are still running), but the VM icon is question mark.

Thanks

aaron · Dec 24, 2020

Check if the necessary services are running. You can do so via the GUI: <Node> -> System. On a single node it is okay if the corosync service is not running. All others should be running.

I assume that the pvestatd service is either not running or having some other problem which should be visible in the logs.

karyuu · Dec 24, 2020

I have reactive the pvestatd, now all the vms are coming back.

aaron said:
Check if the necessary services are running. You can do so via the GUI: <Node> -> System. On a single node it is okay if the corosync service is not running. All others should be running.

I assume that the pvestatd service is either not running or having some other problem which should be visible in the logs.

I restart the pvestatd service, all the VMs are coming back.

Thanks

Search

Search

Deleted Cluster and now I can't access web UI

Deleted member 60080

Guest

t.lamprecht

Proxmox Staff Member

Deleted member 60080

Guest

t.lamprecht

Proxmox Staff Member

Deleted member 60080

Guest

t.lamprecht

Proxmox Staff Member

Deleted member 60080

Guest

t.lamprecht

Proxmox Staff Member

Deleted member 60080

Guest

Deleted member 60080

Guest

t.lamprecht

Proxmox Staff Member

Deleted member 60080

Guest

karyuu

Member

aaron

Proxmox Staff Member

karyuu

Member

Attachments

aaron

Proxmox Staff Member

karyuu

Member