Removing/Deleting a created cluster

an0n

New Member
Jul 4, 2014
1
0
1
Hi everyone.

Just have installed 2 Proxmox's 3.2 and wanted to setup a new cluster with both machines.

No need to mention that this is first time im "playing" with proxmox, so after messing up a little with both proxmox's, i ended up having 2 different clusters (one created in machine).

Now i would like to completely ERASE/DELETE/REMOVE at least 1 of this clusters so i can just add the free node to the remaining cluster. The pvecm man shows no delete option and after a couple of days intensive sarch, seems like the only viable option is to manually remove a set of packages from the system, manually remove specific folders and in top of that, you might have to change the hostname :S

I think there must be a way to do this properly.

It might be possible that i can add node2 to cluster1, even though cluster2 keeps created in node2, but i really would like to be able to remove that second unused cluster to avoid possible future errors because of the messy config.


Hope someone can help me!
Thank you all!


PD: If anyone needs me to clarify something about the actual setup, just let me know :)
 

acidrop

Member
Jul 17, 2012
194
4
18
Use this example and make the modifications on your cluster accordingly:

Node1=proxmox1

Node2=proxmox2 (failed node)

Node1 ip=172.21.3.8

Node2 ip=172.21.3.9

On proxmox1:

pvecm delnode proxmox2

On proxmox2:

cp -a /etc/pve /root/pve_backup * (create backup first)

Stop cluster service:*/etc/init.d/pve-cluster stop

umount /etc/pve

/etc/init.d/cman stop

rm /etc/cluster/cluster.conf

rm -rf /var/lib/pve-cluster/*

/etc/init.d/pve-cluster start

pvecm add proxmox1 *(re-add node2 on the cluster again if needed)
 
  • Like
Reactions: Imilah

hpcraith

Member
Mar 8, 2013
60
0
6
Stuttgart Germany
www.hlrs.de
I think I read this documentation. I created a cluster:
root@vwsrv4:~#create cluster VW-Cluster
root@vwsrv4:/etc/pve# pvecm status
Quorum information
------------------
Date: Fri Jul 8 17:37:23 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 160
Quorate: Yes

Votequorum information
----------------------
Expected votes: 1
Highest expected: 1
Total votes: 1
Quorum: 1
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 141.58.4.252 (local)
root@vwsrv4:/etc/pve#

Question: Why does' nt the status show the cluster name. You find it in /etc/corosync/corosync.conf or /etc/pve/corosync.conf.
The opposite of create cluster is delete cluster. The documentation You mentioned only has Remove a cluster node.
My question was about deleting the complete cluster information and set it up again.


 
Sorry, I misunderstood you. I can see there doesn't seem to be a way to revert to the configuration before creating the cluster (even if it's only one node). This is whatyou'd get on a system without any cluster config:

Code:
# pvecm status
Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?
Cannot initialize CMAP service
I also see pvecm status only reports the IP address which can't be used to delete. pvcm nodes lists the names but deleting it doesn't seem to have any effect. Perhaps Proxmox devs can elaborate on this.
 

hpcraith

Member
Mar 8, 2013
60
0
6
Stuttgart Germany
www.hlrs.de
maybe this wiki article helps:
http://pve.proxmox.com/wiki/Proxmox_Cluster_file_system_(pmxcfs)

see the section about 'Remove Cluster configuration'
I did the following:

service pve-cluster stop
pmxcfs -l
ls -al
root@vwsrv1:/etc/pve# ls -al
total 8
drwxr-xr-x 2 root www-data 0 Jan 1 1970 .
drwxr-xr-x 95 root root 4096 Jul 11 13:33 ..
-rw-r----- 1 root www-data 451 Jul 1 16:24 authkey.pub
-r--r----- 1 root www-data 153 Jan 1 1970 .clusterlog
-rw-r----- 1 root www-data 448 Jul 11 13:52 corosync.conf
-rw-r----- 1 root www-data 13 Jul 1 16:12 datacenter.cfg
-rw-r----- 1 root www-data 2 Jan 1 1970 .debug
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 local -> nodes/vwsrv1
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 lxc -> nodes/vwsrv1/lxc
-r--r----- 1 root www-data 39 Jan 1 1970 .members
drwxr-xr-x 2 root www-data 0 Jul 1 16:24 nodes
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 openvz -> nodes/vwsrv1/openvz
drwx------ 2 root www-data 0 Jul 1 16:24 priv
-rw-r----- 1 root www-data 2041 Jul 1 16:24 pve-root-ca.pem
-rw-r----- 1 root www-data 1671 Jul 1 16:24 pve-www.key
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 qemu-server -> nodes/vwsrv1/qemu-server
-r--r----- 1 root www-data 280 Jan 1 1970 .rrd
-rw-r----- 1 root www-data 127 Jul 1 16:12 storage.cfg
-rw-r----- 1 root www-data 36 Jul 1 16:12 user.cfg
-r--r----- 1 root www-data 379 Jan 1 1970 .version
-r--r----- 1 root www-data 18 Jan 1 1970 .vmlist
-rw-r----- 1 root www-data 119 Jul 1 16:24 vzdump.cron



remove the cluster config
# rm /etc/pve/cluster.conf no such file
# rm /etc/cluster/cluster.conf no such file
# rm /var/lib/pve-cluster/corosync.authkey no such file, only those below

ls /var/lib/pve-cluster
config.db config.db-shm config.db-wal

So this proposal is not at all helpful for me. I already explained that we have 4 Proxmox servers, all on the same
IP subnet. one on 4.1 two on 3.4 and the one mentioned here on 4.2. All have comunity subscription. The 3.4 ones
will be upgraded (new install) step by step. Unfortunately a lot changed in the cluster software, but there should be
a possibility to upgrade.

Rgds
 

hpcraith

Member
Mar 8, 2013
60
0
6
Stuttgart Germany
www.hlrs.de
I did the following:

service pve-cluster stop
pmxcfs -l
ls -al
root@vwsrv1:/etc/pve# ls -al
total 8
drwxr-xr-x 2 root www-data 0 Jan 1 1970 .
drwxr-xr-x 95 root root 4096 Jul 11 13:33 ..
-rw-r----- 1 root www-data 451 Jul 1 16:24 authkey.pub
-r--r----- 1 root www-data 153 Jan 1 1970 .clusterlog
-rw-r----- 1 root www-data 448 Jul 11 13:52 corosync.conf
-rw-r----- 1 root www-data 13 Jul 1 16:12 datacenter.cfg
-rw-r----- 1 root www-data 2 Jan 1 1970 .debug
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 local -> nodes/vwsrv1
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 lxc -> nodes/vwsrv1/lxc
-r--r----- 1 root www-data 39 Jan 1 1970 .members
drwxr-xr-x 2 root www-data 0 Jul 1 16:24 nodes
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 openvz -> nodes/vwsrv1/openvz
drwx------ 2 root www-data 0 Jul 1 16:24 priv
-rw-r----- 1 root www-data 2041 Jul 1 16:24 pve-root-ca.pem
-rw-r----- 1 root www-data 1671 Jul 1 16:24 pve-www.key
lrwxr-xr-x 1 root www-data 0 Jan 1 1970 qemu-server -> nodes/vwsrv1/qemu-server
-r--r----- 1 root www-data 280 Jan 1 1970 .rrd
-rw-r----- 1 root www-data 127 Jul 1 16:12 storage.cfg
-rw-r----- 1 root www-data 36 Jul 1 16:12 user.cfg
-r--r----- 1 root www-data 379 Jan 1 1970 .version
-r--r----- 1 root www-data 18 Jan 1 1970 .vmlist
-rw-r----- 1 root www-data 119 Jul 1 16:24 vzdump.cron



remove the cluster config
# rm /etc/pve/cluster.conf no such file
# rm /etc/cluster/cluster.conf no such file
# rm /var/lib/pve-cluster/corosync.authkey no such file, only those below

ls /var/lib/pve-cluster
config.db config.db-shm config.db-wal

So this proposal is not at all helpful for me. I already explained that we have 4 Proxmox servers, all on the same
IP subnet. one on 4.1 two on 3.4 and the one mentioned here on 4.2. All have comunity subscription. The 3.4 ones
will be upgraded (new install) step by step. Unfortunately a lot changed in the cluster software, but there should be
a possibility to upgrade.

Rgds
Further to the above I found that what was cluster.conf in 3.x is now corosync.conf.
I also detected
><clusternode name="vwsrv4" votes="1" nodeid="4"/>
in /etc/pve/cluster.conf of the 3.4 servers. It is said that 3.x and 4.x are not compatible anymore. So how come
that entries of 4.x show up in 3.x servers? This is a deadly mix, that 3.x cluster software exchanges information
with 4.x. At least there should be a provision that these version do not talk to each other.
Rgds
 

xhimera

New Member
Dec 5, 2016
1
3
3
39
here my solution for version 4.3
$ pveversion
pve-manager/4.3-1/e7cdc165 (running kernel: 4.4.19-1-pve)

# stop service
$ systemctl stop pvestatd.service
$ systemctl stop pvedaemon.service
$ systemctl stop pve-cluster.service

# edit through sqlite, check, delete, verify
$ sqlite3 /var/lib/pve-cluster/config.db
sqlite> select * from tree where name = 'corosync.conf';
254327|0|254329|0|1480944811|8|corosync.conf|totem {
version: 2
[...]
sqlite> delete from tree where name = 'corosync.conf';
sqlite> select * from tree where name = 'corosync.conf';
sqlite> .quit

# start service
$ systemctl start pve-cluster.service
$ systemctl start pvestatd.service
$ systemctl start pvedaemon.service
 

hpcraith

Member
Mar 8, 2013
60
0
6
Stuttgart Germany
www.hlrs.de
here my solution for version 4.3
$ pveversion
pve-manager/4.3-1/e7cdc165 (running kernel: 4.4.19-1-pve)

# stop service
$ systemctl stop pvestatd.service
$ systemctl stop pvedaemon.service
$ systemctl stop pve-cluster.service

# edit through sqlite, check, delete, verify
$ sqlite3 /var/lib/pve-cluster/config.db
sqlite> select * from tree where name = 'corosync.conf';
254327|0|254329|0|1480944811|8|corosync.conf|totem {
version: 2
[...]
sqlite> delete from tree where name = 'corosync.conf';
sqlite> select * from tree where name = 'corosync.conf';
sqlite> .quit

# start service
$ systemctl start pve-cluster.service
$ systemctl start pvestatd.service
$ systemctl start pvedaemon.service
Thank You for Your solution proposal. I did not believe that there is an answer to my problem, which is
still present. I have four Proxmox Server which are unable to talk to each other on one web application.
So I always have to open four web applications.
As soon as I have time availlable I will test Your solution and report.
Rgds
Dieter
 
Dec 11, 2016
1
2
1
37
xhimera, Thanks for your solution.

I was also playing with 3 proxmox servers, and did mass up with the cluster.
I did follow xhimera with another solution as well.

I did this on the 3 servers:
pve-manager/4.3-14/3a8c61c7 (running kernel: 4.4.35-1-pve)

# stop service
systemctl stop pvestatd.service
systemctl stop pvedaemon.service
systemctl stop pve-cluster.service
systemctl stop corosync
systemctl stop pve-cluster

# edit through sqlite, check, delete, verify
$ sqlite3 /var/lib/pve-cluster/config.db
sqlite> select * from tree where name = 'corosync.conf';
254327|0|254329|0|1480944811|8|corosync.conf|totem {
version: 2
[...]
sqlite> delete from tree where name = 'corosync.conf';
sqlite> select * from tree where name = 'corosync.conf';
sqlite> .quit
#

#Remove directories
pmxcfs -l
rm /etc/pve/corosync.conf
rm /etc/corosync/*
rm /var/lib/corosync/*
rm -rf /etc/pve/nodes/*

Dont forget to remove nodes that you dont want from /etc/pve/priv/authorized_keys

I did reboot the 3 servers and started again.
Followed: https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster

And everything seemed to work fine, and the cluster is back online.
NB: The cluster was empty without any vms.

Hope this will help someone.
 

csc

New Member
Oct 26, 2017
2
0
1
38
error
root@pve:/# pvecm status
Cannot initialize CMAP service
--------------------------------------------
file corosyng.conf
nano /etc/pve/corosync.conf

totem {
version: 2
secauth: on
cluster_name: clustercsc
config_version: 1
ip_version: ipv4
interface {
ringnumber: 0
bindnetaddr: 192.168.1.16
}
}

nodelist {
node {
ring0_addr: pve
name: pve
nodeid: 1
quorum_votes: 1
}
}

quorum {
provider: corosync_votequorum
}

logging {
to_syslog: yes
debug: off
}

-------------------------------------------

root@pve:/# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: failed (Result: signal) since Thu 2017-10-26 06:25:14 -03; 4h 31min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 24919 (code=killed, signal=ABRT)

Oct 26 06:25:14 pve corosync[24919]: [TOTEM ] A new membership (192.168.1.16:60) was formed. Members joined: 1
Oct 26 06:25:14 pve corosync[24919]: [TOTEM ] JOIN or LEAVE message was thrown away during flush operation.
Oct 26 06:25:14 pve corosync[24919]: [QUORUM] Members[1]: 1
Oct 26 06:25:14 pve corosync[24919]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 26 06:25:14 pve corosync[24919]: notice [TOTEM ] A new membership (192.168.1.10:64) was formed. Members joined: 1
Oct 26 06:25:14 pve corosync[24919]: [TOTEM ] A new membership (192.168.1.10:64) was formed. Members joined: 1
Oct 26 06:25:14 pve corosync[24919]: corosync: cpg.c:867: downlist_master_choose: Assertion `best != NULL' failed.
Oct 26 06:25:14 pve systemd[1]: corosync.service: Main process exited, code=killed, status=6/ABRT
Oct 26 06:25:14 pve systemd[1]: corosync.service: Unit entered failed state.
Oct 26 06:25:14 pve systemd[1]: corosync.service: Failed with result 'signal'.
 

csc

New Member
Oct 26, 2017
2
0
1
38
error
root@pve:/# pvecm status
Cannot initialize CMAP service
--------------------------------------------
file corosyng.conf
nano /etc/pve/corosync.conf

totem {
version: 2
secauth: on
cluster_name: clustercsc
config_version: 1
ip_version: ipv4
interface {
ringnumber: 0
bindnetaddr: 192.168.1.16
}
}

nodelist {
node {
ring0_addr: pve
name: pve
nodeid: 1
quorum_votes: 1
}
}

quorum {
provider: corosync_votequorum
}

logging {
to_syslog: yes
debug: off
}

-------------------------------------------

root@pve:/# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: failed (Result: signal) since Thu 2017-10-26 06:25:14 -03; 4h 31min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 24919 (code=killed, signal=ABRT)

Oct 26 06:25:14 pve corosync[24919]: [TOTEM ] A new membership (192.168.1.16:60) was formed. Members joined: 1
Oct 26 06:25:14 pve corosync[24919]: [TOTEM ] JOIN or LEAVE message was thrown away during flush operation.
Oct 26 06:25:14 pve corosync[24919]: [QUORUM] Members[1]: 1
Oct 26 06:25:14 pve corosync[24919]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 26 06:25:14 pve corosync[24919]: notice [TOTEM ] A new membership (192.168.1.10:64) was formed. Members joined: 1
Oct 26 06:25:14 pve corosync[24919]: [TOTEM ] A new membership (192.168.1.10:64) was formed. Members joined: 1
Oct 26 06:25:14 pve corosync[24919]: corosync: cpg.c:867: downlist_master_choose: Assertion `best != NULL' failed.
Oct 26 06:25:14 pve systemd[1]: corosync.service: Main process exited, code=killed, status=6/ABRT
Oct 26 06:25:14 pve systemd[1]: corosync.service: Unit entered failed state.
Oct 26 06:25:14 pve systemd[1]: corosync.service: Failed with result 'signal'.
 

David Anderson

New Member
Jun 14, 2018
13
3
3
51
Gabriel-openprogrammer Helped me! Great info - saved my bacon when I started playing with a cluster not having any idea what i was doing. These steps worked great to remove the cluster config - also removed all my VM's but that was easily recoverable with backups.

Awesome - thanks for the post.
 

suresh788

New Member
May 1, 2019
1
0
1
25
Hi,

i have 2-node cluster, in both the node i have created user accounts. due to some issue, one node is degraded in cluster. so i removed and re-joined the node again. after this the users i created are not listing in the user tab. but users are listing in the cli.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!