All vm lost after added to cluster

iruindegi

Renowned Member
Aug 26, 2016
54
0
71
Zarautz
Hi,

We had a proxmox server ( promox2 172.28.64.19) with some VM in it, and we created a cluster in it with other machine witch now is disconnected and removed form cluster.

After some days we bought a new server ( proxmox1 172.28.64.16 ). We created a new cluster in it and we tried to add the other server with

promox1# pvecm add 172.28.64.16

witch fails telling so we tried with the -force flag, with a correct result.

But now, if we navigate to proxmox gui, we don´t have any of our machines in it. And the cluster is not correctly configured... (I´m worried about my machines...)

Some outputs:

root@proxmox1:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 proxmox1 (local)
root@proxmox1:~# service pve-cluster start
root@proxmox1:~# pvecm status
Quorum information
------------------
Date: Tue Jan 17 14:08:34 2017
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/2736
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.28.64.16 (local)

-------------------------------------------------------------------------------------------------------------------
root@proxmox1:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 proxmox1 (local)


-------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------

root@proxmox2:~# pvecm status
Quorum information
------------------
Date: Tue Jan 17 13:40:28 2017
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2/7108
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.28.64.19 (local)

-------------------------------------------------------------------------------------------------------------------

root@proxmox2:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
2 1 proxmox2 (local)
root@proxmox2:~#
 
From your reply it seems like you have 2 clusters on 2 different machines. After you have created new cluster on proxmox1 you ran the following command:

promox1# pvecm add 172.28.64.16

But you should run this command from proxmox2 as following:
promox2# pvecm add 172.28.64.16

Proxmox2 needs to be added to the cluster on proxmox1. When you login to Proxmox2 GUI do you see your existing VMs?
 
Yes, i did launch from proxmox2, I had a mistake when I wrote the first post.

At the first time it fails but we had a correct answer whith -force flag

There is no any vm on proxmox2 gui, neither on proxmox1
 
Yes, i did launch from proxmox2, I had a mistake when I wrote the first post.

When you added the node it copied all configuration from promxox1. I effect, this removed all existing VMs from proxmox2.

Note: That is why this is not allowed by default (but you overwrite that using the force flag).
 
there is no way to restore the vm's?

I guess config files are lost, but maybe there is a copy of the old database in /var/lib/pve-cluster/config.db.backup.

VM image files should be still on your storage, so the easiest way is to create corresponding config files manually.
 
Code:
root@proxmox2:~# ls -la /var/lib/pve-cluster/backup/
total 40
drwxr-xr-x 2 root root  4096 Jan 17 12:04 .
drwxr-xr-x 3 root root  4096 Jan 18 07:34 ..
-rw-r--r-- 1 root root 12645 Nov 11 09:42 config-1478853733.sql.gz
-rw-r--r-- 1 root root 15188 Jan 17 12:04 config-1484651064.sql.gz

We have this. With that, how can I restore my vms?

Thanks for your help!
Really appreciated
 
Finally I´got my machines back

I stopped the cluster service
Code:
systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l

This way I'm working in local mode. Then I restored the backup from /var/lib/pve-cluster/backup like this:
First untar the config backup to get an sql file

Code:
cd /var/lib/
mv config.db config.db.bak
sqlite3 config.db < config-1484651064.sql

After a reboot I can access to my VM's.

Thanks for your help!
 
Thanks iruindegi,

I got my VMs back.

Here an update from myside:

I stopped the cluster service
Code:
systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l

This way I'm working in local mode. Then I restored the backup from /var/lib/pve-cluster/backup like this:
First untar the config backup to get an sql file (e.g. config-1546119386.sql.gz)

Code:
cd /var/lib/pve-cluster
mv config.db config.db.bak
cp /var/lib/pve-cluster/backup/config-1546119386.sql.gz /var/lib/pve-cluster/
gzip -d config-1546119386.sql.gz
sqlite3 config.db < config-1546119386.sql

After a reboot I can access to my VM's.

My VM's did not autostart therefore:

Code:
 systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l
rm /etc/pve/corosync.conf
rm /etc/corosync/*
killall pmxcfs
systemctl start pve-cluster

corosync still fail --> but VM are up again
corosync I will try to repair.

Big thanks for now !!! :);):)