All vm lost after added to cluster

iruindegi

Well-Known Member
Aug 26, 2016
52
0
46
Zarautz
Hi,

We had a proxmox server ( promox2 172.28.64.19) with some VM in it, and we created a cluster in it with other machine witch now is disconnected and removed form cluster.

After some days we bought a new server ( proxmox1 172.28.64.16 ). We created a new cluster in it and we tried to add the other server with

promox1# pvecm add 172.28.64.16

witch fails telling so we tried with the -force flag, with a correct result.

But now, if we navigate to proxmox gui, we don´t have any of our machines in it. And the cluster is not correctly configured... (I´m worried about my machines...)

Some outputs:

root@proxmox1:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 proxmox1 (local)
root@proxmox1:~# service pve-cluster start
root@proxmox1:~# pvecm status
Quorum information
------------------
Date: Tue Jan 17 14:08:34 2017
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 1/2736
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.28.64.16 (local)

-------------------------------------------------------------------------------------------------------------------
root@proxmox1:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 proxmox1 (local)


-------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------

root@proxmox2:~# pvecm status
Quorum information
------------------
Date: Tue Jan 17 13:40:28 2017
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2/7108
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.28.64.19 (local)

-------------------------------------------------------------------------------------------------------------------

root@proxmox2:~# pvecm nodes

Membership information
----------------------
Nodeid Votes Name
2 1 proxmox2 (local)
root@proxmox2:~#
 
From your reply it seems like you have 2 clusters on 2 different machines. After you have created new cluster on proxmox1 you ran the following command:

promox1# pvecm add 172.28.64.16

But you should run this command from proxmox2 as following:
promox2# pvecm add 172.28.64.16

Proxmox2 needs to be added to the cluster on proxmox1. When you login to Proxmox2 GUI do you see your existing VMs?
 
Yes, i did launch from proxmox2, I had a mistake when I wrote the first post.

At the first time it fails but we had a correct answer whith -force flag

There is no any vm on proxmox2 gui, neither on proxmox1
 
Yes, i did launch from proxmox2, I had a mistake when I wrote the first post.

When you added the node it copied all configuration from promxox1. I effect, this removed all existing VMs from proxmox2.

Note: That is why this is not allowed by default (but you overwrite that using the force flag).
 
there is no way to restore the vm's?

I guess config files are lost, but maybe there is a copy of the old database in /var/lib/pve-cluster/config.db.backup.

VM image files should be still on your storage, so the easiest way is to create corresponding config files manually.
 
Code:
root@proxmox2:~# ls -la /var/lib/pve-cluster/backup/
total 40
drwxr-xr-x 2 root root  4096 Jan 17 12:04 .
drwxr-xr-x 3 root root  4096 Jan 18 07:34 ..
-rw-r--r-- 1 root root 12645 Nov 11 09:42 config-1478853733.sql.gz
-rw-r--r-- 1 root root 15188 Jan 17 12:04 config-1484651064.sql.gz

We have this. With that, how can I restore my vms?

Thanks for your help!
Really appreciated
 
Finally I´got my machines back

I stopped the cluster service
Code:
systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l

This way I'm working in local mode. Then I restored the backup from /var/lib/pve-cluster/backup like this:
First untar the config backup to get an sql file

Code:
cd /var/lib/
mv config.db config.db.bak
sqlite3 config.db < config-1484651064.sql

After a reboot I can access to my VM's.

Thanks for your help!
 
Thanks iruindegi,

I got my VMs back.

Here an update from myside:

I stopped the cluster service
Code:
systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l

This way I'm working in local mode. Then I restored the backup from /var/lib/pve-cluster/backup like this:
First untar the config backup to get an sql file (e.g. config-1546119386.sql.gz)

Code:
cd /var/lib/pve-cluster
mv config.db config.db.bak
cp /var/lib/pve-cluster/backup/config-1546119386.sql.gz /var/lib/pve-cluster/
gzip -d config-1546119386.sql.gz
sqlite3 config.db < config-1546119386.sql

After a reboot I can access to my VM's.

My VM's did not autostart therefore:

Code:
 systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l
rm /etc/pve/corosync.conf
rm /etc/corosync/*
killall pmxcfs
systemctl start pve-cluster

corosync still fail --> but VM are up again
corosync I will try to repair.

Big thanks for now !!! :);):)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!