Cluster proxmox debian encrytp

ocerda

Member
Jan 2, 2018
58
1
8
31
Hello,

I installed debian 9 with the encrypted disk, and later installed proxmox as it says here:

https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch

Now I try to create a cluster in proxmox, as this post says:

https://pve.proxmox.com/wiki/Cluster_Manager#_adding_nodes_to_the_cluster

It should be something simple, and it's making me hell, use pvecm create "namecluster"

ssh root @ ip of the node.

pvecm add ip main node.

And always the same error, I have already tried a thousand ways and always the same error.

Please enter superuser (root) password for '192.168.0.2':
Password for root@192.168.0.2: *********

Establishing API connection with host '192.168.0.2'
The authenticity of host '192.168.0.2' can not be established.
X509 SHA256 key fingerprint is 32: AC: 0E: C7: 01: 0C: E4: 9C: F9: 2B: F0: 18: 3E: ED: 51: 4D: EF: 6E: BF: B4: 5B: EF: D8 : D1: 08: F9: BB: 87: EC: 32: 4A: D2.
Are you sure you want to continue connecting (yes / no)? Login succeeded.
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service
backup old database to '/var/lib/pve-cluster/backup/config-1535641447.sql.gz'
Job for corosync.service failed because of the control process exited with error code.

TASK ERROR: starting pve-cluster failed: See "systemctl status corosync.service" and "journalctl -xe" for details ..

help me
 
Hi,

you error do you get?
Code:
journalctl -u  corosync.service

please send the corosync.conf from the first node.

Code:
cat /etc/corosync/corosync.conf
 
I feel the delay in answering.

I pass the two things you have asked me.

root@node3:~# journalctl -u corosync.service
-- Logs begin at Thu 2018-08-30 17:48:33 CEST, end at Mon 2018-09-03 15:39:30 CE
sep 03 15:38:30 node3 systemd[1]: Starting Corosync Cluster Engine...
sep 03 15:38:30 node3 corosync[24028]: [MAIN ] Corosync Cluster Engine ('2.4.2
sep 03 15:38:30 node3 corosync[24028]: notice [MAIN ] Corosync Cluster Engine
sep 03 15:38:30 node3 corosync[24028]: info [MAIN ] Corosync built-in featur
sep 03 15:38:30 node3 corosync[24028]: [MAIN ] Corosync built-in features: dbu
sep 03 15:38:30 node3 corosync[24028]: notice [TOTEM ] Initializing transport (
sep 03 15:38:30 node3 corosync[24028]: notice [TOTEM ] Initializing transmit/re
sep 03 15:38:30 node3 corosync[24028]: [TOTEM ] Initializing transport (UDP/IP
sep 03 15:38:30 node3 corosync[24028]: [TOTEM ] Initializing transmit/receive s
sep 03 15:38:30 node3 corosync[24028]: notice [TOTEM ] The network interface [1
sep 03 15:38:30 node3 corosync[24028]: [TOTEM ] The network interface [192.168.
sep 03 15:38:30 node3 corosync[24028]: [SERV ] Service engine loaded: corosync
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: c
sep 03 15:38:30 node3 corosync[24028]: info [QB ] server name: cmap
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: c
sep 03 15:38:30 node3 corosync[24028]: info [QB ] server name: cfg
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: c
sep 03 15:38:30 node3 corosync[24028]: info [QB ] server name: cpg
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: c
sep 03 15:38:30 node3 corosync[24028]: [QB ] server name: cmap
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: c
sep 03 15:38:30 node3 corosync[24028]: warning [WD ] Watchdog /dev/watchdog e
lines 1-23
-- Logs begin at Thu 2018-08-30 17:48:33 CEST, end at Mon 2018-09-03 15:39:30 CEST. --
sep 03 15:38:30 node3 systemd[1]: Starting Corosync Cluster Engine...
sep 03 15:38:30 node3 corosync[24028]: [MAIN ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
sep 03 15:38:30 node3 corosync[24028]: notice [MAIN ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
sep 03 15:38:30 node3 corosync[24028]: info [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnet
sep 03 15:38:30 node3 corosync[24028]: [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnetd snmp
sep 03 15:38:30 node3 corosync[24028]: notice [TOTEM ] Initializing transport (UDP/IP Multicast).
sep 03 15:38:30 node3 corosync[24028]: notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
sep 03 15:38:30 node3 corosync[24028]: [TOTEM ] Initializing transport (UDP/IP Multicast).
sep 03 15:38:30 node3 corosync[24028]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
sep 03 15:38:30 node3 corosync[24028]: notice [TOTEM ] The network interface [192.168.0.164] is now up.
sep 03 15:38:30 node3 corosync[24028]: [TOTEM ] The network interface [192.168.0.164] is now up.
sep 03 15:38:30 node3 corosync[24028]: [SERV ] Service engine loaded: corosync configuration map access [0]
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: corosync configuration map access [0]
sep 03 15:38:30 node3 corosync[24028]: info [QB ] server name: cmap
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: corosync configuration service [1]
sep 03 15:38:30 node3 corosync[24028]: info [QB ] server name: cfg
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
sep 03 15:38:30 node3 corosync[24028]: info [QB ] server name: cpg
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: corosync profile loading service [4]
sep 03 15:38:30 node3 corosync[24028]: [QB ] server name: cmap
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: corosync resource monitoring service [6]
sep 03 15:38:30 node3 corosync[24028]: warning [WD ] Watchdog /dev/watchdog exists but couldn't be opened.
sep 03 15:38:30 node3 corosync[24028]: warning [WD ] resource load_15min missing a recovery key.
sep 03 15:38:30 node3 corosync[24028]: warning [WD ] resource memory_used missing a recovery key.
sep 03 15:38:30 node3 corosync[24028]: info [WD ] no resources configured.
sep 03 15:38:30 node3 corosync[24028]: notice [SERV ] Service engine loaded: corosync watchdog service [7]
sep 03 15:38:30 node3 corosync[24028]: notice [QUORUM] Using quorum provider corosync_votequorum
sep 03 15:38:30 node3 corosync[24028]: crit [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
sep 03 15:38:30 node3 corosync[24028]: error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.ex
sep 03 15:38:30 node3 corosync[24028]: error [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356.
sep 03 15:38:30 node3 systemd[1]: corosync.service: Main process exited, code=exited, status=20/n/a
sep 03 15:38:30 node3 systemd[1]: Failed to start Corosync Cluster Engine.
sep 03 15:38:30 node3 systemd[1]: corosync.service: Unit entered failed state.
sep 03 15:38:30 node3 systemd[1]: corosync.service: Failed with result 'exit-code'.




root@node3:~# cat /etc/corosync/corosync.conf
logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: node1
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.0.2
}
node {
name: node3
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.0.4
}
}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: timing
config_version: 2
interface {
bindnetaddr: 192.168.0.2
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}

It is always the same error and I can not find a solution
 
I think not, it has to do with the way I installed proxmox.

I have a test bench of 3 computers where I test configurations before setting them up in companies.

Previously, in these three computers I downloaded the proxmox 5 iso from the official website.

Create a 3-node proxmox cluster, with high availability and ceph storage.

Then I decided to make it more complex.

Install debian 9 in the three machines, when partitioning the disk select the option lvm encrypted, this asks for a password and encrypts the disk.

Once I have debian install proxmox as this post says.

https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch

And once I have the proxmox well confirmed, network etc. Process to follow the same steps that followed with the creation of the cluster, but directly fails to try to add the other nodes.

For that reason I doubt that it is multicast problem, since I already got it before

Any ideas?
 
Anyway you could try to test if it's working using omping ...

Might be that PVE installer sets other kernel or sysctl vars which are likely missing in Debian stock setup and might affect this.
 
How do I taste the omping? What should I do, I have read the pages that you have passed me, but the omping command does not work for me
 
The error message
sep 03 15:38:30 node3 corosync[24028]: error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.ex
indicates that the config if not correct.
It looks correct to me but maybe there are some wrong invisible char what trigger this.
What language settings do you use?

How do I taste the omping?
Execute this command on both nodes and please send the output.
Code:
omping -c 10000 -i 0.001 -F -q  192.168.0.2 192.168.0.4
[Code]
 
The error message

indicates that the config if not correct.
It looks correct to me but maybe there are some wrong invisible char what trigger this.
What language settings do you use?

I use the Spanish language


Execute this command on both nodes and please send the output.
Code:
omping -c 10000 -i 0.001 -F -q  192.168.0.2 192.168.0.4
[Code]

I was going crazy with so much testing, in the end I downloaded the proxmox iso, I installed proxmox in the 3 computers, and I created the cluster on the web and it worked the first time, but it is not what I want, since in a company we have already put a proxmox encrypted and works perfectly, that's why I would like to get 3 proxmox encrypted in a cluster.

This afternoon I'll install debian encrypted, and install proxmox and I'll do the test again, when it fails, it always fails, are the omping, what I tried and did not recognize the command.

I have to install some package? Or do I have to have the cluster created even if I fail to add the second proxmox?
 
omping is installed on all Proxmox VE what are installed form the iso.
If you come from a Debian installation you have to install.
 
omping is installed on all Proxmox VE what are installed form the iso.
If you come from a Debian installation you have to install.

Main server, with which you create the cluster. The main one is 192.168.0.4 -> node3

omping -c 10000 -i 0.001 -F -q 192.168.0.2 192.168.0.4
192.168.0.2: waiting for response msg
192.168.0.2: waiting for response msg
192.168.0.2: waiting for response msg
192.168.0.2: waiting for response msg
192.168.0.2: waiting for response msg
192.168.0.2: joined (S, G) = (*, 232.43.211.234), pinging
192.168.0.2: given amount of query messages was sent

192.168.0.2: unicast, xmt / rcv /% loss = 10000/9999/0%, min / avg / max / std-dev = 0.081 / 0.379 / 0.639 / 0.126
192.168.0.2: multicast, xmt / rcv /% loss = 10000/9999/0%, min / avg / max / std-dev = 0.087 / 0.385 / 0.644 / 0.127

The other node.

omping -c 10000 -i 0.001 -F -q 192.168.0.2 192.168.0.4
192.168.0.4: waiting for response msg
192.168.0.4: joined (S, G) = (*, 232.43.211.234), pinging
192.168.0.4: waiting for response msg
192.168.0.4: server told us to stop

192.168.0.4: unicast, xmt / rcv /% loss = 9347/9347/0%, min / avg / max / std-dev = 0.099 / 0.288 / 0.489 / 0.058
192.168.0.4: multicast, xmt / rcv /% loss = 9347/9347/0%, min / avg / max / std-dev = 0.100 / 0.301 / 0.510 / 0.063

This is the result, I do not understand it, you tell me what it means.

This is the new result of the cat


root @ node3: ~ # cat /etc/corosync/corosync.conf
logging {
debug: off
to_syslog: yes
}

nodelist {
do not give {
name: node1
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.0.2
}
do not give {
name: node3
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.0.4
}
}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: timing
config_version: 2
interface {
bindnetaddr: 192.168.0.4
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}

Add that the web interface of the node that I add also stops working, in case that gives you a clue.
 
This is the result, I do not understand it, you tell me what it means.
They mean everything is ok and your network is reliable.
Important is you have no packets lost and the avg ping time is under 2 sec.

Your config looks wrong.
nodelist {
do not give {
name: node1
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.0.2
}
do not give {
The line "do not give" is wrong and "node" would the correct key.

Which lang settings do you use?
 
They mean everything is ok and your network is reliable.
Important is you have no packets lost and the avg ping time is under 2 sec.

Your config looks wrong.

Okey thanks.

The line "do not give" is wrong and "node" would the correct key.

Which lang settings do you use?


Then I change "do not give" to "node"?

A lang settings do you mean what language do I use? If so, I use Spanish
 
Okay, in going to the office this afternoon I change that in the corosync file.

In the afternoon I will let you know if the cluster works or not
 
Anyway, I still think that proxmox is not installed correctly, that I'm missing some package or something.

Install debian, and then I added the proxmox repositories and did apt-get update & apt dist-upgrade.

And then install these packages.
apt install proxmox-ve postfix open-iscsi

After that, configure the network etc.

Is it possible that I miss some package and that's why it fails?
 
I've already changed that in the corosync and it's exactly the same.

It does not work .. Any more ideas that may be happening?
 
I'm thinking, has something to do with the disk where proxmox is installed, is it encrypted, or does it have nothing to do with it?

I do not stop looking for information about the failure and I can not find the reason ... I do not know what to think anymore ..

As it is possible that it works with the iso of proxmox, and not installing debian and the packages that you tell me.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!