Problem bei Clustererstellung

konabi

Renowned Member
Dec 14, 2013
109
4
83
Hallo,
ich wollte einen PVE Cluster erstellen.
Ich habe Multicast mit omping getestet und auch dafür gesorgt dass alle Nodes die selbe Uhrzeit verwenden.

Auf dem ersten Node habe ich mit "pvecm create PVECuster" den Cluster erstellt.
Danach habe ich auf dem zweiten Node "pvecm add IP des ersten Nodes" ausgeführt.

Scheinbar gab danach ein Problem beim Zertifikatsaustausch so dass das Cluster File System (pmxcfs) nicht richtig läuft.

Auf Node PVE01 schein alles OK zu sein das Verzeichnis /etc/pve siehr gut aus:

Code:
root@pve01:~# ls -l /etc/pve
total 4
-r--r----- 1 root www-data  451 May 18 12:42 authkey.pub
-r--r----- 1 root www-data  447 Jun  1 15:23 corosync.conf
-r--r----- 1 root www-data   13 May 18 12:33 datacenter.cfg
lr-xr-xr-x 1 root www-data    0 Jan  1  1970 local -> nodes/pve01
lr-xr-xr-x 1 root www-data    0 Jan  1  1970 lxc -> nodes/pve01/lxc
dr-xr-xr-x 2 root www-data    0 May 18 12:42 nodes
lr-xr-xr-x 1 root www-data    0 Jan  1  1970 openvz -> nodes/pve01/openvz
dr-x------ 2 root www-data    0 May 18 12:42 priv
-r--r----- 1 root www-data 2053 May 18 12:42 pve-root-ca.pem
-r--r----- 1 root www-data 1679 May 18 12:42 pve-www.key
lr-xr-xr-x 1 root www-data    0 Jan  1  1970 qemu-server -> nodes/pve01/qemu-server
-r--r----- 1 root www-data  322 May 22 15:42 storage.cfg
-r--r----- 1 root www-data   51 May 18 12:33 user.cfg
-r--r----- 1 root www-data  312 May 30 12:31 vzdump.cron

Auf PVE02 hingegen fehlen mir das Verzeichnis /etc/pve/nodes und einige Dateien:


Code:
ls -l /etc/pve/
total 1
-r--r----- 1 root www-data 447 Jun  1 15:23 corosync.conf
lr-xr-xr-x 1 root www-data   0 Jan  1  1970 local -> nodes/pve02
lr-xr-xr-x 1 root www-data   0 Jan  1  1970 lxc -> nodes/pve02/lxc
lr-xr-xr-x 1 root www-data   0 Jan  1  1970 openvz -> nodes/pve02/openvz
lr-xr-xr-x 1 root www-data   0 Jan  1  1970 qemu-server -> nodes/pve02/qemu-server

Welche Möglichkeiten der reparatur habe ich jetzt?
Ich bin für jeden Hinweis dankbar!

Sven
 
Du kann nochmal dass 'pvecm add' ausführen (mit --force' flag). Bitte Fehlermeldung dann hier posten.
 
Hallo Dietmar,

Code:
root@pve02:~# pvecm add 192.168.222.101 --force
can't create shared ssh key database '/etc/pve/priv/authorized_keys'
node pve02 already defined
copy corosync auth key
stopping pve-cluster service
backup old database
waiting for quorum...

und dabei bleibt es "waiting for qorum"

im syslog:

Code:
Jun  1 20:41:28 pve02 pmxcfs[3736]: [main] notice: teardown filesystem
Jun  1 20:41:28 pve02 systemd[1]: Stopping The Proxmox VE cluster filesystem...
Jun  1 20:41:30 pve02 pmxcfs[3736]: [main] notice: exit proxmox configuration filesystem (0)
Jun  1 20:41:30 pve02 systemd[1]: Stopped The Proxmox VE cluster filesystem.
Jun  1 20:41:30 pve02 systemd[1]: Starting The Proxmox VE cluster filesystem...
Jun  1 20:41:30 pve02 pmxcfs[72476]: [status] notice: update cluster info (cluster name  PVECluster, version = 2)
Jun  1 20:41:30 pve02 pmxcfs[72476]: [dcdb] notice: members: 2/72476
Jun  1 20:41:30 pve02 pmxcfs[72476]: [dcdb] notice: all data is up to date
Jun  1 20:41:30 pve02 pmxcfs[72476]: [status] notice: members: 2/72476
Jun  1 20:41:30 pve02 pmxcfs[72476]: [status] notice: all data is up to date
Jun  1 20:41:31 pve02 systemd[1]: Started The Proxmox VE cluster filesystem.
Jun  1 20:41:31 pve02 systemd[1]: Started Corosync Cluster Engine.
Jun  1 20:41:31 pve02 pve-ha-crm[2159]: ipcc_send_rec failed: Transport endpoint is not connected
Jun  1 20:41:31 pve02 pve-ha-lrm[2174]: ipcc_send_rec failed: Transport endpoint is not connected
Jun  1 20:41:32 pve02 pveproxy[2164]: starting 1 worker(s)
Jun  1 20:41:32 pve02 pveproxy[2164]: worker 72525 started
Jun  1 20:41:32 pve02 pveproxy[72379]: worker exit
Jun  1 20:41:32 pve02 pveproxy[72380]: worker exit
Jun  1 20:41:32 pve02 pveproxy[72525]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1618.
Jun  1 20:41:32 pve02 pveproxy[2164]: worker 72379 finished
Jun  1 20:41:32 pve02 pveproxy[2164]: starting 1 worker(s)
Jun  1 20:41:32 pve02 pveproxy[2164]: worker 72526 started
Jun  1 20:41:32 pve02 pveproxy[2164]: worker 72380 finished
Jun  1 20:41:32 pve02 pveproxy[2164]: starting 1 worker(s)
Jun  1 20:41:32 pve02 pveproxy[2164]: worker 72527 started
Jun  1 20:41:32 pve02 pveproxy[72526]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1618.
Jun  1 20:41:32 pve02 pveproxy[72527]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1618.
Jun  1 20:41:37 pve02 pvestatd[2139]: ipcc_send_rec failed: Transport endpoint is not connected
Jun  1 20:41:37 pve02 pveproxy[72525]: worker exit
Jun  1 20:41:37 pve02 pveproxy[2164]: worker 72525 finished
Jun  1 20:41:37 pve02 pveproxy[2164]: starting 1 worker(s)
Jun  1 20:41:37 pve02 pveproxy[2164]: worker 72534 started
Jun  1 20:41:37 pve02 pveproxy[72526]: worker exit

Code:
root@pve02:~# pvecm status
Quorum information
------------------
Date:             Thu Jun  1 20:49:36 2017
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          2/8
Quorate:          No

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      1
Quorum:           2 Activity blocked
Flags:

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 192.168.222.102 (local)

Danach habe ich festgestellt dass ich noch einen Fehler in der /etc/hosts Datei hatte und somit die Nodes nicht die Namen des jeweils anderen auflösen konnten. Das habe ich korrogiert und danach:

Code:
root@pve02:~# pvecm expected 1
root@pve02:~#  pvecm add 192.168.222.101 --force
node pve02 already defined
copy corosync auth key
stopping pve-cluster service
backup old database
generating node certificates
merge known_hosts file
restart services
successfully added node 'pve02' to cluster.
root@pve02:~# pvecm status
Quorum information
------------------
Date:             Thu Jun  1 21:06:00 2017
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000002
Ring ID:          2/12
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   1
Highest expected: 1
Total votes:      1
Quorum:           1
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000002          1 192.168.222.102 (local)

Die Ausgabe auf PVE01:
Code:
root@pve01:~# pvecm status
Quorum information
------------------
Date:             Thu Jun  1 21:25:47 2017
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1/8
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   1
Highest expected: 1
Total votes:      1
Quorum:           1
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.222.101 (local)
root@pve01:~# pvecm status
Quorum information
------------------
Date:             Thu Jun  1 21:27:06 2017
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1/8
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   1
Highest expected: 1
Total votes:      1
Quorum:           1
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.222.101 (local)


und das syslog:
Code:
Jun  1 21:19:41 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:20:43 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:21:47 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:22:51 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:23:53 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:24:39 pve02 pveproxy[3347]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:24:54 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:25:54 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:26:56 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:27:57 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:28:06 pve02 pveproxy[3346]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
Jun  1 21:29:00 pve02 pveproxy[3348]: Could not verify remote node certificate '80:68:BD:40:C2:B0:23:0E:F0:7F:D9:A7:BC:32:86:F4:2D:A0:EA:2F:E2:52:D0:47:A8:62:72:EB:B2:C6:23:E4' with list of pinned certificates, refreshing cache
 
Oh stimt, mann sollte natürlich die Ausgabe von omping auch richtig interpretieren...
Es kommen nur unicasts zurück.


Code:
root@pve01:~# omping pve01 pve02
pve02 : waiting for response msg
pve02 : waiting for response msg
pve02 : waiting for response msg
pve02 : waiting for response msg
pve02 : joined (S,G) = (*, 232.43.211.234), pinging
pve02 :   unicast, seq=1, size=69 bytes, dist=0, time=0.165ms
pve02 :   unicast, seq=2, size=69 bytes, dist=0, time=0.160ms
pve02 :   unicast, seq=3, size=69 bytes, dist=0, time=0.223ms
pve02 :   unicast, seq=4, size=69 bytes, dist=0, time=0.082ms
pve02 :   unicast, seq=5, size=69 bytes, dist=0, time=0.161ms
pve02 :   unicast, seq=6, size=69 bytes, dist=0, time=0.146ms
pve02 :   unicast, seq=7, size=69 bytes, dist=0, time=0.153ms
^C
pve02 :   unicast, xmt/rcv/%loss = 7/7/0%, min/avg/max/std-dev = 0.082/0.156/0.223/0.041
pve02 : multicast, xmt/rcv/%loss = 7/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000


Code:
root@pve02:~# omping pve01 pve02
pve01 : waiting for response msg
pve01 : joined (S,G) = (*, 232.43.211.234), pinging
pve01 :   unicast, seq=1, size=69 bytes, dist=0, time=0.129ms
pve01 :   unicast, seq=2, size=69 bytes, dist=0, time=0.213ms
pve01 :   unicast, seq=3, size=69 bytes, dist=0, time=0.193ms
pve01 :   unicast, seq=4, size=69 bytes, dist=0, time=0.217ms
pve01 :   unicast, seq=5, size=69 bytes, dist=0, time=0.214ms
pve01 :   unicast, seq=6, size=69 bytes, dist=0, time=0.201ms
pve01 :   unicast, seq=7, size=69 bytes, dist=0, time=0.220ms
pve01 :   unicast, seq=8, size=69 bytes, dist=0, time=0.216ms
^C
pve01 :   unicast, xmt/rcv/%loss = 8/8/0%, min/avg/max/std-dev = 0.129/0.200/0.220/0.030
pve01 : multicast, xmt/rcv/%loss = 8/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!