We upgraded from latest 3.x to 4.x without any Problems.
But on recreating the cluster there is a quorum problem (no problems on 3.x before):
pvecm status on first node:
Output /var/log/syslog from first node:
OMPING works:
more info on restarting corosync.service by scanning with 'tcdump -i eth1 igmp -n" output identical on first and second node:
Can someone give us a hint to recreate the cluster?
Is it possible to reset the cluster? We have a single server without any vm on it only for login.
But on recreating the cluster there is a quorum problem (no problems on 3.x before):
Code:
pvecm add xx.xx.xx.xx -force
cluster not ready - no quorum?
pvecm status on first node:
Code:
Quorum information
------------------
Date: Fri Oct 30 10:16:50 2015
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 71312
Quorate: No
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.10.10 (local)
Code:
Oct 30 10:12:29 proxmox pmxcfs[8281]: [status] crit: cpg_dispatch failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [status] crit: cpg_leave failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [quorum] crit: quorum_dispatch failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [quorum] crit: quorum_initialize failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [quorum] crit: can't initialize service
Oct 30 10:12:29 proxmox pmxcfs[8281]: [confdb] crit: cmap_initialize failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [confdb] crit: can't initialize service
Oct 30 10:12:29 proxmox pmxcfs[8281]: [dcdb] notice: start cluster connection
Oct 30 10:12:29 proxmox pmxcfs[8281]: [dcdb] crit: cpg_initialize failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [dcdb] crit: can't initialize service
Oct 30 10:12:29 proxmox pmxcfs[8281]: [status] notice: start cluster connection
Oct 30 10:12:29 proxmox pmxcfs[8281]: [status] crit: cpg_initialize failed: 2
Oct 30 10:12:29 proxmox pmxcfs[8281]: [status] crit: can't initialize service
Oct 30 10:12:30 proxmox corosync[8314]: Waiting for corosync services to unload:.[ OK ]
Oct 30 10:12:30 proxmox corosync[8331]: [MAIN ] Corosync Cluster Engine ('2.3.5'): started and ready to provide service.
Oct 30 10:12:30 proxmox corosync[8331]: [MAIN ] Corosync built-in features: augeas systemd pie relro bindnow
Oct 30 10:12:30 proxmox corosync[8332]: [TOTEM ] Initializing transport (UDP/IP Multicast).
Oct 30 10:12:30 proxmox corosync[8332]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Oct 30 10:12:30 proxmox corosync[8332]: [TOTEM ] The network interface [10.0.10.10] is now up.
Oct 30 10:12:30 proxmox corosync[8332]: [SERV ] Service engine loaded: corosync configuration map access [0]
Oct 30 10:12:30 proxmox corosync[8332]: [QB ] server name: cmap
Oct 30 10:12:30 proxmox corosync[8332]: [SERV ] Service engine loaded: corosync configuration service [1]
Oct 30 10:12:30 proxmox corosync[8332]: [QB ] server name: cfg
Oct 30 10:12:30 proxmox corosync[8332]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Oct 30 10:12:30 proxmox corosync[8332]: [QB ] server name: cpg
Oct 30 10:12:30 proxmox corosync[8332]: [SERV ] Service engine loaded: corosync profile loading service [4]
Oct 30 10:12:30 proxmox corosync[8332]: [QUORUM] Using quorum provider corosync_votequorum
Oct 30 10:12:30 proxmox corosync[8332]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]
Oct 30 10:12:30 proxmox corosync[8332]: [QB ] server name: votequorum
Oct 30 10:12:30 proxmox corosync[8332]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Oct 30 10:12:30 proxmox corosync[8332]: [QB ] server name: quorum
Oct 30 10:12:30 proxmox corosync[8332]: [TOTEM ] A new membership (10.0.10.10:71312) was formed. Members joined: 1
Oct 30 10:12:30 proxmox corosync[8332]: [QUORUM] Members[1]: 1
Oct 30 10:12:30 proxmox corosync[8332]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 30 10:12:30 proxmox corosync[8325]: Starting Corosync Cluster Engine (corosync): [ OK ]
Oct 30 10:12:30 proxmox pmxcfs[8281]: [status] crit: cpg_send_message failed: 9
Oct 30 10:12:30 proxmox pmxcfs[8281]: [status] crit: cpg_send_message failed: 9
Oct 30 10:12:30 proxmox pmxcfs[8281]: [status] crit: cpg_send_message failed: 9
Oct 30 10:12:30 proxmox pmxcfs[8281]: [status] crit: cpg_send_message failed: 9
Oct 30 10:12:35 proxmox pmxcfs[8281]: [status] notice: update cluster info (cluster name xyz, version = 2)
Oct 30 10:12:35 proxmox pmxcfs[8281]: [dcdb] notice: members: 1/8281
Oct 30 10:12:35 proxmox pmxcfs[8281]: [dcdb] notice: all data is up to date
Oct 30 10:12:35 proxmox pmxcfs[8281]: [status] notice: members: 1/8281
Oct 30 10:12:35 proxmox pmxcfs[8281]: [status] notice: all data is up to date
Oct 30 10:13:17 proxmox sshd[8464]: Connection closed by 10.0.10.12 [preauth]
Oct 30 10:13:17 proxmox sshd[8466]: Accepted publickey for root from 10.0.10.12 port 59194 ssh2: .....
Oct 30 10:13:17 proxmox sshd[8466]: pam_unix(sshd:session): session opened for user root by (uid=0)
Oct 30 10:13:17 proxmox systemd-logind[769]: New session 7 of user root.
Oct 30 10:13:17 proxmox sshd[8466]: Received disconnect from 10.0.10.12: 11: disconnected by user
Oct 30 10:13:17 proxmox sshd[8466]: pam_unix(sshd:session): session closed for user root
Oct 30 10:13:17 proxmox systemd-logind[769]: Removed session 7.
Oct 30 10:13:17 proxmox sshd[8470]: Accepted publickey for root from 10.0.10.12 port 59196 ssh2: .....
Oct 30 10:13:17 proxmox sshd[8470]: pam_unix(sshd:session): session opened for user root by (uid=0)
Oct 30 10:13:17 proxmox systemd-logind[769]: New session 8 of user root.
Oct 30 10:13:18 proxmox sshd[8470]: Received disconnect from 10.0.10.12: 11: disconnected by user
Oct 30 10:13:18 proxmox sshd[8470]: pam_unix(sshd:session): session closed for user root
Oct 30 10:13:18 proxmox systemd-logind[769]: Removed session 8.
Oct 30 10:13:22 proxmox pveproxy[7723]: ipcc_send_rec failed: Transport endpoint is not connected
Oct 30 10:13:22 proxmox pvedaemon[1524]: ipcc_send_rec failed: Transport endpoint is not connected
OMPING works:
Code:
omping 10.0.10.10 10.0.10.12
10.0.10.10 : waiting for response msg
10.0.10.10 : joined (S,G) = (*, 232.43.211.234), pinging
10.0.10.10 : unicast, seq=1, size=69 bytes, dist=0, time=0.451ms
10.0.10.10 : multicast, seq=1, size=69 bytes, dist=0, time=0.472ms
10.0.10.10 : unicast, seq=2, size=69 bytes, dist=0, time=0.991ms
10.0.10.10 : multicast, seq=2, size=69 bytes, dist=0, time=1.014ms
more info on restarting corosync.service by scanning with 'tcdump -i eth1 igmp -n" output identical on first and second node:
Code:
11:30:28.119245 IP 10.0.10.10 > 224.0.0.22: igmp v3 report, 1 group record(s)
11:30:28.763246 IP 10.0.10.10 > 224.0.0.22: igmp v3 report, 1 group record(s)
11:30:29.235214 IP 10.0.10.10 > 224.0.0.22: igmp v3 report, 1 group record(s)
11:30:29.475224 IP 10.0.10.10 > 224.0.0.22: igmp v3 report, 1 group record(s)
11:30:31.449922 IP 0.0.0.0 > 224.0.0.1: igmp query v2
11:32:36.449939 IP 0.0.0.0 > 224.0.0.1: igmp query v2
Can someone give us a hint to recreate the cluster?
Is it possible to reset the cluster? We have a single server without any vm on it only for login.