I tried joining a new node, and it crashed the host server :
First step I created a cluster on first server.
Second step on second server :
The first server (10.16.2.254) crashed, and I had to delete corosync.conf and reboot with the cd (rescue boot) to fix it.
/var/log/daemon.log on first server:
Any Idea about that ?
First step I created a cluster on first server.
Second step on second server :
Code:
pvecm add 10.16.2.254
Please enter superuser (root) password for '10.16.2.254':
Password for root@10.16.2.254:
Etablishing API connection with host '10.16.2.254'
TASK ERROR: 500 500 Can't connect to 10.16.2.254:8006
The first server (10.16.2.254) crashed, and I had to delete corosync.conf and reboot with the cd (rescue boot) to fix it.
/var/log/daemon.log on first server:
Code:
May 15 14:02:12 proxmox systemd[1]: Stopping The Proxmox VE cluster filesystem...
May 15 14:02:12 proxmox pmxcfs[56235]: [main] notice: teardown filesystem
May 15 14:02:13 proxmox pve-ha-lrm[56449]: lost lock 'ha_agent_proxmox_lock - can't create '/etc/pve/priv/lock' (pmxcfs not mounted?)
May 15 14:02:13 proxmox pveproxy[64100]: ipcc_send_rec[1] failed: Connection refused
May 15 14:02:13 proxmox pveproxy[64100]: ipcc_send_rec[2] failed: Connection refused
May 15 14:02:13 proxmox pveproxy[64100]: ipcc_send_rec[3] failed: Connection refused
May 15 14:02:13 proxmox pveproxy[64100]: ipcc_send_rec[1] failed: Connection refused
May 15 14:02:13 proxmox pveproxy[64100]: ipcc_send_rec[2] failed: Connection refused
May 15 14:02:13 proxmox pveproxy[64100]: ipcc_send_rec[3] failed: Connection refused
May 15 14:02:13 proxmox pmxcfs[56235]: [main] notice: exit proxmox configuration filesystem (0)
May 15 14:02:14 proxmox systemd[1]: Stopped The Proxmox VE cluster filesystem.
May 15 14:02:14 proxmox systemd[1]: Starting The Proxmox VE cluster filesystem...
May 15 14:02:14 proxmox pmxcfs[28843]: [dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 1)
May 15 14:02:14 proxmox pmxcfs[28843]: [dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 1)
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[1] failed: Connection refused
May 15 14:02:14 proxmox pmxcfs[28859]: [quorum] crit: quorum_initialize failed: 2
May 15 14:02:14 proxmox pmxcfs[28859]: [quorum] crit: can't initialize service
May 15 14:02:14 proxmox pmxcfs[28859]: [confdb] crit: cmap_initialize failed: 2
May 15 14:02:14 proxmox pmxcfs[28859]: [confdb] crit: can't initialize service
May 15 14:02:14 proxmox pmxcfs[28859]: [dcdb] crit: cpg_initialize failed: 2
May 15 14:02:14 proxmox pmxcfs[28859]: [dcdb] crit: can't initialize service
May 15 14:02:14 proxmox pmxcfs[28859]: [status] crit: cpg_initialize failed: 2
May 15 14:02:14 proxmox pmxcfs[28859]: [status] crit: can't initialize service
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[2] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[3] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[1] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[2] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[3] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[1] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[2] failed: Connection refused
May 15 14:02:14 proxmox pveproxy[64100]: ipcc_send_rec[3] failed: Connection refused
May 15 14:02:16 proxmox systemd[1]: Started The Proxmox VE cluster filesystem.
May 15 14:02:16 proxmox systemd[1]: Starting Corosync Cluster Engine...
May 15 14:02:16 proxmox corosync[28883]: [MAIN ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
May 15 14:02:16 proxmox corosync[28883]: notice [MAIN ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
May 15 14:02:16 proxmox corosync[28883]: info [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnetd snmp pie relro bindnow
May 15 14:02:16 proxmox corosync[28883]: [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnetd snmp pie relro bindnow
May 15 14:02:16 proxmox corosync[28883]: notice [TOTEM ] Initializing transport (UDP/IP Multicast).
May 15 14:02:16 proxmox corosync[28883]: notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
May 15 14:02:16 proxmox corosync[28883]: [TOTEM ] Initializing transport (UDP/IP Multicast).
May 15 14:02:16 proxmox corosync[28883]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
May 15 14:02:16 proxmox corosync[28883]: notice [TOTEM ] The network interface is down.
May 15 14:02:16 proxmox corosync[28883]: [TOTEM ] The network interface is down.
May 15 14:02:16 proxmox corosync[28883]: notice [SERV ] Service engine loaded: corosync configuration map access [0]
May 15 14:02:16 proxmox corosync[28883]: info [QB ] server name: cmap
May 15 14:02:16 proxmox corosync[28883]: notice [SERV ] Service engine loaded: corosync configuration service [1]
May 15 14:02:16 proxmox corosync[28883]: info [QB ] server name: cfg
May 15 14:02:16 proxmox corosync[28883]: notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
May 15 14:02:16 proxmox corosync[28883]: info [QB ] server name: cpg
May 15 14:02:16 proxmox corosync[28883]: notice [SERV ] Service engine loaded: corosync profile loading service [4]
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine loaded: corosync configuration map access [0]
May 15 14:02:16 proxmox corosync[28883]: notice [SERV ] Service engine loaded: corosync resource monitoring service [6]
May 15 14:02:16 proxmox corosync[28883]: warning [WD ] Watchdog /dev/watchdog exists but couldn't be opened.
May 15 14:02:16 proxmox corosync[28883]: warning [WD ] resource load_15min missing a recovery key.
May 15 14:02:16 proxmox corosync[28883]: warning [WD ] resource memory_used missing a recovery key.
May 15 14:02:16 proxmox corosync[28883]: info [WD ] no resources configured.
May 15 14:02:16 proxmox corosync[28883]: notice [SERV ] Service engine loaded: corosync watchdog service [7]
May 15 14:02:16 proxmox corosync[28883]: notice [QUORUM] Using quorum provider corosync_votequorum
May 15 14:02:16 proxmox corosync[28883]: crit [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
May 15 14:02:16 proxmox corosync[28883]: error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'
May 15 14:02:16 proxmox corosync[28883]: error [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356.
May 15 14:02:16 proxmox corosync[28883]: [QB ] server name: cmap
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine loaded: corosync configuration service [1]
May 15 14:02:16 proxmox corosync[28883]: [QB ] server name: cfg
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
May 15 14:02:16 proxmox corosync[28883]: [QB ] server name: cpg
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine loaded: corosync profile loading service [4]
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine loaded: corosync resource monitoring service [6]
May 15 14:02:16 proxmox corosync[28883]: [WD ] Watchdog /dev/watchdog exists but couldn't be opened.
May 15 14:02:16 proxmox corosync[28883]: [WD ] resource load_15min missing a recovery key.
May 15 14:02:16 proxmox corosync[28883]: [WD ] resource memory_used missing a recovery key.
May 15 14:02:16 proxmox corosync[28883]: [WD ] no resources configured.
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine loaded: corosync watchdog service [7]
May 15 14:02:16 proxmox corosync[28883]: [QUORUM] Using quorum provider corosync_votequorum
May 15 14:02:16 proxmox corosync[28883]: [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
May 15 14:02:16 proxmox corosync[28883]: [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'
May 15 14:02:16 proxmox corosync[28883]: [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356.
May 15 14:02:16 proxmox systemd[1]: corosync.service: Main process exited, code=exited, status=20/n/a
May 15 14:02:16 proxmox systemd[1]: Failed to start Corosync Cluster Engine.
May 15 14:02:16 proxmox systemd[1]: corosync.service: Unit entered failed state.
May 15 14:02:16 proxmox systemd[1]: corosync.service: Failed with result 'exit-code'.
May 15 14:02:17 proxmox pvestatd[1505]: unable to activate storage 'local' - directory is expected to be a mount point but is not mounted: '/var/lib/vz'
May 15 14:02:18 proxmox pve-ha-lrm[56449]: status change active => lost_agent_lock
May 15 14:02:18 proxmox pve-ha-crm[56576]: lost lock 'ha_manager_lock - cfs lock update failed - Permission denied
May 15 14:02:20 proxmox pmxcfs[28859]: [quorum] crit: quorum_initialize failed: 2
May 15 14:02:20 proxmox pmxcfs[28859]: [confdb] crit: cmap_initialize failed: 2
May 15 14:02:20 proxmox pmxcfs[28859]: [dcdb] crit: cpg_initialize failed: 2
May 15 14:02:20 proxmox pmxcfs[28859]: [status] crit: cpg_initialize failed: 2
May 15 14:02:23 proxmox pve-ha-crm[56576]: status change master => lost_manager_lock
May 15 14:02:23 proxmox pve-ha-crm[56576]: watchdog closed (disabled)
May 15 14:02:23 proxmox pve-ha-crm[56576]: status change lost_manager_lock => wait_for_quorum
May 15 14:02:26 proxmox pmxcfs[28859]: [quorum] crit: quorum_initialize failed: 2
May 15 14:02:26 proxmox pmxcfs[28859]: [confdb] crit: cmap_initialize failed: 2
May 15 14:02:26 proxmox pmxcfs[28859]: [dcdb] crit: cpg_initialize failed: 2
May 15 14:02:26 proxmox pmxcfs[28859]: [status] crit: cpg_initialize failed: 2
May 15 14:02:27 proxmox pvestatd[1505]: unable to activate storage 'local' - directory is expected to be a mount point but is not mounted: '/var/lib/vz'
May 15 14:02:32 proxmox pmxcfs[28859]: [quorum] crit: quorum_initialize failed: 2
May 15 14:02:32 proxmox pmxcfs[28859]: [confdb] crit: cmap_initialize failed: 2
May 15 14:02:32 proxmox pmxcfs[28859]: [dcdb] crit: cpg_initialize failed: 2
May 15 14:02:32 proxmox pmxcfs[28859]: [status] crit: cpg_initialize failed: 2
May 15 14:02:37 proxmox pvestatd[1505]: unable to activate storage 'local' - directory is expected to be a mount point but is not mounted: '/var/lib/vz'
May 15 14:02:38 proxmox pmxcfs[28859]: [quorum] crit: quorum_initialize failed: 2
May 15 14:02:38 proxmox pmxcfs[28859]: [confdb] crit: cmap_initialize failed: 2
May 15 14:02:38 proxmox pmxcfs[28859]: [dcdb] crit: cpg_initialize failed: 2
May 15 14:02:38 proxmox pmxcfs[28859]: [status] crit: cpg_initialize failed: 2
Any Idea about that ?