Guys,
here i am again
after some network troubleshooting i needed to reload network and server itself.
After that i lost all my cluster, i have 3 server currently they are unable to see eachothers...
i tried to reload every things...
service pve-cluster restart <--- Nothing changed.
systemctl stop pve-cluster
systemctl stop corosync
systemctl start pve-cluster <----- An error occured.
root@Server1:/etc# systemctl start pve-cluster
Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xe" for details.
root@Server1:/etc# systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-08-10 10:51:23 CEST; 29s ago
Process: 30477 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 30735 ExecStart=/usr/bin/pmxcfs $DAEMON_OPTS (code=exited, status=255)
Main PID: 30475 (code=exited, status=0/SUCCESS)
Aug 10 10:51:13 Server1 systemd[1]: Starting The Proxmox VE cluster filesystem...
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily una
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily una
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] notice: exit proxmox configuration filesystem (-1)
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Control process exited, code=exited status=255
Aug 10 10:51:23 Server1 systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Unit entered failed state.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
lines 1-17/17 (END)
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-08-10 10:51:23 CEST; 29s ago
Process: 30477 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 30735 ExecStart=/usr/bin/pmxcfs $DAEMON_OPTS (code=exited, status=255)
Main PID: 30475 (code=exited, status=0/SUCCESS)
Aug 10 10:51:13 Server1 systemd[1]: Starting The Proxmox VE cluster filesystem...
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily unavailable
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily unavailable
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] notice: exit proxmox configuration filesystem (-1)
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Control process exited, code=exited status=255
Aug 10 10:51:23 Server1 systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Unit entered failed state.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
root@Server1:/etc# systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2017-08-10 10:51:23 CEST; 6min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 30752 (corosync)
Tasks: 2 (limit: 4915)
CGroup: /system.slice/corosync.service
└─30752 /usr/sbin/corosync -f
Aug 10 10:51:23 Server1 corosync[30752]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 10 10:51:23 Server1 corosync[30752]: [QUORUM] This node is within the primary component and will provide service.
Aug 10 10:51:23 Server1 corosync[30752]: [QUORUM] Members[2]: 1 3
Aug 10 10:51:23 Server1 corosync[30752]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 10 10:51:25 Server1 corosync[30752]: notice [TOTEM ] A new membership (192.168.100.11:108) was formed. Members joi
Aug 10 10:51:25 Server1 corosync[30752]: [TOTEM ] A new membership (192.168.100.11:108) was formed. Members joined: 2
Aug 10 10:51:25 Server1 corosync[30752]: notice [QUORUM] Members[3]: 1 2 3
Aug 10 10:51:25 Server1 corosync[30752]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 10 10:51:25 Server1 corosync[30752]: [QUORUM] Members[3]: 1 2 3
Aug 10 10:51:25 Server1 corosync[30752]: [MAIN ] Completed service synchronization, ready to provide service.
here i am again
after some network troubleshooting i needed to reload network and server itself.
After that i lost all my cluster, i have 3 server currently they are unable to see eachothers...
i tried to reload every things...
service pve-cluster restart <--- Nothing changed.
systemctl stop pve-cluster
systemctl stop corosync
systemctl start pve-cluster <----- An error occured.
root@Server1:/etc# systemctl start pve-cluster
Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xe" for details.
root@Server1:/etc# systemctl status pve-cluster.service
● pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-08-10 10:51:23 CEST; 29s ago
Process: 30477 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 30735 ExecStart=/usr/bin/pmxcfs $DAEMON_OPTS (code=exited, status=255)
Main PID: 30475 (code=exited, status=0/SUCCESS)
Aug 10 10:51:13 Server1 systemd[1]: Starting The Proxmox VE cluster filesystem...
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily una
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily una
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] notice: exit proxmox configuration filesystem (-1)
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Control process exited, code=exited status=255
Aug 10 10:51:23 Server1 systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Unit entered failed state.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
lines 1-17/17 (END)
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-08-10 10:51:23 CEST; 29s ago
Process: 30477 ExecStartPost=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 30735 ExecStart=/usr/bin/pmxcfs $DAEMON_OPTS (code=exited, status=255)
Main PID: 30475 (code=exited, status=0/SUCCESS)
Aug 10 10:51:13 Server1 systemd[1]: Starting The Proxmox VE cluster filesystem...
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:13 Server1 pmxcfs[30735]: [main] notice: unable to aquire pmxcfs lock - trying again
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily unavailable
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] crit: unable to aquire pmxcfs lock: Resource temporarily unavailable
Aug 10 10:51:23 Server1 pmxcfs[30735]: [main] notice: exit proxmox configuration filesystem (-1)
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Control process exited, code=exited status=255
Aug 10 10:51:23 Server1 systemd[1]: Failed to start The Proxmox VE cluster filesystem.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Unit entered failed state.
Aug 10 10:51:23 Server1 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
root@Server1:/etc# systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2017-08-10 10:51:23 CEST; 6min ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Main PID: 30752 (corosync)
Tasks: 2 (limit: 4915)
CGroup: /system.slice/corosync.service
└─30752 /usr/sbin/corosync -f
Aug 10 10:51:23 Server1 corosync[30752]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 10 10:51:23 Server1 corosync[30752]: [QUORUM] This node is within the primary component and will provide service.
Aug 10 10:51:23 Server1 corosync[30752]: [QUORUM] Members[2]: 1 3
Aug 10 10:51:23 Server1 corosync[30752]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 10 10:51:25 Server1 corosync[30752]: notice [TOTEM ] A new membership (192.168.100.11:108) was formed. Members joi
Aug 10 10:51:25 Server1 corosync[30752]: [TOTEM ] A new membership (192.168.100.11:108) was formed. Members joined: 2
Aug 10 10:51:25 Server1 corosync[30752]: notice [QUORUM] Members[3]: 1 2 3
Aug 10 10:51:25 Server1 corosync[30752]: notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 10 10:51:25 Server1 corosync[30752]: [QUORUM] Members[3]: 1 2 3
Aug 10 10:51:25 Server1 corosync[30752]: [MAIN ] Completed service synchronization, ready to provide service.