Hello,
I'v two servers in cluster, one HP and one DELL, boths have ProxMox 4.1-1.
Yesterday I'v configured for test the HA from the webgui, creating groups and adding the virtual machine inside.
Tonight the HP server (wich was in HA tab the one called as "master") had a weird crash on a machine, wich caused the entire system to be unstable,i'v attached the kernel log.
Due to instability, I had to force the reboot trought acpi, it has caused that the DELL server, have rebooted itsel too w\o an apparent error.
When the HP and Dell server has comed back, they have started up all vm's and the system was reliable again.
For ispect the HP machine, and try to understand the issue, I'v moved the VM's from HP to DELL, removed all HA groups, but.. again, when I'v done a simple reboot on the HP server, the DELL server got rebooted again.
I'v checked the log on the Dell server and got that:
Why that happend? and, btw I'vnt found a way to disable the HA and keep only the cluster active waiting to prepare the third machine.
Thank you so much for your support
regards
I'v two servers in cluster, one HP and one DELL, boths have ProxMox 4.1-1.
Yesterday I'v configured for test the HA from the webgui, creating groups and adding the virtual machine inside.
Tonight the HP server (wich was in HA tab the one called as "master") had a weird crash on a machine, wich caused the entire system to be unstable,i'v attached the kernel log.
Due to instability, I had to force the reboot trought acpi, it has caused that the DELL server, have rebooted itsel too w\o an apparent error.
When the HP and Dell server has comed back, they have started up all vm's and the system was reliable again.
For ispect the HP machine, and try to understand the issue, I'v moved the VM's from HP to DELL, removed all HA groups, but.. again, when I'v done a simple reboot on the HP server, the DELL server got rebooted again.
I'v checked the log on the Dell server and got that:
Feb 24 10:35:58 vmsrv02 corosync[1299]: [TOTEM ] A new membership (172.16.254.2:24) was formed. Members left: 2
Feb 24 10:35:58 vmsrv02 corosync[1299]: [QUORUM] This node is within the non-primary component and will NOT provide
any services.
Feb 24 10:35:58 vmsrv02 corosync[1299]: [QUORUM] Members[1]: 1
Feb 24 10:35:58 vmsrv02 corosync[1299]: [MAIN ] Completed service synchronization, ready to provide service.
Feb 24 10:35:58 vmsrv02 pmxcfs[1173]: [status] notice: node lost quorum
Feb 24 10:36:00 vmsrv02 pve-ha-crm[1326]: status change slave => wait_for_quorum
Feb 24 10:36:11 vmsrv02 pve-ha-lrm[1334]: status change active => lost_agent_lock
Feb 24 10:36:32 vmsrv02 pvedaemon[1321]: <root@pam> successful auth for user 'root@pam'
Feb 24 10:36:35 vmsrv02 pveproxy[5537]: proxy detected vanished client connection
Feb 24 10:36:57 vmsrv02 watchdog-mux[1047]: client watchdog expired - disable watchdog updates
Feb 24 10:39:05 vmsrv02 rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="1076" x-info="http://www.rsysl
og.com"] start
Why that happend? and, btw I'vnt found a way to disable the HA and keep only the cluster active waiting to prepare the third machine.
Thank you so much for your support
regards
Attachments
Last edited: