Good afternoon.
Our company has a cluster based on proxmox, with purchased licenses for 6ti node, but we have always had problems with ha, periods cease to migrate virtual machines between nodes when trying to migrate from console get hands:
Our company has a cluster based on proxmox, with purchased licenses for 6ti node, but we have always had problems with ha, periods cease to migrate virtual machines between nodes when trying to migrate from console get hands:
root@cluster-1-1:/var/log# qm migrate 185 cluster-1-3 --online
Executing HA migrate for VM 185 to node cluster-1-3
Trying to migrate pvevm:185 to cluster-1-3...Target node dead / nonexistent
command 'clusvcadm -M pvevm:185 -m cluster-1-3' failed: exit code 244
root@cluster-1-1:/var/log#
root@cluster-1-1:/var/log# clustat |more
Cluster Status for freecluster01 @ Thu Mar 13 11:34:30 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
cluster-1-1 1 Online, Local, rgmanager
cluster-1-2 2 Online, rgmanager
cluster-1-3 3 Online
cluster-1-4 4 Online, rgmanager
cluster-1-5 5 Online, rgmanager
cluster-1-6 6 Online, rgmanager
root@cluster-1-2:~# clustat |more
Cluster Status for freecluster01 @ Thu Mar 13 11:34:45 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
cluster-1-1 1 Online, rgmanager
cluster-1-2 2 Online, Local, rgmanager
cluster-1-3 3 Online, rgmanager
cluster-1-4 4 Online, rgmanager
cluster-1-5 5 Online, rgmanager
cluster-1-6 6 Online, rgmanager
Service Name Owner (Last) State
root@cluster-1-3:~# clustat |more
Cluster Status for freecluster01 @ Thu Mar 13 11:34:03 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
cluster-1-1 1 Online, rgmanager
cluster-1-2 2 Online, rgmanager
cluster-1-3 3 Online, Local, rgmanager
cluster-1-4 4 Online, rgmanager
cluster-1-5 5 Online, rgmanager
cluster-1-6 6 Online, rgmanager
Service Name Owner (Last) State
Unfortunately the log files we could not find error reports.Как здесь видно, почему-то нода 1-1 не видит rgmanager на ноде 1-3, хотя другие ноды все видят, подскажите как это можно диагностировать, мы бы очень хотели чтобы наши проблемы с HA закончились и кластер начал жить, так как планируем приобрести еще 6ть лицензий.
root@cluster-1-1:/var/log# group_tool ls
fence domain
member count 6
victim count 0
victim now 0
master nodeid 1
wait state none
members 1 2 3 4 5 6
dlm lockspaces
name rgmanager
id 0x5231f3eb
flags 0x00000000
change member 6 joined 1 remove 0 failed 0 seq 13,13
members 1 2 3 4 5 6
name storage02
id 0x72bc4bef
flags 0x00000008 fs_reg
change member 6 joined 1 remove 0 failed 0 seq 5,5
members 1 2 3 4 5 6
name storage01
id 0x5991182c
flags 0x00000008 fs_reg
change member 6 joined 1 remove 0 failed 0 seq 5,5
members 1 2 3 4 5 6
name clvmd
id 0x4104eefa
flags 0x00000000
change member 6 joined 1 remove 0 failed 0 seq 9,9
members 1 2 3 4 5 6
gfs mountgroups
name storage02
id 0xa14c9488
flags 0x00000008 mounted
change member 6 joined 1 remove 0 failed 0 seq 5,5
members 1 2 3 4 5 6
name storage01
id 0x8a61c74b
flags 0x00000008 mounted
change member 6 joined 1 remove 0 failed 0 seq 5,5
members 1 2 3 4 5 6
root@cluster-1-1:/var/log#