Troubles with VM migration in a 2-nodes Cluster Enviroment.

italian01

Member
Feb 23, 2012
57
0
6
Italy
Hello to all readers.

I've a trouble with a VM migration. Given that I've a two-nodes cluster configuration, and that I've VM100 on node named PROXMOX I have to migrate on node named PROXMOX2, I obtained follow log:

Code:
May 14 16:35:20 starting migration of VM 100 to node 'proxmox2' (192.168.5.251)
May 14 16:35:20 copying disk images
vm-100-disk-1.raw

rsync status: 32768   0%    0.00kB/s    0:00:00  
rsync status: 92831744   1%   28.83MB/s    0:03:35  
rsync status: 139231232   2%   16.61MB/s    0:06:10  
[...]
[...]
rsync status: 6394380288  99%   49.10MB/s    0:00:00  
rsync status: 6442450944 100%   25.50MB/s    0:04:00 (xfer#1, to-check=0/1)

sent 6443237455 bytes  received 31 bytes  26680072.41 bytes/sec
total size is 6442450944  speedup is 1.00
May 14 16:39:30 ERROR: Failed to move config to node 'proxmox2' - rename failed: Device or resource busy
May 14 16:39:30 aborting phase 1 - cleanup resources
May 14 16:39:36 ERROR: unable to open file '/etc/pve/nodes/proxmox/qemu-server/100.conf.tmp.2339' - Device or resource busy
May 14 16:39:36 ERROR: found stale volume copy 'local:100/vm-100-disk-1.raw' on node 'proxmox2'
May  14 16:39:36 ERROR: migration aborted (duration 00:04:16): Failed to  move config to node 'proxmox2' - rename failed: Device or resource busy
TASK ERROR: migration aborted

What do I have to do to solve this issue??

Regards.
 
Hello Dietmar,

fist of all, thank you for your help.

Given that, let me explain you what is my enviroment.

I have 2 different PC where I've installed Proxmox on. On the first one I created a Win 2k3 srv VM, indetified by 100 CMID. And all was working fine, when I decided to migrate toward a much more performant machine and I would like to do this by migration feature of cluster configuaration.

For I made up the 2nd one node and I followed the instructions are written here to build up my cluster. After that, I got a instable Cluster where nodes never saw each other active... I searched a lot and I learned about quorum concept and the trick for bypassing it; that is, adding follow line

<cman expected_votes="1" two_node="1" />

in /etc/cluster/cluster.conf file. But, the good result I got, it has been for a brief moment only: 4/5 minutes after the restarting CMAN services on both nodes. After that, one of two node lose other giving the follow sysmessage:

May 15 09:16:55 proxmox2 pmxcfs[1345]: [status] crit: cpg_send_message failed: 9

Good, given that, I took advantage of the "fine-initial moment" to issue "migrate" command... gettin the effect I reported before.

There is another issue. Meanwhile I was waiting an answer to my questione, I tried to re-migrate VM to old node... (PROXMOX2 -> PROXMOX). So, I restarted CMAN services on both nodes, and I done it. Logs are:


Code:
May 14 18:13:27 starting migration of VM 100 to node 'proxmox' (192.168.5.250)
May 14 18:13:27 copying disk images
vm-100-disk-1.raw

rsync status: 32768   0%    0.00kB/s    0:00:00  
rsync status: 118915072   1%   56.72MB/s    0:01:48  
rsync status: 147357696   2%   46.85MB/s    0:02:11  
[...]
[...]
rsync status: 6379175936  99%   57.05MB/s    0:00:01  
rsync status: 6442450944 100%   49.02MB/s    0:02:05 (xfer#1, to-check=0/1)

sent 6443237455 bytes  received 31 bytes  50934683.68 bytes/sec
total size is 6442450944  speedup is 1.00
May 14 18:18:49 migration finished successfuly (duration 00:05:23)
TASK OK


It seemed that all was gone fine. In fact, VM correctly started up. But I don't able to connect to its by console, and logs I got are:


Code:
no connection : Connection timed out
TASK 
ERROR: command '/bin/nc -l -p 5900 -w 10 -c '/usr/bin/ssh -T -o  BatchMode=yes 
-c blowfish-cbc 192.168.5.251 /usr/sbin/qm vncproxy 100  2>/dev/null'' failed: exit code 1

From bad to worse!!!! Help me, please...

Anyway, coming back at your question, follow my corrent state on both nodes:


Code:
root@proxmox:~# pvecm status

Version: 6.2.0
Config Version: 3
Cluster Name: ProxMCls
Cluster Id: 24223
Cluster Member: Yes
Cluster Generation: 592
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 5
Flags: 2node
Ports Bound: 0
Node name: proxmox
Node ID: 1
Multicast addresses: 239.192.94.253
Node addresses: 192.168.5.250

root@proxmox2:~#  pvecm status
cman_tool: Cannot open connection to cman, is it running ?


And if I restart CMAN service on PROXMOX2, the state change in...


Code:
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ] 
   Checking Network Manager... [  OK  ] 
   Global setup... [  OK  ] 
   Loading kernel modules... [  OK  ] 
   Mounting configfs... [  OK  ] 
   Starting cman... Relax-NG validity error : Extra element cman in interleave
tempfile:4: element cman: Relax-NG validity error : Element cluster failed to validate content
Configuration fails to validate
[  OK  ] 
   Waiting for quorum... [  OK  ] 
   Starting fenced... [  OK  ] 
   Starting dlm_controld... [  OK  ] 
   Unfencing self... [  OK  ] 
TASK OK

root@proxmox:~# pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: ProxMCls
Cluster Id: 24223
Cluster Member: Yes
Cluster Generation: 592
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 5
Flags: 2node
Ports Bound: 0
Node name: proxmox
Node ID: 1
Multicast addresses: 239.192.94.253
Node addresses: 192.168.5.250


root@proxmox2:~#  pvecm status
Version: 6.2.0
Config Version: 3
Cluster Name: ProxMCls
Cluster Id: 24223
Cluster Member: Yes
Cluster Generation: 624
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 5
Flags: 2node
Ports Bound: 0
Node name: proxmox2
Node ID: 2
Multicast addresses: 239.192.94.253
Node addresses: 192.168.5.251

Regards
 
You managed to create a invalid cluster configuration file:

Code:
Starting cman... Relax-NG validity error :date Extra element cman ...

Please correct that.
 
Very good, Tom.

Cluster is working! ;o) Even if I have to specify that I already tried test on multicast when I made up cluster the first time, and it was working. It seemed very strange that it isn't working now...

Aniway, I've to overcome the issue on VM now. Have you any idea?

I can migrate VM from one node to another without errors now, and I can start it up fine. Yet I cannot still connect by Console to its: it there anyway to debug what are happening in VM?


Code:
no connection : Connection timed out
TASK ERROR: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 100 2>/dev/null'' failed: exit code 1


Best Regards.
 
You managed to create a invalid cluster configuration file:

Code:
Starting cman... Relax-NG validity error :date Extra element cman ...

Please correct that.

Question: which cluster.conf do I have to modify?

that one in /etc/pve/

or

that one in /etc/cluster/

?

Regards.
 
Question: which cluster.conf do I have to modify?

that one in /etc/pve/

normally you edit /etc/pve/cluster.conf.new (copy /etc/pve/cluster.conf).

You can then use the GUI to commit the changes (does syntax checks).

or

that one in /etc/cluster/

no, that is normally written by proxmox. But you changed that file manually, so I guess you need to revert those changes.
 
normally you edit /etc/pve/cluster.conf.new (copy /etc/pve/cluster.conf).

You can then use the GUI to commit the changes (does syntax checks).

Yet I cannot write in /etc/pve directory:

Code:
root@proxmox2:/etc/pve# cp cluster.conf cluster.conf.new
cp: cannot create regular file `cluster.conf.new': Device or resource busy

What can I do?

Regards.
 
You need to revert the changes in /etc/cluster/cluster.conf and then restart cman.

Here I'm, again.

Dietmar, Tom: I done what u suggested me, yet I cannot come out. I come to be frustrated a little bit. Anyway: follow my cluster state at moment.

I reverted my /etc/cluster/cluster.conf, but it came to be replaced by /etc/pve/cluster.conf because both of them had the same "config_version=2". Hence, I again reverted my /etc/cluster/cluster.conf but setting its "config_version" parameter to 3. After that, this config file comes to be replace no more but I still cannot modify anything in /etc/pve directory. In fact, if I try to create a /etc/pve/cluster.conf.new file, I get


Code:
root@proxmox:~# cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
cp: cannot create regular file `/etc/pve/cluster.conf.new': Permission denied

root@proxmox:~# ls -lart /etc/pve
total 9
-r--r-----  1 root www-data   89 Jan  1  1970 .vmlist
-r--r-----  1 root www-data  255 Jan  1  1970 .version
-r--r-----  1 root www-data  267 Jan  1  1970 .rrd
lr-xr-x---  1 root www-data    0 Jan  1  1970 qemu-server -> nodes/proxmox/qemu-server
lr-xr-x---  1 root www-data    0 Jan  1  1970 openvz -> nodes/proxmox/openvz
-r--r-----  1 root www-data  233 Jan  1  1970 .members
lr-xr-x---  1 root www-data    0 Jan  1  1970 local -> nodes/proxmox
-rw-r-----  1 root www-data    2 Jan  1  1970 .debug
-r--r-----  1 root www-data 8115 Jan  1  1970 .clusterlog
drwxr-x---  2 root www-data    0 Jan  1  1970 .
dr-x------  2 root www-data    0 Mar  9 16:21 priv
dr-xr-x---  2 root www-data    0 Mar  9 16:21 nodes
-r--r-----  1 root www-data  451 Mar  9 16:21 authkey.pub
-r--r-----  1 root www-data  119 Mar  9 16:21 vzdump.cron
-r--r-----  1 root www-data 1675 Mar  9 16:21 pve-www.key
-r--r-----  1 root www-data 1533 Mar  9 16:21 pve-root-ca.pem
-r--r-----  1 root www-data  119 Mar 14 09:49 storage.cfg
-r--r-----  1 root www-data  205 Mar 28 17:02 domains.cfg
-r--r-----  1 root www-data   77 Mar 29 15:20 datacenter.cfg
-r--r-----  1 root www-data  708 Apr  2 09:41 user.cfg
-r--r-----  1 root www-data  236 May  3 09:45 cluster.conf.old
-r--r-----  1 root www-data  339 May 15 11:10 cluster.conf
drwxr-xr-x 80 root root     4096 May 15 18:38 ..


What do I have to do now???

Regards.
 
You do not have quorum, so please try:

# pvecm expected 1

After that you should be able to write the files.
 
You do not have quorum, so please try:

# pvecm expected 1

After that you should be able to write the files.


Good, I done it. And this is my "/etc/pve/cluster.conf.new" now:

Code:
<?xml version="1.0"?>
<cluster name="ProxMCls" config_version="3">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu" />
  <cman expected_votes="1" two_node="1" />

  <clusternodes>
  <clusternode name="proxmox" votes="1" nodeid="1"/>
  <clusternode name="proxmox2" votes="1" nodeid="2"/></clusternodes>

</cluster>

At this point, GUI state me that

config validation failed: unknown error (500)

Please, why??

Regards.

 
attach your original file (as zip).
 
use this line:

Code:
<cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu" expected_votes="1" two_node="1"/>
 
use this line:

Code:
<cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu" expected_votes="1" two_node="1"/>

Good, and this step is also gone over.

Now, I still get nodes of cluster which are viewing off-line each other. Follow syslog of both nodes:

PROXMOX

Code:
May 16 16:30:37 proxmox kernel: hub 7-0:1.0: USB hub found
May 16 16:30:39 proxmox ntpd[1215]: Listen normally on 7 vmbr0 fe80::ea40:f2ff:fe0d:fc0f UDP 123
May 16 16:30:39 proxmox ntpd[1215]: Listen normally on 8 eth0 fe80::ea40:f2ff:fe0d:fc0f UDP 123
May 16 16:30:39 proxmox postfix/master[1311]: daemon started -- version 2.7.1, configuration /etc/postfix
May 16 16:30:39 proxmox pmxcfs[1322]: [quorum] crit: quorum_initialize failed: 6
May 16 16:30:39 proxmox pmxcfs[1322]: [quorum] crit: can't initialize service
May 16 16:30:39 proxmox pmxcfs[1322]: [confdb] crit: confdb_initialize failed: 6
May 16 16:30:39 proxmox pmxcfs[1322]: [quorum] crit: can't initialize service
May 16 16:30:39 proxmox pmxcfs[1322]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:30:39 proxmox pmxcfs[1322]: [quorum] crit: can't initialize service
May 16 16:30:39 proxmox pmxcfs[1322]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:30:39 proxmox pmxcfs[1322]: [quorum] crit: can't initialize service
May 16 16:30:40 proxmox kernel: DLM (built Apr 11 2012 07:08:13) installed
May 16 16:30:40 proxmox /usr/sbin/cron[1369]: (CRON) INFO (pidfile fd = 3)
May 16 16:30:40 proxmox /usr/sbin/cron[1376]: (CRON) STARTUP (fork ok)
May 16 16:30:40 proxmox /usr/sbin/cron[1376]: (CRON) INFO (Running @reboot jobs)
May 16 16:30:45 proxmox pmxcfs[1322]: [quorum] crit: quorum_initialize failed: 6
May 16 16:30:45 proxmox pmxcfs[1322]: [confdb] crit: confdb_initialize failed: 6
May 16 16:30:45 proxmox pmxcfs[1322]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:30:45 proxmox pmxcfs[1322]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:30:47 proxmox kernel: eth0: no IPv6 routers present
May 16 16:30:47 proxmox kernel: vmbr0: no IPv6 routers present
May 16 16:30:51 proxmox pmxcfs[1322]: [quorum] crit: quorum_initialize failed: 6
May 16 16:30:51 proxmox pmxcfs[1322]: [confdb] crit: confdb_initialize failed: 6
May 16 16:30:51 proxmox pmxcfs[1322]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:30:51 proxmox pmxcfs[1322]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:30:53 proxmox corosync[1442]: [MAIN ] Corosync Cluster Engine ('1.4.3'): started and ready to provide service.
May 16 16:30:53 proxmox corosync[1442]: [MAIN ] Corosync built-in features: nss
May 16 16:30:53 proxmox corosync[1442]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf
May 16 16:30:53 proxmox corosync[1442]: [MAIN ] Successfully parsed cman config
May 16 16:30:53 proxmox corosync[1442]: [MAIN ] Successfully configured openais services to load
May 16 16:30:53 proxmox corosync[1442]: [TOTEM ] Initializing transport (UDP/IP Unicast).
May 16 16:30:53 proxmox corosync[1442]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 16 16:30:53 proxmox corosync[1442]: [TOTEM ] The network interface [192.168.5.250] is now up.
May 16 16:30:54 proxmox corosync[1442]: [QUORUM] Using quorum provider quorum_cman
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
May 16 16:30:54 proxmox corosync[1442]: [CMAN ] CMAN 1324544458 (built Dec 22 2011 10:01:01) started
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: openais cluster membership service B.01.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: openais event service B.01.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: openais checkpoint service B.01.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: openais message service B.03.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: openais distributed locking service B.03.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: openais timer service A.01.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync extended virtual synchrony service
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync configuration service
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync cluster config database access v1.01
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync profile loading service
May 16 16:30:54 proxmox corosync[1442]: [QUORUM] Using quorum provider quorum_cman
May 16 16:30:54 proxmox corosync[1442]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
May 16 16:30:54 proxmox corosync[1442]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
May 16 16:30:54 proxmox corosync[1442]: [TOTEM ] Could not create socket for new member: Address family not supported by protocol (97)
May 16 16:30:54 proxmox corosync[1442]: [CLM ] CLM CONFIGURATION CHANGE
May 16 16:30:54 proxmox corosync[1442]: [CLM ] New Configuration:
May 16 16:30:54 proxmox corosync[1442]: [CLM ] Members Left:
May 16 16:30:54 proxmox corosync[1442]: [CLM ] Members Joined:
May 16 16:30:54 proxmox corosync[1442]: [CLM ] CLM CONFIGURATION CHANGE
May 16 16:30:54 proxmox corosync[1442]: [CLM ] New Configuration:
May 16 16:30:54 proxmox corosync[1442]: [CLM ] #011r(0) ip(192.168.5.250)
May 16 16:30:54 proxmox corosync[1442]: [CLM ] Members Left:
May 16 16:30:54 proxmox corosync[1442]: [CLM ] Members Joined:
May 16 16:30:54 proxmox corosync[1442]: [CLM ] #011r(0) ip(192.168.5.250)
[COLOR=#ff0000][B]May 16 16:30:54 proxmox corosync[1442]: [TOTEM ] A processor joined or left the membership and a new membership was formed[/B][/COLOR].
May 16 16:30:54 proxmox corosync[1442]: [CMAN ] quorum regained, resuming activity
May 16 16:30:54 proxmox corosync[1442]: [QUORUM] This node is within the primary component and will provide service.
May 16 16:30:54 proxmox corosync[1442]: [QUORUM] Members[1]: 1
May 16 16:30:54 proxmox corosync[1442]: [QUORUM] Members[1]: 1
May 16 16:30:54 proxmox corosync[1442]: [CPG ] chosen downlist: sender r(0) ip(192.168.5.250) ; members(old:0 left:0)
May 16 16:30:54 proxmox corosync[1442]: [MAIN ] Completed service synchronization, ready to provide service.
May 16 16:30:57 proxmox fenced[1666]: fenced 1324544458 started
May 16 16:30:57 proxmox dlm_controld[1688]: dlm_controld 1324544458 started
May 16 16:30:57 proxmox pmxcfs[1322]: [status] notice: update cluster info (cluster name ProxMCls, version = 4)
May 16 16:30:57 proxmox pmxcfs[1322]: [status] notice: node has quorum
May 16 16:30:57 proxmox pmxcfs[1322]: [dcdb] notice: members: 1/1322
May 16 16:30:57 proxmox pmxcfs[1322]: [dcdb] notice: all data is up to date
May 16 16:30:57 proxmox pmxcfs[1322]: [dcdb] notice: members: 1/1322
May 16 16:30:57 proxmox pmxcfs[1322]: [dcdb] notice: all data is up to date
May 16 16:30:58 proxmox kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
May 16 16:30:58 proxmox kernel: kvm: Nested Virtualization enabled
May 16 16:30:58 proxmox kernel: kvm: Nested Paging enabled
May 16 16:30:58 proxmox kernel: tun: Universal TUN/TAP device driver, 1.6
May 16 16:30:58 proxmox kernel: tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
May 16 16:30:58 proxmox kernel: ip6_tables: (C) 2000-2006 Netfilter Core Team
May 16 16:30:58 proxmox kernel: RPC: Registered udp transport module.
May 16 16:30:58 proxmox kernel: RPC: Registered tcp transport module.
May 16 16:30:58 proxmox kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
May 16 16:30:58 proxmox kernel: Slow work thread pool: Starting up
May 16 16:30:58 proxmox kernel: Slow work thread pool: Ready
May 16 16:30:58 proxmox kernel: FS-Cache: Loaded
May 16 16:30:58 proxmox kernel: FS-Cache: Netfs 'nfs' registered for caching
May 16 16:30:59 proxmox kernel: nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
May 16 16:30:59 proxmox pvedaemon[1875]: starting server
May 16 16:30:59 proxmox pvedaemon[1875]: starting 3 worker(s)
May 16 16:30:59 proxmox pvedaemon[1875]: worker 1883 started
May 16 16:30:59 proxmox pvedaemon[1875]: worker 1884 started
May 16 16:30:59 proxmox pvedaemon[1875]: worker 1885 started
May 16 16:31:01 proxmox pvestatd[1904]: starting server
May 16 16:31:01 proxmox pvesh: <root@pam> starting task UPID:proxmox:0000077B:00000DA5:4FB3BA25:startall::root@pam:
May 16 16:31:01 proxmox pvesh: <root@pam> end task UPID:proxmox:0000077B:00000DA5:4FB3BA25:startall::root@pam: OK
May 16 16:31:08 proxmox kernel: venet0: no IPv6 routers present
May 16 16:35:39 proxmox ntpd[1215]: Listen normally on 9 venet0 fe80::1 UDP 123


PROXMOX2

Code:
May 16 16:33:53 proxmox2 ntpd[1260]: Listen normally on 6 eth0 fe80::219:99ff:feaf:654c UDP 123
May 16 16:33:53 proxmox2 rrdcached[1307]: starting up
May 16 16:33:53 proxmox2 rrdcached[1307]: checking for journal files
May 16 16:33:53 proxmox2 rrdcached[1307]: started new journal /var/lib/rrdcached/journal//rrd.journal.1337178833.744701
May 16 16:33:53 proxmox2 rrdcached[1307]: journal processing complete
May 16 16:33:53 proxmox2 rrdcached[1307]: listening for connections
May 16 16:33:54 proxmox2 postfix/master[1345]: daemon started -- version 2.7.1, configuration /etc/postfix
May 16 16:33:54 proxmox2 ntpd[1260]: Listen normally on 7 vmbr0 fe80::219:99ff:feaf:654c UDP 123
May 16 16:33:54 proxmox2 pmxcfs[1359]: [quorum] crit: quorum_initialize failed: 6
May 16 16:33:54 proxmox2 pmxcfs[1359]: [quorum] crit: can't initialize service
May 16 16:33:54 proxmox2 pmxcfs[1359]: [confdb] crit: confdb_initialize failed: 6
May 16 16:33:54 proxmox2 pmxcfs[1359]: [quorum] crit: can't initialize service
May 16 16:33:54 proxmox2 pmxcfs[1359]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:33:54 proxmox2 pmxcfs[1359]: [quorum] crit: can't initialize service
May 16 16:33:54 proxmox2 pmxcfs[1359]: [dcdb] crit: cpg_initialize failed: 6
May 16 16:33:54 proxmox2 pmxcfs[1359]: [quorum] crit: can't initialize service
May 16 16:33:55 proxmox2 kernel: DLM (built Apr 11 2012 07:08:13) installed
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Corosync Cluster Engine ('1.4.3'): started and ready to provide service.
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Corosync built-in features: nss
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Successfully parsed cman config
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Successfully configured openais services to load
May 16 16:33:56 proxmox2 corosync[1447]: [TOTEM ] Initializing transport (UDP/IP Unicast).
May 16 16:33:56 proxmox2 corosync[1447]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 16 16:33:56 proxmox2 corosync[1447]: [TOTEM ] The network interface [192.168.5.251] is now up.
May 16 16:33:56 proxmox2 corosync[1447]: [QUORUM] Using quorum provider quorum_cman
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
May 16 16:33:56 proxmox2 corosync[1447]: [CMAN ] CMAN 1324544458 (built Dec 22 2011 10:01:01) started
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: openais cluster membership service B.01.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: openais event service B.01.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: openais checkpoint service B.01.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: openais message service B.03.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: openais distributed locking service B.03.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: openais timer service A.01.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync extended virtual synchrony service
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync configuration service
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync cluster config database access v1.01
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync profile loading service
May 16 16:33:56 proxmox2 corosync[1447]: [QUORUM] Using quorum provider quorum_cman
May 16 16:33:56 proxmox2 corosync[1447]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
May 16 16:33:56 proxmox2 corosync[1447]: [TOTEM ] Could not create socket for new member: Address family not supported by protocol (97)
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] CLM CONFIGURATION CHANGE
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] New Configuration:
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] Members Left:
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] Members Joined:
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] CLM CONFIGURATION CHANGE
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] New Configuration:
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] #011r(0) ip(192.168.5.251)
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] Members Left:
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] Members Joined:
May 16 16:33:56 proxmox2 corosync[1447]: [CLM ] #011r(0) ip(192.168.5.251)
[COLOR=#ff0000][B]May 16 16:33:56 proxmox2 corosync[1447]: [TOTEM ] A processor joined or left the membership and a new membership was formed.[/B][/COLOR]
May 16 16:33:56 proxmox2 corosync[1447]: [CMAN ] quorum regained, resuming activity
May 16 16:33:56 proxmox2 corosync[1447]: [QUORUM] This node is within the primary component and will provide service.
May 16 16:33:56 proxmox2 corosync[1447]: [QUORUM] Members[1]: 2
May 16 16:33:56 proxmox2 corosync[1447]: [QUORUM] Members[1]: 2
May 16 16:33:56 proxmox2 corosync[1447]: [CPG ] chosen downlist: sender r(0) ip(192.168.5.251) ; members(old:0 left:0)
May 16 16:33:56 proxmox2 corosync[1447]: [MAIN ] Completed service synchronization, ready to provide service.
May 16 16:33:56 proxmox2 /usr/sbin/cron[1491]: (CRON) INFO (pidfile fd = 3)
May 16 16:33:56 proxmox2 /usr/sbin/cron[1492]: (CRON) STARTUP (fork ok)
May 16 16:33:56 proxmox2 /usr/sbin/cron[1492]: (CRON) INFO (Running @reboot jobs)
May 16 16:34:00 proxmox2 fenced[1543]: fenced 1324544458 started
May 16 16:34:00 proxmox2 dlm_controld[1555]: dlm_controld 1324544458 started
May 16 16:34:00 proxmox2 pmxcfs[1359]: [status] notice: update cluster info (cluster name ProxMCls, version = 4)
May 16 16:34:00 proxmox2 pmxcfs[1359]: [status] notice: node has quorum
May 16 16:34:00 proxmox2 pmxcfs[1359]: [dcdb] notice: members: 2/1359
May 16 16:34:00 proxmox2 pmxcfs[1359]: [dcdb] notice: all data is up to date
May 16 16:34:00 proxmox2 pmxcfs[1359]: [dcdb] notice: members: 2/1359
May 16 16:34:00 proxmox2 pmxcfs[1359]: [dcdb] notice: all data is up to date

What is the next challenge??

Thank you for your help.

Regards.
 
post /etc/hosts:

Code:
cat /etc/hosts
 
post /etc/hosts:

Code:
cat /etc/hosts

Good: very good! I undestood the point where you'd have come, and I've already change my hosts files with mutual node entry. So, cluster is up and all (respect its) go on fine.

Anyway, I've to overcome the issue on VM now. Have you any idea?

I migrated VM from one node to another, and now it's on its original one: PROXMOX node. I did it without errors, and I can start it up fine. Yet I cannot still connect by Console to its: it there anyway to debug what are happening in VM?


Code:
no connection : Connection timed out
TASK ERROR: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 100 2>/dev/null'' failed: exit code 1

How can I debug and work around this issue?

Thank you.

Regards.
 
pls search the forum, this is discussed several times. if you cannot figure it out, pls open a new thread as this is not cluster issue (see subject of this thread).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!