Results 1 to 12 of 12

Thread: Cluster : cmantool : Problem or not ?

  1. #1
    Join Date
    Feb 2012
    Posts
    19

    Default Cluster : cmantool : Problem or not ?

    Hello,

    First I configured the network
    Secondly I have updated the system (apt-get update, apt-get upgrade, apt-get dist-upgrade)
    Third, I created the cluster (pvecm create asgard)

    Generating public/private rsa key pair.
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    89:da:e0:84:99:0a:2c:33:72:df:d1:c0:aa:6d:73:34 root@srv-proxmox
    The key's randomart image is:
    +--[ RSA 2048]----+
    | |
    | . |
    | o |
    |. + . + . |
    |*.= + E S |
    |+= * * o |
    |. . B + |
    | . o |
    | |
    +-----------------+
    Restarting pve cluster filesystem: pve-cluster[dcdb] crit: unable to read cluster config file '/etc/cluster/cluster.conf' - Failed to open file '/etc/cluster/cluster.conf': No such file or directory
    [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
    [dcdb] crit: cman_tool version failed with exit code 1#010
    .
    Starting cluster:
    Checking if cluster has been disabled at boot... [ OK ]
    Checking Network Manager... [ OK ]
    Global setup... [ OK ]
    Loading kernel modules... [ OK ]
    Mounting configfs... [ OK ]
    Starting cman... [ OK ]
    Waiting for quorum... [ OK ]
    Starting fenced... [ OK ]
    Starting dlm_controld... [ OK ]
    Unfencing self... [ OK ]

    Is it normal to have :
    [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
    [dcdb] crit: cman_tool version failed with exit code 1#010



    pvecm status
    Version: 6.2.0
    Config Version: 1
    Cluster Name: asgard
    Cluster Id: 6484
    Cluster Member: Yes
    Cluster Generation: 4
    Membership state: Cluster-Member
    Nodes: 1
    Expected votes: 1
    Total votes: 1
    Node votes: 1
    Quorum: 1
    Active subsystems: 5
    Flags:
    Ports Bound: 0
    Node name: srv-proxmox
    Node ID: 1
    Multicast addresses: 239.192.25.109
    Node addresses: 10.1.25.212
    Dell PowerEdge T310
    CD Proxmox v2
    srv-proxmox 2.6.32-7-pve
    network vlan
    auto vmbr0
    iface vmbr0 inet static
    address 10.1.25.212
    netmask 255.255.192.0
    gateway 10.1.1.19
    bridge_ports eth2.301
    bridge_stp off
    bridge_fd 0

    auto vmbr1
    iface vmbr1 inet manual
    bridge_ports eth2.201
    bridge_stp off
    bridge_fd 0


    Thanks

  2. #2
    Join Date
    Apr 2005
    Location
    Austria
    Posts
    11,806

    Default Re: Cluster : cmantool : Problem or not ?

    Quote Originally Posted by rsf View Post
    [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
    [dcdb] crit: cman_tool version failed with exit code 1#010
    not normal, but does not look critical. Please can you reboot the node and check if all services gets started without errors?

  3. #3
    Join Date
    Feb 2012
    Posts
    19

    Default Re: Cluster : cmantool : Problem or not ?

    when starting the pc everything is ok but when you look at the daemon.log file, the ntpd has a problem initializing and quorum initialize failed
    The problem is probably due to high latency of the network initialization ??

    Mar 15 14:38:05 srv-proxmox ntpd[1270]: ntpd 4.2.6p2@1.2194-o Sun Oct 17 13:35:13 UTC 2010 (1)
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: proto: precision = 0.140 usec
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: Listen and drop on 1 v6wildcard :: UDP 123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: Listen normally on 2 lo 127.0.0.1 UDP 123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: Listen normally on 3 vmbr0 10.1.25.212 UDP 123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: Listen normally on 4 lo ::1 UDP 123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: bind(21) AF_INET6 fe80::21b:21ff:fe98:597b%4#123 flags 0x11 failed: Cannot assign requested address
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: unable to create socket on eth2 (5) for fe80::21b:21ff:fe98:597b#123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: failed to init interface for address fe80::21b:21ff:fe98:597b
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: bind(21) AF_INET6 fe80::21b:21ff:fe98:597b%8#123 flags 0x11 failed: Cannot assign requested address
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: unable to create socket on eth2.201 (6) for fe80::21b:21ff:fe98:597b#123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: failed to init interface for address fe80::21b:21ff:fe98:597b
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: bind(21) AF_INET6 fe80::21b:21ff:fe98:597b%7#123 flags 0x11 failed: Cannot assign requested address
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: unable to create socket on vmbr1 (7) for fe80::21b:21ff:fe98:597b#123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: failed to init interface for address fe80::21b:21ff:fe98:597b
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: bind(21) AF_INET6 fe80::21b:21ff:fe98:597b%6#123 flags 0x11 failed: Cannot assign requested address
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: unable to create socket on eth2.301 (8) for fe80::21b:21ff:fe98:597b#123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: failed to init interface for address fe80::21b:21ff:fe98:597b
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: bind(21) AF_INET6 fe80::21b:21ff:fe98:597b%5#123 flags 0x11 failed: Cannot assign requested address
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: unable to create socket on vmbr0 (9) for fe80::21b:21ff:fe98:597b#123
    Mar 15 14:38:05 srv-proxmox ntpd[1303]: failed to init interface for address fe80::21b:21ff:fe98:597b
    Mar 15 14:38:05 srv-proxmox iscsid: transport class version 2.0-870. iscsid version 2.0-871
    Mar 15 14:38:05 srv-proxmox iscsid: iSCSI daemon with pid=1148 started!
    Mar 15 14:38:06 srv-proxmox rrdcached[1359]: starting up
    Mar 15 14:38:06 srv-proxmox rrdcached[1359]: checking for journal files
    Mar 15 14:38:06 srv-proxmox rrdcached[1359]: started new journal /var/lib/rrdcached/journal//rrd.journal.1331818686.453783
    Mar 15 14:38:06 srv-proxmox rrdcached[1359]: journal processing complete
    Mar 15 14:38:06 srv-proxmox rrdcached[1359]: listening for connections
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [quorum] crit: quorum_initialize failed: 6
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [quorum] crit: can't initialize service
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [confdb] crit: confdb_initialize failed: 6
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [quorum] crit: can't initialize service
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [dcdb] crit: cpg_initialize failed: 6
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [quorum] crit: can't initialize service
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [dcdb] crit: cpg_initialize failed: 6
    Mar 15 14:38:06 srv-proxmox pmxcfs[1376]: [quorum] crit: can't initialize service
    Mar 15 14:38:09 srv-proxmox ntpd[1303]: Listen normally on 10 venet0 fe80::1 UDP 123
    Mar 15 14:38:09 srv-proxmox ntpd[1303]: Listen normally on 11 eth2 fe80::21b:21ff:fe98:597b UDP 123
    Mar 15 14:38:09 srv-proxmox ntpd[1303]: Listen normally on 12 eth2.201 fe80::21b:21ff:fe98:597b UDP 123
    Mar 15 14:38:09 srv-proxmox ntpd[1303]: Listen normally on 13 vmbr1 fe80::21b:21ff:fe98:597b UDP 123
    Mar 15 14:38:09 srv-proxmox ntpd[1303]: Listen normally on 14 eth2.301 fe80::21b:21ff:fe98:597b UDP 123
    Mar 15 14:38:09 srv-proxmox ntpd[1303]: Listen normally on 15 vmbr0 fe80::21b:21ff:fe98:597b UDP 123
    Mar 15 14:38:12 srv-proxmox pmxcfs[1376]: [status] notice: update cluster info (cluster name asgard, version = 1)
    Mar 15 14:38:12 srv-proxmox pmxcfs[1376]: [status] notice: node has quorum
    Mar 15 14:38:12 srv-proxmox pmxcfs[1376]: [dcdb] notice: members: 1/1376
    Mar 15 14:38:12 srv-proxmox pmxcfs[1376]: [dcdb] notice: all data is up to date
    Mar 15 14:38:12 srv-proxmox pmxcfs[1376]: [dcdb] notice: members: 1/1376
    Mar 15 14:38:12 srv-proxmox pmxcfs[1376]: [dcdb] notice: all data is up to date
    Mar 15 14:38:16 srv-proxmox pvedaemon[1771]: starting server
    Mar 15 14:38:16 srv-proxmox pvedaemon[1771]: starting 3 worker(s)
    Mar 15 14:38:16 srv-proxmox pvedaemon[1771]: worker 1776 started
    Mar 15 14:38:16 srv-proxmox pvedaemon[1771]: worker 1778 started
    Mar 15 14:38:16 srv-proxmox pvedaemon[1771]: worker 1780 started
    Mar 15 14:38:17 srv-proxmox pvestatd[1800]: starting server
    Mar 15 14:40:23 srv-proxmox pvedaemon[1778]: <root@pam> successful auth for user 'root@pam'
    part of /var/log/messages
    ...
    Mar 15 14:38:05 srv-proxmox kernel: ACPI Error: SMBus or IPMI write requires Buffer of length 42, found length 20 (20090903/exfield-286)
    Mar 15 14:38:05 srv-proxmox kernel: ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PMI0._GHL] (Node ffff88023f0b4f60), AE_AML_BUFFER_LIMIT
    Mar 15 14:38:05 srv-proxmox kernel: ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PMI0._PMC] (Node ffff88023f0b4420), AE_AML_BUFFER_LIMIT
    Mar 15 14:38:05 srv-proxmox kernel: ACPI Exception: AE_AML_BUFFER_LIMIT, Evaluating _PMC (20090903/power_meter-759)
    ...
    Mar 15 14:38:05 srv-proxmox kernel: Bridge firewalling registered
    Mar 15 14:38:05 srv-proxmox kernel: 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
    Mar 15 14:38:05 srv-proxmox kernel: All bugs added by David S. Miller <davem@redhat.com>
    Mar 15 14:38:05 srv-proxmox kernel: 8021q: adding VLAN 0 to HW filter on device eth2
    Mar 15 14:38:05 srv-proxmox kernel: device eth2.301 entered promiscuous mode
    Mar 15 14:38:05 srv-proxmox kernel: New device eth2.301 does not support netpoll
    Mar 15 14:38:05 srv-proxmox kernel: Disabling netpoll for vmbr0
    Mar 15 14:38:05 srv-proxmox kernel: device eth2 entered promiscuous mode
    Mar 15 14:38:05 srv-proxmox kernel: vmbr0: port 1(eth2.301) entering forwarding state
    Mar 15 14:38:05 srv-proxmox kernel: device eth2.201 entered promiscuous mode
    Mar 15 14:38:05 srv-proxmox kernel: New device eth2.201 does not support netpoll
    Mar 15 14:38:05 srv-proxmox kernel: Disabling netpoll for vmbr1
    Mar 15 14:38:05 srv-proxmox kernel: vmbr1: port 1(eth2.201) entering forwarding state
    Mar 15 14:38:05 srv-proxmox kernel: fuse init (API version 7.13)
    ...
    Mar 15 14:38:05 srv-proxmox kernel: NET: Registered protocol family 10
    Mar 15 14:38:05 srv-proxmox kernel: ADDRCONF(NETDEV_UP): eth2: link is not ready
    ...
    Mar 15 14:38:05 srv-proxmox kernel: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
    Mar 15 14:38:07 srv-proxmox kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
    ...
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [MAIN ] Successfully parsed cman config
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [MAIN ] Successfully configured openais services to load
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [TOTEM ] Initializing transport (UDP/IP Multicast).
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [TOTEM ] The network interface [10.1.25.212] is now up.
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [QUORUM] Using quorum provider quorum_cman
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [CMAN ] CMAN 1324544458 (built Dec 22 2011 10:01:01) started
    Mar 15 14:38:09 srv-proxmox corosync[1610]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90
    ...
    ACPI error -> important or not ???
    New device eth2.301 does not support netpoll -> important or not ???

  4. #4
    Join Date
    Apr 2005
    Location
    Austria
    Posts
    11,806

    Default Re: Cluster : cmantool : Problem or not ?

    I can't really see a serious problem.

  5. #5
    Join Date
    Feb 2012
    Posts
    19

    Default Re: Cluster : cmantool : Problem or not ?

    Ok but ...

    cluster :
    srv-proxmox 10.1.25.212
    network vlan (tagged)

    node :
    srv-proxmox2 10.1.25.118
    network Untagged

    I joined a node to the cluster as shown in the wiki (http://pve.proxmox.com/wiki/Proxmox_VE_2.0_Cluster) and I get an error :

    root@srv-proxmox2:/# pvecm add 10.1.25.212

    Generating public/private rsa key pair.
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    93:05:62:4b:82:86:61:5e:bb:2e:1e:2f:79:87:78:8d root@srv-proxmox2
    The key's randomart image is:
    +--[ RSA 2048]----+
    |.+ o. + . |
    |+ + .+ o . |
    | o . . . |
    | . o |
    | . S |
    | . . |
    | oo.+ |
    |.++E o |
    | .+.. |
    +-----------------+
    The authenticity of host '10.1.25.212 (10.1.25.212)' can't be established.
    RSA key fingerprint is 2e:cd:c6:55:cf:73:bb:61:09:67:cc:c7:69:f4:ca:62.
    Are you sure you want to continue connecting (yes/no)? yes
    root@10.1.25.212's password:
    copy corosync auth key
    stopping pve-cluster service
    Stopping pve cluster filesystem: pve-cluster.
    backup old database
    Starting pve cluster filesystem : pve-cluster.
    Starting cluster:
    Checking if cluster has been disabled at boot... [ OK ]
    Checking Network Manager... [ OK ]
    Global setup... [ OK ]
    Loading kernel modules... [ OK ]
    Mounting configfs... [ OK ]
    Starting cman... [ OK ]
    Waiting for quorum... [ OK ]
    Starting fenced... [ OK ]
    Starting dlm_controld... [ OK ]
    Unfencing self... [ OK ]
    cluster not ready - no quorum?
    and reboot srv-proxmox2
    Display boot :

    ...
    Starting CMAN OK
    waiting for quorum... timed-out waiting for cluster
    [FAILED]
    cluster not ready - no quorum ?

    starting PVE daemon : pvedaemon.
    starting WebServer : Apache2Syntax error on line 13 of /etc/apache2/sites-available/pve-redirect.conf
    SSLCertificateFile: file '/etc/pve/local/pve-ssl.pem does not exist or is empty
    Apache FAILED
    there is not file /etc/pve/local/pve-ssl.pem
    The file is here -> /etc/pve/nodes/srv-proxmox/pve-ssl.pem



    /etc/apache2/sites-available/pve-redirect.conf
    SSLCertificateFile /etc/pve/local/pve-ssl.pem
    SSLCertificateKeyFile /etc/pve/local/pve-ssl.key
    I change the path and reboot
    SSLCertificateFile /etc/pve/nodes/srv-proxmox/pve-ssl.pem
    SSLCertificateKeyFile /etc/pve/nodes/srv-proxmox/pve-ssl.key
    Display boot :
    Code:
    Starting CMAN OK
    waiting for quorum... timed-out waiting for cluster
    [FAILED]
    cluster not ready - no quorum ?
    starting PVE daemon : pvedaemon.
    starting WebServer : Apache2Syntax error on line 36 of /etc/apache2/sites-available/pve.conf
    SSLCertificateFile: file '/etc/pve/local/pve-ssl.pem does not exist or is empty
    I change the path of file "/etc/apache2/sites-available/pve.conf" and reboot
    SSLCertificateFile /etc/pve/nodes/srv-proxmox/pve-ssl.pem
    SSLCertificateKeyFile /etc/pve/nodes/srv-proxmox/pve-ssl.key
    OK for apache
    but I still have the error :
    waiting for quorum... timed-out waiting for cluster
    [FAILED]
    cluster not ready - no quorum ?
    =>
    startpar: service(s) returned failure : cman qemu-server failed

  6. #6
    Join Date
    Apr 2005
    Location
    Austria
    Posts
    11,806

    Default Re: Cluster : cmantool : Problem or not ?

    Quote Originally Posted by rsf View Post
    Ok but ...

    cluster :
    srv-proxmox 10.1.25.212
    network vlan (tagged)

    node :
    srv-proxmox2 10.1.25.118
    network Untagged
    Either tag both or none.

  7. #7
    Join Date
    Feb 2012
    Posts
    19

    Default Re: Cluster : cmantool : Problem or not ?

    Can you explain why this is not possible ?

    all pc are Untagged and the server is tagged and it works !!
    This is the switch that forwards packets

    cluster : srv-proxmox 10.1.25.212 and node : srv-proxmox2 10.1.25.118 are on the same network, on the same vlan

    sorry for google traduction

  8. #8
    Join Date
    Apr 2005
    Location
    Austria
    Posts
    11,806

    Default Re: Cluster : cmantool : Problem or not ?

    Quote Originally Posted by rsf View Post
    Can you explain why this is not possible ?
    I simple never tested that. So if you have problems, I would first test without such settings.

  9. #9
    Join Date
    Feb 2012
    Posts
    19

    Default Re: Cluster : cmantool : Problem or not ?

    OK !

    all tests are done with the same material without vlan

    Test I
    1 Installation cd Proxmox V2 beta
    2 configuration network
    3 update
    4 create cluster
    => same error.
    Test II
    1 Installation cd Proxmox V2 beta
    2 configuration network
    3 no update
    4 create cluster
    => OK ! Yes !!
    Test III
    1 Installation cd Proxmox V2 RC1
    2 configuration network
    3 no update
    4 create cluster
    => same error.

    Error =>
    [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
    [dcdb] crit: cman_tool version failed with exit code 1#010
    Repeat Test II and join the srv-proxmox2
    grrr same pb of quorum

    Following Monday ...

  10. #10
    Join Date
    Feb 2012
    Posts
    19

    Default Re: Cluster : cmantool : Problem or not ?

    You made some updates since last Friday, no ?
    I have no error for apache configuration (symlink OK : /etc/pve/local -> /etc/pve/clusters/srv-proxmox)

    The warning and error still exist
    [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
    [dcdb] crit: cman_tool version failed with exit code 1#010
    But there is apparently no effect
    It's just "unfortunate" (not sure of the term)

    Ok let's go
    still tests ...

    As a reminder :
    srv-proxmox (cluster and node)
    dell poweredge T310
    3 network interfaces
    network interface active -> eth2
    vlan Untagged
    srv-proxmox2 (node)
    basic pc
    2 network interfaces
    network interface active -> eth1
    vlan Untagged
    Servers are on the same network
    Multicast tested and OK

    Test xx
    1 Installation cd Proxmox V2RC1 on srv-proxmox (cluster & node 1)
    2 configuration network
    3 update
    4 create cluster
    =>
    [dcdb] notice: wrote new cluster config '/etc/cluster/cluster.conf'
    [dcdb] crit: cman_tool version failed with exit code 1#010
    Checking if cluster has been disabled at boot... [ OK ]
    Checking Network Manager... [ OK ]
    Global setup... [ OK ]
    Loading kernel modules... [ OK ]
    Mounting configfs... [ OK ]
    Starting cman... [ OK ]
    Waiting for quorum... [ OK ]
    Starting fenced... [ OK ]
    Starting dlm_controld... [ OK ]
    Unfencing self... [ OK ]
    5 Installation cd Proxmox V2RC1 on srv-proxmox2 (node 2)
    6 configuration network
    7 update
    8 join the cluster
    =>
    Checking if cluster has been disabled at boot... [ OK ]
    Checking Network Manager... [ OK ]
    Global setup... [ OK ]
    Loading kernel modules... [ OK ]
    Mounting configfs... [ OK ]
    Starting cman... [ OK ]
    Waiting for quorum... [ OK ]
    Starting fenced... [ OK ]
    Starting dlm_controld... [ OK ]
    Unfencing self... [ OK ]
    cluster not ready - no quorum?
    why does it display this message -> cluster not ready - no quorum?

    If I restart one of the machines, I again a quorum problem.
    waiting for quorum... timed-out waiting for cluster
    [FAILED]
    cluster not ready - no quorum ?
    pvecm status
    Version: 6.2.0
    Config Version: 2
    Cluster Name: asgard
    Cluster Id: 6484
    Cluster Member: Yes
    Cluster Generation: 28
    Membership state: Cluster-Member
    Nodes: 1
    Expected votes: 2
    Total votes: 1
    Node votes: 1
    Quorum: 2 Activity blocked
    Active subsystems: 5
    Flags:
    Ports Bound: 0
    Node name: srv-proxmox2
    Node ID: 2
    Multicast addresses: 239.192.25.109
    Node addresses: 10.1.25.118
    if I stop and start service cman (/etc/init.d/cman), it's ok
    /etc/init.d/cman start
    Starting cluster:
    Checking if cluster has been disabled at boot... [ OK ]
    Checking Network Manager... [ OK ]
    Global setup... [ OK ]
    Loading kernel modules... [ OK ]
    Mounting configfs... [ OK ]
    Starting cman... [ OK ]
    Waiting for quorum... [ OK ]
    Starting fenced... [ OK ]
    Starting dlm_controld... [ OK ]
    Unfencing self... [ OK ]
    pvecm status
    Version: 6.2.0
    Config Version: 2
    Cluster Name: asgard
    Cluster Id: 6484
    Cluster Member: Yes
    Cluster Generation: 36
    Membership state: Cluster-Member
    Nodes: 2
    Expected votes: 2
    Total votes: 2
    Node votes: 1
    Quorum: 2
    Active subsystems: 5
    Flags:
    Ports Bound: 0
    Node name: srv-proxmox
    Node ID: 1
    Multicast addresses: 239.192.25.109
    Node addresses: 10.1.25.212
    Why "cman" can not find a quorum to boot ?

  11. #11
    Join Date
    Apr 2005
    Location
    Austria
    Posts
    11,806

    Default Re: Cluster : cmantool : Problem or not ?

    I guess I found the bug - will try to upload a fix soon.

  12. #12
    Join Date
    Feb 2012
    Posts
    19

    Default Re: Cluster : cmantool : Problem or not ?

    srv-proxmox (cluster and node)
    dell poweredge T310
    3 network interfaces
    network interface active -> eth2
    vlan Untagged
    srv-proxmox2 (node)
    basic pc
    2 network interfaces
    network interface active -> eth1
    vlan Untagged
    Servers are on the same network
    Multicast tested and OK


    Test 09
    1 Installation cd Proxmox V2 RC1 on srv-proxmox (cluster & node 1)
    2 configuration network
    3 update
    4 create cluster
    Checking if cluster has been disabled at boot... [ OK ]
    Checking Network Manager... [ OK ]
    Global setup... [ OK ]
    Loading kernel modules... [ OK ]
    Mounting configfs... [ OK ]
    Starting cman... [ OK ]
    Waiting for quorum... [ OK ]
    Starting fenced... [ OK ]
    Starting dlm_controld... [ OK ]
    Unfencing self... [ OK ]
    => Ok no problem

    5 Installation cd Proxmox V2RC1 on srv-proxmox2 (node 2)
    6 configuration network
    7 update
    8 join the cluster
    waiting for quorum... timed-out waiting for cluster
    [FAILED]
    waiting for quorum...
    => I waited several minutes and then restart the service cman (/etc/init.d/cman restart) on node1 server and node2 finds quorum

    9 restart node2
    10 wait
    waiting for quorum... timed-out waiting for cluster
    [FAILED]
    => no good

    11 restart node2
    waiting for quorum...
    12 restart cman on node1
    => node2 finds quorum

    An idea ??

    edit :
    node1 /etc/pve/nodes/srv-proxmox1 and etc/pve/nodes/srv-proxmox2
    node2 /etc/pve/nodes/srv-proxmox2 only
    13 I reversed :node2 ready and restart node1
    =>node1 finds quorum
    14 restart node1
    =>node1 doesn't find quorum
    node1 /etc/pve/nodes/srv-proxmox1 and etc/pve/nodes/srv-proxmox2
    node2 /etc/pve/nodes/srv-proxmox1 and etc/pve/nodes/srv-proxmox2

    15 node1 ready and restart node2
    => node2 doesn't find quorum and problem apache (/etc/pve/nodes/...)
    16 node1 ready and restart node2
    17 restart cman on node1 when node2 waiting quorum
    => node2 finds quorum but there is problem apache (/etc/pve/nodes/...)

    I'm going crazy !
    Last edited by rsf; 03-21-2012 at 10:04 AM. Reason: additional information

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •