Hi,
I have three nodes in a cluster with Proxmox ve 2.1. Everything seems ok except that i notice in my corosync log some message like this:
And once a month, in two nodes, corosync fails
My cluster is now down. Only one node is green.
I got some problems with corosync and multicast. However, i tested multicast with asmping and saw no problem.
I think i got too much traffic on the interface used by corosync, the same used by several vm. I would like to know if it's possible to change the interface used by corosync (multicast) while the cluster is already configured and how can i do this ?
I post you some output of my cluster configuration:
Network: 4 cards, 2 bonds (one for vm and one for backup and qdisk in private IP and nfs), and on bond0 two vlans (prod and supervision)
/etc/pve/cluster.conf
Output after desynchronization of the cluster:
Node XORP:
On the other node:
Indeed, corosync failed... No more process...
And finally, the output of /etc/pve/.members when nodes are red (only xorp is green):
I can do a /etc/init.d/cman restart and /etc/init.d/pve-cluster restart but i always "got corosync [TOTEM ] Retransmit List" until corosync fails (~ 20 days after restart).
How can i modify my cluster configuration so that corosync traffic use the vmbr2 interface (vlan of supervision where traffic is lower) without any reinstall ? Is it possible and is it the good stuff?
Thanks
I have three nodes in a cluster with Proxmox ve 2.1. Everything seems ok except that i notice in my corosync log some message like this:
Code:
corosync [TOTEM ] Retransmit List: 4ea806 4ea7f6 4ea7f7 4ea7f8 4ea807 4ea808 4ea809 4ea80a 4ea7e7 4ea7e8 4ea7e9 4ea7ea 4ea7eb 4ea7fb 4ea7fc 4ea7fd 4ea7ec 4ea7ed 4ea7ee 4ea7ef 4ea7f0 4ea7f1 4ea7f2 4ea7f3
And once a month, in two nodes, corosync fails
Code:
corosync [TOTEM ] FAILED TO RECEIVE
My cluster is now down. Only one node is green.
I got some problems with corosync and multicast. However, i tested multicast with asmping and saw no problem.
I think i got too much traffic on the interface used by corosync, the same used by several vm. I would like to know if it's possible to change the interface used by corosync (multicast) while the cluster is already configured and how can i do this ?
I post you some output of my cluster configuration:
Network: 4 cards, 2 bonds (one for vm and one for backup and qdisk in private IP and nfs), and on bond0 two vlans (prod and supervision)
Code:
auto lo
iface lo inet loopback
iface eth0 inet manual
iface eth1 inet manual
iface eth2 inet manual
iface eth3 inet manual
auto bond0
iface bond0 inet manual
slaves eth0 eth1
bond_miimon 100
bond_mode 802.3ad
auto bond0.502
iface bond0.502 inet manual
vlan-raw-device bond0
auto bond0.64
iface bond0.64 inet manual
vlan-raw-device bond0
auto bond1
iface bond1 inet manual
slaves eth2 eth3
bond_miimon 100
bond_mode 802.3ad
auto vmbr0
iface vmbr0 inet static
address *.*.*.150
netmask 255.255.255.0
gateway *.*.*.*
bridge_ports bond0.502
bridge_stp off
bridge_fd 0
auto vmbr1
iface vmbr1 inet static
address 192.168.*.*
netmask 255.255.255.0
bridge_ports bond1
bridge_stp off
bridge_fd 0
auto vmbr2
iface vmbr2 inet static
address 172.29.11.251
netmask 255.255.255.0
bridge_ports bond0.64
bridge_stp off
bridge_fd 0
/etc/pve/cluster.conf
Code:
<?xml version="1.0"?>
<cluster name="CLUSTER-DI-01" config_version="5">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>
<clusternodes>
<clusternode name="xorp" votes="1" nodeid="1"/>
<clusternode name="loo" votes="1" nodeid="2"/><clusternode name="prox" votes="1" nodeid="3"/></clusternodes>
</cluster>
Output after desynchronization of the cluster:
Node XORP:
Code:
root@xorp:~# pvecm status
Version: 6.2.0
Config Version: 5
Cluster Name: CLUSTER-DI-01
Cluster Id: 8813
Cluster Member: Yes
Cluster Generation: 3752
Membership state: Cluster-Member
Nodes: 1
Expected votes: 2
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: xorp
Node ID: 1
Multicast addresses: 239.192.34.143
Node addresses: *.*.*.150
root@xorp:~# pvecm nodes
Node Sts Inc Joined Name
1 M 3728 2012-06-06 15:51:35 xorp
2 X 3740 loo
3 X 3748 prox
root@xorp:~# clustat
Cluster Status for CLUSTER-DI-01 @ Mon Jul 2 14:18:35 2012
Member Status: Inquorate
Member Name ID Status
------ ---- ---- ------
xorp 1 Online, Local
loo 2 Offline
prox 3 Offline
On the other node:
Code:
# pvecm status
cman_tool: Cannot open connection to cman, is it running ?
And finally, the output of /etc/pve/.members when nodes are red (only xorp is green):
Code:
root@loo:~# cat /etc/pve/.members
{
"nodename": "loo",
"version": 7,
"cluster": { "name": "CLUSTER-DI-01", "version": 5, "nodes": 3, "quorate": 1 },
"nodelist": {
"xorp": { "id": 1, "online": 1, "ip": "*.*.*.150"},
"loo": { "id": 2, "online": 1, "ip": "*.*.*.157"},
"prox": { "id": 3, "online": 1, "ip": "*.*.*.155"}
}
}
root@xorp:~# cat /etc/pve/.members
{
"nodename": "xorp",
"version": 9,
"cluster": { "name": "CLUSTER-DI-01", "version": 5, "nodes": 3, "quorate": 0 },
"nodelist": {
"xorp": { "id": 1, "online": 1, "ip": "*.*.*.150"},
"loo": { "id": 2, "online": 0, "ip": "*.*.*.157"},
"prox": { "id": 3, "online": 0, "ip": "*.*.*.155"}
}
}
root@prox:~# cat /etc/pve/.members
{
"nodename": "prox",
"version": 18,
"cluster": { "name": "CLUSTER-DI-01", "version": 5, "nodes": 3, "quorate": 1 },
"nodelist": {
"xorp": { "id": 1, "online": 1, "ip": "*.*.*.150"},
"loo": { "id": 2, "online": 0, "ip": "*.*.*.157"},
"prox": { "id": 3, "online": 1, "ip": "*.*.*.155"}
}
}
I can do a /etc/init.d/cman restart and /etc/init.d/pve-cluster restart but i always "got corosync [TOTEM ] Retransmit List" until corosync fails (~ 20 days after restart).
How can i modify my cluster configuration so that corosync traffic use the vmbr2 interface (vlan of supervision where traffic is lower) without any reinstall ? Is it possible and is it the good stuff?
Thanks