New Kernel and bug fixes

the forum does not loose posts. post from new member are moderated, if you post you will see a short note.

as soon as you are a valued member of the forum, your posts will be visible immediately without moderation.

Okay, but it just took REALLY long, and then sent me back to the http://forum.proxmox.com/forums/16-Proxmox-VE-2-x-Installation-and-configuration page without any message. But I'll remember to wait much longer next time. Other vBulletin forums send you to the thread you posted in, so I didn't expect that.
 
Code:
Aug 29 19:19:32 corosync [TOTEM ] FAILED TO RECEIVE

This error causes cman/corosync to exit. Do you use iptables (see http://forum.proxmox.com/threads/8665-cman-keeps-crashing)?

Nope. And I've seen that thread.

Code:
root@bcvm3:~# iptables --list -v --line-numbers
Chain INPUT (policy ACCEPT 1227K packets, 678M bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 959K packets, 178M bytes)
num   pkts bytes target     prot opt in     out     source               destination

Since this is no longer about the new kernel release, I created a new thread http://forum.proxmox.com/threads/10986-problem-with-quorum-and-cman-stopping
 
Intel driver in this kernel 2.6.32-14 dosen't work with an Intel 82579V network adapter (the same occurs also with 2.6.32-13 kernel)
This will be the future Intel driver for new kernel releases ?
ANyone more with this problem ?
 
Hi everyone,

I did a full-upgrade on a 4-nodes cluster (from 2.6.32-11 to 2.6.32-14) and just issue a problem very similar to http://forum.proxmox.com/threads/8624-How-to-remove-zombie-OpenVZ-container.

My problem was with CT using nfs mounts inside: they where not able to shutdown any more, and the only way to restart them was to reboot the node.
The 3 processes I could'nt kill (even with -9) were always :

$ vzps -E 100
VEID PID TTY TIME CMD
100 16108 ? 00:00:00 init
100 16109 ? 00:00:00 kthreadd/100
100 16127 ? 00:00:00 nfsiod/100

The only "solution" I found was to downgrade to pve-kernel-2.6.32-13-pve.

Does anyone met the same ?
Is it safe to downgrade ony the pve-kernel package ?

Thanks for your help.

$ pveversion -v
pve-manager: 2.1-14 (pve-manager/2.1/f32f3f46)
running kernel: 2.6.32-14-pve
proxmox-ve-2.6.32: 2.1-74
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.92-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.8-1
pve-cluster: 1.0-27
qemu-server: 2.0-49
pve-firmware: 1.0-18
libpve-common-perl: 1.0-30
libpve-access-control: 1.0-24
libpve-storage-perl: 2.0-31
vncterm: 1.0-3
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.1-8
ksm-control-daemon: 1.1-1
 
Last edited:
You are very welcome to provide a kernel package, the machine is still in burnin fase so I do not have to schedule downtime. I have changed GRUB_CMDLINE_LINUX_DEFAULT="quiet", run update-grub2 and rebooted.
The new kernel may or may not be needed. I have not experienced the network problem since then, and I have expanded with more virtual machines.
 
I just installed a Win7 and a Fedora 17 using virtio-scsi as boot disk. Both run without any problems. Seems you do something wrong.?

Hi Tom and all

Please anybody can help me

A question, with the lastest version of PVE 2.2 (updated in 11/07/12) is virtio-scsi ready for production enviroments for use with any Windows systems and with latest drivers version?

Best regards
Cesar
 
Last edited:
Please help me build this new cluster.

Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

clustat
Cluster Status for master3 @ Wed Dec 4 06:57:32 2013
Member Status: Inquorate


Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Offline
node1 3 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

The quorum disk is created on SAN.
Please kindly reply me .

Regards,
Nasim
 
Please help me build this new cluster.

Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

clustat
Cluster Status for master3 @ Wed Dec 4 06:57:32 2013
Member Status: Inquorate


Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Offline
node1 3 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

The quorum disk is created on SAN.
Please kindly reply me .

Regards,
Nasim

Try starting with run (tell to the system on this Server that only one vote of quorum is required):
shell> pvecm e 1
shell> service pve-cluster restart
 
I have done this still there is no change .
The cluster is created on node3. i.e #pvecm create master
The node 3 is running fine, GUI is ok.
But other two nodes in GUI is showing red.
And the error is
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

root@node1:~# clustat
Cluster Status for master3 @ Wed Dec 4 09:03:07 2013
Member Status: Inquorate


Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Offline
node1 3 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

root@node2:~# clustat
Cluster Status for master3 @ Wed Dec 4 09:07:00 2013
Member Status: Inquorate


Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

please reply.
Regards,
Nasim
 
root@node3:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2


root@node1:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2


root@node2:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2


Should I change the expected_votes under cluster.conf ??
My expected votes is 4 showing there...

Regards,
Nasim
 
@nasim

UPDATED

In general mode about of your expected votes, is better that number is uneven (obviously the number of nodes that vote also), this helps in most cases of drop in nodes because always will have a majority of votes of Quorum (for example if 2 nodes are disconnected of the other 2 nodes, which group should win the Quorum if we have tie of votes?).

But about of your problem, if the nodes have his NICs for the PVE Cluster communication working properly (Realtek is garbage), try this:

In all PVE nodes, on the file /etc/network/interfaces for each vmbr[x] that you have, add the line in black for each appropriate vmbr[x]

This is a axample apply to vmbr0 (see that i have 2 lines that say vmbr0).
and you will have that replace vmbr0 for the appropriate number
auto vmbr0
iface vmbr0 inet manual
address xxx.xxx.xxx.xxx
netmask xxx.xxx.xxx.xxx
gateway xxx.xxx.xxx.xxx
bridge_ports eth4
bridge_stp off
bridge_fd 0
post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping

The line in black is a patch for avoid the problem between the PVE cluster comunication and the multicast snooping Error.

After you should restart the PVE Nodes for apply the changes.

If the problem persist, tell it again.
 
Last edited:
Sir the switch is not supporting multicasting . So I have used the transport="udpu" in the cluster.conf
I have added the line
post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping
in /etc/network/interfaces
I have only vmbr0 is there.
no bonding device is created.

Still my nodes are inactive on CLUSTER .

Regards,
Nasim
 
There is a difference with the output of "pvecm s" command on 3 nodes.

root@node1:~# pvecm s
Version: 6.2.0
Config Version: 13
Cluster Name: master3
Cluster Id: 13531
Cluster Member: Yes
Cluster Generation: 304
Membership state: Cluster-Member
Nodes: 1
Expected votes: 4
Total votes: 1
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 3
Flags:
Ports Bound: 0 178
Node name: node1
Node ID: 3
Multicast addresses: 255.255.255.255
Node addresses: XX.XX.XX.XX
root@node2:~# pvecm s
Version: 6.2.0
Config Version: 1333
Cluster Name: master3
Cluster Id: 13531
Cluster Member: Yes
Cluster Generation: 172
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Quorum device votes: 2
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0 178
Node name: node2
Node ID: 2
Multicast addresses: 255.255.255.255
Node addresses: XX.XX.XX.XX

root@node3:~# pvecm s
Version: 6.2.0
Config Version: 13
Cluster Name: master3
Cluster Id: 13531
Cluster Member: Yes
Cluster Generation: 412
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 7
Flags:
Ports Bound: 0 178
Node name: node3
Node ID: 1
Multicast addresses: 255.255.255.255
Node addresses: XX.XX.XX.XX

Is there any help with this ?

Regards,
Nasim
 
@nasim

"Of the 3 PVE Nodes" please show the output of:
cat /etc/pve/cluster.conf

Re-Edit: Ah, very important: before you add a node in unicast mode, make sure you add all other nodes in /etc/hosts "correctly". Do you have added with this previous step?.
 
Last edited:
Sir,
I have added these nodes using the command "pvecm add node[X]"
One node I have used --force flag because it was unable to copy ssh id.
There are total of 3 nodes. I'm also showing the out put of /etc/hosts here.

root@node1:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="13" name="master3">
<cman expected_votes="4" keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
<quorumd allow_kill="0" interval="1" label="qdisk" tko="10" votes="2"/>
<totem token="54000"/>
<clusternodes>
<clusternode name="node3" nodeid="1" votes="1"/>
<clusternode name="node2" nodeid="2" votes="1"/>
<clusternode name="node1" votes="1" nodeid="3"/></clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>


root@node1:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
XX.XX.XX.XX node1.kagilum.com node1 pvelocalhost


10.90.103.206 node2.kagilum.com node2
10.90.103.205 node3.kagilum.com node3


# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

-------------------------------------------

root@node2:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster name="master3" config_version="13">


<cman expected_votes="4" keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu">
</cman>
<quorumd votes="2" allow_kill="0" interval="1" label="qdisk" tko="10" votes="2"/>
<totem token="54000"/>


<clusternodes>
<clusternode name="node3" votes="1" nodeid="1"/>
<clusternode name="node2" votes="1" nodeid="2"/></clusternodes>


</cluster>

root@node2:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
XX.XX.XX.XX node2.kagilum.com node2 pvelocalhost


10.90.103.207 node1.kagilum.com node1
10.90.103.205 node3.kagilum.com node3


# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

-------------------------------------------

root@node3:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="13" name="master3">
<cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
<quorumd allow_kill="0" interval="1" label="qdisk" tko="10" votes="2"/>
<totem token="54000"/>
<clusternodes>
<clusternode name="node3" nodeid="1" votes="1"/>
<clusternode name="node2" nodeid="2" votes="1"/>
<clusternode name="node1" votes="1" nodeid="3"/></clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>


root@node3:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
XX.XX.XX.XX node3.kagilum.com node3 pvelocalhost


10.90.103.207 node1.kagilum.com node1
10.90.103.206 node2.kagilum.com node2


# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
root@node3:~#

XX.XX.XX.XX is the public ip bound to eth0 and other 10.90.XX.XX private ip bound to eth1.
pvecm create [master3] was implemented on node3..
And on the other two nodes I have applied pvecm add node3 and proceed.

Regards,
Nasim
 
Last edited by a moderator:
@nasim

As i can see it, the error is on the content of you cluster.conf file, this file should be the same in all PVE Nides !!!.
I think that you do not respect the syntax of this configuration file and the rules of the cluster configuration..

My suggestion are:

A) Rules generals tha you need to know:
1- Don't use your Quorum Disk unless necessary to complete a group of odd nodes (or add a Node PVE more to Cluster)
2- The /etc/pve/cluster.conf file should be the same in all Nodes of a same PVE cluster.
3- "Always" that you want modify the /etc/pve/cluster.conf file, you should know that do:
3.1- Run: cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
3.2- Modify the /etc/pve/cluster.conf.new file according to your needs
3.3- Into this file, for the config_version value you must add an integer and put this number replacing the previous
3.4- On the PVE GUI of the same Node click in the option "Activate" of the tag "HA"
Note: if you don't have errors, the new config will be activated in all PVE Nodes of the same Cluster, otherwise not take effect. This option is only for check the syntax, but you should know what you are doing

B) For your case:
1- You must correct the settings of your cluster.conf file, and think that should do it in the PVE Node that have Qurum (light green on PVE GUI), then the other Nodes can take the configuration of this Node with the configuration modified (for do it more simple, may be necessary restart the other Nodes)

I don't have practice with unicast, but for help you, I show a cluster.conf file that should work in multicast mode, (then for the unicast, you must adapt it to your needs, and the <rm/> key may be unnecessary):

<?xml version="1.0"?>
<cluster name="master3" config_version="14">

<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>

<clusternodes>
<clusternode name="node1" votes="1" nodeid="3"/>
<clusternode name="node2" votes="1" nodeid="2"/>
<clusternode name="node3" votes="1" nodeid="1"/>
</clusternode>
</clusternodes>

<rm/>


Awaiting that this help you, i say goodbye
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!