New Kernel and bug fixes

peetaur · Aug 31, 2012

tom said:
the forum does not loose posts. post from new member are moderated, if you post you will see a short note.

as soon as you are a valued member of the forum, your posts will be visible immediately without moderation.

Okay, but it just took REALLY long, and then sent me back to the http://forum.proxmox.com/forums/16-Proxmox-VE-2-x-Installation-and-configuration page without any message. But I'll remember to wait much longer next time. Other vBulletin forums send you to the thread you posted in, so I didn't expect that.

peetaur · Aug 31, 2012

dietmar said:
Code:

Aug 29 19:19:32 corosync [TOTEM ] FAILED TO RECEIVE

This error causes cman/corosync to exit. Do you use iptables (see http://forum.proxmox.com/threads/8665-cman-keeps-crashing)?

Nope. And I've seen that thread.

Code:

root@bcvm3:~# iptables --list -v --line-numbers
Chain INPUT (policy ACCEPT 1227K packets, 678M bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 959K packets, 178M bytes)
num   pkts bytes target     prot opt in     out     source               destination

Since this is no longer about the new kernel release, I created a new thread http://forum.proxmox.com/threads/10986-problem-with-quorum-and-cman-stopping

xagaba · Sep 3, 2012

Intel driver in this kernel 2.6.32-14 dosen't work with an Intel 82579V network adapter (the same occurs also with 2.6.32-13 kernel)
This will be the future Intel driver for new kernel releases ?
ANyone more with this problem ?

x3w · Sep 13, 2012

Hi everyone,

I did a full-upgrade on a 4-nodes cluster (from 2.6.32-11 to 2.6.32-14) and just issue a problem very similar to http://forum.proxmox.com/threads/8624-How-to-remove-zombie-OpenVZ-container.

My problem was with CT using nfs mounts inside: they where not able to shutdown any more, and the only way to restart them was to reboot the node.
The 3 processes I could'nt kill (even with -9) were always :

$ vzps -E 100
VEID PID TTY TIME CMD
100 16108 ? 00:00:00 init
100 16109 ? 00:00:00 kthreadd/100
100 16127 ? 00:00:00 nfsiod/100

The only "solution" I found was to downgrade to pve-kernel-2.6.32-13-pve.

Does anyone met the same ?
Is it safe to downgrade ony the pve-kernel package ?

Thanks for your help.

$ pveversion -v
pve-manager: 2.1-14 (pve-manager/2.1/f32f3f46)
running kernel: 2.6.32-14-pve
proxmox-ve-2.6.32: 2.1-74
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.92-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.8-1
pve-cluster: 1.0-27
qemu-server: 2.0-49
pve-firmware: 1.0-18
libpve-common-perl: 1.0-30
libpve-access-control: 1.0-24
libpve-storage-perl: 2.0-31
vncterm: 1.0-3
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.1-8
ksm-control-daemon: 1.1-1

JonB · Sep 17, 2012

JonB said:
You are very welcome to provide a kernel package, the machine is still in burnin fase so I do not have to schedule downtime. I have changed GRUB_CMDLINE_LINUX_DEFAULT="quiet", run update-grub2 and rebooted.

The new kernel may or may not be needed. I have not experienced the network problem since then, and I have expanded with more virtual machines.

cesarpk · Nov 7, 2012

tom said:
I just installed a Win7 and a Fedora 17 using virtio-scsi as boot disk. Both run without any problems. Seems you do something wrong.?

Hi Tom and all

Please anybody can help me

A question, with the lastest version of PVE 2.2 (updated in 11/07/12) is virtio-scsi ready for production enviroments for use with any Windows systems and with latest drivers version?

Best regards
Cesar

glena · Nov 21, 2012

dietmar said:
Does a reload, or logout/login helps?

I also am having this issue. How do I fix it?

Thanks,

-Glen

nasim · Dec 4, 2013

Please help me build this new cluster.

Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

clustat
Cluster Status for master3 @ Wed Dec 4 06:57:32 2013
Member Status: Inquorate

Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Offline
node1 3 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

The quorum disk is created on SAN.
Please kindly reply me .

Regards,
Nasim

cesarpk · Dec 4, 2013

nasim said:
Please help me build this new cluster.

Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

clustat
Cluster Status for master3 @ Wed Dec 4 06:57:32 2013
Member Status: Inquorate

Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Offline
node1 3 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

The quorum disk is created on SAN.
Please kindly reply me .

Regards,
Nasim

Try starting with run (tell to the system on this Server that only one vote of quorum is required):
shell> pvecm e 1
shell> service pve-cluster restart

nasim · Dec 4, 2013

I have done this still there is no change .
The cluster is created on node3. i.e #pvecm create master
The node 3 is running fine, GUI is ok.
But other two nodes in GUI is showing red.
And the error is
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Starting qdiskd... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

root@node1:~# clustat
Cluster Status for master3 @ Wed Dec 4 09:03:07 2013
Member Status: Inquorate

Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Offline
node1 3 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

root@node2:~# clustat
Cluster Status for master3 @ Wed Dec 4 09:07:00 2013
Member Status: Inquorate

Member Name ID Status
------ ---- ---- ------
node3 1 Offline
node2 2 Online, Local
/dev/block/8:17 0 Offline, Quorum Disk

please reply.
Regards,
Nasim

cesarpk · Dec 4, 2013

@nasim

what is the output of "pveversion -v"?

nasim · Dec 4, 2013

root@node3:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2

root@node1:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2

root@node2:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2

Should I change the expected_votes under cluster.conf ??
My expected votes is 4 showing there...

Regards,
Nasim

cesarpk · Dec 4, 2013

@nasim

UPDATED

In general mode about of your expected votes, is better that number is uneven (obviously the number of nodes that vote also), this helps in most cases of drop in nodes because always will have a majority of votes of Quorum (for example if 2 nodes are disconnected of the other 2 nodes, which group should win the Quorum if we have tie of votes?).

But about of your problem, if the nodes have his NICs for the PVE Cluster communication working properly (Realtek is garbage), try this:

In all PVE nodes, on the file /etc/network/interfaces for each vmbr[x] that you have, add the line in black for each appropriate vmbr[x]

This is a axample apply to vmbr0 (see that i have 2 lines that say vmbr0).
and you will have that replace vmbr0 for the appropriate number
auto vmbr0
iface vmbr0 inet manual
address xxx.xxx.xxx.xxx
netmask xxx.xxx.xxx.xxx
gateway xxx.xxx.xxx.xxx
bridge_ports eth4
bridge_stp off
bridge_fd 0
post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping

The line in black is a patch for avoid the problem between the PVE cluster comunication and the multicast snooping Error.

After you should restart the PVE Nodes for apply the changes.

If the problem persist, tell it again.

nasim · Dec 4, 2013

Sir the switch is not supporting multicasting . So I have used the transport="udpu" in the cluster.conf
I have added the line
post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping
in /etc/network/interfaces
I have only vmbr0 is there.
no bonding device is created.

Still my nodes are inactive on CLUSTER .

Regards,
Nasim

nasim · Dec 4, 2013

There is a difference with the output of "pvecm s" command on 3 nodes.

root@node1:~# pvecm s
Version: 6.2.0
Config Version: 13
Cluster Name: master3
Cluster Id: 13531
Cluster Member: Yes
Cluster Generation: 304
Membership state: Cluster-Member
Nodes: 1
Expected votes: 4
Total votes: 1
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 3
Flags:
Ports Bound: 0 178
Node name: node1
Node ID: 3
Multicast addresses: 255.255.255.255
Node addresses: XX.XX.XX.XX
root@node2:~# pvecm s
Version: 6.2.0
Config Version: 1333
Cluster Name: master3
Cluster Id: 13531
Cluster Member: Yes
Cluster Generation: 172
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Quorum device votes: 2
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0 178
Node name: node2
Node ID: 2
Multicast addresses: 255.255.255.255
Node addresses: XX.XX.XX.XX

root@node3:~# pvecm s
Version: 6.2.0
Config Version: 13
Cluster Name: master3
Cluster Id: 13531
Cluster Member: Yes
Cluster Generation: 412
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 7
Flags:
Ports Bound: 0 178
Node name: node3
Node ID: 1
Multicast addresses: 255.255.255.255
Node addresses: XX.XX.XX.XX

Is there any help with this ?

Regards,
Nasim

cesarpk · Dec 4, 2013

@nasim

"Of the 3 PVE Nodes" please show the output of:
cat /etc/pve/cluster.conf

Re-Edit: Ah, very important: before you add a node in unicast mode, make sure you add all other nodes in /etc/hosts "correctly". Do you have added with this previous step?.

nasim · Dec 4, 2013

Sir,
I have added these nodes using the command "pvecm add node[X]"
One node I have used --force flag because it was unable to copy ssh id.
There are total of 3 nodes. I'm also showing the out put of /etc/hosts here.

root@node1:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="13" name="master3">
<cman expected_votes="4" keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
<quorumd allow_kill="0" interval="1" label="qdisk" tko="10" votes="2"/>
<totem token="54000"/>
<clusternodes>
<clusternode name="node3" nodeid="1" votes="1"/>
<clusternode name="node2" nodeid="2" votes="1"/>
<clusternode name="node1" votes="1" nodeid="3"/></clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>

root@node1:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
XX.XX.XX.XX node1.kagilum.com node1 pvelocalhost

10.90.103.206 node2.kagilum.com node2
10.90.103.205 node3.kagilum.com node3

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

-------------------------------------------

root@node2:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster name="master3" config_version="13">

<cman expected_votes="4" keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu">
</cman>
<quorumd votes="2" allow_kill="0" interval="1" label="qdisk" tko="10" votes="2"/>
<totem token="54000"/>

<clusternodes>
<clusternode name="node3" votes="1" nodeid="1"/>
<clusternode name="node2" votes="1" nodeid="2"/></clusternodes>

</cluster>

root@node2:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
XX.XX.XX.XX node2.kagilum.com node2 pvelocalhost

10.90.103.207 node1.kagilum.com node1
10.90.103.205 node3.kagilum.com node3

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

-------------------------------------------

root@node3:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="13" name="master3">
<cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu"/>
<quorumd allow_kill="0" interval="1" label="qdisk" tko="10" votes="2"/>
<totem token="54000"/>
<clusternodes>
<clusternode name="node3" nodeid="1" votes="1"/>
<clusternode name="node2" nodeid="2" votes="1"/>
<clusternode name="node1" votes="1" nodeid="3"/></clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>

root@node3:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
XX.XX.XX.XX node3.kagilum.com node3 pvelocalhost

10.90.103.207 node1.kagilum.com node1
10.90.103.206 node2.kagilum.com node2

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
root@node3:~#

XX.XX.XX.XX is the public ip bound to eth0 and other 10.90.XX.XX private ip bound to eth1.
pvecm create [master3] was implemented on node3..
And on the other two nodes I have applied pvecm add node3 and proceed.

Regards,
Nasim

cesarpk · Dec 4, 2013

@nasim

As i can see it, the error is on the content of you cluster.conf file, this file should be the same in all PVE Nides !!!.
I think that you do not respect the syntax of this configuration file and the rules of the cluster configuration..

My suggestion are:

A) Rules generals tha you need to know:
1- Don't use your Quorum Disk unless necessary to complete a group of odd nodes (or add a Node PVE more to Cluster)
2- The /etc/pve/cluster.conf file should be the same in all Nodes of a same PVE cluster.
3- "Always" that you want modify the /etc/pve/cluster.conf file, you should know that do:
3.1- Run: cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
3.2- Modify the /etc/pve/cluster.conf.new file according to your needs
3.3- Into this file, for the config_version value you must add an integer and put this number replacing the previous
3.4- On the PVE GUI of the same Node click in the option "Activate" of the tag "HA"
Note: if you don't have errors, the new config will be activated in all PVE Nodes of the same Cluster, otherwise not take effect. This option is only for check the syntax, but you should know what you are doing

B) For your case:
1- You must correct the settings of your cluster.conf file, and think that should do it in the PVE Node that have Qurum (light green on PVE GUI), then the other Nodes can take the configuration of this Node with the configuration modified (for do it more simple, may be necessary restart the other Nodes)

I don't have practice with unicast, but for help you, I show a cluster.conf file that should work in multicast mode, (then for the unicast, you must adapt it to your needs, and the <rm/> key may be unnecessary):

<?xml version="1.0"?>
<cluster name="master3" config_version="14">

<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>

<clusternodes>
<clusternode name="node1" votes="1" nodeid="3"/>
<clusternode name="node2" votes="1" nodeid="2"/>
<clusternode name="node3" votes="1" nodeid="1"/>
</clusternode>
</clusternodes>

<rm/>

Awaiting that this help you, i say goodbye

Search

Search

New Kernel and bug fixes

peetaur

Active Member

peetaur

Active Member

xagaba

Member

x3w

Member

JonB

Member

cesarpk

Well-Known Member

glena

Member

nasim

Guest

cesarpk

Well-Known Member

nasim

Guest

cesarpk

Well-Known Member

nasim

Guest

cesarpk

Well-Known Member

nasim

Guest

nasim

Guest

cesarpk

Well-Known Member

nasim

Guest

cesarpk

Well-Known Member

We value your privacy