New Proxmox VE 1.6 kernels (2.6.32 and 2.6.35)

martin · Oct 27, 2010

We just released two new kernels to the stable repository.

Release notes:

pve-kernel-2.6.32 (2.6.32-25)
- upgrade to debian squeeze kernel 2.6.32-26
- update config (original debian config now includes ISDN4LINUX)
- enable CONFIG_NF_CONNTRACK_IPV6
- install latest igb driver (igb-2.3.4.tar.gz)

pve-kernel-2.6.35 (2.6.35-7)
- update to Ubuntu-2.6.35-23.36

How to get the latest version:
Just run 'aptitude update' and 'aptitude safe-upgrade'
__________________
Best regards,
Martin Maurer

ozgurerdogan · Nov 1, 2010

Hi,
I switched from .32 to .35 and wanted to upgrade to 2.6.35-7.

After upgrade my pveversion -v:

Code:

pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.6-7
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-22
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-14
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-8
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.5-2
ksm-control-daemon: 1.0-4

Is my active kernel 2.6.35-7?

Also wondering why do I see two kernels;
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5
is this because I switched from .32?

Can you please shortly describe followings:
pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.6-7
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5

Thanks in advance.

Tipika · Nov 2, 2010

Maybe it is a silly question, but i am wondering is those kernels are patched for that exploit http://linux.slashdot.org/story/10/09/20/0217204/Linux-Kernel-Exploit-Busily-Rooting-64-Bit-Machines

Maybe won't affect proxmox ve?

tom · Nov 2, 2010

our kernels are patched.

jinjer · Nov 8, 2010

I've been running 2.6.32-4-pve and have had the following error that occurs at boot consistently:

------------[ cut here ]------------
WARNING: at kernel/irq/manage.c:274 enable_irq+0x48/0x7c()
Hardware name: X7DCL
Unbalanced enable for IRQ 17
Modules linked in: ehci_hcd uhci_hcd it8213(+) ahci ide_core libata e1000e usbcore nls_base thermal fan thermal_sys
Pid: 462, comm: modprobe Not tainted 2.6.32-4-pve #1
Call Trace:
[<ffffffff81097115>] ? enable_irq+0x48/0x7c
[<ffffffff81097115>] ? enable_irq+0x48/0x7c
[<ffffffff8104dfcc>] ? warn_slowpath_common+0x77/0xa3
[<ffffffff8104e054>] ? warn_slowpath_fmt+0x51/0x59
[<ffffffff81016d63>] ? native_read_tsc+0x2/0x11
[<ffffffff8105bb0b>] ? msleep+0x14/0x1e
[<ffffffffa00b0463>] ? do_probe+0xb5/0x1e6 [ide_core]
[<ffffffff81180eb9>] ? delay_tsc+0x30/0x73
[<ffffffff81097115>] ? enable_irq+0x48/0x7c
[<ffffffffa00b0b11>] ? ide_probe_port+0x57d/0x5ab [ide_core]
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-8: ST3500418AS, CC46, max UDMA/133
ata1.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA ST3500418AS CC46 PQ: 0 ANSI: 5
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: [<ffffffffa00b0e3a>] ? ide_host_register+0x270/0x61e [ide_core]
[<ffffffffa00b4e7d>] ? ide_pci_init_two+0x4e5/0x5a8 [ide_core]
[<ffffffff8117a4a8>] ? ida_get_new_above+0xf5/0x1b3
[<ffffffff8117a2be>] ? idr_get_empty_slot+0x16d/0x262
[<ffffffff8117a4a8>] ? ida_get_new_above+0xf5/0x1b3
[<ffffffff81143c2d>] ? sysfs_new_dirent+0x4a/0xf7
[<ffffffff81102224>] ? iput+0x27/0x60
[<ffffffff81143ded>] ? sysfs_addrm_finish+0x66/0x230
[<ffffffff8118dcf2>] ? local_pci_probe+0x12/0x16
[<ffffffff8118e942>] ? pci_device_probe+0xc0/0xe9
[<ffffffff812115d4>] ? driver_probe_device+0xa3/0x14b
[<ffffffff812116cb>] ? __driver_attach+0x4f/0x6f
[<ffffffff8121167c>] ? __driver_attach+0x0/0x6f
[<ffffffff81210ea3>] ? bus_for_each_dev+0x43/0x74
[<ffffffff81210863>] ? bus_add_driver+0xaf/0x1f8
[<ffffffff81211983>] ? driver_register+0xa7/0x111
[<ffffffffa000c000>] ? it8213_ide_init+0x0/0x1a [it8213]
[<ffffffff8118eb88>] ? __pci_register_driver+0x50/0xb8
[<ffffffffa000c000>] ? it8213_ide_init+0x0/0x1a [it8213]
[<ffffffff8100a065>] ? do_one_initcall+0x64/0x174
[<ffffffff81082e6b>] ? sys_init_module+0xc5/0x21a
[<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
---[ end trace df01b595cb07d1f2 ]---

Is the new update to 2.6.32-25 able to fix it?

jinjer

tom · Nov 8, 2010

don´t know but just try the latest pve-kernel-2.6.32-4-pve: 2.6.32-26 (from pvetest).

jinjer · Nov 9, 2010

I'm not sure about -26- but -25- has the same behaviour. The problem appers on all my supermicro X7DCL-I servers. They're all Dual-Xeon E5405 (8 core total) with 12GB ram.

I've been fighting stability issues for some time with the disk subsystem (huge problems with ocfs2+drbd when load is applied to the system).

I have another configuration (single xeons E5335 on a X7DAL-E mobo with 8G ram) that does not have the above problem.

The X7DAL-E is an Intel 5000X while the X7DCL-i has a 5100 bridge. Also south bridge (and disk controllers) are much different.

I don't think it's a proxmox issue, but I have to verify with a clean debian kernel (it will take time.. I need some sleep).

jinjer

jinjer · Nov 10, 2010

I would like to add that on the "good" system I have lots of READ DMA EXT errors:

Code:

ata2.00: failed command: READ DMA EXT
ata2.00: cmd 25/00:80:c2:ef:6d/00:00:37:00:00/e0 tag 0 dma 65536 in
         res 51/04:0f:c2:ef:6d/00:00:00:00:00/e0 Emask 0x1 (device error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ABRT }
ata2.00: configured for UDMA/33
ata2: EH complete

This is on tested drives (no issues there) and smartctl reports are fine (selftest passes).
I had no such issues with older kernels, and 2.6.32 seems to be the culprit.

I'm wondering if it's just me that has so many issues or is it that nobody has updated to 2.6.32 (using soft-raid)?

I think I will need to downgrade the kernel to 2.6.24 series and report back

jinjer

jschellhaass · Nov 10, 2010

The "configured for UDMA/33" makes me thinks it's related to your cd/dvd drive or you have a bad cable somewhere.

jeff

jinjer · Nov 12, 2010

jschellhaass said:
The "configured for UDMA/33" makes me thinks it's related to your cd/dvd drive or you have a bad cable somewhere.

jeff

I traced the issue to an obscure setting in the X7DCL bios about PCI-E compatibility enforcement (?!?) which purpose I can only guess. Setting it to "enforce compatibility" fixed the problem and increased throughput on the disks immensely. I also think this slowness of the disks was uncovered an issue in ocfs2 code (but I still need to prove it)

These mainboard were used previously with kernel series 2.6.18 so I don't know when exactly this issue was introduced.

I'm thinking of using proxmox in a small production cluster so testing intensively now. I'll report if I find more issues.

thank you,
jinjer

SuSt · Nov 30, 2010

jinjer said:
I've been fighting stability issues for some time with the disk subsystem (huge problems with ocfs2+drbd when load is applied to the system).

I also have stability issues with ocfs2+drbd. I use kernel 2.6.24-12-pve on two HP DL 380 servers. Sometimes this bundle falls and takes the production system down with him. So, I am seriously thinking to decline the drbd usage on production, because its implementation is quite raw.

oeginc · Nov 30, 2010

jinjer said:
I would like to add that on the "good" system I have lots of READ DMA EXT errors:

Code:

ata2.00: failed command: READ DMA EXT ata2.00: cmd 25/00:80:c2:ef:6d/00:00:37:00:00/e0 tag 0 dma 65536 in res 51/04:0f:c2:ef:6d/00:00:00:00:00/e0 Emask 0x1 (device error) ata2.00: status: { DRDY ERR } ata2.00: error: { ABRT } ata2.00: configured for UDMA/33 ata2: EH complete

This is on tested drives (no issues there) and smartctl reports are fine (selftest passes).
I had no such issues with older kernels, and 2.6.32 seems to be the culprit.

I'm wondering if it's just me that has so many issues or is it that nobody has updated to 2.6.32 (using soft-raid)?

I think I will need to downgrade the kernel to 2.6.24 series and report back

jinjer

I was having this same problem on 2.6.18 kernels, for me it turned out to be a power supply issue.
not enough ooomph for all the hard drives I was running, or maybe varying voltages - not sure, but
replacing the power supply seemed to fix the problem.

jinjer · Dec 6, 2010

Thanks for sharing your experience. I too think the problem is with drbd+ocfs2.

I am under the impression that ocfs2 requires some performance from the underlying disk system. I had performance issues from the disks that were causing the disk "pressure" to be very sustained. I can rule out problems with the power supplies or other hardware related problems.

I think that either drbd or ocfs2 hit some internal timeout and cause a rescan of the bus. Ocfs2 would issue "invalid opcode" errors inside its kernel module and bring the machine to a complete halt (hard reset required. the kernel sysreq sequence was able to reset the server).

Currently I'm trying iscsi backend with ocfs2 on top and will soon have another cluster with shared sas san to play with. I hope to have enough time for more investigation of this issue.

jinjer

Search

Search

New Proxmox VE 1.6 kernels (2.6.32 and 2.6.35)

martin

Proxmox Staff Member

ozgurerdogan

Renowned Member

Tipika

New Member

tom

Proxmox Staff Member

jinjer

Renowned Member

tom

Proxmox Staff Member

jinjer

Renowned Member

jinjer

Renowned Member

jschellhaass

New Member

jinjer

Renowned Member

SuSt

Well-Known Member

oeginc

Member

jinjer

Renowned Member