New Proxmox VE 1.6 kernels (2.6.32 and 2.6.35)

martin

Proxmox Staff Member
Staff member
Apr 28, 2005
754
1,742
223
We just released two new kernels to the stable repository.

Release notes:

  • pve-kernel-2.6.32 (2.6.32-25)
    - upgrade to debian squeeze kernel 2.6.32-26
    - update config (original debian config now includes ISDN4LINUX)
    - enable CONFIG_NF_CONNTRACK_IPV6
    - install latest igb driver (igb-2.3.4.tar.gz)

  • pve-kernel-2.6.35 (2.6.35-7)
    - update to Ubuntu-2.6.35-23.36
How to get the latest version:
Just run 'aptitude update' and 'aptitude safe-upgrade'
__________________
Best regards,
Martin Maurer
 
Last edited by a moderator:
Hi,
I switched from .32 to .35 and wanted to upgrade to 2.6.35-7.

After upgrade my pveversion -v:
Code:
pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.6-7
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-22
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-14
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-8
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.5-2
ksm-control-daemon: 1.0-4

Is my active kernel 2.6.35-7?

Also wondering why do I see two kernels;
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5
is this because I switched from .32?

Can you please shortly describe followings:
pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.6-7
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5

Thanks in advance.
 
our kernels are patched.
 
I've been running 2.6.32-4-pve and have had the following error that occurs at boot consistently:


------------[ cut here ]------------
WARNING: at kernel/irq/manage.c:274 enable_irq+0x48/0x7c()
Hardware name: X7DCL
Unbalanced enable for IRQ 17
Modules linked in: ehci_hcd uhci_hcd it8213(+) ahci ide_core libata e1000e usbcore nls_base thermal fan thermal_sys
Pid: 462, comm: modprobe Not tainted 2.6.32-4-pve #1
Call Trace:
[<ffffffff81097115>] ? enable_irq+0x48/0x7c
[<ffffffff81097115>] ? enable_irq+0x48/0x7c
[<ffffffff8104dfcc>] ? warn_slowpath_common+0x77/0xa3
[<ffffffff8104e054>] ? warn_slowpath_fmt+0x51/0x59
[<ffffffff81016d63>] ? native_read_tsc+0x2/0x11
[<ffffffff8105bb0b>] ? msleep+0x14/0x1e
[<ffffffffa00b0463>] ? do_probe+0xb5/0x1e6 [ide_core]
[<ffffffff81180eb9>] ? delay_tsc+0x30/0x73
[<ffffffff81097115>] ? enable_irq+0x48/0x7c
[<ffffffffa00b0b11>] ? ide_probe_port+0x57d/0x5ab [ide_core]
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-8: ST3500418AS, CC46, max UDMA/133
ata1.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA ST3500418AS CC46 PQ: 0 ANSI: 5
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: [<ffffffffa00b0e3a>] ? ide_host_register+0x270/0x61e [ide_core]
[<ffffffffa00b4e7d>] ? ide_pci_init_two+0x4e5/0x5a8 [ide_core]
[<ffffffff8117a4a8>] ? ida_get_new_above+0xf5/0x1b3
[<ffffffff8117a2be>] ? idr_get_empty_slot+0x16d/0x262
[<ffffffff8117a4a8>] ? ida_get_new_above+0xf5/0x1b3
[<ffffffff81143c2d>] ? sysfs_new_dirent+0x4a/0xf7
[<ffffffff81102224>] ? iput+0x27/0x60
[<ffffffff81143ded>] ? sysfs_addrm_finish+0x66/0x230
[<ffffffff8118dcf2>] ? local_pci_probe+0x12/0x16
[<ffffffff8118e942>] ? pci_device_probe+0xc0/0xe9
[<ffffffff812115d4>] ? driver_probe_device+0xa3/0x14b
[<ffffffff812116cb>] ? __driver_attach+0x4f/0x6f
[<ffffffff8121167c>] ? __driver_attach+0x0/0x6f
[<ffffffff81210ea3>] ? bus_for_each_dev+0x43/0x74
[<ffffffff81210863>] ? bus_add_driver+0xaf/0x1f8
[<ffffffff81211983>] ? driver_register+0xa7/0x111
[<ffffffffa000c000>] ? it8213_ide_init+0x0/0x1a [it8213]
[<ffffffff8118eb88>] ? __pci_register_driver+0x50/0xb8
[<ffffffffa000c000>] ? it8213_ide_init+0x0/0x1a [it8213]
[<ffffffff8100a065>] ? do_one_initcall+0x64/0x174
[<ffffffff81082e6b>] ? sys_init_module+0xc5/0x21a
[<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
---[ end trace df01b595cb07d1f2 ]---

Is the new update to 2.6.32-25 able to fix it?

jinjer
 
don´t know but just try the latest pve-kernel-2.6.32-4-pve: 2.6.32-26 (from pvetest).
 
I'm not sure about -26- but -25- has the same behaviour. The problem appers on all my supermicro X7DCL-I servers. They're all Dual-Xeon E5405 (8 core total) with 12GB ram.

I've been fighting stability issues for some time with the disk subsystem (huge problems with ocfs2+drbd when load is applied to the system).

I have another configuration (single xeons E5335 on a X7DAL-E mobo with 8G ram) that does not have the above problem.

The X7DAL-E is an Intel 5000X while the X7DCL-i has a 5100 bridge. Also south bridge (and disk controllers) are much different.

I don't think it's a proxmox issue, but I have to verify with a clean debian kernel (it will take time.. I need some sleep).

jinjer
 
I would like to add that on the "good" system I have lots of READ DMA EXT errors:
Code:
ata2.00: failed command: READ DMA EXT
ata2.00: cmd 25/00:80:c2:ef:6d/00:00:37:00:00/e0 tag 0 dma 65536 in
         res 51/04:0f:c2:ef:6d/00:00:00:00:00/e0 Emask 0x1 (device error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ABRT }
ata2.00: configured for UDMA/33
ata2: EH complete
This is on tested drives (no issues there) and smartctl reports are fine (selftest passes).
I had no such issues with older kernels, and 2.6.32 seems to be the culprit.

I'm wondering if it's just me that has so many issues or is it that nobody has updated to 2.6.32 (using soft-raid)?

I think I will need to downgrade the kernel to 2.6.24 series and report back :(

jinjer
 
The "configured for UDMA/33" makes me thinks it's related to your cd/dvd drive or you have a bad cable somewhere.

jeff
 
The "configured for UDMA/33" makes me thinks it's related to your cd/dvd drive or you have a bad cable somewhere.

jeff
I traced the issue to an obscure setting in the X7DCL bios about PCI-E compatibility enforcement (?!?) which purpose I can only guess. Setting it to "enforce compatibility" fixed the problem and increased throughput on the disks immensely. I also think this slowness of the disks was uncovered an issue in ocfs2 code (but I still need to prove it)

These mainboard were used previously with kernel series 2.6.18 so I don't know when exactly this issue was introduced.

I'm thinking of using proxmox in a small production cluster so testing intensively now. I'll report if I find more issues.

thank you,
jinjer
 
I've been fighting stability issues for some time with the disk subsystem (huge problems with ocfs2+drbd when load is applied to the system).

I also have stability issues with ocfs2+drbd. I use kernel 2.6.24-12-pve on two HP DL 380 servers. Sometimes this bundle falls and takes the production system down with him. So, I am seriously thinking to decline the drbd usage on production, because its implementation is quite raw.
 
I would like to add that on the "good" system I have lots of READ DMA EXT errors:
Code:
ata2.00: failed command: READ DMA EXT
ata2.00: cmd 25/00:80:c2:ef:6d/00:00:37:00:00/e0 tag 0 dma 65536 in
         res 51/04:0f:c2:ef:6d/00:00:00:00:00/e0 Emask 0x1 (device error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ABRT }
ata2.00: configured for UDMA/33
ata2: EH complete
This is on tested drives (no issues there) and smartctl reports are fine (selftest passes).
I had no such issues with older kernels, and 2.6.32 seems to be the culprit.

I'm wondering if it's just me that has so many issues or is it that nobody has updated to 2.6.32 (using soft-raid)?

I think I will need to downgrade the kernel to 2.6.24 series and report back :(

jinjer

I was having this same problem on 2.6.18 kernels, for me it turned out to be a power supply issue.
not enough ooomph for all the hard drives I was running, or maybe varying voltages - not sure, but
replacing the power supply seemed to fix the problem.
 
Thanks for sharing your experience. I too think the problem is with drbd+ocfs2.

I am under the impression that ocfs2 requires some performance from the underlying disk system. I had performance issues from the disks that were causing the disk "pressure" to be very sustained. I can rule out problems with the power supplies or other hardware related problems.

I think that either drbd or ocfs2 hit some internal timeout and cause a rescan of the bus. Ocfs2 would issue "invalid opcode" errors inside its kernel module and bring the machine to a complete halt (hard reset required. the kernel sysreq sequence was able to reset the server).

Currently I'm trying iscsi backend with ocfs2 on top and will soon have another cluster with shared sas san to play with. I hope to have enough time for more investigation of this issue.

jinjer
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!