Big error with 2.6.32-19-pve on HP server

amace

Renowned Member
Dec 17, 2012
24
3
68
Armpit, Hell
Hello, we updated our DL380 G7 to Proxmox version 2.3 over the weekend and have had to back down to pve-16 due to problems with high load average and random kernel panics. The big killer seems to be intel-iommu though there are other oops logged.

Mar 26 18:48:59 proxmox kernel: ------------[ cut here ]------------
Mar 26 18:48:59 proxmox kernel: WARNING: at drivers/pci/intel-iommu.c:2775 intel_unmap_page+0x15f/0x180() (Not tainted)
Mar 26 18:48:59 proxmox kernel: Hardware name: ProLiant DL380 G7
Mar 26 18:48:59 proxmox kernel: Driver unmaps unmatched page at PFN 0
Mar 26 18:48:59 proxmox kernel: Modules linked in: radeon ttm drm_kms_helper drm shpchp snd_pcsp i2c_algo_bit serio_raw i2c_core snd_pcm snd_timer i7core_edac edac_core hpwdt hpilo tpm_tis snd soundcore tpm tpm_bios power_meter snd_page_alloc ext3 jbd mbcache sg ata_generic pata_acpi ata_piix bnx2 e1000e hpsa [last unloaded: scsi_wait_scan]
Mar 26 18:48:59 proxmox kernel: Pid: 0, comm: swapper veid: 0 Not tainted 2.6.32-19-pve #1
Mar 26 18:48:59 proxmox kernel: Call Trace:
Mar 26 18:48:59 proxmox kernel: <IRQ> [<ffffffff8106d6c8>] ? warn_slowpath_common+0x88/0xc0
Mar 26 18:48:59 proxmox kernel: [<ffffffff8106d7b6>] ? warn_slowpath_fmt+0x46/0x50
Mar 26 18:48:59 proxmox kernel: [<ffffffff812a7c1b>] ? find_iova+0x5b/0x90
Mar 26 18:48:59 proxmox kernel: [<ffffffff812abe5f>] ? intel_unmap_page+0x15f/0x180
Mar 26 18:48:59 proxmox kernel: [<ffffffffa0076a65>] ? bnx2_poll_work+0x155/0x11d0 [bnx2]
Mar 26 18:48:59 proxmox kernel: [<ffffffff810eb300>] ? handle_IRQ_event+0x60/0x170
Mar 26 18:48:59 proxmox kernel: [<ffffffff810ed9e8>] ? handle_edge_irq+0x98/0x180
Mar 26 18:48:59 proxmox kernel: [<ffffffff8111dd86>] ? group_sched_in+0x26/0x170
Mar 26 18:48:59 proxmox kernel: [<ffffffffa0077b1d>] ? bnx2_poll_msix+0x3d/0xd0 [bnx2]
Mar 26 18:48:59 proxmox kernel: [<ffffffff81458f83>] ? net_rx_action+0x103/0x2f0
Mar 26 18:48:59 proxmox kernel: [<ffffffff81076573>] ? __do_softirq+0x103/0x260
Mar 26 18:48:59 proxmox kernel: [<ffffffff8100c2ac>] ? call_softirq+0x1c/0x30
Mar 26 18:48:59 proxmox kernel: [<ffffffff8100def5>] ? do_softirq+0x65/0xa0
Mar 26 18:48:59 proxmox kernel: [<ffffffff8107639d>] ? irq_exit+0xcd/0xd0
Mar 26 18:48:59 proxmox kernel: [<ffffffff81526545>] ? do_IRQ+0x75/0xf0
Mar 26 18:48:59 proxmox kernel: [<ffffffff8100ba93>] ? ret_from_intr+0x0/0x11
Mar 26 18:48:59 proxmox kernel: <EOI> [<ffffffff812d1dbe>] ? intel_idle+0xde/0x170
Mar 26 18:48:59 proxmox kernel: [<ffffffff812d1da1>] ? intel_idle+0xc1/0x170
Mar 26 18:48:59 proxmox kernel: [<ffffffff8109e52d>] ? sched_clock_cpu+0xcd/0x110
Mar 26 18:48:59 proxmox kernel: [<ffffffff81421827>] ? cpuidle_idle_call+0xa7/0x140
Mar 26 18:48:59 proxmox kernel: [<ffffffff8100a023>] ? cpu_idle+0xb3/0x110
Mar 26 18:48:59 proxmox kernel: [<ffffffff81505555>] ? rest_init+0x85/0x90
Mar 26 18:48:59 proxmox kernel: [<ffffffff81c2ef6e>] ? start_kernel+0x412/0x41e
Mar 26 18:48:59 proxmox kernel: [<ffffffff81c2e33a>] ? x86_64_start_reservations+0x125/0x129
Mar 26 18:48:59 proxmox kernel: [<ffffffff81c2e438>] ? x86_64_start_kernel+0xfa/0x109
Mar 26 18:48:59 proxmox kernel: ---[ end trace 83e11cbc4ff8ba9c ]---

Steps we have taken:

- firmware updates (BIOS, RAID)
- Compiled in the newest hpsa driver (3.2.0-3)
- disabled edac_core and i7core_edac

The changes did not help so we're running 2.6.32-16-pve for now.

Has anyone else seen this error? Are there any fixes available?

- - - Updated - - -

root@proxmox:~# pveversion -v
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.3-93
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-19-pve: 2.6.32-93
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-18
pve-firmware: 1.0-21
libpve-common-perl: 1.0-49
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-6
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-8
ksm-control-daemon: 1.1-1
 
We wound up having some problems on pve-16. Removing intel_iommu=on from /etc/default/grub cleared it up. now running smoothly on pve-16. Cheers!
 
In case you are wondering how to reboot to older kernel here is how I did it.
I changed the default value in Grub to 2 (the 2nd option in grub menu which was 2.6.32-17-pve kernel. Notice the first is always 0. Not 1.)

vi /etc/default/grub

GRUB_DEFAULT=2

Then update grub with this command.
update-grub
The pve-17 kernel will be used after the next reboot.




You can check the order of your kernel listing with this command.

less /boot/grub/grub.cfg

With our servers it was like this.
(Ignore the code on top of the grub.cfg file.)
### BEGIN /etc/grub.d/10_linux ###menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-19-pve' --class proxmox --class gnu-linux --class gnu --class os {
insmod part_msdos
insmod ext2
set root='(hd0,msdos1)'
search --no-floppy --fs-uuid --set 60faf10c-6b1c-4247-bb4a-c21efedbb59a
echo 'Loading Linux 2.6.32-19-pve ...'
linux /vmlinuz-2.6.32-19-pve root=/dev/mapper/pve-root ro quiet
echo 'Loading initial ramdisk ...'
initrd /initrd.img-2.6.32-19-pve
}
menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-18-pve' --class proxmox --class gnu-linux --class gnu --class os {
insmod part_msdos
insmod ext2
set root='(hd0,msdos1)'
search --no-floppy --fs-uuid --set 60faf10c-6b1c-4247-bb4a-c21efedbb59a
echo 'Loading Linux 2.6.32-18-pve ...'
linux /vmlinuz-2.6.32-18-pve root=/dev/mapper/pve-root ro quiet
echo 'Loading initial ramdisk ...'
initrd /initrd.img-2.6.32-18-pve
}
menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-17-pve' --class proxmox --class gnu-linux --class gnu --class os {
insmod part_msdos
insmod ext2
set root='(hd0,msdos1)'
search --no-floppy --fs-uuid --set 60faf10c-6b1c-4247-bb4a-c21efedbb59a
echo 'Loading Linux 2.6.32-17-pve ...'
linux /vmlinuz-2.6.32-17-pve root=/dev/mapper/pve-root ro quiet
echo 'Loading initial ramdisk ...'
initrd /initrd.img-2.6.32-17-pve
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!