Bug on proxmox 1.8

reloaded

Member
Aug 25, 2009
69
0
6
Hello everyone,

I would like to record a bug or problem that is causing me many problems withProxmox 1.8,

Proxmox I am using 1.8 with Adaptec 2405, I bought this controller that is part of thesupport Proxmox.

The controller works well for a few days or hours, then get the following error message:
kernel: Disabling IRQ # 16

And it all starts to run slow, the ith delay, goes from a 0.0 to 38 at least, the load goes up dramatically and all the VPS malfunction.

the question is: Is there any fix for this bug?

Performing a search on google, there are more people with this problem, but all the solutions I find are for redhat systems, ubuntu base.

If anyone has any solution I would appreciate.

Thank you very much for your help
 
pveperf
CPU BOGOMIPS: 54262.70
REGEX/SECOND: 1404610
HD SIZE: 15.75 GB (/dev/mapper/pve-root)
BUFFERED READS: 1.49 MB/sec
AVERAGE SEEK TIME: 124.00 ms
FSYNCS/SECOND: 1.29
DNS EXT: 134.33 ms
DNS INT: 206.44 ms (unelink.net)
 
also post the result of 'pveversion -v'

any logs in the ASM?
 
pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6


Only use vz machines no KVM
 
pveversion -v
pve-manager: 1.8-18 (pve-manager/1.8/6070)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-33
pve-kernel-2.6.32-4-pve: 2.6.32-33
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-17
vncterm: 0.9-2
vzctl: 3.0.28-1pve1
vzdump: 1.2-14
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6


Only use vz machines no KVM

Like, I can get logs ASM?
​thank you very much for your concern
 
Thank you very much tom! for all the trouble you are taking.

I installed ASM and I can see from the Windows application

But I find no way to see the logs. I do not find in / var / log.

Where can be the precious logs?

Thanks again
 
I talk about the ASM logs, raid status, disk status, ...
See Adaptec documentation about ASM.
 
  • what disks you use? (detailed model number)
  • raid level?
  • did you enable write cache on the disks?
  • did you enable write cache on the controller?
 
OK thanks
  • what disks you use? Seagate barracuda 7200.12 1TB
  • raid level? level 10
  • did you enable write cache on the disks? cache 32 MB on the disc internal.
  • did you enable write cache on the controller? yes cache 128 MB on controller
I also asked that hard drives are on the compatibility list adaptec to the controller 2405.

Now I am using seagate, but the problem occurs: wd caviar blue with 32MB of cache
I indicated, I tried the cache controller, on and off in amboscasos has given me the error

Any idea?

Thanks tom!
 
Last edited:
More information:

When all goes well on the server run:


# lspci-v

And raid driver appears as displayed:

02:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09)
Subsystem: Adaptec ASR-2405
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at fb800000 (64-bit, non-prefetchable) [size = 2M]
[virtual] Expansion ROM at dc000000 [disabled] [size = 512K]
Capabilities: [98] Power Management version 2
Capabilities: [a0] Message signaller Interrupts: Mask-64bit +Queue = 0 / 1 Enable-
Capabilities: [d0] Express Endpoint, MSI 00
Capabilities: [90] Vital Product Data <?>
Capabilities: [100] Advanced Error Reporting <?>
Kernel driver in use: aacraid
Kernel modules: aacraid

When everything goes wrong, the entry for the raid, it disappears andthis message:

kernel: Disabling IRQ # 16


Why Proxmox IRQ16 disabled?

I've been watching the forum and there are people with a similar problem.

Any help?
 
looks like an IRQ issue with your devices. which device is also using IRQ 16?

pls report your detailed hardware for a reference, also check bios settings (upgrades available?), try another physical slot for the card (if available), try disabling probably conflicting other cards.
 
Hello, I tested several motherboards:

Gigabyte H67MA-UB2H-B3-1155 SCHOKET -1155 MICRO INTEL I7

H55M-UD3H Gigabyte SHOCKET-1156 MICRO-INTEL I7 -1156

ASUS P8H67-M-1155 PRO MICRO INTEL I7


RAM:

16 GB kingstong 4x4GB

RAID Card:

Adaptec 2405



in all cases I received the error, the error never appears in the same time, wereproduce the error.

The error usually occurs when the server contains several vps openvz running and are working on them.

When the server is running the error with all vps server performance is optimal, with aload of load of 0.08 and 0.0.

But when the message performance load falls to 18-40 and 30 and 60 io delay.


To say that no other cards installed in the machine and the problem has begun to happen when we added the raid card. Previously we used those plates to install raidcard Proxmox not with the same configuration and have always worked perfectly.

Thank you very much for your time and help.
 
In the beginning was not updated anything, but now they are updated.

But there is failure in several different plates wonder that comes from the firmware.

But if you've tried it and did not solve the problem.


Also another problem is I can not reproduce the error whenever I want (therefore I can not implement solutions and testing to see if I get the error again.


I've always used Proxmox and I am very surprised Proxmox behavior in these cases.A random error is the worst that I can have. Right now we have withdrawn fromproduction and have 2 days without fail: (.

Thank you very much to all for your important help.
 
More information:

DMESG:

usb 1-1.5: USB disconnect, address 3
irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-4-pve #1
Call Trace:
<IRQ> [<ffffffff81097bfd>] ? __report_bad_irq+0x30/0x7d
[<ffffffff81097d4f>] ? note_interrupt+0x105/0x16e
[<ffffffff810165b1>] ? read_tsc+0xa/0x20
[<ffffffff810983b4>] ? handle_fasteoi_irq+0x93/0xb5
[<ffffffff8101333f>] ? handle_irq+0x17/0x1d
[<ffffffff81012999>] ? do_IRQ+0x57/0xb6
[<ffffffff81011593>] ? ret_from_intr+0x0/0x11
<EOI> [<ffffffffa01194f9>] ? acpi_idle_enter_bm+0x27d/0x2af [processor]
[<ffffffffa01194f2>] ? acpi_idle_enter_bm+0x276/0x2af [processor]
[<ffffffff812508ae>] ? cpuidle_idle_call+0x94/0xee
[<ffffffff8100ff09>] ? cpu_idle+0xa2/0xda
[<ffffffff81528140>] ? early_idt_handler+0x0/0x71
[<ffffffff81528cea>] ? start_kernel+0x3f2/0x3fe
[<ffffffff815283b7>] ? x86_64_start_kernel+0xf9/0x106
handlers:
[<ffffffffa0074bee>] (ata_sff_interrupt+0x0/0xbf [libata])
[<ffffffffa00a2f97>] (aac_rx_intr_message+0x0/0x9f [aacraid])
Disabling IRQ #16




lspci -v:

03:00.0 IDE interface: VIA Technologies, Inc. VT6415 PATA IDE Host Controller (prog-if 85 [Master SecO PriO])
Subsystem: ASUSTeK Computer Inc. Device 838f
Flags: bus master, fast devsel, latency 0, IRQ 16
I/O ports at e040
I/O ports at e030
I/O ports at e020
I/O ports at e010
I/O ports at e000
Expansion ROM at fbc00000 [disabled] [size=64K]
Capabilities: [50] Power Management version 3
Capabilities: [70] Message Signalled Interrupts: Mask+ 64bit+ Queue=0/0 Enable-
Capabilities: [90] Express Legacy Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting <?>
Capabilities: [130] Device Serial Number 00-40-63-ff-ff-63-40-00
Kernel driver in use: pata_via
Kernel modules: pata_via, ata_generic


Any idea?

Thnks
 
can you try completely disabling usb in your mainboard bios? I goggled around and this helped for some with similar issues.
 
thanks!

I've managed to reproduce the bug repeatedly adding and removing the keyboard USB port, so within minutes the system fails.

I'm testing:

USB3 disabling: the problem continues

disabling all usb: Testing now edit: problem continues

irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-4-pve #1
Call Trace:
<IRQ> [<ffffffff81097bf5>] ? __report_bad_irq+0x30/0x7d
[<ffffffff81097d47>] ? note_interrupt+0x105/0x16e
[<ffffffff811cf713>] ? acpi_hw_read_port+0x2e/0x93
[<ffffffff810983ac>] ? handle_fasteoi_irq+0x93/0xb5
[<ffffffff8101333f>] ? handle_irq+0x17/0x1d
[<ffffffff81012999>] ? do_IRQ+0x57/0xb6
[<ffffffff811ce85f>] ? acpi_hw_read+0x4d/0x54
[<ffffffff81011593>] ? ret_from_intr+0x0/0x11
[<ffffffff81028234>] ? native_apic_mem_write+0x0/0xc
[<ffffffff81054928>] ? __do_softirq+0x97/0x22f
[<ffffffff81096a8c>] ? handle_IRQ_event+0x58/0x126
[<ffffffff81011d6c>] ? call_softirq+0x1c/0x30
[<ffffffff810132eb>] ? do_softirq+0x3f/0x7c
[<ffffffff8105473f>] ? irq_exit+0x78/0xb8
[<ffffffff810129e2>] ? do_IRQ+0xa0/0xb6
[<ffffffff81011593>] ? ret_from_intr+0x0/0x11
<EOI> [<ffffffff81047c87>] ? finish_task_switch+0x44/0xaf
[<ffffffff81047c7d>] ? finish_task_switch+0x3a/0xaf
[<ffffffff813143e5>] ? thread_return+0x4e/0x143
[<ffffffff8100ff3f>] ? cpu_idle+0xd8/0xda
[<ffffffff81528140>] ? early_idt_handler+0x0/0x71
[<ffffffff81528ce7>] ? start_kernel+0x3f2/0x3fe
[<ffffffff815283b4>] ? x86_64_start_kernel+0xf6/0x103
handlers:
[<ffffffffa0020d6d>] (aac_rx_intr_message+0x0/0x78 [aacraid])
[<ffffffffa011c848>] (usb_hcd_irq+0x0/0x7e [usbcore])
Disabling IRQ #16



I also noticed that when you start Proxmox not for displaying this message (which is blocking the network):

promiscuous mode device eth1 Entered
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
r8169: eth1: link up
much more ..

If the problem is the USB? There is some possibility that it can solve Proxmox?
USB booting is very important in the motherboard

all these problems appear to me trying various motherboard

Thank you very much for everything.
 
Last edited: