2.6.32-1-pve: aacraid: Host adapter reset request. SCSI hang ?

lbm

Member
Feb 11, 2010
8
0
21
Hi,
i would like to comment congelation system during massive write to disk. There is my testing environment:

HW:
AMD Opteron 1212 2.0GHz 2Core
4GB RAM
MB Supermicro H8SSL-i2
4 x WD 2500YS-01SHB0 RE16
Adaptec AAR-2420SA

SW:
Lenny with proxmox-ve-2.6.32
Code:
Linux version 2.6.32-1-pve (root@oahu) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 SMP Fri Jan 15 11:37:39 CET 2010 ()
...
kernel: aacraid 0000:02:01.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
kernel: IRQ 20/aacraid: IRQF_DISABLED is not guaranteed on shared IRQs
kernel: AAC0: kernel 5.2-0[15611] Jan 18 2008
kernel: AAC0: monitor 5.2-0[15611]
kernel: AAC0: bios 5.2-0[15611]
kernel: AAC0: serial 0502EB
kernel: scsi6 : aacraid
kernel: scsi 6:0:0:0: Direct-Access     Adaptec  arr1             V1.0 PQ: 0 ANSI: 2
kernel: scsi 6:1:0:0: Direct-Access     WDC WD25 00YS-01SHB       20.0 PQ: 0 ANSI: 2
kernel: scsi 6:1:1:0: Direct-Access     WDC WD25 00YS-01SHB       20.0 PQ: 0 ANSI: 2
kernel: scsi 6:1:2:0: Direct-Access     WDC WD25 00YS-01SHB       20.0 PQ: 0 ANSI: 2
kernel: scsi 6:1:3:0: Direct-Access     WDC WD25 00YS-01SHB       20.0 PQ: 0 ANSI: 2
kernel: sd 6:0:0:0: Attached scsi generic sg0 type 0
kernel: sd 6:0:0:0: [sda] 979763200 512-byte logical blocks: (501 GB/467 GiB)
kernel: sd 6:0:0:0: [sda] Write Protect is off
kernel: sd 6:0:0:0: [sda] Mode Sense: 06 00 10 00
kernel: sd 6:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
When the system get freeze in kern.log appears:
Code:
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter abort request (6,0,0,0)
kernel: aacraid: Host adapter reset request. SCSI hang ?
kernel: AAC: Host adapter BLINK LED 0xef
kernel: AAC0: adapter kernel panic'd ef.
My test procedures:
1. FTP upload big file (4GB disk image) over 1GB LAN
(approx. first 100MB uploaded only)
2. postmark
postmark settings:
Code:
set number 60000
set transactions 100000
set size 500 153600
set read 4096
set write 4096
set subdirectories 150
When I install proxmox-ve-2.6.18 both procedures run OK.

Regards, Lubomir
 
can you test the 2.6.32 kernel from the pvetest repository?
 
Unfortunately, bad things happened...
after huge of testing:
Code:
# less /var/log/kern.log
-bash: /usr/bin/less: Input/output error
# dmesg 
-bash: /bin/dmesg: Input/output error
on the console:
Code:
aacraid: Host adapter abort request (0,0,6,0)
aacraid: Host adapter reset request. SCSI hang ?
...
sd 4:0:0:0: rejecting I/O to offline device
sd 4:0:0:0: rejecting I/O to offline device
...
I must say that this error occurred after load module w83793 for lm-sensors... :(
Without him seems to be everything all right.
All previous tests (on the kernel 2.6.18.2 and 2.6.32.1) has been done without w83793 module.

excerpt from boot sequence 2.6.32.2:
Code:
May  3 14:11:33 helios kernel: Linux version 2.6.32-2-pve (root@oahu) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 SMP Thu Apr 15 10:22:46 CE
ST 2010 ()
...
May  3 14:11:33 helios kernel: Adaptec aacraid driver 1.1-5[2461]-ms
May  3 14:11:33 helios kernel:  alloc irq_desc for 20 on node 0
May  3 14:11:33 helios kernel:  alloc kstat_irqs on node 0
May  3 14:11:33 helios kernel: aacraid 0000:02:01.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
May  3 14:11:33 helios kernel: IRQ 20/aacraid: IRQF_DISABLED is not guaranteed on shared IRQs
May  3 14:11:33 helios kernel: AAC0: kernel 5.2-0[15611] Jan 18 2008
May  3 14:11:33 helios kernel: AAC0: monitor 5.2-0[15611]
May  3 14:11:33 helios kernel: AAC0: bios 5.2-0[15611]
May  3 14:11:33 helios kernel: AAC0: serial 0502EB
May  3 14:11:33 helios kernel: scsi6 : aacraid
May  3 14:11:33 helios kernel: scsi 6:0:0:0: Direct-Access     Adaptec  arr1             V1.0 PQ: 0 ANSI: 2
excerpt from boot sequence 2.6.18.2:
Code:
May  3 10:00:49 helios kernel: Linux version 2.6.18-2-pve (root@oahu) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 SMP Mo
n Feb 1 10:45:26 CET 2010
...
May  3 10:00:49 helios kernel: Adaptec aacraid driver 1.1-5[2461]
May  3 10:00:49 helios kernel: ACPI: PCI Interrupt 0000:02:01.0[A] -> GSI 20 (level, low) -> IRQ 185
May  3 10:00:49 helios kernel: AAC0: kernel 5.2-0[15611] Jan 18 2008
May  3 10:00:49 helios kernel: AAC0: monitor 5.2-0[15611]
May  3 10:00:49 helios kernel: AAC0: bios 5.2-0[15611]
May  3 10:00:49 helios kernel: AAC0: serial 0502EB
May  3 10:00:49 helios kernel: scsi4 : aacraid
May  3 10:00:49 helios kernel:  Vendor: Adaptec   Model: arr1              Rev: V1.0