'aacraid: Host adapter abort request' errors

fab26

Member
Aug 17, 2020
12
1
23
56
Hi,

I'm running Proxmox 6.3 on a supermicro server with an adaptec card and a ZFS RAIDZ1 array of disks.
I get lots of errors and those messages don't stop popping up on the console:

aacraid: Outstanding command on (0,1,0,0):
aacraid: Host adapter abort request.

I did some googling, that seems to be related to maybe some versions of the Linux kernel, or maybe the firmware of the adaptec card, or maybe a SCSI timeout setting.

Is anyone experienced this kind of issues ?

Thanks !


--
Proxmox 6.3-1, Linux 5.4.101-1-pve,
32 x Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (2 Sockets), 96GB RAM
Adaptec Series 7 6G SAS/PCIe 3 (rev01) in HBA Mode, RAIDZ1 4x8TB Seagate IronWolf
 
The raid adapter is set up in HBA mode (pass through) so I think that should not be an issue with ZFS

BTW errors are back, the firmware update did not fix anything.

Any thoughts ?
 
Hi Squirell,

Sorry for the long delay, I was not able to physically access this server anymore
(here we got a pretty strict 'stay at home' order, again)

arcconf logs and config : see below attached text files.
there is a bunch of "AFM700_BU_FATAL_ERROR_ALERT" and "FSA_EM_EXPANDED_EVENT"
also some more errors I saw today...

Thank you very much for your help!

Fab

[234521.856660] aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,11,0):
[234534.144875] aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,11,0):
[234534.146587] aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,11,0):
[234534.148168] aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,11,0):
[234534.168878] aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,11,0):
[234534.170539] aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,11,0):
[234535.021438] sd 0:1:11:0: [sdg] tag#707 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[234535.021450] sd 0:1:11:0: [sdg] tag#707 CDB: Read(16) 88 00 00 00 00 03 6f 3c cc 80 00 00 00 08 00 00
[234535.021454] blk_update_request: I/O error, dev sdg, sector 14751157376 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
[234535.021505] aacraid: Host bus reset request. SCSI hang ?
[234535.022323] zio pool=tank0 vdev=/dev/sdg1 error=5 type=1 offset=7552591527936 size=4096 flags=180880
[234535.023081] aacraid 0000:82:00.0: outstanding cmd: midlevel-2
[234535.023082] aacraid 0000:82:00.0: outstanding cmd: lowlevel-0
[234535.023083] aacraid 0000:82:00.0: outstanding cmd: error handler-5
[234535.023083] aacraid 0000:82:00.0: outstanding cmd: firmware-0
[234535.023084] aacraid 0000:82:00.0: outstanding cmd: kernel-0
[234535.023161] aacraid 0000:82:00.0: Controller reset type is 3
[234535.023900] aacraid 0000:82:00.0: Issuing IOP reset
[234568.332404] aacraid 0000:82:00.0: IOP reset succeeded
[234568.390079] aacraid: Comm Interface type2 enabled
[234577.428169] aacraid 0000:82:00.0: Scheduling bus rescan
[234587.827958] sd 0:1:11:0: [sdg] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB)
[234587.827960] sd 0:1:11:0: [sdg] 4096-byte physical blocks
 

Attachments

Apply this patch to your kernel, should fix the regression you are hitting in the aacraid driver, I hit this error myself and have confirmed this patch fixes it:
https://patchwork.kernel.org/project/linux-scsi/patch/20190819163546.915-2-khorenko@virtuozzo.com/

Additional details:
https://lore.kernel.org/linux-scsi/20190819163546.915-1-khorenko@virtuozzo.com/

Edit:
Still seeing this error now actually but it looks to be less frequent and doesn't have the resets afterwards:
Code:
aacraid: Host adapter abort request.
aacraid: Outstanding commands on (0,1,47,0):
 
Last edited:
Very interesting ! That's exactly my issue.
during my googling I remember seeing this forum thread but I did not dig enough to find the patch
I will try it !

Thanks!
 
Sorry for bumping an old post, but since the issue seems still relevant, another workaround might be setting

Code:
elevator=noop
as Kernel option.

I didn't do a long run test with it but i had no issue after a coupple of hours of high io so far.
 
Sorry for bumping an old post, but since the issue seems still relevant, another workaround might be setting

Code:
elevator=noop
as Kernel option.

I didn't do a long run test with it but i had no issue after a coupple of hours of high io so far.
Hi MadMakz,

I may have to work on this server again (which is offline at the moment)
what do you mean exactly by 'elevator=noop' as kernel option ?
Are you tweaking the i/o scheduler on the host (proxmox) or on the guest (OMV) ?

Thanks,
Fab
 
Hi, did you ever find a fix or workaround? I am having a similar problem with my adaptec series 7 card. Proxmox is 7.4-3 and running on ZFS, but not through the adaptec card, which is in HBA mode and only passing through some ext4 drives.

From what I can gather, noop is deprecated in current versions of linux, and none should be set instead.
 
Last edited: