Hi,
I just installed a new server.
I use the latest proxmox version with the latest firmware raid controller and I encounter an issue.
I don't know if it specific to proxmox but I report it here.
When I do a "MegaCli64 -AdpAllInfo -aALL", the server become unresponsive during 2-3 minutes. Web interface is unreacheable.
I can see the following in the log :
The specification indicate the card compatible with debian : https://www.broadcom.com/products/storage/raid-controllers/megaraid-9560-8i
An idea to avoid this issue ?
Edit: The issue doesn't seems present with storcli
I just installed a new server.
I use the latest proxmox version with the latest firmware raid controller and I encounter an issue.
I don't know if it specific to proxmox but I report it here.
proxmox-ve: 7.2-1 (running kernel: 5.15.39-3-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-8
pve-kernel-helper: 7.2-8
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.39-3-pve: 5.15.39-3
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-7
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-8
pve-kernel-helper: 7.2-8
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.39-3-pve: 5.15.39-3
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-7
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
Basics :
======
Controller = 0
Model = MegaRAID 9560-8i 4GB
Serial Number = SPC1629132
Current Controller Date/Time = 08/05/2022, 14:21:40
Current System Date/time = 08/05/2022, 16:21:41
SAS Address = 500062b20b7b5800
PCI Address = 00:65:00:00
Mfg Date = 04/29/22
Rework Date = 00/00/00
Revision No = 01004
Version :
=======
Firmware Package Build = 52.21.0-4428
Firmware Version = 5.210.02-3663
PSOC FW Version = 0x000C
NVDATA Version = 5.2100.00-0528
CBB Version = 22.25.04.00
Bios Version = 7.21.01.0_0x07150200
HII Version = 07.21.05.00
HIIA Version = 07.21.05.00
Driver Name = megaraid_sas
Driver Version = 07.717.02.00-rc1
======
Controller = 0
Model = MegaRAID 9560-8i 4GB
Serial Number = SPC1629132
Current Controller Date/Time = 08/05/2022, 14:21:40
Current System Date/time = 08/05/2022, 16:21:41
SAS Address = 500062b20b7b5800
PCI Address = 00:65:00:00
Mfg Date = 04/29/22
Rework Date = 00/00/00
Revision No = 01004
Version :
=======
Firmware Package Build = 52.21.0-4428
Firmware Version = 5.210.02-3663
PSOC FW Version = 0x000C
NVDATA Version = 5.2100.00-0528
CBB Version = 22.25.04.00
Bios Version = 7.21.01.0_0x07150200
HII Version = 07.21.05.00
HIIA Version = 07.21.05.00
Driver Name = megaraid_sas
Driver Version = 07.717.02.00-rc1
When I do a "MegaCli64 -AdpAllInfo -aALL", the server become unresponsive during 2-3 minutes. Web interface is unreacheable.
I can see the following in the log :
Code:
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: FW supports atomic descriptor : Yes
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: FW provided supportMaxExtLDs: 1 max_lds: 240
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: controller type : MR(4096MB)
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: Online Controller Reset(OCR) : Enabled
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: Secure JBOD support : Yes
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: NVMe passthru support : Yes
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: FW provided TM TaskAbort/Reset timeout : 6 secs/60 secs
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: JBOD sequence map support : Yes
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: PCI Lane Margining support : Yes
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: return -EBUSY from megasas_refire_mgmt_cmd 4330 cmd 0x5 opcode 0x10b0100
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: return -EBUSY from megasas_mgmt_fw_ioctl 8402 cmd 0x5 opcode 0x10b0100 cmd->cmd_status_drv 0x3
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: waiting for controller reset to finish
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: Adapter is OPERATIONAL for scsi:6
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: Snap dump wait time : 15
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: Reset successful for scsi6.
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: 3514 (713024564s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: 3517 (713024571s/0x0020/CRIT) - Controller encountered an error and was reset
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: scanning for scsi6...
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: 3543 (713024603s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: 3546 (713024610s/0x0020/CRIT) - Controller encountered an error and was reset
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: scanning for scsi6...
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: 3572 (713024644s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: 3575 (713024651s/0x0020/CRIT) - Controller encountered an error and was reset
Aug 05 16:24:33 proxmox kernel: megaraid_sas 0000:65:00.0: scanning for scsi6...
Aug 05 16:24:33 proxmox pvedaemon[1293]: <root@pam> successful auth for user 'root@pam'
Aug 05 16:24:33 proxmox pve-firewall[1262]: firewall update time (130.580 seconds)
Aug 05 16:24:34 proxmox pvestatd[1264]: status update time (130.566 seconds)
Aug 05 16:24:38 proxmox pve-ha-lrm[1307]: loop take too long (142 seconds)
Aug 05 16:24:38 proxmox pve-ha-crm[1298]: loop take too long (139 seconds)
Aug 05 16:27:45 proxmox pvedaemon[1291]: <root@pam> successful auth for user 'root@pam'
The specification indicate the card compatible with debian : https://www.broadcom.com/products/storage/raid-controllers/megaraid-9560-8i
An idea to avoid this issue ?
Edit: The issue doesn't seems present with storcli
Last edited: