Need Help - Fujitsu 9300-8I SAS 12G SATA 6G PCI-e Controller HBA IT mode - crashing VM - Error XT4-FS errors

chrischambers

Member
Nov 1, 2022
33
2
13
UPDATE: at first I through it was a Proxmox issue, but now think it is a LSI Issue, please refer to the late ports for more information.



I have been having a bit of a problem with my proxmox-backup-client
for the past 2 mouths it has been crashing with the below errors
I have done some basic checks, by removing hardware / and not firing up VM's but after 24 hours it crash's
I have even rebuild my server thinking it was a Operating system issues but the problem is still here
cos as you will see from the errors I have been getting my thinking is that is a Operating system issue.


my set up is

CPU 16 x AMD Ryzen 7 2700 Eight-Core Processor (1 Socket
RAM: 32GB
MB: B450 TOMAHAWK
Bios: Lastest Stable version 7C02v1I


my environment is
VM Unraid with a Broadcom / LSI SAS 3006 PCI-Express Fusion-MPT SAS-3 - with approx. 6 spinning hard drives
VM Plex
VM Home Assistant
VM HandBrack with a SCSI connected Hard Drive SCSI9 /dev/disk/by-id/ATA-WD10**************************,backup=0,size=976762584K

my main proxmox SSD is a Samsung PM863A 960G 2.5 SATA SSD


A few of the errors I am getting are

EXT4-FS ERROR (DEVICE DM-1) IN EXT4_RESERVE_INODE_WRITE:5792: JOURNAL HAS ABOURTED
EXT4-FS ERROR (DEVICE DM-1): EXT4_JOURNAL_CHECK_START:84: COMMSYSTEMD-JOURNAL: DETECTEDABORTED JOURNAL
EXT4-FS ERROR (DEVICE DM-1): in ext4_reservce_inode_write:5792: IO Failure
remounting filesystem read-only

when I get this error, I then boot onto a Live USB Stick and then the following command to check the fsck.ext4 -f /dev/mapper/pve-root
but all I get is a screen full of dots, which I have left for over 3 hours and it was still running.

my current thinking is that my SSD is on its way out, but not reporting any issues.
can someone please advice / me on on resolving this.
I have attached a document will all the information I think you might need but if there is anything else then please
ask, as I would like to get this resolved.

update: my LSI Details

lspci | grep -i lsi
25:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)

lspci -k -s 25:00.0
25:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
Subsystem: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3
Kernel driver in use: mpt3sas
Kernel modules: mpt3sas

regards
 

Attachments

Last edited:
ok I have been looking into this, and I saw that I had the same issue last year, and one of the members helped me to get the LSI working. I am down to step 10, ask I Am going to be leaving the system running without any hard drivers attached to see if I get any more issues


when I re-discovered this I learned that I missed out to make steps

nano /etc/modules

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

and

and adding this to the grub file
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction"

@justinclift = if you are still around I might be knocking on your door for help in upgrading the firmware on my LSI card - again if this works I will update you all.
 
Last edited:
note being very suprise the system, again crash - with the above errors. so there is something wrong. I have ran the following command on each hard drive which I have passed through

sudo smartctl -a /dev/sdh | less

and then check the Raw_Read_Error_Rate Reallocated_Sector_Ct , Current_Pending_Sector ,

and for each hard drive they are showing 0

I have also checked the Log file within Proxmox for the time it crashed

one thing I am confused about is the SMTP which I havn't set up and it is trying to connect to <ntiip-ist4@fleetfost.od.uk>
is there a way to stop this ?
 

Attachments

OK, I have located the problem, the problem is with the iommu, that when I plug the LSI card in, the system will crash within 24 hours, but without the card, it will run with no issues.

looks like my motherboard has chosen to no longer play and give me issues with the IOMMU.

I have one more step and that is to blacklist the mpt3sas drivers in proxmox to make sure that it is not grapping the card before it is past over to the VM.

I will keep you posted - as it looks like I Will be purchasing a new motherboard