Q: Install on dell r815 / PERC H200i raid / broadcom gig nic "fun"- comments ?

fortechitsolutions

Renowned Member
Jun 4, 2008
437
48
93
Hi, I wanted to ask for comments-thoughts. I was trying to setup a box for a client today, it was running VMware until - today - and we wanted to flip it over to be a test box for proof-of-concept as proxmox at their site. The drama appears to be mostly related to the fact that it has a Perc H200i raid card. Which looks like it believes itself to be a real hardware raid, but when I proceed with proxmox install - it cannot detect the thing. Vanilla debian11 has the same outcome. I am guessing linux support for this thing is just not great. I foolishly assumed it was a slightly nicer-newer thing than a Perc6 but clearly that is not the case.

Just wondering if anyone has managed to get a sane setup on one of these raid cards or not?

I tried it with a raid1 mirror - and also just "2 disks, no raid JBOD" - and nothing was visible to linux. It complained briefly in the dmesg logs but fails.

Some google digging suggests some people do a 3rd party firmware flash onto the thing / and put a LSI based firmware for "IT mode" to make the thing into a vanilla JBOD disk controller. Which I am guessing might be possible but seems slightly insane. I would then presumably do a vanilla Debian install, linux SW raid, and proxmox on top of that (ie, which is a config I am comfortable doing, so long as disks are visible reliably / and performance is not absolute garbage?)

Otherwise the secondary bit of fun is that the on-board broadcom quad-gig-NIC interfaces appear to be a chipset not supported on vanilla install / and I need to fuss about with some extra modules at install time to get the NIC(s) working. Which is annoying. But not a show stopper. Certainly I am more fussed about the raid card than the NIC chipset.

any comments are greatly appreciated.

Thank you!


Tim
 
Small ping on this, I found some discussion here,
https://www.dell.com/support/kbdoc/...dell-poweredge-r630-with-perc-h330-controller

which suggests a workaround, kind of, I think. It seems weird to me this raid card is not supported. Since it is a pretty classic LSI >DellPERC part.

I also found a thread, https://www.scaleway.com/en/docs/dedibox/hardware/how-to/configure-dell-perc-h200/ which seems a bit promising although it seems less focused on OS install and more on getting management tools working. Which is nice, but not my focus.

Sigh

Tim
 
Follow up on this thread. It seems there is a known bug in the megasas config / driver
as per discussion here
with workaround in theory,
Code:
If you are using BIOS add it in grub,

nano /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mpt3sas.max_queue_depth=10000"

alas for me, I used Debian.12 stock media / boot up / edit via Help>advancedmode > pass extra stanza to kernel boot
ask it to do "install mpt3sas.max_queue_depth=10000"
and it boots up
but there is no change, we still have an error logged in dmesg output / visible in console (background behind the installer..) and the MPT Drives are invisible/bad.

so, it seems that this raid controller is just a hassle to use with Debian-current or proxmox-current right now
and I can't see any way to get it working.

not sure if anyone else has any other suggestions. would love to hear if so.
thanks

Tim
 
A small followup on this. I banged my head on this problem again today. I thought I was being clever, I got my hands on a nicer H700 raid card which is better version of the h200. I foolishly assumed that the better HW Raid features of the h700 might make it work better.

Alas I have the same core problem.

- booting proxmox latest, or Debian 10 or newer, and the MPT Sas is busted/sad. No disks detected, error state on the controller, dead in water.
- I booted my dell box with an ancient 'system rescue live-USB-key' I have had (literally) for ~10 years. So an ancient kernel and drivers. The damn thing booted perfectly and sees the raid mirror on my Perc700 perfectly, no drama, it "just works" perfectly.

So kind of grumpy that - this is something that used to 'just work' and now it is 'total pain'

I have read a bunch more threads today suggesting there are known bugs in MPT_SAS and these are maybe fixed and filtering their way down from upstream. I am just curious if any suggestion on how to make it sing and dance without waiting forever.

-- For a lark I tried older Proxmox based on Debian8. This also did not work / same thing / failed to detect the perc700. So that is weird. Because I thought I had found discussion suggesting Deb.9 or older worked fine with this raid card.

-- if there is an easy-ish suggestion I would love to hear it. ie?

Deb-latest, netboot installer, pass it custom kernel parameters. X,Y,Z. Disable this, enable that, tweak depth thing, etc. Something else? custom module driver? other?

Any help is really appreciated

plan-C would be to just buy a totally different PCIE Raid card which (a) works on Current proxmox-Debian (b) no drama and (c) supports the same SAS Fan out cables already in the chassis, so I may do this without drama around disks <> backplane <> SAS Cable connectivty.

Thank you,

Tim
 
booting proxmox latest, or Debian 10 or newer, and the MPT Sas is busted/sad
mpt sas raid is a first class citizen on debian, and well supported. whats the output of MegaCli -PDList -aALL?

The damn thing booted perfectly and sees the raid mirror on my Perc700 perfectly, no drama, it "just works" perfectly.

compare the output of lsmod for both- should give you what you're missing.
 
Thank you. That is a good suggestion. I agree, my feeling that MPT SAS was expected for me to be "a no brainer, easy good reliable because super standard and extremely well supported". Hence all my frustration with this.

With my test work yesterday - On new-current Debian and Proxmox, there is what looks like - a problem with init status of the raid card. I have photos (not text blobs to paste - sorry) - since was working on physical console of system / did not have easy way to capture all the text messages.

I will post these in a second, just uploading them from my phone now.

A few threads I have found which I thought? were relevant but the fixes there don't seem to touch my situation:

https://forum.proxmox.com/threads/proxmox-ve-7-2-megaraid-issues.110587/
seems to suggest "add the kernel parameters intel_iommu=on iommu=pt " to your boot kernel stanza to help things out
(in my case adding these to boot stanza for proxmox or debian installer does not help my situation)

then also linked from above, this thread
https://forum.proxmox.com/threads/proxmox-7-2-upgrade-broke-my-raid.109368/#post-471877

For reference this site is giving nice overview of what I have done (ie, remove h200 and swap in h700) although funny enough I did not find this web page until after I had done the work myself already:

https://practicalsbs.wordpress.com/2014/08/06/upgrading-dell-perc-h200-raid-card-to-h700/

otherwise, this thread also feels very relevant, the error and behaviour are consistent:
https://forums.debian.net/viewtopic.php?t=144854

anyhoo. I will put a few screenshot photos up in a moment to show more precisely what I have observed yesterday / so far.

thank you for the feedback. It is really appreciated.

Tim
 
Hi, here are some photo snips of dmesg / errors from various attempts, with brief text to try to help give context and clarify.
these are all with h700 card, and a pair of 1Tb SSD attached with a single raid mirror volume defined.

First error ref from Proxmox-latest installer
key text error here I think is "UBSAN shift-out-of-bounds"
001 error prox boot raid.jpg


Second error, we have this when it is trying to init the controller, it spends about 2 minutes thrashing ~hundreds of attempts like this. Notable error message keyword is "Deadbeef"
002 fail deadbeef error.jpg

endgame after a few minutes of those messages, we get this endpoint:
003 error end game after deadbeef.jpg

here is a sample of what was seen in messages flying by with my "good happy ancient sysrescue linux bootup" on the same hardware
004 better version.jpg

and then once this old "sysrescueCD" booted I could use cfdisk and just see the raid volume on the perc controller without issue.

for new-current proxmox and debian, there is no disk detected on perc / perc is in error state I believe so not initialized/active.

small drama with this work is that my physical access is a hassle (ie, I need to drive 30m to get to the site)
I do have VPN > DRAC in theory but the DRAC is a hassle to do since it is ancient java based and ~all modern browsers want to block it.
I think I can get DRAC working with a bit more effort so will do so now because I have no need to physically go there to do more parts swap work in theory. ie, If all goes well this problem can be resolved over remote console DRAC work and booting suitable ISO via Drac with proper config stanzas etc. Maybe.

anyhow. I hope this helps give you bit more info about what I am seeing. It is a slog do debug this in person because the Dell (R815) server (dual 8 core CPU, 128gig ram - this is why we're trying to use this thing for a 'proxmox test install proof of concept deploy", instead of tossing it in the bin - it was running vmware fine before this) - the dell takes nearly 5-8 minutes for each boot cycle and so if you end up doing 10-12 reboots in the course of a debug session you spend a lot of time waiting around for the thing to boot. Always a joy.

Tim
 
Ping? does anyone have any thoughts comments? I am still baffled why MPT_SAS is so busted on this hardware with more recent linux/proxmox; but it works fine with an ancient Sysrescue/10year old boot media. Thank you. (!!)
 
Ping? does anyone have any thoughts comments? I am still baffled why MPT_SAS is so busted on this hardware with more recent linux/proxmox; but it works fine with an ancient Sysrescue/10year old boot media. Thank you. (!!)
The UBSAN messages ought to be just warning - and in most cases upgrading the controllers firmware will go a long way to fix issues with newer kernels.

Please try upgrading all firmware in your systems
 
Thank you for the reply.

Added details:
> I realize the UBSAN is a warn now. ie, bit of a red herring I should ignore.
> I have already updated the system as follows

(a) new / latest BIOS for mainboard in the dell r815
(b) raid card (perc 700) was on latest firmware from Dell, I had problem still
(c) then I downgraded perc700 to 1 revision back on firmware in case that might help, still I have the problem.
(d) then I test-boot with my ancient sysrescueCDLiveMedia and things all are awesome, my raid card works and mirror volume is visible.
(e) so that led me to think not a core hardware problem, but a driver issue.
(f) it seems ?many others? report that booting with flag passed to kernel @ boot - mpt3sas.max_queue_depth=8000 - helps workaround problem, but that in my case appears to be not the case. (ie, with this flag confirmed in my dmesg after booting, I have still problems with mpt3sas.

I found other thread discussing MPT3SAS drama that suggested if I can boot with an 'edge kernel' I maybe able to get around the problem
I am not sure how to get a boot media installer ISO for proxmox/with Edge Kernel. Does such a thing exist?

Thank you!

Tim
 
a few options you could try:
* check out the following article - someone got their system with a similar controller running, by changing a few bios settings (e.g. make sure you let the PCIe Speed set to auto)
https://e-mc2.net/blog/megaraid-sas-9341-8i-not-working-with-linux/
* you can try installing the pve-kernel-6.2 package and see if this improves things (PVE 8.0 currently is using it and you should upgrade sooner or later anyways)

I hope this helps!
 
Thank you! I will try the iommu=soft / bios settings review. This link/problem sounds familiar situation - controller cannot transition to active state

I will post followup after testing this.

Tim
 
I have a dell r715 with an perc h200. It works fine on pve 7.4. It works fine. It has 2 sas drives as a mirror setup in the controller. And 4 ssds set as single disks in the controller. The 4 ssds are zfs raid 10 and work well. Maybe they are not the fastest but it is ok.

If you would like the firmware version for the controller I will try to figure out how to see it. Pretty sure it was off the dell web sight. Anything else I might show?

Are you installing 7 or 8? I am afraid to upgrade to 8 due to this thread.
https://forum.proxmox.com/threads/no-sas2008-after-upgrade.129499/post-574205

Code:
05:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
    DeviceName: Integrated SAS                        
    Subsystem: Dell PERC H200 Integrated
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 34
    NUMA node: 0
    IOMMU group: 18
    Region 0: I/O ports at fc00 [size=256]
    Region 1: Memory at ecff0000 (64-bit, non-prefetchable) [size=64K]
    Region 3: Memory at ecf80000 (64-bit, non-prefetchable) [size=256K]
    Expansion ROM at ece00000 [disabled] [size=1M]
    Capabilities: [50] Power Management version 3
        Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [68] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 5GT/s (ok), Width x4 (downgraded)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR-
             10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled,
             AtomicOpsCtl: ReqEn-
        LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
             EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [d0] Vital Product Data
pcilib: sysfs_read_vpd: read failed: No such device
        Not readable
    Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
        Vector table: BAR=1 offset=0000e000
        PBA: BAR=1 offset=0000f800
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
        CEMsk:    RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
        AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
            MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [138 v1] Power Budgeting <?>
    Kernel driver in use: mpt3sas
    Kernel modules: mpt3sas
 
Last edited:
Hi!

About the "Perc H200i/H700" - use latest RAID firmware, and in the Utility Config ( after the BIOS post ) "set HBA-mode/JBOD to all disk", in this way the "megaraid_sas" kernel module will load, you will see all the disk. This will use the card like a simple HBA card, RAID config wont work on firmware level Utility Config, you can use the "linux software raid or ZFS" on the OS level.

IT-MODE firmware wont work, because it uses the "mpt3sas" kernel module, thats broken for the old SAS2 HBA cards in the new kernels ( 5.15; 6.2 )

The "Perc H200i/H700" is an old card (EOL), no support from chip maker: LSI/AVAGO/BROADCOM - so they wont fix the kernel driver.
 
Last edited:
I am not sure how it is working for me but it is. My pve started on version 4 and has been upgraded to current 7.4. It is running on Linux 6.2.16-4-bpo11-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.16-4~bpo11+1 (2023-07-07T15:05Z) opt-in kernel.
Code:
root@pve:~# lsmod
Module                  Size  Used by
udp_diag               16384  0
tcp_diag               16384  0
inet_diag              28672  2 tcp_diag,udp_diag
cfg80211             1204224  0
8021q                  45056  0
garp                   20480  1 8021q
mrp                    20480  1 8021q
veth                   40960  0
ipmi_si                90112  1
ebtable_filter         16384  0
ebtables               45056  1 ebtable_filter
dell_rbu               20480  0
ip_set                 57344  0
ip6table_raw           16384  0
iptable_raw            16384  0
ip6table_filter        16384  0
ip6_tables             36864  2 ip6table_filter,ip6table_raw
iptable_filter         16384  0
bpfilter               16384  0
mptctl                 40960  1
mptbase               114688  1 mptctl
nf_tables             331776  0
nfnetlink_cttimeout    20480  0
bonding               229376  0
tls                   143360  1 bonding
openvswitch           192512  32
nsh                    16384  1 openvswitch
nf_conncount           24576  1 openvswitch
nf_nat                 61440  1 openvswitch
nf_conntrack          192512  4 nf_nat,nfnetlink_cttimeout,openvswitch,nf_conncount
nf_defrag_ipv6         24576  2 nf_conntrack,openvswitch
nf_defrag_ipv4         16384  1 nf_conntrack
softdog                16384  2
nfnetlink_log          24576  1
nfnetlink              24576  6 nfnetlink_cttimeout,nf_tables,ip_set,nfnetlink_log
xfs                  2105344  1
amd64_edac             45056  0
edac_mce_amd           40960  1 amd64_edac
kvm_amd               192512  30
ccp                   126976  1 kvm_amd
kvm                  1298432  1 kvm_amd
nouveau              2744320  3
snd_hda_codec_hdmi     94208  1
crct10dif_pclmul       16384  1
polyval_clmulni        16384  0
polyval_generic        16384  1 polyval_clmulni
ghash_clmulni_intel    16384  0
sha512_ssse3           53248  0
aesni_intel           397312  0
snd_hda_intel          57344  0
snd_intel_dspcfg       36864  1 snd_hda_intel
snd_intel_sdw_acpi     20480  1 snd_intel_dspcfg
mxm_wmi                16384  1 nouveau
video                  73728  1 nouveau
crypto_simd            20480  1 aesni_intel
snd_hda_codec         200704  2 snd_hda_codec_hdmi,snd_hda_intel
cryptd                 28672  2 crypto_simd,ghash_clmulni_intel
wmi                    40960  3 video,mxm_wmi,nouveau
drm_ttm_helper         16384  1 nouveau
ttm                   102400  2 drm_ttm_helper,nouveau
snd_hda_core          135168  3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
drm_display_helper    204800  1 nouveau
ipmi_ssif              45056  0
snd_hwdep              20480  1 snd_hda_codec
snd_pcm               188416  4 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_core
cec                    94208  1 drm_display_helper
mgag200                73728  0
snd_timer              45056  1 snd_pcm
rc_core                77824  1 cec
drm_shmem_helper       24576  1 mgag200
snd                   135168  6 snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_timer,snd_pcm
dcdbas                 24576  0
drm_kms_helper        237568  6 drm_display_helper,mgag200,nouveau
cdc_acm                49152  0
joydev                 32768  0
input_leds             16384  0
soundcore              16384  1 snd
serio_raw              20480  0
i2c_algo_bit           16384  2 mgag200,nouveau
pcspkr                 16384  0
syscopyarea            16384  1 drm_kms_helper
sysfillrect            20480  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
ipmi_devintf           20480  2
k10temp                16384  0
ipmi_msghandler        86016  3 ipmi_devintf,ipmi_si,ipmi_ssif
fam15h_power           20480  0
mac_hid                16384  0
acpi_power_meter       20480  0
zfs                  4288512  43
zunicode              352256  1 zfs
zzstd                 684032  1 zfs
zlua                  204800  1 zfs
zavl                   24576  1 zfs
icp                   348160  1 zfs
zcommon               118784  2 zfs,icp
znvpair               131072  2 zfs,zcommon
spl                   126976  6 zfs,icp,zzstd,znvpair,zcommon,zavl
vhost_net              32768  84
vhost                  57344  1 vhost_net
vhost_iotlb            16384  1 vhost
tap                    32768  1 vhost_net
ib_iser                49152  0
rdma_cm               139264  1 ib_iser
iw_cm                  57344  1 rdma_cm
ib_cm                 139264  1 rdma_cm
ib_core               471040  4 rdma_cm,iw_cm,ib_iser,ib_cm
iscsi_tcp              24576  0
libiscsi_tcp           32768  1 iscsi_tcp
libiscsi               77824  3 libiscsi_tcp,iscsi_tcp,ib_iser
scsi_transport_iscsi   167936  5 libiscsi_tcp,iscsi_tcp,ib_iser,libiscsi
vfio_pci               16384  0
vfio_pci_core          90112  1 vfio_pci
irqbypass              16384  256 vfio_pci_core,kvm
vfio_iommu_type1       49152  0
vfio                   57344  3 vfio_pci_core,vfio_iommu_type1,vfio_pci
iommufd                69632  1 vfio
pci_stub               16384  0
drm                   671744  11 drm_kms_helper,drm_shmem_helper,drm_display_helper,mgag200,drm_ttm_helper,ttm,nouveau
sunrpc                712704  1
ip_tables              36864  2 iptable_filter,iptable_raw
x_tables               65536  7 ebtables,ip6table_filter,ip6table_raw,iptable_filter,ip6_tables,iptable_raw,ip_tables
autofs4                53248  2
btrfs                1851392  0
blake2b_generic        20480  0
xor                    24576  1 btrfs
raid6_pq              126976  1 btrfs
simplefb               16384  0
dm_thin_pool           86016  2
dm_persistent_data    114688  1 dm_thin_pool
dm_bio_prison          28672  1 dm_thin_pool
dm_bufio               49152  1 dm_persistent_data
libcrc32c              16384  7 nf_conntrack,nf_nat,dm_persistent_data,openvswitch,btrfs,nf_tables,xfs
ses                    20480  0
enclosure              24576  1 ses
uas                    28672  0
usb_storage            81920  2 uas
usbmouse               16384  0
usbkbd                 16384  0
hid_generic            16384  0
usbhid                 69632  0
hid                   172032  2 usbhid,hid_generic
mpt3sas               364544  12
crc32_pclmul           16384  0
raid_class             16384  1 mpt3sas
psmouse               204800  0
scsi_transport_sas     53248  2 ses,mpt3sas
i2c_piix4              28672  0
ohci_pci               20480  0
bnx2                  118784  0
ehci_pci               20480  0
ahci                   49152  1
ohci_hcd               61440  1 ohci_pci
ehci_hcd              102400  1 ehci_pci
libahci                57344  1 ahci
 
Last edited:
I have a dell r715 with an perc h200. It works fine on pve 7.4. It works fine. It has 2 sas drives as a mirror setup in the controller. And 4 ssds set as single disks in the controller. The 4 ssds are zfs raid 10 and work well. Maybe they are not the fastest but it is ok.

If you would like the firmware version for the controller I will try to figure out how to see it. Pretty sure it was off the dell web sight. Anything else I might show?

Are you installing 7 or 8? I am afraid to upgrade to 8 due to this thread.
https://forum.proxmox.com/threads/no-sas2008-after-upgrade.129499/post-574205

Code:
05:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
    DeviceName: Integrated SAS                      
    Subsystem: Dell PERC H200 Integrated
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 34
    NUMA node: 0
    IOMMU group: 18
    Region 0: I/O ports at fc00 [size=256]
    Region 1: Memory at ecff0000 (64-bit, non-prefetchable) [size=64K]
    Region 3: Memory at ecf80000 (64-bit, non-prefetchable) [size=256K]
    Expansion ROM at ece00000 [disabled] [size=1M]
    Capabilities: [50] Power Management version 3
        Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [68] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 5GT/s (ok), Width x4 (downgraded)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR-
             10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled,
             AtomicOpsCtl: ReqEn-
        LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
             EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [d0] Vital Product Data
pcilib: sysfs_read_vpd: read failed: No such device
        Not readable
    Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
        Vector table: BAR=1 offset=0000e000
        PBA: BAR=1 offset=0000f800
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
        CEMsk:    RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
        AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
            MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [138 v1] Power Budgeting <?>
    Kernel driver in use: mpt3sas
    Kernel modules: mpt3sas
Hi donhwyo, thank you for this reply. Very much appreciated. Good to hear the h200 works for someone as intended!

I know firmware version is displayed at boot-time but that is a pain if you don't intend to reboot anytime soon. It is part of the messaging that goes by, ie, Dell Logo > CPU Memory Info > Raid controller info pops up > hints about configured raid volumes and also a hint about FW Revision will be shown. > then IPMI info > then gives grub boot loader > onto a normal proxmox boot cycle.

alternately, I think if you have mpt sas / lsi megaraid tools present, you can get them to tell you such things as
current LSI Raid status
raid health
controller status
and probably? I think ? also the raid hardware version?
(ie, megaraid-status or megacli) - although I just peeked at a reference system just now and I see those did not immediately tell me this info, but I am guessing I need to use more suitable verbose flags and then it is buried somewhere in the ~pages of text that can be generated that talks about your exciting raid card config.)

but. anyhoo. maybe more important?

do you mind checking to see if there are any interesting flags visible on your grub config which might be passed to the system at boot time? I think this is one way that these controllers can be made to cooperate more, if I understand correctly some of the other problem reports (And workarounds) that I have read.

In theory on your proxmox host, if you look at output of /proc/cmdline it will tell you boot flags you have in place presently, ie, as per

Code:
root@proxmoxer:/var/lib/vz/images# cat /proc/cmdline

BOOT_IMAGE=/boot/vmlinuz-5.15.104-1-pve root=/dev/md1 ro rootdelay=10 rootdelay=10 vga=normal nomodeset noquiet nosplash systemd.unified_cgroup_hierarchy=0

root@dadaprox2:/var/lib/vz/images#

(is an illustrative example I pulled from a different working proxmox host just now)

thank you!

Tim
 
Last edited:
Will reboot in a few.
Code:
root@pve:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.2.16-4-bpo11-pve root=/dev/mapper/pve-root ro mitigations=off tsc=nowatchdog crashkernel=384M-:256M
mitigations=off tsc=nowatchdog fixed my crashes but the disks were fine before crash. I dont think the disk system had anything to do with the crashes.

Are you on pve 7 or 8?
 
Last edited:
Trying to get screenshots attached. 6 total. Hope this is readable and useful. My graphical skills leave some room for improvement.

I remember not being able to get uefi to boot and just going with bios. My fault I am sure, it was my first attempt.

I also have this in my /etc/apt/sources.list .
Code:
#HWRaid
deb http://hwraid.le-vert.net/debian bullseye main
deb-src http://hwraid.le-vert.net/debian bullseye main
Not sure if I installed anything from there. I don't think so.
Code:
root@pve:~# dpkg -l |grep mega
root@pve:~#
 

Attachments

  • smallone.jpg
    smallone.jpg
    292.6 KB · Views: 6
  • six.jpg
    six.jpg
    327.8 KB · Views: 5
  • smalltwo.jpg
    smalltwo.jpg
    332 KB · Views: 4
  • smallthree.jpg
    smallthree.jpg
    325.6 KB · Views: 5
  • smallfour.jpg
    smallfour.jpg
    355.9 KB · Views: 3
  • smallfive.jpg
    smallfive.jpg
    336.5 KB · Views: 4
Last edited:
Ok something else of questionable use.
Code:
root@pve:~# locate megaraid
/lib/modules/5.15.104-1-pve/kernel/drivers/scsi/megaraid
/lib/modules/5.15.104-1-pve/kernel/drivers/scsi/megaraid.ko
/lib/modules/5.15.104-1-pve/kernel/drivers/scsi/megaraid/megaraid_mbox.ko
/lib/modules/5.15.104-1-pve/kernel/drivers/scsi/megaraid/megaraid_mm.ko
/lib/modules/5.15.104-1-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko
/lib/modules/5.15.107-2-pve/kernel/drivers/scsi/megaraid
/lib/modules/5.15.107-2-pve/kernel/drivers/scsi/megaraid.ko
/lib/modules/5.15.107-2-pve/kernel/drivers/scsi/megaraid/megaraid_mbox.ko
/lib/modules/5.15.107-2-pve/kernel/drivers/scsi/megaraid/megaraid_mm.ko
/lib/modules/5.15.107-2-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko
/lib/modules/5.15.108-1-pve/kernel/drivers/scsi/megaraid
/lib/modules/5.15.108-1-pve/kernel/drivers/scsi/megaraid.ko
/lib/modules/5.15.108-1-pve/kernel/drivers/scsi/megaraid/megaraid_mbox.ko
/lib/modules/5.15.108-1-pve/kernel/drivers/scsi/megaraid/megaraid_mm.ko
/lib/modules/5.15.108-1-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko
/lib/modules/6.2.11-2-pve/kernel/drivers/scsi/megaraid
/lib/modules/6.2.11-2-pve/kernel/drivers/scsi/megaraid.ko
/lib/modules/6.2.11-2-pve/kernel/drivers/scsi/megaraid/megaraid_mbox.ko
/lib/modules/6.2.11-2-pve/kernel/drivers/scsi/megaraid/megaraid_mm.ko
/lib/modules/6.2.11-2-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko
/lib/modules/6.2.16-4-bpo11-pve/kernel/drivers/scsi/megaraid
/lib/modules/6.2.16-4-bpo11-pve/kernel/drivers/scsi/megaraid.ko
/lib/modules/6.2.16-4-bpo11-pve/kernel/drivers/scsi/megaraid/megaraid_mbox.ko
/lib/modules/6.2.16-4-bpo11-pve/kernel/drivers/scsi/megaraid/megaraid_mm.ko
/lib/modules/6.2.16-4-bpo11-pve/kernel/drivers/scsi/megaraid/megaraid_sas.ko
/mnt/pve/TB4/dump/vzdump-lxc-139-2017_12_11-09_46_42.tmp/opt/observium/includes/discovery/sensors/lsi-megaraid-sas-mib.inc.php
/usr/src/linux-headers-6.2.11-1-pve/drivers/scsi/megaraid
/usr/src/linux-headers-6.2.11-1-pve/drivers/scsi/megaraid/Kconfig.megaraid
/usr/src/linux-headers-6.2.11-1-pve/drivers/scsi/megaraid/Makefile
/usr/src/linux-headers-6.2.11-2-pve/drivers/scsi/megaraid
/usr/src/linux-headers-6.2.11-2-pve/drivers/scsi/megaraid/Kconfig.megaraid
/usr/src/linux-headers-6.2.11-2-pve/drivers/scsi/megaraid/Makefile
/usr/src/linux-headers-6.2.16-4-bpo11-pve/drivers/scsi/megaraid
/usr/src/linux-headers-6.2.16-4-bpo11-pve/drivers/scsi/megaraid/Kconfig.megaraid
/usr/src/linux-headers-6.2.16-4-bpo11-pve/drivers/scsi/megaraid/Makefile
/usr/src/linux-headers-6.2.9-1-pve/drivers/scsi/megaraid
/usr/src/linux-headers-6.2.9-1-pve/drivers/scsi/megaraid/Kconfig.megaraid
/usr/src/linux-headers-6.2.9-1-pve/drivers/scsi/megaraid/Makefile
root@pve:~#
 
Hi, just quick update on this thread. I did a bit more work. I managed to get remote DRAC access to console on this server working and this makes it a lot easier to move this along.

Tried one hint based on discussion-post here, https://e-mc2.net/blog/megaraid-sas-9341-8i-not-working-with-linux/
specifically changed to ACPI boot and passed kernel the option iommu=soft
and this had no help for me.

Then I found this good discussion
https://lists.debian.org/debian-kernel/2013/04/msg00422.html

and realize it is spot on for my situation. For fun downloaded Debian Squeeze Installer, booted with that, and the Perc Raid works perfectly.

I updated my Perc raid to latest firmware (12.10) version and then tried newer debian, but no joy.

Mainboard on the r815 is already at latest
so it appears? there is a known problem? which surfaced on Wheezy and has persisted, which renders this perc card stuck in error state / unable to transition to 'ready' state / stays in 'fault' state and disks/raid cannot be detected.

OR there is some combination of
Dell firmware
Perc firmware
Megaraid_sas driver on Debian later than squeeze. (wheezy and beyond)
which is broken in my case and so far I can't see how to make it work other than roll out this box with ancient proxmox based on squeeze

Plan-B. If anyone has a ~decent entry level (current or few years old is fine) PCIE raid card you recommend for basic Mirror/Raid1 that works well with Debian/Proxmox Latest and which supports normal HBA / SAS fan out cable connection. I would just buy such a thing happily and install to the server and move this thing along. So much time wasted debugging. Ugh.

Tim
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!