[SOLVED] ASM1166 + Intel Cxxx + Proxmox 9: GUI reboot enumeration failure fixed with pre-ha-lrm systemd hook

upcycle

New Member
Mar 30, 2026
1
1
3
Report: ASM1166 (PH516 VER:1.5, firmware 241224-0000-00) + Intel C236 root port (Dell Precision Tower 3620) + Proxmox 9 (kernel 6.17.x) warm-reboot enumeration failure

Hardware

  • Adapter: ASM1166 M.2-to-6xSATA (PH516 VER:1.5, all 6 SATA ports populated with drives, legacy boot enabled in BIOS)
  • Root port: 00:1b.0 (Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #17, rev f1)
  • Firmware tested: original 220419-0000-00 and upgraded 241224-0000-00 (identical behavior)
Observed behavior (verified across cold vs warm boot cycles)
  • Cold boot / hard power cycle: ASM1166 enumerates correctly under downstream bus 02 at 8.0 GT/s x2, Power state D0, full BARs and memory window populated, ahci driver loads cleanly, all 6 drives visible.
  • Any soft reboot (reboot, shutdown -r now, or Proxmox web-UI Reboot button): 00:1b.0 shows Link speed downgraded to 2.5 GT/s, Power state D3hot, downstream bus 02 empty. lspci -vvv shows zeroed memory window at offset 0x20, downgraded LNKSTA/LNKCTL bits, PMCSR in D3hot, AER error flags set. dmesg reports “broken device, retraining non-functional downstream link at 2.5GT/s” + “retraining failed”.
Root cause discovery
The failure is suspect to be a timing race that occurs only on warm reset between the ASM1166 silicon and the Intel C236 root port.
  • Direct CLI reboot runs the kernel’s standard PCIe shutdown notifiers in the expected order, leaving the root port in a clean state.
  • Proxmox GUI reboot first stops pve-ha-lrm.service with its “conditional” shutdown policy (stops/freezes HA services), which changes the exact moment the PCIe notifiers run and leaves the root port locked in D3hot + broken retrain flags.
    BIOS settings (C-States disabled, Deep Sleep disabled), pcie_aspm=off, ahci.mobile_lpm_policy=0, libata.force=nolpm were tested; only pcie_aspm=off made CLI reboots reliable, but it never fixed the GUI path.
What was tried (chronological summary)
  • Exhaustive lspci -vvv -xxx and dmesg comparisons of good (cold) vs bad (warm) states.
  • Manual sysfs power/control D0/D3 cycles, setpci writes to Bridge Control (secondary bus reset), PMCSR (50.w), rescan sequences.
  • Journal analysis of pve-ha-lrm.service vs systemd-shutdown paths (GUI vs CLI).
  • Creation and iterative refinement of a systemd oneshot service that runs Before=pve-ha-lrm.service on shutdown/reboot targets.
Prospective solution (installed and verified working after GUI reboot)
A minimal systemd oneshot service that forces the exact clean root-port state before the HA conditional freeze occurs.

A service file is created at: /etc/systemd/system/pve-pre-ha-pcie-clean.service

Installation steps (run as root, don't forget to adapt to your own adapter, no pun intended)

1. Create the service file (cli command):
Code:
cat > /etc/systemd/system/pve-pre-ha-pcie-clean.service << 'EOF'
[Unit]
Description=Minimal ASM1166 pre-HA hook: force clean C236 root port state before pve-ha-lrm conditional shutdown
DefaultDependencies=no
Before=pve-ha-lrm.service shutdown.target reboot.target
Wants=shutdown.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/sh -c 'setpci -s 00:1b.0 3e.b=40; sleep 1; setpci -s 00:1b.0 3e.b=00; sleep 2; echo 1 > /sys/bus/pci/rescan; echo "Pre-HA PCIe SBR+rescan done (ASM1166 fixed)" > /dev/kmsg'
ExecStop=/bin/true

[Install]
WantedBy=reboot.target shutdown.target
EOF

2. Reload systemd and enable the service (this also creates two symlinks):

Code:
systemctl daemon-reload && systemctl enable --now pve-pre-ha-pcie-clean.service

The symlinks:
  • /etc/systemd/system/reboot.target.wants/pve-pre-ha-pcie-clean.service
  • /etc/systemd/system/shutdown.target.wants/pve-pre-ha-pcie-clean.service
These symlinks are the only permanent filesystem changes. The service has zero effect while the machine is running; it activates only on shutdown or reboot targets and runs before pve-ha-lrm.service.

Result after first GUI reboot with the service active
  • 00:1b.0: LnkSta Speed 8GT/s, Width x2, Power state D0
  • 02:00.0: ASM1166 fully enumerated
  • ahci driver loaded without controller-reset failures
  • No retrain errors in dmesgThe GUI reboot now behaves identically to a cold boot.
Undo (if removal is required)

Code:
systemctl disable --now pve-pre-ha-pcie-clean.service
rm -f /etc/systemd/system/pve-pre-ha-pcie-clean.service
rm -f /etc/systemd/system/reboot.target.wants/pve-pre-ha-pcie-clean.service
rm -f /etc/systemd/system/shutdown.target.wants/pve-pre-ha-pcie-clean.service
systemctl daemon-reload

Although this workaround is developed on my Intel C236 + ASM1166 combination, it may apply to any downstream PCIe bridge combination that fails warm-reset timing under Proxmox HA-managed shutdowns. No kernel patches, BIOS changes, or firmware updates were required.

Disclaimer: I am new to proxmox, this is my first post. I registered specifically to provide this feedback and share my experience in the hope it may help others like this forum has helped me fix this.
 
  • Like
Reactions: DerekG