Proxmox won't complete the reboot

Michael S Ortega

Active Member
Aug 19, 2018
28
0
41
45
hello guys.
I have a weird behavior with a DELL Poweredge T440 with PERC H330 Raid controller, 32 gb ram, latest bios, idrac drivers installed. The issue only happens with Proxmox. I installed proxmox v9 on it, and it works fine. exept that the server wont properly reboot, whenever i reboot the proxmox serverr, it just wont complet the cycle, and it hangs up on the stated showed in the screenshot below. I have to force a reboot by pressing and holding the power button. Any help provided will be highly appreciated it. This behavior only happens with proxmox, windows works fine. thanks.
 

Attachments

  • Selection_249.png
    Selection_249.png
    486.7 KB · Views: 29
Wondering if the OP found a solution? I am having the same issue, and I have 3 of them.
 
Same for me. T440 with PercH730P. Checked older Controller Firmware but nothing worked. It shuts down until the cache of my disks is synced back and then comes "No signal". A Warm power cycle helped as a one-shot-workaround. Same as OP.
Error Code in lifycle-protocol is PCI1318 and "A fatal error was detected on a component at bus 4 device 0 function 1 or 0" (Thats the controller)
Controller was in RAID Mode when the Problem appeared in Proxmox 8 nearly 8 months ago. Now I did much trial-and-error without success. Actual the controller is in HBA Mode. Firmware is the newest. PVE 9 fresh installed and updated. It still won´t reboot without "hands-on".

I would be very happy for any hint to get it work.

Diagnostic with percli dropped this information:

Code:
CLI Version = 007.2313.0000.0000 Mar 07, 2023
Operating system = Linux 6.17.2-1-pve
Status Code = 0
Status = Success
Description = None

Number of Controllers = 1
Host Name = pve-XX
Operating System  = Linux 6.17.2-1-pve

System Overview :
===============

----------------------------------------------------------------------------
Ctl Model            Ports PDs DGs DNOpt VDs VNOpt BBU sPR DS EHS ASOs Hlth
----------------------------------------------------------------------------
  0 PERCH730PAdapter     8   6   0     0   0     0 Opt On  3  N      0 Opt
----------------------------------------------------------------------------

Ctl=Controller Index|DGs=Drive groups|VDs=Virtual drives|Fld=Failed
PDs=Physical drives|DNOpt=Array NotOptimal|VNOpt=VD NotOptimal|Opt=Optimal
Msng=Missing|Dgd=Degraded|NdAtn=Need Attention|Unkwn=Unknown
sPR=Scheduled Patrol Read|DS=DimmerSwitch|EHS=Emergency Spare Drive
Y=Yes|N=No|ASOs=Advanced Software Options|BBU=Battery backup unit/CV
Hlth=Health|Safe=Safe-mode boot|CertProv-Certificate Provision mode
Chrg=Charging | MsngCbl=Cable Failure

ASO :
===

----
Ctl
----
  0
----

Ctl=Controller Index|Cl=Cluster|MD=Max Disks|WC=Wide Cache|SS=Safe Store|FP=Fast Path
Re=Recovery|CR=CacheCade(Read)|RF=Reduced Feature Set|CO=Cache Offload
CW=CacheCade(Read/Write)|X=Not Available/Not Installed|U=Unlimited|T=Trial
|HA=High Availability |SSHA=Single server High Availability
 
Today I did a clean install on a SSD SATA Disc without controller AND the problem appeared again. In the r/Proxmox forum I´ve found some people with the same problem:
https://www.reddit.com/r/Proxmox/comments/1o6lrh2/server_not_rebooting_properly/

It seems to be a Kernel bug in combination with the Broadcom NICs. Reinstall the kernel before 6.6 should be a workaround.
I will try to use another NIC from Intel. Maybe this would help, too...
Anybody checked?
 
The solution for me is found! It wasn´t the controller. The problem ist the Kerner >6.5 with the Broadcom network interface cards. The are not compatible. If you disable them in system Bios the machine ist booting fine without any error AND with the Perc H730p working. Check out the reddit thread: https://www.reddit.com/r/Proxmox/comments/1o6lrh2/server_not_rebooting_properly/

Now I put in another network card and everything is fine!

Hope this helps every DELL T440 User to give their server a new job :-)
 
  • Like
Reactions: Onslow