[SOLVED] Proxmox suddenly stopped booting

Apr 3, 2023
4
0
1
System info:
MB: Supermicro h11ssl-i
CPU: epyc 7551
Ram: 256GB of samsung @2666
Boot drives: ZFS mirrored samsung 860 evo 2tb
Other drives:
-4tb WD red pro x8
-1tb samsung 970 evo x6
-480gb seagate firecuda x1
PSU: corsair 1200w
PCIe cards:
-4 nvme expander (no pci bridge, slot bifurcation)
-2 nvme expander (^^)
-Dell broadcom dualport 10gb nic (Active cooled)

8 WD drives connected via chassis backplane using MB breakout slots
2 Samsung 2tb drives connected via tradition sata cables

Every drive except the 860 mirror is passed through to a truenas scale VM

What I've done so far:

Server was powered down as I was doing electrical work in the house. Shut down was where the issues started. I could power down the VMs but proxmox itself wasn't responding to the power down command. Eventually, once the VMs were shutdown I used IPMI to hard shutdown the server. Following this, I installed UPS into the rack. Seemed to be powering on normally but then got stuck. From my tests, its getting past SystemD-Boot, but fails when loading up everything into the kernel... maybe? IDK enough to properly troubleshoot further. The ZFS pool is still accessible, and I've successfully mounted it using a Ubuntu live media drive. Unfortunately I've only recently guessed how far its gotten booting as the quiet CMD command is passed to the bootloader, and I'm not entirely sure how to change that external from the proxmox boot tool.
I'll post the SYSlog file here from the live USB momentarily, but I have some other questions:
-Do I need another backup server vm to read backups from the zfspool they're stored on? or can I just mount that pool and use the backups from there?
-Every now and again I'd see something along the lines of kernel is dazed and confused and trying to continue, does anybody know much as to what this might be? It references power settings but the processor is on auto for its cTDP, Affinity and boost. Aside from turning SVM on, processor settings have been untouched.

Gonna start pulling PCI cards now to see where it gets me. Any help would be appreciated!
 
Pulling PCIE cards did nothing. Kernel 5.13.19-6-pve booted to the point of letting me login on the terminal, but its completely stuck spinning its wheels trying to run ls
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!