Hello all, and thanks in advance for any suggestions. Back in October I built myself a proxmox server using an I9 11900K on an ASUS z590-A gaming Wi-Fi, 128Gb oloy 3200 cl16 ram, 1TB intel 960 SSD, in a supermicro super chassis with redundant 1400W power supply's built in. I also have an Adaptec 2274400-R raid card with 10, 6Tb Seagate SAS HDDs in raid 10. Yes I know that the drives will not be viewable to the OS and that is OK with me. Proxmox installed flawlessly other than an issue with the network interface sometimes being lost at boot, which I was able to fix right away. The server was working flawlessly for what I needed from it and was not giving me any issues. I backed up the installation SSD to another SSD and cold stored it in case I ever ran into issues. Before cold storing it, I validated that it was working before storing it.
At the beginning of march, I noticed that the system was becoming unresponsive after being on 24/7 since the new year. According to the logs it had installed some updates automatically a few days earlier. I did a restart to the system hoping this would fix my issues, unfortunately it would not reboot on me once rebooting; giving the error “Volume group “PVE” not found. Cannot process volume group PVE”. Upon seeing this I tried changing over to my cold stored SSD, hoping that the issue was either with the SSD or the installation. Unfortunately, this would give me the same result, but I started to get another issue, I would get a “no suitable video mode found. Booting in blind mode”, on either SSD. I looked up the “no suitable video mode” error and found that sometimes you need to set the video device in the bios to the CPU integrated as the default device. This did nothing for me than give me the same 2 issues.
I decided then to try and reinstall the OS, but this again gave me some issues. I started off getting the “no suitable video mode found. Booting in blind mode” error, and after a few cold boots I could get it to actually start an installer. Upon booting into the installer I would get a few different issues, “VFS: unable to mount root fs on unknown-block(1,0)”, “kernel panic not syncing: attempted to kill init”, “Kernel panic – not syncing: fatal exception in interrupt Kernel offset: 0x2d000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) –[end kernel panic – not syncing: fatal exception in interrupt]---“. After seeing all these issues, I thought it might be my installation medium, I updated my iso from proxmox-ve_7.1-1 to proxmox-ve_7.1-2 and tried 4 different iso to USB burning tools on 3 different flash drives: all producing the same errors. I decided to try and do a clean Debian install in case there was an issue with the proxmox image and again no boot and the same issues coming up. I then thought, could the kernel panic be a ram issue, so I ran memtest86. All memory reported good after running 4 passes on each stick individually and all together. During the memtest process I had completely disassembled the system to verify it was not a hardware issue, but it resulted in the same error's too. At this point I tried to do a bios update in case it was I bios issue, again with really no success. Occasionally upon a cold boot afterwards I would be able to get into the proxmox installer without any issues but when I get to the select network interface window and it would not see any NIC and would lock up at this point.
At this point I am stumped at what to do since nothing I do seems to work to get the system working again. Any additional diagnostics steps I should try to get it installed would be greatly appreciated. I can take photos of the screen during boot if anyone wants to try and see more of what happens upon request.
Thanks!
At the beginning of march, I noticed that the system was becoming unresponsive after being on 24/7 since the new year. According to the logs it had installed some updates automatically a few days earlier. I did a restart to the system hoping this would fix my issues, unfortunately it would not reboot on me once rebooting; giving the error “Volume group “PVE” not found. Cannot process volume group PVE”. Upon seeing this I tried changing over to my cold stored SSD, hoping that the issue was either with the SSD or the installation. Unfortunately, this would give me the same result, but I started to get another issue, I would get a “no suitable video mode found. Booting in blind mode”, on either SSD. I looked up the “no suitable video mode” error and found that sometimes you need to set the video device in the bios to the CPU integrated as the default device. This did nothing for me than give me the same 2 issues.
I decided then to try and reinstall the OS, but this again gave me some issues. I started off getting the “no suitable video mode found. Booting in blind mode” error, and after a few cold boots I could get it to actually start an installer. Upon booting into the installer I would get a few different issues, “VFS: unable to mount root fs on unknown-block(1,0)”, “kernel panic not syncing: attempted to kill init”, “Kernel panic – not syncing: fatal exception in interrupt Kernel offset: 0x2d000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) –[end kernel panic – not syncing: fatal exception in interrupt]---“. After seeing all these issues, I thought it might be my installation medium, I updated my iso from proxmox-ve_7.1-1 to proxmox-ve_7.1-2 and tried 4 different iso to USB burning tools on 3 different flash drives: all producing the same errors. I decided to try and do a clean Debian install in case there was an issue with the proxmox image and again no boot and the same issues coming up. I then thought, could the kernel panic be a ram issue, so I ran memtest86. All memory reported good after running 4 passes on each stick individually and all together. During the memtest process I had completely disassembled the system to verify it was not a hardware issue, but it resulted in the same error's too. At this point I tried to do a bios update in case it was I bios issue, again with really no success. Occasionally upon a cold boot afterwards I would be able to get into the proxmox installer without any issues but when I get to the select network interface window and it would not see any NIC and would lock up at this point.
At this point I am stumped at what to do since nothing I do seems to work to get the system working again. Any additional diagnostics steps I should try to get it installed would be greatly appreciated. I can take photos of the screen during boot if anyone wants to try and see more of what happens upon request.
Thanks!