Hello,
Unfortunatelly I still have some problems with my HPE bl460 Gen8. But this time different one so I open new topic.
After some problems with LVM on RAID5 on Proxmox 5.x (updated), I decided to:
1. Exchange system disks and install Proxmox 6.1.
2. Replace Slot 3 with additional HPE SB40c module with another box.
New storage module got P420 controller and 4x900 GB SAS RAID5 for LVM and 2x 2GB SATA for local backups.
All instalation process went fine and I migrated some VMs there. Unfortunatelly system crashed this weekend with a sequence of steps:
"77"," Critical","Drive Array","12/21/2019 23:59","12/21/2019 23:59","1","Drive Array Controller Failure (Slot 3)",
"76"," Critical","PCI Bus","12/19/2019 02:51","12/19/2019 02:51","1","Uncorrectable PCI Express Error (Slot 3, Bus 8, Device 0, Function 0, Error status 0x00000010)",
"75"," Critical","System Error","12/19/2019 02:51","12/19/2019 02:51","1","Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible",
I logged, rebooted and checked all elements. No single warning. So stated machine again.
Today I got:
"81"," Critical","PCI Bus","12/23/2019 00:19","12/23/2019 00:19","1","Uncorrectable PCI Express Error (Slot 3, Bus 0, Device 28, Function 0, Error status 0x00010000)",
"80"," Critical","System Error","12/23/2019 00:19","12/23/2019 00:19","1","Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible",
Server still works. But error message varies. I've checked HPE forum and found that at last second case (error 81) is connected with Matrox on board graphic card which produces some errors while drivers change. There are some threads about problems connected both to Windows and Linux new drivers.
So my question here is as follows:
Does Proxmox 6.1 have new driver for Matrox G200eh card? Or anything new for HPE P410/420 RAID controllers?
As I changed both - Proxmox and some hardware - and moreover error differs - I have another puzzle to settle with this machine. What did not change - server itself which was working pretty long on previous Proxmox without any failure. So I rather expect some problem with a new stroage bay. Maybe P420 this time. Or sth in SB40c module.
But maybe someone faced a problem with HPE BL460c Gen8 and Proxmox 6.1?
[edit] I have also found german thread: https://forum.proxmox.com/threads/watchdog-hp-proliant-d380-g8.45747/ connected to pretty the same issue (if I can trust Google translator). Tere they were talking about problem with kernel drivers for HPE. Maybe with new Proxmox 6.1 and its kernel: Linux amber 5.3.13-1-pve #1 SMP PVE 5.3.13-1 (Thu, 05 Dec 2019 07:18:14 +0100) x86_64 GNU/Linux problem rised again?
Unfortunatelly I still have some problems with my HPE bl460 Gen8. But this time different one so I open new topic.
After some problems with LVM on RAID5 on Proxmox 5.x (updated), I decided to:
1. Exchange system disks and install Proxmox 6.1.
2. Replace Slot 3 with additional HPE SB40c module with another box.
New storage module got P420 controller and 4x900 GB SAS RAID5 for LVM and 2x 2GB SATA for local backups.
All instalation process went fine and I migrated some VMs there. Unfortunatelly system crashed this weekend with a sequence of steps:
"77"," Critical","Drive Array","12/21/2019 23:59","12/21/2019 23:59","1","Drive Array Controller Failure (Slot 3)",
"76"," Critical","PCI Bus","12/19/2019 02:51","12/19/2019 02:51","1","Uncorrectable PCI Express Error (Slot 3, Bus 8, Device 0, Function 0, Error status 0x00000010)",
"75"," Critical","System Error","12/19/2019 02:51","12/19/2019 02:51","1","Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible",
I logged, rebooted and checked all elements. No single warning. So stated machine again.
Today I got:
"81"," Critical","PCI Bus","12/23/2019 00:19","12/23/2019 00:19","1","Uncorrectable PCI Express Error (Slot 3, Bus 0, Device 28, Function 0, Error status 0x00010000)",
"80"," Critical","System Error","12/23/2019 00:19","12/23/2019 00:19","1","Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible",
Server still works. But error message varies. I've checked HPE forum and found that at last second case (error 81) is connected with Matrox on board graphic card which produces some errors while drivers change. There are some threads about problems connected both to Windows and Linux new drivers.
So my question here is as follows:
Does Proxmox 6.1 have new driver for Matrox G200eh card? Or anything new for HPE P410/420 RAID controllers?
As I changed both - Proxmox and some hardware - and moreover error differs - I have another puzzle to settle with this machine. What did not change - server itself which was working pretty long on previous Proxmox without any failure. So I rather expect some problem with a new stroage bay. Maybe P420 this time. Or sth in SB40c module.
But maybe someone faced a problem with HPE BL460c Gen8 and Proxmox 6.1?
[edit] I have also found german thread: https://forum.proxmox.com/threads/watchdog-hp-proliant-d380-g8.45747/ connected to pretty the same issue (if I can trust Google translator). Tere they were talking about problem with kernel drivers for HPE. Maybe with new Proxmox 6.1 and its kernel: Linux amber 5.3.13-1-pve #1 SMP PVE 5.3.13-1 (Thu, 05 Dec 2019 07:18:14 +0100) x86_64 GNU/Linux problem rised again?
Last edited: