Proxmox 7.2 Upgrade Broke My RAID

In case someone has the time and opportunity:
Since we had some similar reports - just to make sure that UBSAN is really not causing the issues - we prepared a test-kernel where UBSAN is disabled:
http://download.proxmox.com/temp/kernel-5.15.35-noubsan/

it would be great if you could:
* fetch the packages
* compare the sha512sums
* install them via `apt install /path/to/packages/pve-kernel-5.15.35-3testubsan-pve_5.15.35-1~testubsan_amd64.deb` (or with `dpkg -i`)

and boot once to see if the issues go away.

Thanks!
 
Where I work, we were experiencing the same issue and the suggested kernel helped to solve the issue.
When will the kernel changes be pushed be pushed to the proxmox repositories ?
 
Last edited:
We I work we were experiencing the same issue and the suggested kernel helped to solve the issue.
When will the kernel changes be pushed be pushed to the proxmox repositories ?
could you please provide the journals from both boots (once with the kernel that causes the issue, and once with the one from the link (nobsan))?

for the other threads where people had similar issues disabling ubsan did not change the problem (only the error-messages that were shown pointed more directly to the actual issue)
 
Here I am sharing the dmesg logs from both kernels.
The following message are not longer seen on the new provided test kernel.

Code:
UBSAN: array-index-out-of-bounds
 

Attachments

  • dmesg-5.15.35-1~testubsan.txt
    162.7 KB · Views: 2
  • dmesg-5.15.39-1-pve.txt
    188.2 KB · Views: 1
The following message are not longer seen on the new provided test kernel.
That's expected - and the call traces are also gone - however - does anything not work with the regular kernel?

The issue with this thread and the others was that something actually was broken - i.e. that the raid-array was not available in the OS - and this is (at least that's my assumption for now, and also documented for this option) unrelated to UBSAN.
The fixes for the issues usually were upgrading the firmware.
 
We are actually experiencing an issue with the NvME drives not been detected unless those are configured as non-raid or raid0 which in both cases the Dell iDrac will create a virtual disk and then the disk will show as attached. This issue we are investigating as a separated issue as we are using a new raid controller which we did not test before. In this case the test kernel does not solve the issue.

The raid controller affected by this issue is PERC H755N Front, still we have not tested kernel 5.15.39-1-pve on other servers yet.

I did run another test on which i removed kernel 5.13.19-6-pve and. remove both non-raid and raid0 configurations. Still both NvME failed to be detected unless those were configured as non-raid or raid0 in which case a VD is created. While using this kernel no Warnings of UBSAN: array-index-out-of-bounds are seen.

Code:
[root@pve-00]# dmesg -T | tail
[Tue Jul 12 10:34:21 2022] Started bpfilter
[Tue Jul 12 10:36:42 2022] scsi 0:2:0:0: Direct-Access     NVMe     Dell Ent NVMe P5 .1.5 PQ: 0 ANSI: 7
[Tue Jul 12 10:36:42 2022] sd 0:2:0:0: Attached scsi generic sg2 type 0
[Tue Jul 12 10:36:42 2022] sd 0:2:0:0: [sdb] 3125627568 512-byte logical blocks: (1.60 TB/1.46 TiB)
[Tue Jul 12 10:36:42 2022] sd 0:2:0:0: [sdb] Write Protect is off
[Tue Jul 12 10:36:42 2022] sd 0:2:0:0: [sdb] Mode Sense: 6b 00 10 08
[Tue Jul 12 10:36:42 2022] sd 0:2:0:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
[Tue Jul 12 10:36:42 2022] sd 0:2:0:0: [sdb] Attached SCSI disk
[Tue Jul 12 10:38:39 2022] scsi 0:2:1:0: Direct-Access     NVMe     Dell Ent NVMe P5 .1.5 PQ: 0 ANSI: 7
[Tue Jul 12 10:38:39 2022] sd 0:2:1:0: Attached scsi generic sg3 type 0
[Tue Jul 12 10:38:39 2022] sd 0:2:1:0: [sdc] 3125627568 512-byte logical blocks: (1.60 TB/1.46 TiB)
[Tue Jul 12 10:38:39 2022] sd 0:2:1:0: [sdc] Write Protect is off
[Tue Jul 12 10:38:39 2022] sd 0:2:1:0: [sdc] Mode Sense: 6b 00 10 08
[Tue Jul 12 10:38:39 2022] sd 0:2:1:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
[Tue Jul 12 10:38:39 2022] sd 0:2:1:0: [sdc] Attached SCSI disk

Have you heard of a similar case to ours?
 

Attachments

  • dmesg-5.13.19-15.txt
    163.9 KB · Views: 1
Last edited:
I'm going to chime in and +1 this issue. Different hardware, but it interacted very badly with the raid controller. Lenovo sent a replacement, after first boot that one died too. In the case of Lenovo's 530-8i raid controller, it locks into a failsafe mode that isn't customer resolvable. Pending reply from Lenovo to see if they'll replace it a second time, but it was definitely a shock to have proxmox cause a raid controller to go into a permanent failsafe mode. I'm frustrated with the Lenovo side because of how their controller interacts with the OS and doesn't have a recovery path once the issue is triggered. However, that being said I wasn't expecting an update through the proxmox subscription repo to cause issues like this. Since Proxmox isn't on the compatibility list and triggered the controller issues, I might be stuck fronting the cost of a new controller.
This is true, I tried to install proxmox 7.2 then it fails responding an error "No Hard Disk found". After rebooting the server the raid controller disables its write permissions. Thanks god Lenovo responded unto the problem and will bring a new controller. I am now afraid to install again the proxmox 7.2. Another problem is when installing Proxmox 7.1 with raid 5 configured under hardware raid controller it will get stuck on creating LV's. I hope you got a workaround on this issues.

Server: ThinkSystem SR250
Raid Controller: Lenovo ThinkSystem RAID 530-8i PCIe 12Gb Adapter
 
Last edited:
Currently helping a friend of mine which also upgraded from proxmox 6.x to 7.2.
Previously it worked fine & stable, until the update...
He now get's the same error as @superjay, in which it states "No Hard Disk Found" at a random point of the day (multiple times).
Then it tries to solve the problem, but fails again. And after a while it gives
Code:
pve kernel: ata10.00: exception Emask 0x0 SAct 0x7f803103 SErr 0x0 action 0x6 frozen

which leads later on into:
Code:
I/O error, dev sdu, sector 21458583134 op 0x0:(READ) flags 0x200000 phy

This happens to different disks. Also tried to switch some cables but didn't do any good.

Could it be an issue with his HBA card? Maybe it's non-compatible with proxmox 7.x.
Does anyone have an idea?
 

Attachments

  • proxmox_error.log
    18.9 KB · Views: 2
Thank you for your interest.

I have the latest firmware version for the Smart Array P410 controller:

https://support.hpe.com/connect/s/softwaredetails?language=it&tab=Istruzioni di installazione&softwareId=MTX_5e52f965d84f41c2bb65d33b58

and I have the latest System ROMPaq Firmware for the ML330 G6 (W07) Servers
https://support.hpe.com/connect/s/s...037790d01fb4f7b885cd45e7c&tab=revisionHistory

I add that I tried to install Proxmox VE 7.2 from USB Key again and the installer finds the RAID LOGICAL_VOLUME but after starting the installation it freezes when creating partitions.

Old thread, but for future reference:

Download latest HPE tools/Firmware from here:
Code:
https://downloads.linux.hpe.com/SDR/index.html
https://downloads.linux.hpe.com/SDR/repo/mcp/debian/pool/non-free/
Code:
- ssacli
- ssa *opcinal
- ssaducli *opcinal

Find latest firmware ( RHEL rpm file, use hpe tools to upgrade )
Code:
https://downloads.linux.hpe.com/SDR/repo/
https://downloads.linux.hpe.com/SDR/repo/spp/

These are more recent versions, unlike the support page. The "hpsa" kernel module will be used from the kernel, you can use the "ssacli" tool for management.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!