Hardware RAID LVM Configuration Issues - Seeking Help

Moonstar

New Member
Jan 13, 2025
5
0
1
On a Hardware RAID configured system, we are having a hard time creating LVM2 groups on the RAID controlled drives.

Whenever the hardware undergoes a physical power cycle, any LVM2 groups created on these drives are lost and cannot be recovered. Commands like `pvscan`, `lvscan`, or `lgscan` only return the PVE default storage pool and do not detect anything we have done to the other drives. When attempting to restore the storage pool configurations, the system will error stating that the drive with the specified UUID cannot be located for the restoration.

I am unaware whether it is how I am making the LVM2 groups or if it is something to do with the RAID controller, so I figured I would come here for support. IT believes the system is operating as to be expected and suggest this is an OS or user error.

Steps used to create the LVM2 Groups:

1. Determining the partition intent:

From the 16 physical drives supplied, we suspect that 4 RAID groups have been created. This suspicion arises from the Serial Numbers reported by the system. So, we are treating the system as if only 4 physical devices are actually in play, not the full 16.

We know that the end result is 2 storage pools:

- Archival: 75% of the available storage data
- Non-Archival: 25% of the available storage data

This would mean that one of the RAID pools would just become our Non-Archival storage pool, and the 3 remaining drive pools would become our Archival storage pool.

I wish to achieve this by creating one Logical Volume group, referred to as LVG, and two logical volumes, referred to as Archival and NonArchival.

2. Creating physical volumes:

Using the command `pvcreate /dev/sd{c..r}`, I created the Physical Volumes for each of the drives. This command completes without error or warning.

3. Creating the LVG:

I attempted to run the command `vgcreate LVG /dev/sd{c..r}`, but get the following error in return:

Bash:
root@fe:~# vgcreate LVG /dev/sd{c..r}
  Failed to read lvm info for /dev/sdg PVID vgRDKw5AdWtBH7dAoT8cJe8Y4BZnzcS1.
  Failed to read lvm info for /dev/sdh PVID wCV18TWd4jN8wR2MbMv0Pofsyumk64q4.
  Failed to read lvm info for /dev/sdi PVID 0TD2Kd60AcNLFRX4gfTSYkNJVpytStdW.
  Failed to read lvm info for /dev/sdj PVID gJv8XtVeQ9lwXXaOwNJqvlJJ0magNwv0.
  Failed to read lvm info for /dev/sdk PVID vgRDKw5AdWtBH7dAoT8cJe8Y4BZnzcS1.
  Failed to read lvm info for /dev/sdl PVID wCV18TWd4jN8wR2MbMv0Pofsyumk64q4.
  Failed to read lvm info for /dev/sdm PVID 0TD2Kd60AcNLFRX4gfTSYkNJVpytStdW.
  Failed to read lvm info for /dev/sdn PVID gJv8XtVeQ9lwXXaOwNJqvlJJ0magNwv0.
  Failed to read lvm info for /dev/sdo PVID vgRDKw5AdWtBH7dAoT8cJe8Y4BZnzcS1.
  Failed to read lvm info for /dev/sdp PVID wCV18TWd4jN8wR2MbMv0Pofsyumk64q4.
  Failed to read lvm info for /dev/sdq PVID 0TD2Kd60AcNLFRX4gfTSYkNJVpytStdW.
  Failed to read lvm info for /dev/sdr PVID gJv8XtVeQ9lwXXaOwNJqvlJJ0magNwv0.

This is what leads me to believe that we only have access to 4 drives instead of 16. Notice the 4 missing errors: sdc, sdd, sde, and sdf. These are the 4 drives in which are the first of each serial number reported, so this makes sense to me. So, what if we attempt to only use those 4?

Bash:
root@fe:~# vgcreate LVG /dev/sd{c..f}
  Volume group "LVG" successfully created
root@fe:~# vgdisplay LVG
  Volume group "LVG" not found
  Cannot process volume group LVG

You can see that is successfully created the group, but than can not display any information about said group. I cannot move forward because of this.

However, when I first attempted this, I did managed to make the group and even the subsequent LVM members using:
- `lvcreate -n NonArchival -L 25%VG LVG`
- `lvcreate -n Archival -L 75%VG LVG`

These volume groups did not persist power cycle, and we ended up losing all data saved to these members.

Bash:
root@fe:~# lspci -knn | grep 'RAID bus controller'
18:00.0 RAID bus controller [0104]: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] [1000:005d] (rev 02)

1736779853529.png

Another odd thing I have noticed is that drives with specific serial numbers have changed what group they belong to. What I mean is that when I first created my LVM2 group on this system, `/dev/sdc` belonged to the serial number ending with `10` and not `f`.

All of this information is a lot, but it is everything I can possibly think of that would be considered relevant to this problem. I am seeking help to create these LVM members and have them survive a power cycle.
 
First download "storcli" raid ctrl. tool from broadcom homepage and install ist.
"storcli /call/eall/sall show" ?
 
First download "storcli" raid ctrl. tool from broadcom homepage and install ist.
"storcli /call/eall/sall show" ?
Bash:
root@fe:~# /opt/MegaRAID/storcli/storcli64 /call/eall/sall show
CLI Version = 007.3205.0000.0000 Oct 09, 2024
Operating system = Linux 6.8.12-4-pve
Controller = 0
Status = Success
Description = Show Drive Information Succeeded.


Drive Information :
=================

---------------------------------------------------------------------------------------
EID:Slt DID State DG       Size Intf Med SED PI SeSz Model                     Sp Type
---------------------------------------------------------------------------------------
32:0      0 Onln   0 893.750 GB SATA SSD N   N  512B MZ7LM960HMJP0D3           U  -   
32:1      1 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:2      2 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:3      3 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:4      4 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:5      5 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:6      6 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:7      7 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:8      8 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:9      9 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:10    10 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:11    11 Onln   0 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:12    12 UGood  - 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
32:13    13 UGood  - 893.750 GB SATA SSD N   N  512B Samsung SSD 883 DCT 960GB U  -   
---------------------------------------------------------------------------------------

EID=Enclosure Device ID|Slt=Slot No|DID=Device ID|DG=DriveGroup
DHS=Dedicated Hot Spare|UGood=Unconfigured Good|GHS=Global Hotspare
UBad=Unconfigured Bad|Sntze=Sanitize|Onln=Online|Offln=Offline|Intf=Interface
Med=Media Type|SED=Self Encryptive Drive|PI=PI Eligible
SeSz=Sector Size|Sp=Spun|U=Up|D=Down|T=Transition|F=Foreign
UGUnsp=UGood Unsupported|UGShld=UGood shielded|HSPShld=Hotspare shielded
CFShld=Configured shielded|Cpybck=CopyBack|CBShld=Copyback Shielded
UBUnsp=UBad Unsupported|Rbld=Rebuild
 
You have 2 unassigned ssd's which do NOT come in automatically for rebuilds when one ssd will fail as they are not defined as spare !!
"storcli64 /c0/vall show" ?
"storcli64 /c0/bbu show all" ?
"storcli64 /c0 set cacheflushint=1"
 
You have 2 unassigned ssd's which do NOT come in automatically for rebuilds when one ssd will fail as they are not defined as spare !!
"storcli64 /c0/vall show" ?
"storcli64 /c0/bbu show all" ?
"storcli64 /c0 set cacheflushint=1"

Below is the output of those requested commands. What concerns me more is the fact that none of the devices that was listed include the HDDs that are actually giving me the issue. They are all the SSD in which the OS is installed on.

Bash:
root@fe:~# /opt/MegaRAID/storcli/storcli64 /c0/vall show
CLI Version = 007.3205.0000.0000 Oct 09, 2024
Operating system = Linux 6.8.12-4-pve
Controller = 0
Status = Success
Description = None


Virtual Drives :
==============

-------------------------------------------------------------
DG/VD TYPE  State Access Consist Cache Cac sCC     Size Name
-------------------------------------------------------------
0/0   RAID6 Optl  RW     Yes     RWBD  -   OFF 8.728 TB VD00
-------------------------------------------------------------

VD=Virtual Drive| DG=Drive Group|Rec=Recovery
Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|Dgrd=Degraded
Optl=Optimal|dflt=Default|RO=Read Only|RW=Read Write|HD=Hidden|TRANS=TransportReady
B=Blocked|Consist=Consistent|R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack
FWB=Force WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check Consistency
Bash:
root@fe:~# /opt/MegaRAID/storcli/storcli64 /c0/bbu show all
CLI Version = 007.3205.0000.0000 Oct 09, 2024
Operating system = Linux 6.8.12-4-pve
Controller = 0
Status = Success
Description = None


BBU_Info :
========

----------------------
Property      Value  
----------------------
Type          BBU    
Voltage       3923 mV
Current       0 mA  
Temperature   27 C  
Battery State Optimal
----------------------


BBU_Firmware_Status :
===================

-------------------------------------------------
Property                                   Value
-------------------------------------------------
Charging Status                            None
Voltage                                    OK  
Temperature                                OK  
Learn Cycle Requested                      No  
Learn Cycle Active                         No  
Learn Cycle Status                         OK  
Learn Cycle Timeout                        No  
I2C Errors Detected                        No  
Battery Pack Missing                       No  
Replacement required                       No  
Remaining Capacity Low                     No  
Periodic Learn Required                    No  
Transparent Learn                          No  
No space to cache offload                  No  
Pack is about to fail & should be replaced No  
Cache Offload premium feature required     No  
Module microcode update required           No  
-------------------------------------------------


GasGaugeStatus :
==============

-------------------------------------
Property                   Value    
-------------------------------------
GasGauge StatusCode        0x228    
Fully Discharged           N/A      
Fully Charged              N/A      
Discharging                N/A      
Initialized                N/A      
Remaining Time Alarm       N/A      
Terminate Discharge Alarm  N/A      
Over Temperature           N/A      
Charging Terminated        N/A      
Over Charged               N/A      
Relative State of Charge   100%    
Charger System State       Complete
Remaining Capacity         388      
Full Charge Capacity       390      
Is SOH Good                Yes      
Battery backup charge time 0 hour(s)
-------------------------------------


BBU_Capacity_Info :
=================

--------------------------------------
Property                 Value      
--------------------------------------
Relative State of Charge 100%        
Absolute State of charge 0%          
Remaining Capacity       388 mAh    
Full Charge Capacity     390 mAh    
Run time to empty        Unavailable
Average time to empty    31 min      
Average Time to full     Unavailable
Cycle Count              33          
Max Error                0%          
Remaining Capacity Alarm 0 mAh      
Remaining Time Alarm     0 minutes(s)
--------------------------------------


BBU_Design_Info :
===============

--------------------------------
Property                Value  
--------------------------------
Date of Manufacture     00/00/0
Design Capacity         0 mAh  
Design Voltage          0 mV  
Specification Info      0      
Serial Number           0      
Pack Stat Configuration 0      
Manufacture Name        0x129  
Device Name                    
Device Chemistry              
Battery FRU             N/A    
Transparent Learn       1      
App Data                1      
Module Version          0.6    
--------------------------------


BBU_Properties :
==============

-------------------------------------------
Property             Value                
-------------------------------------------
Auto Learn Period    90d (7776000 seconds)
Learn Delay Interval 0 hour(s)            
Auto-Learn Mode      Transparent          
-------------------------------------------
Bash:
root@fe:~# /opt/MegaRAID/storcli/storcli64 /c0 set cacheflushint=1
CLI Version = 007.3205.0000.0000 Oct 09, 2024
Operating system = Linux 6.8.12-4-pve
Controller = 0
Status = Success
Description = None


Controller Properties :
=====================

---------------------------
Ctrl_Prop            Value
---------------------------
Cache Flush Interval 1    
---------------------------
 
You see there are 14 ssd's connected to the raid ctrl. while 12 are building 1 raid6 volume of nearly 9 TB (!!). BBU looks good !!
Wondering why you have 16x 2.2 TB (~35 TB) lvm volumes defined and not just 1 ...
There are *NO* hdd attached to this raid controller.
Maybe your hdd's are connected directly to mainboard with sata(/sas?) but without a LSI chip.
So you have another maybe software raid of hdd's which give you the missing 26 TB (badly) mixed up to a (hybrid) pool of 35 TB ??
"cat /proc/mdstat" ?
pvscan ?
pvs ?
lvs ?
 
You see there are 14 ssd's connected to the raid ctrl. while 12 are building 1 raid6 volume of nearly 9 TB (!!). BBU looks good !!
Wondering why you have 16x 2.2 TB (~35 TB) lvm volumes defined and not just 1 ...
There are *NO* hdd attached to this raid controller.
Maybe your hdd's are connected directly to mainboard with sata(/sas?) but without a LSI chip.
So you have another maybe software raid of hdd's which give you the missing 26 TB (badly) mixed up to a (hybrid) pool of 35 TB ??
"cat /proc/mdstat" ?
pvscan ?
pvs ?
lvs ?
Wondering why you have 16x 2.2 TB (~35 TB) lvm volumes defined and not just 1 ...
To answer this: the moment we run the `pvcreate` command, all disks affected identify immediately as LVM members. The intent is to only have one Volume group span the physical devices.
Maybe your hdd's are connected directly to mainboard with sata(/sas?) but without a LSI chip.
I can ask our IT person for more specifics on this.
As for the commands requested:
Bash:
root@fe:~# cat /proc/mdstat
Personalities : 
unused devices: <none>
Bash:
root@fe:~# pvscan
  PV /dev/sda3   VG pve             lvm2 [<8.73 TiB / <16.25 GiB free]
  Total: 1 [<8.73 TiB] / in use: 1 [<8.73 TiB] / in no VG: 0 [0   ]
Bash:
root@fe:~# pvs
  PV         VG  Fmt  Attr PSize  PFree  
  /dev/sda3  pve lvm2 a--  <8.73t <16.25g
Bash:
root@fe:~# lvs
  LV            VG  Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <8.58t             1.92   0.68                            
  root          pve -wi-ao---- 96.00g                                                    
  swap          pve -wi-ao----  8.00g                                                    
  vm-100-disk-0 pve Vwi-aotz-- 32.00g data        8.90                                   
  vm-100-disk-1 pve Vwi-aotz--  1.95t data        1.62                                   
  vm-100-disk-2 pve Vwi-aotz-- <5.86t data        0.83                                   
  vm-101-disk-0 pve Vwi-aotz-- 32.00g data        37.47                                  
  vm-102-disk-0 pve Vwi-aotz-- 96.00g data        31.80                                  
  vm-103-disk-0 pve Vwi-aotz-- 96.00g data        27.23                                  
  vm-104-disk-0 pve Vwi-aotz-- 96.00g data        16.12

I also ran an `lsblk` command to try and give you a bigger picture:
Bash:
root@fe:~# lsblk
NAME                         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda                            8:0    0  8.7T  0 disk 
├─sda1                         8:1    0 1007K  0 part 
├─sda2                         8:2    0    1G  0 part /boot/efi
└─sda3                         8:3    0  8.7T  0 part 
  ├─pve-swap                 252:0    0    8G  0 lvm  [SWAP]
  ├─pve-root                 252:1    0   96G  0 lvm  /
  ├─pve-data_tmeta           252:2    0 15.9G  0 lvm  
  │ └─pve-data-tpool         252:4    0  8.6T  0 lvm  
  │   ├─pve-data             252:5    0  8.6T  1 lvm  
  │   ├─pve-vm--100--disk--0 252:6    0   32G  0 lvm  
  │   ├─pve-vm--100--disk--1 252:7    0    2T  0 lvm  
  │   ├─pve-vm--100--disk--2 252:8    0  5.9T  0 lvm  
  │   ├─pve-vm--101--disk--0 252:9    0   32G  0 lvm  
  │   ├─pve-vm--102--disk--0 252:10   0   96G  0 lvm  
  │   ├─pve-vm--103--disk--0 252:11   0   96G  0 lvm  
  │   └─pve-vm--104--disk--0 252:12   0   96G  0 lvm  
  └─pve-data_tdata           252:3    0  8.6T  0 lvm  
    └─pve-data-tpool         252:4    0  8.6T  0 lvm  
      ├─pve-data             252:5    0  8.6T  1 lvm  
      ├─pve-vm--100--disk--0 252:6    0   32G  0 lvm  
      ├─pve-vm--100--disk--1 252:7    0    2T  0 lvm  
      ├─pve-vm--100--disk--2 252:8    0  5.9T  0 lvm  
      ├─pve-vm--101--disk--0 252:9    0   32G  0 lvm  
      ├─pve-vm--102--disk--0 252:10   0   96G  0 lvm  
      ├─pve-vm--103--disk--0 252:11   0   96G  0 lvm  
      └─pve-vm--104--disk--0 252:12   0   96G  0 lvm  
sdb                            8:16   0 14.9G  0 disk 
├─sdb1                         8:17   0    4M  0 part 
├─sdb5                         8:21   0  250M  0 part 
├─sdb6                         8:22   0  250M  0 part 
├─sdb7                         8:23   0  110M  0 part 
├─sdb8                         8:24   0  286M  0 part 
└─sdb9                         8:25   0  6.4G  0 part 
sdc                            8:32   0    2T  0 disk 
sdd                            8:48   0    2T  0 disk 
sde                            8:64   0    2T  0 disk 
sdf                            8:80   0    2T  0 disk 
sdg                            8:96   0    2T  0 disk 
sdh                            8:112  0    2T  0 disk 
sdi                            8:128  0    2T  0 disk 
sdj                            8:144  0    2T  0 disk 
sdk                            8:160  0    2T  0 disk 
sdl                            8:176  0    2T  0 disk 
sdm                            8:192  0    2T  0 disk 
sdn                            8:208  0    2T  0 disk 
sdo                            8:224  0    2T  0 disk 
sdp                            8:240  0    2T  0 disk 
sr0                           11:0    1 1024M  0 rom  
sdq                           65:0    0    2T  0 disk 
sdr                           65:16   0    2T  0 disk
 
You see there are 14 ssd's connected to the raid ctrl. while 12 are building 1 raid6 volume of nearly 9 TB (!!). BBU looks good !!
Wondering why you have 16x 2.2 TB (~35 TB) lvm volumes defined and not just 1 ...
There are *NO* hdd attached to this raid controller.
Maybe your hdd's are connected directly to mainboard with sata(/sas?) but without a LSI chip.
So you have another maybe software raid of hdd's which give you the missing 26 TB (badly) mixed up to a (hybrid) pool of 35 TB ??
"cat /proc/mdstat" ?
pvscan ?
pvs ?
lvs ?
Update from internal IT team:

The original specs of the server were mis-reported. It appears as there are no HDDs in the server, and instead they are all SSDs. Additionally, they are all interacting with the Hardware RAID controller according to the team.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!