Enlarging existing Proxmox VE partition in-place

going.concern

New Member
Dec 30, 2023
3
0
1
I have an existing Proxmox 7.4 installation on a device with a hardware RAID controller. I recently changed RAID levels on said hardware controller and enlarged the hardware RAID's single logical volume to fill the additional space, creating additional space on the single disk visible to Proxmox. Here is what fdisk -l outputs on the proxmox node:

Code:
The backup GPT table is not on the end of the device.
Disk /dev/sda: 4.91 TiB, 5400907505664 bytes, 10548647472 sectors
Disk model: LOGICAL VOLUME 
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 262144 bytes / 1572864 bytes
Disklabel type: gpt
Disk identifier: BF01F356-72CA-4362-8462-A6F965663B68

Device       Start        End    Sectors  Size Type
/dev/sda1       34       2047       2014 1007K BIOS boot
/dev/sda2     2048    2099199    2097152    1G EFI System
/dev/sda3  2099200 7032432142 7030332943  3.3T Linux LVM

Partition 1 does not start on physical sector boundary.

Note /dev/sda is 4.91 TiB, whereas /dev/sda3 is only 3.3T large.

I would like to extend /dev/sda3 to fill the remaining empty space, and I would then like to update proxmox to use this space. Resizing the logical volumes within proxmox seems like the easy part, but I have to first extend /dev/sda3, and my limited research shows this may cause disaster (but I haven't found much documentation on the subject either way, so...)

I am looking for a solution that preserves the existing contents of my local/local-lvm drives and my VMs but allows me to grow them. I can boot to installation media if necessary, but obviously it'd be easier if I could do this all from within proxmox. Can I safely resize /dev/sda3 to the end of the disk without concern, or are there additional steps necessary to safely expand proxmox? Or should I establish another partition in the empty space? Or something else?
 
The key phrase for search: "how to extend partition and lvm"
The first result looks on target: https://networklessons.com/uncategorized/extend-lvm-partition

Make sure you read and understand the entire procedure. You can also test this in a VM.


Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
This was greatly helpful, thank you. However, I'm running into a new issue now, which I'm not sure how to proceed from.

I completed all the steps, creating a new physical volume and adding it to the volume group, without error, up to the following line:

Code:
lvextend -l +100%FREE /dev/vg_vmware/lv_root

In my case, I substituted my thin pool pve/data for the /dev/vg_vmware/lv_root (I also attempted to use the /dev/mapper/pve-data route w/the same result). I also added a --poolmetadata +8G flag as recommended by the wiki. However, I get the following error:

Code:
root@pve:~# lvextend -l +100%FREE --poolmetadatasize +8G /dev/mapper/pve-data
  Size of logical volume pve/data_tmeta changed from 15.81 GiB (4048 extents) to 23.81 GiB (6096 extents).
  device-mapper: reload ioctl on  (253:2) failed: Invalid argument
  Failed to suspend logical volume pve/data

(I'll note here 253:2 is the pve-data_tmeta lv, not the data pool.)

I've attempted a few different steps, most notably, attempting to lvchange -an pve/data to see if suspending the thin pool allows me to extend it. However, curiously, attempting to suspend pve/data results in some strange behavior: for a few minutes, the entire system will lock up if I attempt to execute even basic commands such as lvdisplay, lvs, etc. Then, when I am able to execute commands again, pve/data will have returned to available and I am greeted with the same error attempting to extend it. I have tried everything - turned off all LXCs running on the node, suspending all the LXC disks (lvchange -an /dev/mapper/vm-100-disk-0, which runs without issue and properly suspends the lv), rebooting the node itself - to no avail.

Any advice is appreciated. Thanks.
 
I completed all the steps, creating a new physical volume and adding it to the volume group, without error, up to the following line:
I would have expected you to just expand sda3 to take up new free space. I guess you created sda4 and added it to VG?

Plugging in the error you receive into google provides a few helpful articles, I did not read into details, but at the surface it seems that your best approach is backup, reduce/remove new partition, extend sda3, repeat extension.

https://support.hpe.com/hpesc/public/docDisplay?docId=c03744232&docLocale=en_US



Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
https://access.redhat.com/solutions/21240

Resolution​

  • To recover the volume, it is recommended that the pvresize change is backed-out and the PV size is restored to the original size.
    • You can use vgcfgrestore -f /etc/lvm/backup/<vgname> <vgname> to restore the LVM2 metadata to a state before the change (be sure to use a metadata backup prior to the change from /etc/lvm/[archive|backup]).
    • Determine the old size and change the metadata back:
      • Check the size of the physical volume with pvs (this output shows DevSize is 60G):
      • Raw
        # pvs PV VG Fmt Attr PSize PFree DevSize PV UUID /dev/sdb1 VolGroup00 lvm2 a-- 90.00g 37.00g 60.00g lt5jBc-JE70-hlbn-yFtg-CZ5l-mD2c-twu9xj
      • Resize the PV metadata back to the device size:
      • Raw
        # pvresize --setphysicalvolumesize 60g /dev/sdb1
  • To correctly extend the physical volume, see How to extend an LVM disk on a RHEL guest machine running on VMware host?.
  • General steps to extend LVM root file system in RHEL 4 can be found in How do I resize the root partition (/) after installation on Red Hat Enterprise Linux 4?

Root Cause​

  • device-mapper is unaware of the underlying device resize, because the partition on the device has not been resized. Also, LVM commands fail because LVM believes the partition is larger than it really is.
  • Also, pvresize generally shouldn't be forced by setting --setphysicalvolumesize which overrides the autodetection of the device size. If pvresize can't detect the new storage size, it is likely because the partition on the device has not been resized.
  • From the man page for pvresize:
    Raw
    --setphysicalvolumesize size
    Overrides the automatically-detected size of the PV. Use with care, or prior to reducing the physical size of
    the device.

Diagnostic Steps​

  • Check for any physical volumes with pv_size > dev_size.
    • In the below example, /dev/mapper/mpath3 has a pv_size > dev_size. The pv_size is written onto the disk when pvcreate is issued. If the size of the device gets smaller after this point, this can lead to errors such as lvcreate or lvextend failing with "reload ioctl failed: Invalid argument".
    • Raw
      # pvs -opv_name,pv_size,dev_sizePV PSize DevSize/dev/mapper/mpath1 8.00m 10.00m/dev/mapper/mpath2 8.00m 10.00m/dev/mapper/mpath3 10.00m 5.00m
    • Here is a simple script you can run which should flag any devices with the above condition.
    • Raw
      # pvs --noheadings --units S --nosuffix -ovg_name,pv_name,pv_size,dev_size | \ awk '{ if ($3 > $4) { print "Found volume group "$1" with device "$2" that has a pv_size > dev_size. \ \nIt is recommended that you run pvresize "$2".\n"} }'
    • Sample output for the above script is:
    • Raw
      Found volume group VolGroup01 with device /dev/mapper/mpath3 that has a pv_size > dev_size.It is recommended that you run pvresize /dev/mapper/mpath3.
  • The following errors can be seen when trying to manipulate LVM Logical Volumes (LV's):
    • When you try to lvremove or lvcreate a new LVM logical volume, it fails with the following messages:
    • Raw
      # lvremove VolGroup00/LogVol11 Do you really want to remove active logical volume LogVol11? [y/n]: y Attempted to decrement suspended device counter below zero. Logical volume "LogVol11" successfully removed# lvcreate --size 10G --name LogVol11 VolGroup00 device-mapper: resume ioctl on failed: Invalid argument Unable to resume VolGroup00-LogVol11 (253:11) Failed to activate new LV. Attempted to decrement suspended device counter below zero.
    • The following messages are seen in /var/log/messages at the same time:
    • Raw
      # tail /var/log/messageskernel: device-mapper: table: 253:11: sdb1 too small for target: start=125831168, len=62906368, dev_size=125837312kernel: device-mapper: table: 253:11: sdb1 too small for target: start=125831168, len=62906368, dev_size=125837312


Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
https://access.redhat.com/solutions/21240

Resolution​

  • To recover the volume, it is recommended that the pvresize change is backed-out and the PV size is restored to the original size.
    • You can use vgcfgrestore -f /etc/lvm/backup/<vgname> <vgname> to restore the LVM2 metadata to a state before the change (be sure to use a metadata backup prior to the change from /etc/lvm/[archive|backup]).
    • Determine the old size and change the metadata back:
      • Check the size of the physical volume with pvs (this output shows DevSize is 60G):
      • Raw
        # pvs PV VG Fmt Attr PSize PFree DevSize PV UUID /dev/sdb1 VolGroup00 lvm2 a-- 90.00g 37.00g 60.00g lt5jBc-JE70-hlbn-yFtg-CZ5l-mD2c-twu9xj
      • Resize the PV metadata back to the device size:
      • Raw
        # pvresize --setphysicalvolumesize 60g /dev/sdb1
  • To correctly extend the physical volume, see How to extend an LVM disk on a RHEL guest machine running on VMware host?.
  • General steps to extend LVM root file system in RHEL 4 can be found in How do I resize the root partition (/) after installation on Red Hat Enterprise Linux 4?

Root Cause​

  • device-mapper is unaware of the underlying device resize, because the partition on the device has not been resized. Also, LVM commands fail because LVM believes the partition is larger than it really is.
  • Also, pvresize generally shouldn't be forced by setting --setphysicalvolumesize which overrides the autodetection of the device size. If pvresize can't detect the new storage size, it is likely because the partition on the device has not been resized.
  • From the man page for pvresize:
    Raw
    --setphysicalvolumesize size
    Overrides the automatically-detected size of the PV. Use with care, or prior to reducing the physical size of
    the device.

Diagnostic Steps​

  • Check for any physical volumes with pv_size > dev_size.
    • In the below example, /dev/mapper/mpath3 has a pv_size > dev_size. The pv_size is written onto the disk when pvcreate is issued. If the size of the device gets smaller after this point, this can lead to errors such as lvcreate or lvextend failing with "reload ioctl failed: Invalid argument".
    • Raw
      # pvs -opv_name,pv_size,dev_sizePV PSize DevSize/dev/mapper/mpath1 8.00m 10.00m/dev/mapper/mpath2 8.00m 10.00m/dev/mapper/mpath3 10.00m 5.00m
    • Here is a simple script you can run which should flag any devices with the above condition.
    • Raw
      # pvs --noheadings --units S --nosuffix -ovg_name,pv_name,pv_size,dev_size | \ awk '{ if ($3 > $4) { print "Found volume group "$1" with device "$2" that has a pv_size > dev_size. \ \nIt is recommended that you run pvresize "$2".\n"} }'
    • Sample output for the above script is:
    • Raw
      Found volume group VolGroup01 with device /dev/mapper/mpath3 that has a pv_size > dev_size.It is recommended that you run pvresize /dev/mapper/mpath3.
  • The following errors can be seen when trying to manipulate LVM Logical Volumes (LV's):
    • When you try to lvremove or lvcreate a new LVM logical volume, it fails with the following messages:
    • Raw
      # lvremove VolGroup00/LogVol11 Do you really want to remove active logical volume LogVol11? [y/n]: y Attempted to decrement suspended device counter below zero. Logical volume "LogVol11" successfully removed# lvcreate --size 10G --name LogVol11 VolGroup00 device-mapper: resume ioctl on failed: Invalid argument Unable to resume VolGroup00-LogVol11 (253:11) Failed to activate new LV. Attempted to decrement suspended device counter below zero.
    • The following messages are seen in /var/log/messages at the same time:
    • Raw
      # tail /var/log/messageskernel: device-mapper: table: 253:11: sdb1 too small for target: start=125831168, len=62906368, dev_size=125837312kernel: device-mapper: table: 253:11: sdb1 too small for target: start=125831168, len=62906368, dev_size=125837312


Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Yes, I created a new partition (sda4) with the free space, then added it to the volume group, since I wasn't sure if extending sda3 directly would cause potential data loss due to how the LVM partition is laid out, and the article you linked recommended as much. Seemed like the safest path forward.

I did see the article but couldn't read it, but since it is reproduced on the HPE site I can see it now. However, although this is closest to my issue, I'm not sure it's the same one. Both of my physical volumes match their pv_size to their dev_size:

Code:
root@pve:~# pvs -a -v
  PV         VG  Fmt  Attr PSize  PFree   DevSize PV UUID                               
  /dev/sda2           ---      0       0    1.00g                                       
  /dev/sda3  pve lvm2 a--   3.27t <16.38g   3.27t ===
  /dev/sda4  pve lvm2 a--  <1.64t  <1.64t  <1.64t ===

And dmesg outputs a different error to the one in the RedHat article:

Code:
[===] device-mapper: table: 253:2: zero-length target
[===] device-mapper: ioctl: error adding target to table

Here's a full lsblk:

Code:
root@pve:~# lsblk
NAME                         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                            8:0    0  4.9T  0 disk 
├─sda1                         8:1    0 1007K  0 part 
├─sda2                         8:2    0    1G  0 part 
├─sda3                         8:3    0  3.3T  0 part 
│ ├─pve-swap                 253:0    0    8G  0 lvm  [SWAP]
│ ├─pve-root                 253:1    0   96G  0 lvm  /
│ ├─pve-data_tmeta           253:2    0 15.8G  0 lvm  
│ │ └─pve-data-tpool         253:4    0  3.1T  0 lvm  
│ │   ├─pve-data             253:5    0  3.1T  1 lvm  
│ │   ├─pve-vm--100--disk--0 253:6    0   64G  0 lvm  
│ │   ├─[several more vm volumes]
│ └─pve-data_tdata           253:3    0  3.1T  0 lvm  
│   └─pve-data-tpool         253:4    0  3.1T  0 lvm  
│     ├─pve-data             253:5    0  3.1T  1 lvm  
│     ├─pve-vm--100--disk--0 253:6    0   64G  0 lvm  
│     ├─[several more vm volumes]
└─sda4                         8:4    0  1.6T  0 part
 
The underlying problem that is being solved by both RH and HP solutions is that something was done incorrectly in preceding expansion steps. I dont know how to fix your specific problem. Finding a solution requires going back and retracing all steps and their outputs, analyzing various current outputs and trying different online solutions depending on the output.

You'd have to decide whether you want to invest your time in trying to solve the issue or doing a backup/reformat/restore.

Good luck.


Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!