thin LVM possible over LVM-raid

luison

Renowned Member
Feb 22, 2010
158
5
83
Spain
elsurexiste.com
Due to some issues on the system we are considering moving from our lvm over linux raid to lvm-raid configuration.

As I understand I can convert any LVM LV to thin I am assumming I can do so too with a mirrored (raid1) LV.
Anyone has experience with this setup and implications I should consider?
Thanks.
 
We've now moved the mdadm + lvm to sinle partitions that we have included in the pve group, but we are now rather confused with the possibility or not of a lvm type raid1 converted to a thin pool. This is, can a thinpool be raid1 type as well?

We have done something like:

Code:
lvcreate -L 240G -n thin pve /dev/nvme0n1p2
      Logical volume "thin" created.

Then we add a mirror to it on the other disk/partition

Code:
lvconvert –type raid1 –mirrors 1 pve/thing /dev/nvme1n1p2
      Logical volume pve/thin successfully converted.

As we use a thin pool storage system for LXC we assumed we could then just convert it to a thin pool
Code:
lvconvert --type thin-pool pve/thin
       Converted pve/thin to thin pool.

All seemed to work but the problem is we are uncertain that tha last conversion affects the previous one. The reason being that when we list with lvs we get:

Code:
thin           pve  twi-a-tz--  240,00g                 0,00   10,42

Attribute 1 a 7 show this is a thinpool but no mention to the raid1 or value in the sync.
While "lvs -a -o +devices" does show it being mirrored on two partitions:


Code:
[thin_tdata]              pve  rwi-aor---  240,00g                                        24,17            thin_tdata_rimage_0(0),thin_tdata_rimage_1(0)
      [thin_tdata_rimage_0]     pve  iwi-aor---  240,00g                                                         /dev/nvme0n1p2(67074)                
      [thin_tdata_rimage_1]     pve  Iwi-aor---  240,00g                                                         /dev/nvme1n1p2(67075)                
      [thin_tdata_rmeta_0]      pve  ewi-aor---    4,00m                                                         /dev/nvme0n1p2(128514)                
      [thin_tdata_rmeta_1]      pve  ewi-aor---    4,00m                                                         /dev/nvme1n1p2(67074)                
      [thin_tmeta]              pve  ewi-ao----  120,00m                                                         /dev/sdd2(0)


So the doubt now is if "behind" the thinpool the raid is still working or simply has been allocated but not being used now (wasted space we should remove). Creating the thin pool first and then converting it to --raid1 type returns an error so that is not an option.

We have not found any doc about this scenario and in the case that this was working we are completely lost on how to monitor the lvm-raid status as we were planning to monitor drives status with the return of lvs of type "r".
https://forum.proxmox.com/threads/thin-lvm-possible-over-lvm-raid.80100/reply
 
Additionally we've now noticed that one of our thin-pools has allocated the lvol0_pmspare on a disk we want to remove. Any way of moving it to another partition within the group?
 
How did this end up for you? I am implementing something similar but wound up in a weirder place. I am implementing RAID 10 on 4 SAS disks (connected to an HBA, no hardware raid).

I found this wiki, which indicated that thin RAID 10 was not possible in LVM without some trickery: https://wiki.gentoo.org/wiki/LVM#Thin_RAID10
So I decided to try that out and see how it went...

Code:
#1 - create VG named SAS_VG with my 4 disks:
vgcreate SAS_VG /dev/sda /dev/sdb /dev/sdc /dev/sdd

#2 - create standard pool SAS_LV as static 5TB in raid 10:
lvcreate -i 2 -m 1 --type raid10 -L 5T -n SAS_LV SAS_VG

#3 - create metadata pool SAS_META_LV as static 450GB, also in RAID 10: (wasn't sure how big this needed to be)
lvcreate -i 2 -m 1 --type raid10 -L 450G -n SAS_META_LV SAS_VG

#4 - convert SAS_LV (data pool) to thin pool using SAS_META_LV as metadatapool:
lvconvert --thinpool SAS_VG/SAS_LV --poolmetadata SAS_VG/SAS_META_LV


Two problems from this method though.
  1. It shows up as LVM-Thin, which I want...but it's Active state is set to No:1611975901768.png
  2. The Volume Group shows in the LVM page on my node, but it shows all disks as full:
1611976058971.png
3. And it does not show the thin LV I created in LVM-Thin section:
1611976081361.png



I sort of expected some amount of functionality to be missing, but I think the kicker is that Active state. I'm fairly certain that's what's causing it to not show usage correctly and not appear in the LVM-Thin section.

Give that wiki a read and try it out. Maybe you'll find something I haven't!
 
I spent the better part of last night trying to figure out what was wrong, gave up around midnight. I realized I needed to RTFM...https://pve.proxmox.com/pve-docs/chapter-pvesm.html

The problem stemmed from the lvconvert command above not combining my data and metadata LVs like it was supposed to.

There was lots of troubleshooting in-between what I'm showing here, but it was getting tedious. I blew it away and started over with pvesm remove and vgremove.

Now I start fresh, making it smaller as I thought that was part of the issue:

Code:
root@psivmh-snh-001:~# lvcreate -i 2 -m 1 --type raid10 -L 1T -n SAS_LV SAS_VG
  Using default stripesize 64.00 KiB.
  Logical volume "SAS_LV" created.
root@psivmh-snh-001:~# lvcreate -i 2 -m 1 --type raid10 -L 15G -n SAS_META_LV SAS_VG
  Using default stripesize 64.00 KiB.
  Logical volume "SAS_META_LV" created.
root@psivmh-snh-001:~# lvconvert --thinpool SAS_VG/SAS_LV --poolmetadata SAS_VG/SAS_META_LV
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  WARNING: Converting SAS_VG/SAS_LV and SAS_VG/SAS_META_LV to thin pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert SAS_VG/SAS_LV and SAS_VG/SAS_META_LV? [y/n]: y
  Converted SAS_VG/SAS_LV and SAS_VG/SAS_META_LV to thin pool.

LVM seems to have combined the LVs:
Code:
root@psivmh-snh-001:~# lvs
  LV     VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV SAS_VG twi-a-tz--  1.00t             0.00   0.14                          
  data   pve    twi-a-tz-- 59.66g             0.00   1.59                          
  root   pve    -wi-ao---- 27.75g                                                  
  swap   pve    -wi-ao----  8.00g

And now Proxmox shows my joined LV in the LVM-Thin space! However, it's not showing right in the web gui. pvesm status doesn't show it either:
Code:
root@psivmh-snh-001:~# pvesm status
Name             Type     Status           Total            Used       Available        %
local             dir     active        28510260         2551800        24487180    8.95%
local-lvm     lvmthin     active        62562304               0        62562304    0.00%

Use pvesm add:

Code:
root@psivmh-snh-001:~# pvesm add lvmthin SAS_THIN --thinpool SAS_LV --vgname SAS_VG
root@psivmh-snh-001:~# pvesm status
Name             Type     Status           Total            Used       Available        %
SAS_THIN      lvmthin     active      1073741824               0      1073741824    0.00%
local             dir     active        28510260         2551892        24487088    8.95%
local-lvm     lvmthin     active        62562304               0        62562304    0.00%

pvesm now reports it as active with the expected space! GUI shows it in the left hand server section as well:
1612020319466.png

Now I want to extend the LV so I can use all of my storage. Take caution here, there's math.
Since we're doing RAID 10, I can't actually use the full 10.92TB that vgs thinks is available:

Code:
root@psivmh-snh-001:~# vgs
  VG     #PV #LV #SN Attr   VSize    VFree
  SAS_VG   4   1   0 wz--n-  <10.92t  8.87t

I only have about 5.45TB usable since that's total space of half of my drives. I must use this amount instead of the 10.92 total because using the 10.92 tb would cause the RAID10 to fill and cause many problems. I think LVM would stop any real damage from being done, but just to be on the safe side, I will shoot for a 5TB volume. Now I extend my data to 5TB:

Code:
lvextend -L 5T SAS_VG/SAS_LV
  Extending 2 mirror images.
  Size of logical volume SAS_VG/SAS_LV_tdata changed from 1.00 TiB (262144 extents) to 5.00 TiB (1310720 extents).
  Logical volume SAS_VG/SAS_LV_tdata successfully resized.
root@psivmh-snh-001:~# lvs
  LV     VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV SAS_VG twi-a-tz--  5.00t             0.00   0.24                          
  data   pve    twi-a-tz-- 59.66g             0.00   1.59                          
  root   pve    -wi-ao---- 27.75g                                                  
  swap   pve    -wi-ao----  8.00g

Now we need to extend the metadata size. If we don't do this, LVM won't know how to reference our RAID volume once it fills up the metadata section. Please reference this section of RedHat guide to LVM:
For each RAID image, every 500MB data requires 4MB of additional storage space for storing the integrity metadata.

4/500 is .008 (.08%) which is the percentage of your volume that needs to reserved for metadata. be My total is 5TB, so we multiply 5,000,000MB by .008 to get 40 GB needed for metadata storage.

But how do we mess with the now baked-in metadata size?
We can see how the data is split across the disks (including metadata size) through lvs -a:

Code:
root@psivmh-snh-001:~# lvs -a
  LV                      VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV                  SAS_VG twi-a-tz--  5.00t             0.00   0.24                         
  [SAS_LV_tdata]          SAS_VG rwi-aor---  5.00t                                    20.23         
  [SAS_LV_tdata_rimage_0] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rimage_1] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rimage_2] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rimage_3] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rmeta_0]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tdata_rmeta_1]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tdata_rmeta_2]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tdata_rmeta_3]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tmeta]          SAS_VG ewi-aor--- 15.00g                                    100.00       
  [SAS_LV_tmeta_rimage_0] SAS_VG iwi-aor---  7.50g                                                 
  [SAS_LV_tmeta_rimage_1] SAS_VG iwi-aor---  7.50g                                                 
  [SAS_LV_tmeta_rimage_2] SAS_VG iwi-aor---  7.50g                                                 
  [SAS_LV_tmeta_rimage_3] SAS_VG iwi-aor---  7.50g                                                 
  [SAS_LV_tmeta_rmeta_0]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tmeta_rmeta_1]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tmeta_rmeta_2]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tmeta_rmeta_3]  SAS_VG ewi-aor---  4.00m
  [lvol0_pmspare]                  SAS_VG ewi-------   15.00g

The SAS_LV_tmeta and lvol0_pmspare LVs are the metadata and mirror of the metadata, respectively. We can extend the metadata size to 40G from 15G, with this command (lvmextend reference):

Code:
lvextend --poolmetadatasize +25G SAS_VG/SAS_LV
  Extending 2 mirror images.
  Size of logical volume SAS_VG/SAS_LV_tmeta changed from 15.00 GiB (3840 extents) to 40.00 GiB (10240 extents).
  Logical volume SAS_VG/SAS_LV_tmeta successfully resized.

HOWEVER, this does not resize our mirror of the metadata, leaving it at 15GB. We need to resize that too. LVM doesn't seem to know to do that on its own, so we must take the pool offline, then tell it to repair the thin LV, which will fix it:

Code:
root@psivmh-snh-001:~# lvconvert --repair SAS_VG/SAS_LV
  Active pools cannot be repaired.  Use lvchange -an first.
root@psivmh-snh-001:~# lvchange -an SAS_VG/SAS_LV
root@psivmh-snh-001:~# lvconvert --repair SAS_VG/SAS_LV
  WARNING: LV SAS_VG/SAS_LV_meta0 holds a backup of the unrepaired metadata. Use lvremove when no longer required.
  WARNING: New metadata LV SAS_VG/SAS_LV_tmeta might use different PVs.  Move it with pvmove if required.
root@psivmh-snh-001:~# lvs -a
  LV                      VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV                  SAS_VG twi---tz--  5.00t                                                 
  SAS_LV_meta0            SAS_VG rwi---r--- 40.00g                                                 
  [SAS_LV_meta0_rimage_0] SAS_VG Iwi---r--- 20.00g                                                 
  [SAS_LV_meta0_rimage_1] SAS_VG Iwi---r--- 20.00g                                                 
  [SAS_LV_meta0_rimage_2] SAS_VG Iwi---r--- 20.00g                                                 
  [SAS_LV_meta0_rimage_3] SAS_VG Iwi---r--- 20.00g                                                 
  [SAS_LV_meta0_rmeta_0]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_meta0_rmeta_1]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_meta0_rmeta_2]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_meta0_rmeta_3]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_tdata]          SAS_VG rwi---r---  5.00t                                                 
  [SAS_LV_tdata_rimage_0] SAS_VG Iwi---r---  2.50t                                                 
  [SAS_LV_tdata_rimage_1] SAS_VG Iwi---r---  2.50t                                                 
  [SAS_LV_tdata_rimage_2] SAS_VG Iwi---r---  2.50t                                                 
  [SAS_LV_tdata_rimage_3] SAS_VG Iwi---r---  2.50t                                                 
  [SAS_LV_tdata_rmeta_0]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_tdata_rmeta_1]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_tdata_rmeta_2]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_tdata_rmeta_3]  SAS_VG ewi---r---  4.00m                                                 
  [SAS_LV_tmeta]          SAS_VG ewi------- 40.00g                                                 
  [lvol1_pmspare]         SAS_VG ewi------- 40.00g

Another problem. We got two warning messages - one about potentially using different PVs (which is false for me, everything related to this is using SAS_VG, so I can safely ignore it), and the other telling us to remove a backup of the pre-repaired metadata LVM made.
Simple enough since it shows up as a top-level LV now:

Code:
root@psivmh-snh-001:~# lvs
  LV           VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV       SAS_VG twi---tz--  5.00t                                                 
  SAS_LV_meta0 SAS_VG rwi---r--- 40.00g                                                 
  data         pve    twi-a-tz-- 59.66g             0.00   1.59                         
  root         pve    -wi-ao---- 27.75g                                                 
  swap         pve    -wi-ao----  8.00g                                                 
root@psivmh-snh-001:~# lvremove SAS_VG/SAS_LV_meta0
Do you really want to remove and DISCARD logical volume SAS_VG/SAS_LV_meta0? [y/n]: y
  Logical volume "SAS_LV_meta0" successfully removed

Last, we reactivate our pool since we needed to deactivate it to fix it...

Code:
root@psivmh-snh-001:~# lvchange -ay SAS_VG/SAS_LV
root@psivmh-snh-001:~# lvs
  LV     VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV SAS_VG twi-a-tz--  5.00t             0.00   0.23                         
  data   pve    twi-a-tz-- 59.66g             0.00   1.59                         
  root   pve    -wi-ao---- 27.75g                                                 
  swap   pve    -wi-ao----  8.00g                                                 
root@psivmh-snh-001:~# lvs -a
  LV                      VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  SAS_LV                  SAS_VG twi-a-tz--  5.00t             0.00   0.23                         
  [SAS_LV_tdata]          SAS_VG rwi-aor---  5.00t                                    25.88         
  [SAS_LV_tdata_rimage_0] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rimage_1] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rimage_2] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rimage_3] SAS_VG Iwi-aor---  2.50t                                                 
  [SAS_LV_tdata_rmeta_0]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tdata_rmeta_1]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tdata_rmeta_2]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tdata_rmeta_3]  SAS_VG ewi-aor---  4.00m                                                 
  [SAS_LV_tmeta]          SAS_VG ewi-ao---- 40.00g                                                 
  [lvol1_pmspare]         SAS_VG ewi------- 40.00g                                                 
  data                    pve    twi-a-tz-- 59.66g             0.00   1.59                         
  [data_tdata]            pve    Twi-ao---- 59.66g                                                 
  [data_tmeta]            pve    ewi-ao----  1.00g                                                 
  [lvol0_pmspare]         pve    ewi-------  1.00g                                                 
  root                    pve    -wi-ao---- 27.75g                                                 
  swap                    pve    -wi-ao----  8.00g


Now we go look at Proxmox web GUI:

1612026045261.png

1612026066986.png



1612026006921.png
 
Last edited:
When needing to set up a thinpool over LVM RAID1, these 4 liners have always worked for me.

Code:
pvcreate /sda5 /sdb5

vgcreate vg-b /dev/sda5 /dev/sdb5

lvcreate --type raid1 -m 1 -l 97%FREE -n proxthin vg-b

lvconvert --type thin-pool --poolmetadatasize 1024M --chunksize 128 vg-b/proxthin

Done.

Making sure the thinpool data and metadata are getting synced:

Code:
root@pmx8:~# lvs -a -o name,copy_percent,devices vg-b
  LV                     Cpy%Sync Devices
  [lvol0_pmspare]                    /dev/sda5(23104)
  proxthin                           proxthin_tdata(0)
  [proxthin_tdata]          100.00   proxthin_tdata_rimage_0(0),proxthin_tdata_rimage_1(0)
  [proxthin_tdata_rimage_0]          /dev/sda5(1)
  [proxthin_tdata_rimage_1]          /dev/sdb5(1)
  [proxthin_tdata_rmeta_0]           /dev/sda5(0)
  [proxthin_tdata_rmeta_1]           /dev/sdb5(0)
  [proxthin_tmeta]                   /dev/sda5(23040)

By pre-allocating the disk space, I stop worrying about space constraints.

Code:
root@pmx8:~# lvdisplay vg-b
  --- Logical volume ---
  LV Name                proxthin
  VG Name                vg-b
  LV UUID                TDT9UD-pVbD-Szc4-el1x-5w8f-gDj0-xbtXCw
  LV Write Access        read/write
  LV Creation host, time pmx8.xxxxxxxx.com, 2021-05-23 13:20:39 +0100
  LV Pool metadata       proxthin_tmeta
  LV Pool data           proxthin_tdata
  LV Status              available
  # open                 2
  LV Size                <97.00 GiB
  Allocated pool data    1.14%   <-- Current data usage
  Allocated metadata     1.43%   <-- Current metadata usage
  Current LE             23039
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:26

I get informed twice daily of the current space usage in the thinpool with an email from cron.

Code:
/sbin/lvdisplay vg-b | grep Allocated
  Allocated pool data    1.14%
  Allocated metadata     1.43%