[SOLVED] PVE and client have different ideas about disk size

gctwnl

Member
Aug 24, 2022
108
20
23
As can be read in https://forum.proxmox.com/threads/a...a-disk-for-a-client-how-do-i-decrease.178622/ I have had an accident with my setup when I tried to resize some of my LVs. I am left with two specific problems regarding size and a warning I do not understand.

This was the situation before I started to adapt it:
Code:
PVE Host INTERN SSD:
    pve/vm-100-disk-2 (500GB) becomes scsi3 /dev/sdd1 on the client and is mounted on /mnt/ServerData;  24GB in use from 492GB
PVE Host EXTERN RAID1 (LUKS):
    rna-mepdm-1/rna-pbs-mepdm-1 (500GB) is mounted on the PVE host as /mnt/pbs-backup-1; 40GB in use from 492GB
    rna-mepdm-1/vm-100-disk-0 (500GB) becomes scsi1 /dev/sdb1 on the client mounted on /mnt/ServerBackup; 431GB in use from 492GB
I wanted to decrease (VG/LV) rna-mepdm-1/rna-pbs-mepdm-1 to 200GB, increase rna-mepdm-1/vm-100-disk-0 to 800GB and decrease pve/vm-100-disk-2 to 300GB to make room for something else.

My error was that I tried to do everything at PVE-level assuming that the VM would simply mount what there is. So, I tried to lvresize rna-mepdm-1/vm-100-disk-0 and I made a complete idiotic error by forgetting about raw device (which was created on PVE and handed to the client) versus partition (which was created on the client). And at some point I made a mistake in doing an increase on the wrong virtual 'disk'. In the thread linked above I got help and I am now back to mostly OK (/mnt/pbs-backup-1 has an OK FS on the PVE host and /mnt/ServerData has an OK FS on the client), except for two things.

Now, the issues are
  1. On the PVE host, the LV rna-mepdm-1/vm-100-disk-0 (inside the LUKS container that is on the physical external RAID device) is 800GB, but that becomes scsi1 /dev/sdb on the client with 300GB. I am assuming t he data on this device has been lost. This is my primary backup, but luckily I have a secondary one in the cloud with the same data (just with a bit less granularity in time). Though the data may actually still be there I have no idea how to reach it.
  2. On the PVE host, the LV pve/vm-100-disk-2 is 300GB, but that becomes scsi3 /dev/sdd on the client with 800GB (parted shows bigger numbers for all) . This is my primary data volume for that VM and luckily the data isn't lost there. And there is a warning I do not understand that seems clearly related.
On the PVE host:
Code:
root@pve:~# lvs
  WARNING: Thin volume pve/vm-100-disk-2 maps <465.25 GiB while the size is only 300.00 GiB.
  LV              VG          Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data            pve         twi-aotz-- <794.79g             64.62  2.14                         
  root            pve         -wi-ao----   96.00g                                                 
  swap            pve         -wi-ao----    8.00g                                                 
  vm-100-disk-0   pve         Vwi-aotz--   32.00g data        52.66                               
  vm-100-disk-1   pve         Vwi-aotz--   32.00g data        98.32                               
  vm-100-disk-2   pve         Vwi-aotz--  300.00g data        100.00                               
  rna-pbs-mepdm-1 rna-mepdm-1 -wi-ao----  200.00g                                                 
  vm-100-disk-0   rna-mepdm-1 -wi-ao----  800.00g                                                 
root@pve:~# lsblk
NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda                                             8:0    0   1.7T  0 disk
`-sda1                                          8:1    0   1.7T  0 part
  `-luks-fa1483bd-f599-4dcf-9732-c09069472150 252:9    0   1.7T  0 crypt
    |-rna--mepdm--1-vm--100--disk--0          252:10   0   800G  0 lvm 
    `-rna--mepdm--1-rna--pbs--mepdm--1        252:11   0   200G  0 lvm   /mnt/pbs-backup-1

On the client:
Code:
# parted -l /dev/sdb

...

Error: Invalid argument during seek for read on /dev/sdb
Retry/Ignore/Cancel? i                                                 
Error: The backup GPT table is corrupt, but the primary appears OK, so that will be used.
OK/Cancel? OK                                                           
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sdb: 322GB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
Disk Flags:

...

Warning: Not all of the space available to /dev/sdd appears to be used, you can fix the GPT to use all of
the space (an extra 629145600 blocks) or continue with the current setting?
Fix/Ignore? I                                                           
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sdd: 859GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End    Size   File system  Name              Flags
 1      1049kB  322GB  322GB  ext4         Linux filesystem


I am completely baffled (which is easy to do as I am not a Linux specialist). I appreciate all tips and suggestions.
 
Last edited:
In addition, on the client gdisk reports that it sees the file system so my data might not be lost:

Code:
# gdisk -l /dev/sdb
GPT fdisk (gdisk) version 1.0.8

Warning! Disk size is smaller than the main header indicates! Loading
secondary header from the last sector of the disk! You should use 'v' to
verify disk integrity, and perhaps options on the experts' menu to repair
the disk.
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.

Warning! One or more CRCs don't match. You should repair the disk!
Main header: OK
Backup header: ERROR
Main partition table: OK
Backup partition table: ERROR

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: damaged

****************************************************************************
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.
****************************************************************************
Disk /dev/sdb: 629145600 sectors, 300.0 GiB
Model: QEMU HARDDISK
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): 63F416B4-9F47-486C-ACD2-71906BA86F88
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 1048575966
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048      1048575966   500.0 GiB   8300  Linux filesystem
So my suspicion that the client-created partition table is the problem in both cases. But how I can fix this I don't yet know. What still baffles me is how the client can think /dev/sdb is 300GB in size while the LV offered to it by PVE is 800GB in size.
 
Last edited:
Additional info. I tried telling QEMU what happened and I tried rescanning the SCSI on the client.
Code:
PVE Host:
  526  qm stop 100
  527  qm status 100
  528  qm set 100 --delete scsi1
  529  qm set 100 --scsi1 rna-mepdm-1:vm-100-disk-0,size=800G
  530  qm config 100
  531  qm start 100

Client:
 1992  echo 1 | sudo tee /sys/class/block/sdb/device/rescan
 1993  lsblk /dev/sdb
 1997  for h in /sys/class/scsi_host/host*; do   echo "- - -" | sudo tee $h/scan; done
 1998  lsblk /dev/sdb

No change. Still the LV is 800GB on PVE and 300GB on the client. As long as the Ubuntu client kernel thinks the disk is 300GB I cannot do anything.
 
Last edited:
I am afraid I may have a lot of egg on my face. I'll explain later (and then remove this message)
 
So, here is what happened. When I set up my initial environment three years ago, I decided upon the following as initial setup for a backup (most importantly mail and local DNS) server:
  • Proxmox on a decent NUC as HOST
    • Internal 1TB SSD, hardware encrypted at rest
    • External 2TB hardware SSD RAID1, LUKS encrypted at rest
    • A VG on each storage device
      • VMs stored on the internal SSD, backed up (with PBS) to the external RAID1, an LV mounted as /mnt/pbs-backup-1 on the HOST
      • Client Server Data on the external RAID1, mounted on /mnt/ServerData, backed up with restic to the internal SSD of the host (client mounted as /mnt//ServerBackup) (as well as to the cloud)
      • Other machines on the LAN (Macs) backup to that restic server running in the client, as well as to the cloud
      • That way, VMs were stored on internal, backed up on external, Data was stored external and backed up to internal (as well as cloud). Everything has a form of storage robustness
Now, on PVE, I added 4 hard disks to the VM: one for Client:/ (scsi0), one for CLIENT:/mnt/ServerData (scsi1), one for separate docker storage (so it would not overrun the rest) mounted on CLIENT:/various/lib/docker and one for CLIENT:/ServerBackup. As I had no idea yet which elements would grow fastest I decided to give HOST:/mnt/pbs-backup-1, CLIENT:/mnt/ServerData, and CLIENT:/mnt/ServerBackup 500GB.

After three years it became clear that especially the ServerBackup volume was filling up and needed to be extended. My VM hardly changes so the PBS backup was only 40GB, so it could be decreased.

Now, this is where I went wrong. On the client, I have 4 devices, sda (32GB), sdb (500GB), sdc (32GB), sdd (500GB). And in my VM config, scsi0 was 32GB for the VM, scsi1 was for the volume containing ServerBackup, scsi2 was for the volume containing docker stuff, and scsi3 was the volume containing ServerData. But what I didn't know what that the device letters aren't fixed. So, I assumed (having done most of this stuff on unixes in the dim past, Ultrix, SunOS, Solaris, AIX, SCO Unix, Minix, HP-UX, various kinds of (somewhat) BSD-like such as NeXTSTEP, macOS, etc.) that scsi0 would be sda, scsi1 woul be sdb, scsi2 would be sdc and scsi3 would be sdd. But that is not what happens. Linux connects these in parallel on a first-come-first-served basis. My fstab mounted on UUID, so it would not notice any difference and I only seldom looked at lsblk and such. Besides both /dev/sd[bd] were 500GB so that they were 'switched', I could not easily see.

So, three years later and I want to resize the 500GB volumes to better values. And I assume the LV on the RAID has ended up as sdb because it is scsi1 and the LV internal has ended up as sdd because it was scsi3. This confused me because (in the back of my mind, did I decide after all not to put ServerData on the RAID??? Apparently.

But reality is this (three ways to see how it ends up on standard PVE without extra tools installed, run these on the client):
Code:
# sg_map -sd -x # forth number column is LUN
/dev/sg0  1 0 0 0  5
/dev/sg1  2 0 0 0  0  /dev/sda
/dev/sg2  2 0 0 3  0  /dev/sdb
/dev/sg3  2 0 0 2  0  /dev/sdc
/dev/sg4  2 0 0 1  0  /dev/sdd

# ls -l /dev/disk/by-id/ | grep -E "scsi[0123]"
lrwxrwxrwx 1 root root  9 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi0 -> ../../sda
lrwxrwxrwx 1 root root 10 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part3 -> ../../sda3
lrwxrwxrwx 1 root root  9 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi1 -> ../../sdd
lrwxrwxrwx 1 root root 10 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-part1 -> ../../sdd1
lrwxrwxrwx 1 root root  9 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi2 -> ../../sdc
lrwxrwxrwx 1 root root 10 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi2-part1 -> ../../sdc1
lrwxrwxrwx 1 root root  9 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi3 -> ../../sdb
lrwxrwxrwx 1 root root 10 Jan  4 13:38 scsi-0QEMU_QEMU_HARDDISK_drive-scsi3-part1 -> ../../sdb1

# lsblk -o NAME,SIZE,SERIAL,UUID # as it is now
NAME                       SIZE SERIAL               UUID
loop0                     63.8M                     
loop1                     63.8M                     
loop2                     91.4M                     
loop3                     91.4M                     
loop4                     50.8M                     
loop5                     50.9M                     
sda                         32G drive-scsi0         
├─sda1                       1M                     
├─sda2                       2G                      f1fa5b9e-b57f-40d0-880c-7fed8ab4cab4
└─sda3                      30G                      CmkWOb-erKV-R9cx-nbtg-vrNV-4jsu-dYvObl
  └─ubuntu--vg-ubuntu--lv   15G                      22f1ccea-1fa2-4118-8959-73123e2a875b
sdb                        300G drive-scsi3         
└─sdb1                     300G                      109bd659-811d-442e-9539-ebf3673d9ad3
sdc                         32G drive-scsi2         
└─sdc1                      32G                      a74f54a6-7a85-4c3b-839f-c034ef280d0b
sdd                        700G drive-scsi1         
└─sdd1                     700G                      e7639f38-e488-46fb-bd95-64c930c30603
And as you can see
- scsi0 becomes /dev/sda (as expected)
- scsi1 becomes /dev/sdd
- scsi2 becomes /dev/sdc (as expected)
- scsi3 becomes /dev/sdb

Before I knew the above, I started to mess around with lvresize, made a few other mistakes, but mainly:
- I resized a 500GB LV with a 460GB use that I though was on the internal SSD back to 300GB, thinking it held a 30GB file system or so
- I resized a 500GB LV with a 40GB use that I thought was on the external RAID1 to 800GB.

I ended up booting the VM with SystemRescue, running gdisk to repair partition, tables, etc., I was convinced there was an error somewhere, where my client VM got wrong information somehow. Because I 'knew' what sdb and sdd were as I had configured them as such in PVE as scsi1 and scsi3. 'Knew not' as it turned out.

Now, when I found out my mistake. I did the following on the HOST (with the VM shut down). I created loop devices for the LVs. I ran fsck on the partitions (all fine). I mounted these so I could inspect them. Here are some commands:
Code:
  losetup -fP /dev/pve/vm-100-disk-2 # becomes /dev/loop0 if no loops are available yet, otherwise: check for actual number with losetup -a
  losetup -fP /dev/rna-mepdm-1/vm-100-disk-0 # becomes /dev/loop1
  losetup -fP /dev/pve/vm-100-disk-1  # becomes /dev/loop2
  mkdir /mnt/ServerBackup
  mkdir /mnt/ServerData
  mkdir /mnt/VarLibDocker
  mount /dev/loop0p1 /mnt/ServerBackup
  mount /dev/loop1p1 /mnt/ServerData
  mount /dev/loop2p1 /mnt/VarLibDocker

# Save ServerData temporary in a tar archive, just to be sure, remove afterwards

  tar --directory /mnt/ServerData --create --file /mnt/pbs-backup-1/serverdata.tar.bz2 --bzip2 --acls --xattrs --preserve-permissions --verbose .

# After inspection & backup:
  umount /mnt/ServerData
  umount /mnt/ServerBackup
  umount /mnt/VarLibDocker
I resized the LVs. I ran gdisk on the HOST and repaired the tables (somehow this did not work on the client). I ran GParted on the client (boot with SystemRescue image) and resized the partitions+file systems to fill up the entire LV.

Everything works and amazingly no data was lost. The only thing that has changed afaics is that because I mounted the LVS on the host, it now knows about the partitions inside the crypt container. Before:
Code:
# lsblk -f /dev/sda1
NAME                                        FSTYPE      FSVER    LABEL UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
sda1                                        crypto_LUKS 2              fa1483bd-f599-4dcf-9732-c09069472150                
└─luks-fa1483bd-f599-4dcf-9732-c09069472150 LVM2_member LVM2 001       LhSQs2-gMmx-o81Q-IKjg-dIxj-j8E9-ocrPVU              
  ├─rna--mepdm--1-vm--100--disk--0                                                                                          
  └─rna--mepdm--1-rna--pbs--mepdm--1        ext4        1.0            fb75e648-561d-47a1-948c-83d9d72df80f    426.5G     8% /mnt/pbs-backup-1
After:
Code:
# lsblk -f /dev/sda1
NAME                                        FSTYPE      FSVER    LABEL UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
sda1                                        crypto_LUKS 2              fa1483bd-f599-4dcf-9732-c09069472150                
`-luks-fa1483bd-f599-4dcf-9732-c09069472150 LVM2_member LVM2 001       LhSQs2-gMmx-o81Q-IKjg-dIxj-j8E9-ocrPVU              
  |-rna--mepdm--1-vm--100--disk--0                                                                                          
  | `-rna--mepdm--1-vm--100--disk--0p1      ext4        1.0            109bd659-811d-442e-9539-ebf3673d9ad3                
  `-rna--mepdm--1-rna--pbs--mepdm--1        ext4        1.0            fb75e648-561d-47a1-948c-83d9d72df80f      148G    19% /mnt/pbs-backup-1

I still need to remove the loop devices and mount points, but I am thinking to leave them there for future easy access.

PS. I did try to get some assistance from various LLMs (I tried free ChatGPT and free Gemini on this one). I got a lot of wrong conclusions, command suggestions that combined flags one could not combine, some really faulty reasonings, some suggestions that would have destroyed my data. But also some pointers that finally make me find out that scsi[0123] wasn't sd[abcd] on Linux. And suddenly it turned out, both Ubuntus had not been lying to me, I had been lying to the Ubuntus. Sorry. I have been unable to find the actual command that showed me that PVE was offering a label scsi1 for sdd and scsi3 for sdb (and even then for a while I thought I had found a technical problem...). One learns, and I write this out in case people run into the same thing. A warning in the PVE manual when describing adding hard disks to a VM might be a good idea. The role the LLMs played doesn't surprise me (https://ea.rna.nl/the-chatgpt-and-friends-collection/)
 
Last edited:
Next: the combined PVE/PBS upgrade, but I'm going to wait until I have recovered from this adventure.
 
@gctwnl What an odyssey! ;-)
Congratulations on achieving this! :) And thanks for the detailed and informative description!
 
I ended up booting the VM with SystemRescue, running gdisk to repair partition, tables, etc., I was convinced there was an error somewhere, where my client VM got wrong information somehow. Because I 'knew' what sdb and sdd were as I had configured them as such in PVE as scsi1 and scsi3. 'Knew not' as it turned out.
If you ever need to find out again how the Drives map (mostly they are numbered the same, but don't count on that), you can just run lsblk -S
The Serialnumber contains the numbering from Proxmox (drive-scsi0/1/2 etc..)

Bash:
$ lsblk -S
NAME HCTL       TYPE VENDOR   MODEL          REV SERIAL      TRAN
sda  2:0:0:0    disk QEMU     QEMU HARDDISK 2.5+ drive-scsi0
sdb  3:0:0:1    disk QEMU     QEMU HARDDISK 2.5+ drive-scsi1
sdc  4:0:0:2    disk QEMU     QEMU HARDDISK 2.5+ drive-scsi2
 
Last edited:
  • Like
Reactions: Johannes S