Decrease size of VM hard disk

tuathan

Member
May 23, 2020
52
6
8
I have a VM with a 42 GB hard disk (format raw, zfs file system) which I'd like to reduce to 32 GB.

I thought I could use parted to reduce the partition on the VM first and then decrease the PVE size but I'm struggling to find a method to do this?

Is there any method / documentation available?
 
There currently is no supported way of doing this, since it might lead to catastrophic data loss.

If you are sure you know what you're doing, you're already on the right track:
  1. Make sure there is absolutely no important data in the sectors of the disk you're going the remove (starting from the end of the disk)
  2. (Make a backup before proceeding to be really sure)
  3. Shut down the VM
  4. Tell ZFS to shrink the disk (e.g. zfs set volsize=XXXG rpool/data/vm-<vmid>-disk-0, use zfs list to find the correct one)
  5. Run qm rescan <vmid>
  6. Start the VM to verify
 
Hy & sorry for kidnapping this Thread!

I used this commands to shrink a large Disk from a Windows VM.
The previous IT technical set the disk to 8TB but only 4 are required. Now the Datastore ist over 90% filled

The shrink of the disk looks good, in StorageOverview from PVE Gui the Disk shows only ~ 4TB.
But the Datastore is still filled up over 90%. The disk shows "used" value with 8TB and not 4TB

PVE Version = 6.4.4
VM = Windows Server 2016
Discard is active
We also tried "Windows Optimize Disk -> Defrag" and sdelete



Has anyone an idea why the disk does not free up the space?




Code:
root@slpvep01:~# zfs get all zSpace/vm/vm-100-disk-0
NAME                     PROPERTY              VALUE                  SOURCE
zSpace/vm/vm-100-disk-0  type                  volume                 -
zSpace/vm/vm-100-disk-0  creation              Fri Dec 28  9:59 2018  -
zSpace/vm/vm-100-disk-0  used                  8.25T                  -
zSpace/vm/vm-100-disk-0  available             3.09T                  -
zSpace/vm/vm-100-disk-0  referenced            5.34T                  -
zSpace/vm/vm-100-disk-0  compressratio         1.03x                  -
zSpace/vm/vm-100-disk-0  reservation           none                   default
zSpace/vm/vm-100-disk-0  volsize               4.49T                  local
zSpace/vm/vm-100-disk-0  volblocksize          8K                     default
zSpace/vm/vm-100-disk-0  checksum              on                     default
zSpace/vm/vm-100-disk-0  compression           lz4                    local
zSpace/vm/vm-100-disk-0  readonly              off                    default
zSpace/vm/vm-100-disk-0  createtxg             3236796                -
zSpace/vm/vm-100-disk-0  copies                1                      default
zSpace/vm/vm-100-disk-0  refreservation        8.25T                  local
zSpace/vm/vm-100-disk-0  guid                  2085206188499828143    -
zSpace/vm/vm-100-disk-0  primarycache          all                    default
zSpace/vm/vm-100-disk-0  secondarycache        all                    default
zSpace/vm/vm-100-disk-0  usedbysnapshots       0B                     -
zSpace/vm/vm-100-disk-0  usedbydataset         5.34T                  -
zSpace/vm/vm-100-disk-0  usedbychildren        0B                     -
zSpace/vm/vm-100-disk-0  usedbyrefreservation  2.91T                  -
zSpace/vm/vm-100-disk-0  logbias               latency                default
zSpace/vm/vm-100-disk-0  objsetid              291                    -
zSpace/vm/vm-100-disk-0  dedup                 off                    default
zSpace/vm/vm-100-disk-0  mlslabel              none                   default
zSpace/vm/vm-100-disk-0  sync                  standard               default
zSpace/vm/vm-100-disk-0  refcompressratio      1.03x                  -
zSpace/vm/vm-100-disk-0  written               5.34T                  -
zSpace/vm/vm-100-disk-0  logicalused           3.78T                  -
zSpace/vm/vm-100-disk-0  logicalreferenced     3.78T                  -
zSpace/vm/vm-100-disk-0  volmode               default                default
zSpace/vm/vm-100-disk-0  snapshot_limit        none                   default
zSpace/vm/vm-100-disk-0  snapshot_count        none                   default
zSpace/vm/vm-100-disk-0  snapdev               hidden                 default
zSpace/vm/vm-100-disk-0  context               none                   default
zSpace/vm/vm-100-disk-0  fscontext             none                   default
zSpace/vm/vm-100-disk-0  defcontext            none                   default
zSpace/vm/vm-100-disk-0  rootcontext           none                   default
zSpace/vm/vm-100-disk-0  redundant_metadata    all                    default
zSpace/vm/vm-100-disk-0  encryption            off                    default
zSpace/vm/vm-100-disk-0  keylocation           none                   default
zSpace/vm/vm-100-disk-0  keyformat             none                   default
zSpace/vm/vm-100-disk-0  pbkdf2iters           0                      default



kr Roland
 
Did you try a zpool trim zSpace to trim your pool?

Also did you shrink your guests filesystem and partitions before shrinking the zvol?
 
Last edited:
Hy

I did it on a testsystem with Raicontroller configured all Disks as Raid0 and zfs Mirrow with the following result
Until now I was afraid to this on the productive System

Result
Code:
cannot trim: no devices in pool support trim operations

Do you think on Disks connected via HBA get a better result?

Yes, the Filesystem on Windows VM was shrinked from 8TB to 4TB before.
With "qm rescan" the config was updatet and the disk inside VM is showing 4TB maxsize


kr
Roland
 
Last edited:
Think also on the prod System a trim is not possible. Maybe a Raidcontroller is used instead an HBA

zpool status -t zSpace shows "trim unsupported"

pool: zSpace state: ONLINE scan: scrub repaired 0B in 18:19:58 with 0 errors on Sun Oct 10 18:44:02 2021 config: NAME STATE READ WRITE CKSUM zSpace ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ata-WDC_WD40EFRX-68N32N0_WD-WCC7K0ZKN1FL ONLINE 0 0 0 (trim unsupported) ata-WDC_WD40EFRX-68N32N0_WD-WCC7K3UHY7RC ONLINE 0 0 0 (trim unsupported) ata-WDC_WD40EFRX-68N32N0_WD-WCC7K3UHY932 ONLINE 0 0 0 (trim unsupported) ata-WDC_WD40EFRX-68N32N0_WD-WCC7K5TR16CJ ONLINE 0 0 0 (trim unsupported)


Is there any other solution to decrease the used space?
 
Last edited:
You really shouldn't use a raid controller with ZFS. Atleast not in production.
See here:

Hardware RAID controllers

Hardware RAID controllers should not be used with ZFS. While ZFS will likely be more reliable than other filesystems on Hardware RAID, it will not be as reliable as it would be on its own.

  • Hardware RAID will limit opportunities for ZFS to perform self healing on checksum failures. When ZFS does RAID-Z or mirroring, a checksum failure on one disk can be corrected by treating the disk containing the sector as bad for the purpose of reconstructing the original information. This cannot be done when a RAID controller handles the redundancy unless a duplicate copy is stored by ZFS in the case that the corruption involving as metadata, the copies flag is set or the RAID array is part of a mirror/raid-z vdev within ZFS.
  • Sector size information is not necessarily passed correctly by hardware RAID on RAID 1 and cannot be passed correctly on RAID 5/6. Hardware RAID 1 is more likely to experience read-modify-write overhead from partial sector writes and Hardware RAID 5/6 will almost certainty suffer from partial stripe writes (i.e. the RAID write hole). Using ZFS with the disks directly will allow it to obtain the sector size information reported by the disks to avoid read-modify-write on sectors while ZFS avoids partial stripe writes on RAID-Z by desing from using copy-on-write.
    • There can be sector alignment problems on ZFS when a drive misreports its sector size. Such drives are typically NAND-flash based solid state drives and older SATA drives from the advanced format (4K sector size) transition before Windows XP EoL occurred. This can be manually corrected at vdev creation.
    • It is possible for the RAID header to cause misalignment of sector writes on RAID 1 by starting the array within a sector on an actual drive, such that manual correction of sector alignment at vdev creation does not solve the problem.
  • Controller failures can require that the controller be replaced with the same model, or in less extreme cases, a model from the same manufacturer. Using ZFS by itself allows any controller to be used.
  • If a hardware RAID controller’s write cache is used, an additional failure point is introduced that can only be partially mitigated by additional complexity from adding flash to save data in power loss events. The data can still be lost if the battery fails when it is required to survive a power loss event or there is no flash and power is not restored in a timely manner. The loss of the data in the write cache can severely damage anything stored on a RAID array when many outstanding writes are cached. In addition, all writes are stored in the cache rather than just synchronous writes that require a write cache, which is inefficient, and the write cache is relatively small. ZFS allows synchronous writes to be written directly to flash, which should provide similar acceleration to hardware RAID and the ability to accelerate many more in-flight operations.
  • Behavior during RAID reconstruction when silent corruption damages data is undefined. There are reports of RAID 5 and 6 arrays being lost during reconstruction when the controller encounters silent corruption. ZFS’ checksums allow it to avoid this situation by determining if not enough information exists to reconstruct data. In which case, the file is listed as damaged in zpool status and the system administrator has the opportunity to restore it from a backup.
  • IO response times will be reduced whenever the OS blocks on IO operations because the system CPU blocks on a much weaker embedded CPU used in the RAID controller. This lowers IOPS relative to what ZFS could have achieved.
  • The controller’s firmware is an additional layer of complexity that cannot be inspected by arbitrary third parties. The ZFS source code is open source and can be inspected by anyone.
  • If multiple RAID arrays are formed by the same controller and one fails, the identifiers provided by the arrays exposed to the OS might become inconsistent. Giving the drives directly to the OS allows this to be avoided via naming that maps to a unique port or unique drive identifier.
    • e.g. If you have arrays A, B, C and D; array B dies, the interaction between the hardware RAID controller and the OS might rename arrays C and D to look like arrays B and C respectively. This can fault pools verbatim imported from the cachefile.
    • Not all RAID controllers behave this way. However, this issue has been observed on both Linux and FreeBSD when system administrators used single drive RAID 0 arrays. It has also been observed with controllers from different vendors.
One might be inclined to try using single-drive RAID 0 arrays to try to use a RAID controller like a HBA, but this is not recommended for many of the reasons listed for other hardware RAID types. It is best to use a HBA instead of a RAID controller, for both performance and reliability.
 
Hy

i know that's a Raidcontroller isn't a acceptable soluton for a productive Environment. We inherited the system "as is" from an other IT-Supplier.
The customer will change the system next year (i hope!!)

many thanks for your hint!
 
There currently is no supported way of doing this, since it might lead to catastrophic data loss.

If you are sure you know what you're doing, you're already on the right track:
  1. Make sure there is absolutely no important data in the sectors of the disk you're going the remove (starting from the end of the disk)
  2. (Make a backup before proceeding to be really sure)
  3. Shut down the VM
  4. Tell ZFS to shrink the disk (e.g. zfs set volsize=XXXG rpool/data/vm-<vmid>-disk-0, use zfs list to find the correct one)
  5. Run qm rescan <vmid>
  6. Start the VM to verify
After doing that gparted reported that my GPT table is corrupt but my Windows guest was very lively.
The backup GPT table is corrupt, but the primary appears OK, so that will be used.

This answer solved my problem and I could repair my backup gpt table using gdisk: https://askubuntu.com/questions/386752/fixing-corrupt-backup-gpt-table/386802#386802
Just be brave: When you see your tables with the p command, you can safely write it with w. The Warning that all existing partitions will be overwritten is very general.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!