Proxmox + NVMe/TCP : Is LVM-thin supported and suitable for fast VM provisioning?

lxiosjao

Active Member
Mar 21, 2021
66
1
28
Hi everyone,

We are currently using NVMe over TCP to store the virtual disks of our Proxmox VMs.
The backend storage is a Dell PowerStore (full NVMe) array.


Current architecture​

  • 4 hosts connect to the PowerStore via NVMe/TCP
  • HA needed
  • We created an LVM volume group on top of those devices
  • Proxmox uses this LVM storage to host the VM disks

From the Proxmox perspective, the storage is simply LVM on a block device, even though underneath it is NVMe over TCP.

Question about LVM-thin​


➡️ Is LVM-thin supported or recommended when the underlying storage is NVMe over TCP?


Current issue (deployment time)​


Right now we are using LVM (not LVM-thin). Because of this, when deploying VMs from a template we are forced to create full clones.

This significantly increases our VM deployment times.

For example:

  • Our template is a Rocky Linux VM with a 740 GB disk
  • In our previous VMware vSphere CI environment, deploying a VM from this template took around 2 minutes (with thin-provsioning).
  • With our current setup in Proxmox, we are seeing ~1 hour 20 minutes because the clone is full and the disk is converted to qcow2

This obviously has a big impact on our CI environment and automation workflows.

My assumption is that thin provisioning (LVM-thin) would allow us to create fast linked clones, which should drastically reduce deployment times.

Thanks for your help.

 
Last edited:
Hi @lxiosjao ,

Is LVM-thin supported or recommended when the underlying storage is NVMe over TCP?
No.

I am not sure if the new tech-preview PVE9 LVM snapshot-as-volume-chain may optimize anything for you. If I recall correctly, it still require 100% space overhead reservation for snapshots/clones.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Johannes S
Thank you for clarifying. I’m a bit unsure about the best approach here. Could you advise on what steps I should take next to optimize storage or manage snapshots/clones effectively?
As it stands, it works, but the deployment time for our templates is far too long.
 
Thank you for clarifying. I’m a bit unsure about the best approach here. Could you advise on what steps I should take next to optimize storage or manage snapshots/clones effectively?
As it stands, it works, but the deployment time for our templates is far too long.
You may have to live with the current deployment time until a technical solution is implemented to address this limitation. There may be something on the horizon, but I do not have visibility into the development priorities of Proxmox Server Solutions GmbH.

One option is to purchase a support subscription for Proxmox Virtual Environment and pursue an official answer through the support channel.

Another option is to engage a Proxmox implementation partner. A deeper review of your workflow and storage configuration may reveal possible optimizations.

Finally, you could consider using a storage backend that is more tightly integrated with PVE and supports the full set of platform features, rather than only the basic create/delete operations.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Johannes S
We use xfs based nfs(4.2) storage for all pve (vm+lxc) images which your powerstore could do too.
First having a couple of templates which consists of 1 pseudo 1MB (eg "9000") one and the real ones which could be differnt (here linux) OS'ses and disk sizes (eg 999 rocky with a 800GB disk), better raw but qcow2 is possible in little other way too.
Full process like doing just in a small bash script (no cloudinit etc needed for):
1.) qm clone 9000 777 --name my-independent-vm --full 1
2.) root@pve1:/mnt/pve/srv_data/images/777# time cp ../999/base-999-disk-0.raw vm-777-disk-0.raw
real 0m1.079s
user 0m0.000s
sys 0m0.005s
root@pve1:/mnt/pve/srv_data/images/777# l -h
-rw-r--r-- 1 root root 800G Mar 14 11:37 vm-777-disk-0.raw
3.) losetup -fP vm-777-disk-0.raw
4.) mkdir tmp
5.) mount -o loop $(lsblk /dev/loop?p?|grep p3|awk '{print $1}') tmp
ls tmp
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr usr2 var
6.) change hostname, machine_id + host sshkeys + pw (by chroot), static IP if not using dhcp inside mounted vm image by variables given the bash script
7.) umount tmp
8.) losetup -d /dev/loop$(lsblk /dev/loop?p?|grep p3|awk '{print $1}'|cut -d'p' -f2)
9.) qm start 777
So in the end that takes less than 1 minute to generate a 800 GB vm from template while using nfs which is finest for migrating vm's between pve's in seconds.

Doing backups of these nfs based .raw images works fine eg with:
time (qm suspend 176;cp vm-176-disk-0.raw vm-176-disk-0.raw.snap;qm resume 176) # took ~10 sec and afterwards the .snap file could be saved to other media while vm not affected anymore.
#
PS: Nearly the same create and backup process for lxc's.
 
Last edited:
Have a look: