Can we use lvm cache in Proxmox 4.x?

e100

Renowned Member
Nov 6, 2010
1,268
45
88
Columbus, Ohio
ulbuilder.wordpress.com
Anyone tried setting up lvm cache? Its a fairly new cache tier based on dm-cache
Theoretically we should be able to add an SSD cache to any logical volume that Proxmox has created for VM disks.

It supports writethrough and writeback cache modes.
With writethrough no data is lost if the cache device fails.
You can read all about it in the "lvmcache" man page.

This appears to be a good how-to on setting it up in debian Jessie so I can only assume it would work in Proxmox 4.x
http://www.bradfordembedded.com/2015/03/lvmcache/

Article with some simple benchmarks:
http://www.nor-tech.com/the-benefits-of-ssd-caching/
 
Technically it should be possible. I used flashcache with Proxmox 3.4 for a while, yet I do not know if this is a supported configuration or not.
 
I personally prefer ZFS for such setups, because it is easier to manage and provides more features.
 
I switched to to ZFS on the machine I used flashcache before. Yet, it has more features (that's why I'm using it right now), but ZFS is much slower - order of magnitude than LVM-backed, crypted flashcache! At least it feels that way. It is a Dell notebook with 32 GB SSD (all L2ARC, because I do not have many SYNC-writes) and 500 GB SATA.
 
@dietmar: I felt in love with ZFS, but I never encountered "fast". I tried 4 different systems and have a big machine equipped with 6x Enterprise SSD with ZFS and it is not "really fast", ext4 is still a magnitude faster. Maybe, my hardware is too cheap, it's one low 5 figures (without SSD) per 1 HE, but still.

I only wanted to say that I gained better experiences with SSD caching devices for LVM than for ZFS. Laptop booted and felt faster.
 
Anyone tried setting up lvm cache? Its a fairly new cache tier based on dm-cache
Theoretically we should be able to add an SSD cache to any logical volume that Proxmox has created for VM disks.
.....
Yes, that also my question. For a new server Setup with hardware Raid (Raid 10 with 4x HDDs) and two Intel SSDs, I want to use the SSDs for caching (dm-cache). I would therefore repeat the question
Is there someone who already has experience with LVM / dm-cache and Proxmox 4.x?
Can describe someone how he implemented that?
EDIT:
In this case an important question for me is:
if I perform a new setup, with dm-cache, can I be sure that no next Proxmox update all destroyed again?
Is that a stable feature?

kind regards,
maxprox
 
Last edited:
I did it last week and it works flawless. VERY fast ZFS with ONE disk.

It is "hacked" in, because you need to start it manually before ZFS starts. There is IMHO no automatic configuration available in Debian Jessie hat this moment, so it is of course not supported by Proxmox - but If you're familiar with Debian, it's very easy to implement.
 
BTW: my next step is to use hardware-accelerated encryption to build a fully-encrypted ZFS with DM-Cache on SSD.
it's in german: http://falkhusemann.de/blog/2014/01/proxmox-ve-mit-software-raid-und-full-disk-encryption/

At first I will work with ZFS, but now I have get a motherboard with LSI hardware RAID onboard. Therefore, I will realize the Raid10 via hardware controller.
Is it possible - after a clean installation of Proxmox (without ZFS) - then add two SSDs as dm-cache?
Similar to the following instructions ?
https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/
 

Thank you for the link, yet i'm familiar with every kind of encryption in Linux, yet I have not tried ZFS-on-crypted-DM-Cached-on-crypted-disk. Without having read the article, It is possible to do the setup of the provided link with the stock Debian installer since at least Debian Wheezy. So no news to me.

At first I will work with ZFS, but now I have get a motherboard with LSI hardware RAID onboard. Therefore, I will realize the Raid10 via hardware controller.
Is it possible - after a clean installation of Proxmox (without ZFS) - then add two SSDs as dm-cache?
Similar to the following instructions ?
https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/

Short answer: of course! Everything is possible in Linux, most of it only without a reboot :-D

Longer answer:
If you only install on a very small volume and let the rest untouched, you can then create a dm-cached device on top of LVM.

It will be easier (and performance-speaking: faster) if you install Debian Jessie as you like with one partition (4 GB) for Proxmox, one for swap with e.g. 16 GB and one partition for the rest. This rest is then used for creating the dm-cache afterwards. Then install Proxmox on top of Jessie and do the rest. Do it only if you're familiar with the topic and do try-runs in Proxmox as KVM-VM itself with snapshots along the way to redo if you're unsure how to work with this things.
 
LVM cache / dm-cache
My HowTo
With the hardware described in this post:
https://forum.proxmox.com/threads/zfs-or-hardware-raid-new-hardware-setup.26586/
I decided to use the following setup.
First of all I used the onboard LSI hardware raid controller, AVAGO 3108 MegaRAID
Some tools for Debian Jessie:
1. sources.list:
Code:
 cat /etc/apt/sources.list
...
deb http://hwraid.le-vert.net/debian jessie main

2. install:
megacli, megactl which contains megasasctl
maby have a look at thomas krenn or what you like:
https://www.thomas-krenn.com/de/wiki/MegaCLI
3. use it for a short overview:
Code:
megasasctl -B -t -v

I build a raid10 based on the four 2TB SAS HDDs.
Then I installed Proxmox as usual without ZFS (maxroot = 23 GB; maxswap = 4 GB, ext4).

Code:
=> lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda            8:0    0   3.7T  0 disk
├─sda1         8:1    0  1007K  0 part
├─sda2         8:2    0   127M  0 part
└─sda3         8:3    0   3.7T  0 part
  ├─pve-root 252:0    0    27G  0 lvm  /
  ├─pve-swap 252:1    0     8G  0 lvm  [SWAP]
  └─pve-data 252:2    0   3.6T  0 lvm  /var/lib/vz

After that I installed the Intel Enterprise SSDs
and with the same raid controller I build a raid0.

Code:
 => parted -l
Model: AVAGO SMC3108 (scsi)
Disk /dev/sda: 4000GB
(Hardware Raid10 (4x2TB SAS))
.....
Disk /dev/sdb: 199GB
(Hardware Raid0 (2x100SSD))

According to the redhat admin doku you can use a physical volume, like a device as my raid0, without
a partition table and without a partition to use it for lvm2.
"If you are using a whole disk device for your physical volume, the disk must have no partition table."
"You can remove an existing partition table by zeroing the first sector with the following command:"

Code:
# dd if=/dev/zero of=PhysicalVolume bs=512 count=1

Setup /dev/sdb as a physical volume:
Code:
# pvcreate /dev/sdb

# lvmdiskscan
  ...
  /dev/sdb  [  185.31 GiB] LVM physical volume
  ...
  1 LVM physical volume whole disk

As you know the name of the proxmox VG (volume group) is "pve".
Very important for using dm-cache is, both logical volumes for data
and for the cache have to be in the same volume group ("pve").
For that reason the existing volume group has to be extended
with the new cache device.

Code:
# vgscan
"Found volume group "pve" using metadata type lvm2"

# vgextend pve /dev/sdb
" Volume group "pve" successfully extended"

You can controll it with "vgdisplay"
Code:
before:
# vgdisplay

VG Name  pve
  Metadata Areas  1
  Metadata Sequence No  4
  VG Access  read/write
  VG Status  resizable
  MAX LV  0
  Cur LV  3
  Open LV  3
  Max PV  0
  Cur PV  1
  Act PV  1
  VG Size  3.64 TiB
  PE Size  4.00 MiB
  Total PE  953567
  Alloc PE / Size  949472 / 3.62 TiB
  Free  PE / Size  4095 / 16.00 GiB
  VG UUID  QIzoZv-EoMX-ZWvR-LRj0-Eofo-o68H-i0vjMz

afterwards:
# vgdisplay
VG Name  pve
...
  Metadata Areas  2
  Metadata Sequence No  5
...
  Cur PV  2
  Act PV  2
  VG Size  3.82 TiB
  PE Size  4.00 MiB
  Total PE  1001006
  Alloc PE / Size  949472 / 3.62 TiB
  Free  PE / Size  51534 / 201.30 GiB
  VG UUID  QIzoZv-EoMX-ZWvR-LRj0-Eofo-o68H-i0vjMz

Now we produce the important cache LV. There are two different cache LVs:
A - data LV, named CacheDataLV in my setup
B - the cache metadata LV, named CacheMetaLV in my setup
have a look at "man lvmcache"

My PV (2 x 100GB SSDs) has a size = 185 GB, I will use aboud 0,5 GB / 512 MB as CacheMetaLV
and 160 GB for CacheDataLV. Nowhere I found an information that you have to calculate
the exact values, therefor I used estimated values.

Code:
# lvcreate -n CacheDataLV -L CacheSize VG FastPVs
and
# lvcreate -n CacheMetaLV -L MetaSize VG FastPVs

For me:
Code:
# lvcreate -n CacheDataLV -L 160G pve /dev/sdb
" Logical volume "CacheDataLV" created."
# lvcreate -n CacheMetaLV -L 0.5G pve /dev/sdb
"Logical volume "CacheMetaLV" created."

The important step we need to do is to "engage" the data cache LV
and metadata cache LV in a single LV called cache pool",
a logical volume of type cache-pool.

Code:
# lvconvert --type cache-pool --cachemode writethrough --poolmetadata VG/lv_cache_meta VG/lv_cache

For me:
Code:
# lvconvert --type cache-pool --cachemode writethrough --poolmetadata pve/CacheMetaLV pve/CacheDataLV
"  WARNING: Converting logical volume pve/CacheDataLV and pve/CacheMetaLV to pool's data and metadata volumes.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert pve/CacheDataLV and pve/CacheMetaLV? [y/n]: y
  Converted pve/CacheDataLV to cache pool."

with the following command you can see the result:
Code:
# lvs -a -o +devices
  LV  VG  Attr  LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
  CacheDataLV  pve  Cwi---C--- 160.00g  CacheDataLV_cdata(0)
  [CacheDataLV_cdata] pve  Cwi------- 160.00g  /dev/sdb(0)
  [CacheDataLV_cmeta] pve  ewi------- 512.00m  /dev/sdb(40960)
  data  pve  -wi-ao----  3.59t  /dev/sda3(8960)
  [lvol0_pmspare]  pve  ewi------- 512.00m  /dev/sda3(949472)
  root  pve  -wi-ao----  27.00g  /dev/sda3(2048)
  swap  pve  -wi-ao----  8.00g  /dev/sda3(0)

As you can see there is a renaming (_cdata; _cmeta) as descript in the redhat documentation
Befor the conversion the result of above command was:

Code:
 ...
  CacheDataLV  pve  -wi-a----- 160.00g
  CacheMetaLV  pve  -wi-a----- 512.00m
  data  pve  -wi-ao----  3.59t
  ...

have also a look at the attribute. (yes, "C" is for Cached or not ;-)

The last step is the allocation of the cache pool to the meaning data LV
(named "data" in proxmox)
Create the cache logical volume by combining the cache pool logical volume with the origin "data" logical volume.

Code:
# lvconvert --type cache --cachepool VG/lv_cache VG/lv
For me:
Code:
# lvconvert --type cache --cachepool pve/CacheDataLV pve/data
" Logical volume pve/data is now cached."

And with that, we are done. We can now continue using the pve logical
volume, but from now on as a cached volume using the cache space on the
SSD.
Now you can see the successfully cached proxmox LV "data"

Code:
# lvs -a -o +devices
  LV  VG  Attr  LSize  Pool  Origin  Data%  Meta%  Move Log Cpy%Sync Convert Devices
  [CacheDataLV]  pve  Cwi---C--- 160.00g  0.00  3.97  100.00  CacheDataLV_cdata(0)
  [CacheDataLV_cdata] pve  Cwi-ao---- 160.00g  /dev/sdb(0)
  [CacheDataLV_cmeta] pve  ewi-ao---- 512.00m  /dev/sdb(40960)
  data  pve  Cwi-aoC---  3.59t [CacheDataLV] [data_corig] 0.00  3.97  100.00  data_corig(0)
  [data_corig]  pve  owi-aoC---  3.59t  /dev/sda3(8960)
  [lvol0_pmspare]  pve  ewi------- 512.00m  /dev/sda3(949472)
  root  pve  -wi-ao----  27.00g  /dev/sda3(2048)
  swap  pve  -wi-ao----  8.00g  /dev/sda3(0)

main sources:
https://access.redhat.com/documenta...Administration/lvm_cache_volume_creation.html
and:
http://blog-vpodzime.rhcloud.com/?p=45
and the manpages, primarily man lvmcache

Now I have to test it.
For objections and suggestions I am always grateful

best regards,
maxprox
 
Last edited:
Hi Maxprox,

thanks for the detailed documentation!
I'm new to Proxmox and I'm thinking of replacing our current homegrown KVM/Libvirt-solution during a necessary hardware replacement by Proxmox. The hew hardware will have an NVMw SSD for caching purposes.

I could install the dm-cache the way you described. Technically, it's possible.
But since we'd like to buy a support subscription, which is only available for bare metal proxmox, the question that I have is:

Will break your recipe my support?

Regards
Volker
 
Important to know is the reason for this setup:
I get a server with two enterprise SSDs and a good hardware raid controller....
If I can decide the server and it's hardware "Keep it simple and stupid" I work mostly without a hardware raid controller, I prefer a Server with lot of RAM and then I would create a ZFS Raid 10 with or without SSD ...
Regards,
maxprox
 
Last edited:
LVM cache / dm-cache
My HowTo
With the hardware described in this post:
https://forum.proxmox.com/threads/zfs-or-hardware-raid-new-hardware-setup.26586/
I decided to use the following setup.
First of all I used the onboard LSI hardware raid controller, AVAGO 3108 MegaRAID
Some tools for Debian Jessie:
1. sources.list:
Code:
 cat /etc/apt/sources.list
...
deb http://hwraid.le-vert.net/debian jessie main

2. install:
megacli, megactl which contains megasasctl
maby have a look at thomas krenn or what you like:
https://www.thomas-krenn.com/de/wiki/MegaCLI
3. use it for a short overview:
Code:
megasasctl -B -t -v

I build a raid10 based on the four 2TB SAS HDDs.
Then I installed Proxmox as usual without ZFS (maxroot = 23 GB; maxswap = 4 GB, ext4).

Code:
=> lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda            8:0    0   3.7T  0 disk
├─sda1         8:1    0  1007K  0 part
├─sda2         8:2    0   127M  0 part
└─sda3         8:3    0   3.7T  0 part
  ├─pve-root 252:0    0    27G  0 lvm  /
  ├─pve-swap 252:1    0     8G  0 lvm  [SWAP]
  └─pve-data 252:2    0   3.6T  0 lvm  /var/lib/vz

After that I installed the Intel Enterprise SSDs
and with the same raid controller I build a raid0.

Code:
 => parted -l
Model: AVAGO SMC3108 (scsi)
Disk /dev/sda: 4000GB
(Hardware Raid10 (4x2TB SAS))
.....
Disk /dev/sdb: 199GB
(Hardware Raid0 (2x100SSD))

According to the redhat admin doku you can use a physical volume, like a device as my raid0, without
a partition table and without a partition to use it for lvm2.
"If you are using a whole disk device for your physical volume, the disk must have no partition table."
"You can remove an existing partition table by zeroing the first sector with the following command:"

Code:
# dd if=/dev/zero of=PhysicalVolume bs=512 count=1

Setup /dev/sdb as a physical volume:
Code:
# pvcreate /dev/sdb

# lvmdiskscan
  ...
  /dev/sdb  [  185.31 GiB] LVM physical volume
  ...
  1 LVM physical volume whole disk

As you know the name of the proxmox VG (volume group) is "pve".
Very important for using dm-cache is, both logical volumes for data
and for the cache have to be in the same volume group ("pve").
For that reason the existing volume group has to be extended
with the new cache device.

Code:
# vgscan
"Found volume group "pve" using metadata type lvm2"

# vgextend pve /dev/sdb
" Volume group "pve" successfully extended"

You can controll it with "vgdisplay"
Code:
before:
# vgdisplay

VG Name  pve
  Metadata Areas  1
  Metadata Sequence No  4
  VG Access  read/write
  VG Status  resizable
  MAX LV  0
  Cur LV  3
  Open LV  3
  Max PV  0
  Cur PV  1
  Act PV  1
  VG Size  3.64 TiB
  PE Size  4.00 MiB
  Total PE  953567
  Alloc PE / Size  949472 / 3.62 TiB
  Free  PE / Size  4095 / 16.00 GiB
  VG UUID  QIzoZv-EoMX-ZWvR-LRj0-Eofo-o68H-i0vjMz

afterwards:
# vgdisplay
VG Name  pve
...
  Metadata Areas  2
  Metadata Sequence No  5
...
  Cur PV  2
  Act PV  2
  VG Size  3.82 TiB
  PE Size  4.00 MiB
  Total PE  1001006
  Alloc PE / Size  949472 / 3.62 TiB
  Free  PE / Size  51534 / 201.30 GiB
  VG UUID  QIzoZv-EoMX-ZWvR-LRj0-Eofo-o68H-i0vjMz

Now we produce the important cache LV. There are two different cache LVs:
A - data LV, named CacheDataLV in my setup
B - the cache metadata LV, named CacheMetaLV in my setup
have a look at "man lvmcache"

My PV (2 x 100GB SSDs) has a size = 185 GB, I will use aboud 0,5 GB / 512 MB as CacheMetaLV
and 160 GB for CacheDataLV. Nowhere I found an information that you have to calculate
the exact values, therefor I used estimated values.

Code:
# lvcreate -n CacheDataLV -L CacheSize VG FastPVs
and
# lvcreate -n CacheMetaLV -L MetaSize VG FastPVs

For me:
Code:
# lvcreate -n CacheDataLV -L 160G pve /dev/sdb
" Logical volume "CacheDataLV" created."
# lvcreate -n CacheMetaLV -L 0.5G pve /dev/sdb
"Logical volume "CacheMetaLV" created."

The important step we need to do is to "engage" the data cache LV
and metadata cache LV in a single LV called cache pool",
a logical volume of type cache-pool.

Code:
# lvconvert --type cache-pool --cachemode writethrough --poolmetadata VG/lv_cache_meta VG/lv_cache

For me:
Code:
# lvconvert --type cache-pool --cachemode writethrough --poolmetadata pve/CacheMetaLV pve/CacheDataLV
"  WARNING: Converting logical volume pve/CacheDataLV and pve/CacheMetaLV to pool's data and metadata volumes.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert pve/CacheDataLV and pve/CacheMetaLV? [y/n]: y
  Converted pve/CacheDataLV to cache pool."

with the following command you can see the result:
Code:
# lvs -a -o +devices
  LV  VG  Attr  LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
  CacheDataLV  pve  Cwi---C--- 160.00g  CacheDataLV_cdata(0)
  [CacheDataLV_cdata] pve  Cwi------- 160.00g  /dev/sdb(0)
  [CacheDataLV_cmeta] pve  ewi------- 512.00m  /dev/sdb(40960)
  data  pve  -wi-ao----  3.59t  /dev/sda3(8960)
  [lvol0_pmspare]  pve  ewi------- 512.00m  /dev/sda3(949472)
  root  pve  -wi-ao----  27.00g  /dev/sda3(2048)
  swap  pve  -wi-ao----  8.00g  /dev/sda3(0)

As you can see there is a renaming (_cdata; _cmeta) as descript in the redhat documentation
Befor the conversion the result of above command was:

Code:
 ...
  CacheDataLV  pve  -wi-a----- 160.00g
  CacheMetaLV  pve  -wi-a----- 512.00m
  data  pve  -wi-ao----  3.59t
  ...

have also a look at the attribute. (yes, "C" is for Cached or not ;-)

The last step is the allocation of the cache pool to the meaning data LV
(named "data" in proxmox)
Create the cache logical volume by combining the cache pool logical volume with the origin "data" logical volume.

Code:
# lvconvert --type cache --cachepool VG/lv_cache VG/lv
For me:
Code:
# lvconvert --type cache --cachepool pve/CacheDataLV pve/data
" Logical volume pve/data is now cached."

And with that, we are done. We can now continue using the pve logical
volume, but from now on as a cached volume using the cache space on the
SSD.
Now you can see the successfully cached proxmox LV "data"

Code:
# lvs -a -o +devices
  LV  VG  Attr  LSize  Pool  Origin  Data%  Meta%  Move Log Cpy%Sync Convert Devices
  [CacheDataLV]  pve  Cwi---C--- 160.00g  0.00  3.97  100.00  CacheDataLV_cdata(0)
  [CacheDataLV_cdata] pve  Cwi-ao---- 160.00g  /dev/sdb(0)
  [CacheDataLV_cmeta] pve  ewi-ao---- 512.00m  /dev/sdb(40960)
  data  pve  Cwi-aoC---  3.59t [CacheDataLV] [data_corig] 0.00  3.97  100.00  data_corig(0)
  [data_corig]  pve  owi-aoC---  3.59t  /dev/sda3(8960)
  [lvol0_pmspare]  pve  ewi------- 512.00m  /dev/sda3(949472)
  root  pve  -wi-ao----  27.00g  /dev/sda3(2048)
  swap  pve  -wi-ao----  8.00g  /dev/sda3(0)

main sources:
https://access.redhat.com/documenta...Administration/lvm_cache_volume_creation.html
and:
http://blog-vpodzime.rhcloud.com/?p=45
and the manpages, primarily man lvmcache

Now I have to test it.
For objections and suggestions I am always grateful

best regards,
maxprox
How well is this working, more than a year later?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!