[SOLVED] LVM failure caused by cache SSD failure

richardk · Jul 21, 2020

Please, no comments about ZFS, etc. etc., as ZFS is not an option for my environment.

[BACKGROUND INFORMATION]
1. I configured a server with a large hardware raid volume (DELL Perc H700 raid controller).
2. fresh install of ProxMox v6.2 to that single large volume using installer defaults for LVM-thin config
3. after installation, I configured an raid-controller attached SSD for LVM cache using the following commands:



4. Everything ran fine for several months, and performance made it seem as though the cache was functioning as expected.
5. total failure of the SSD (raid controller no longer had access to the disk)
6. Running "vgchange --test -a y /dev/pve" returns "Refusing activation of partial LV /pve/data...pve/vm-###-disk-0" ... etc. etc. for all LVs

[QUESTIONS]
I would have expected the system to continue to function, at reduced speeds; however, all but two of my containers/ VMs stopped functioning. Two containers were still running, but after shutting down those containers, I was unable to start them or any other containers/VMs.

I will update as I take steps to correct the problem, but my questions are:
a) Did I configure the cache correctly?
b) Is the failure of the LVM, as a result of the SSD failure, expected behavior?
c) If not, what could have caused the LVM failure?

LnxBil · Jul 22, 2020

richardk said:
a) Did I configure the cache correctly?

You setup writeback cache, so if the single device you rely on fails, you will probably loose everything because the LVM has no consistent state.

richardk · Jul 22, 2020

LnxBil said:
You setup writeback cache, so if the single device you rely on fails, you will probably loose everything because the LVM has no consistent state.

That makes complete sense, and I have no idea how I didn't realize that immediately.

Is it considered best practice to configure the cache as write through, since it should still be nice and peppy on an SSD?

Follow up question: can I recover with a new cache disk before restoring VMs/CTs from backup, or do I have to reload PVE?

LnxBil · Jul 22, 2020

richardk said:
Is it considered best practice to configure the cache as write through, since it should still be nice and peppy on an SSD?

I'd use an mdadm raid over at least two SSDs (best different brand, chances of same failure is minimized) and use that as a write back device. There is nothing faster than write back. You can also use multiple NVMes for that, even faster write back cache, but you _need_ redundancy also in the write back cache. For write through cache, it does not matter, because the data is always synchronized on the backend storage.

richardk said:
Follow up question: can I recover with a new cache disk before restoring VMs/CTs from backup, or do I have to reload PVE?

Since the cache is part of the LVM, I think you're out of luck and need to restore from backup, but I could be wrong.

In the past, I used an in-between-cache-layer which consists of a big and slow backing device and a fast caching device to form, e.g. a flashcache based new device-mapper device that then was used as a physical volume. In that setup the LVM is completely unaware of the cache disk and it cannot fail that hard. In cache layer takes care of the data and you can e.g. force a flush to disk in order to get a synchronous state on the backend device.

Another approach would be to use a RAID-controller that already has a multi-tier-layout and can use SSD internally to speed up stuff. That is similar to the software approach I just described. I only read about this controllers, never used it before. We're currently SSD-only or just use a SAN.

richardk · Jul 22, 2020

LnxBil said:
Another approach would be to use a RAID-controller that already has a multi-tier-layout and can use SSD internally to speed up stuff. That is similar to the software approach I just described. I only read about this controllers, never used it before. We're currently SSD-only or just use a SAN.

This is also a part of the configuration - the RAID controller has an attached SSD that it uses for caching, in addition to its on-board RAM-looking cache.

Thank you, SO MUCH for your insight; this has been very helpful!

richardk · Jul 22, 2020

I know I already marked this as solved, but is there a straightforward way to change the cache mode from write back to write through on another server where the cache SSD has not failed?

LnxBil · Jul 23, 2020

richardk said:
I know I already marked this as solved, but is there a straightforward way to change the cache mode from write back to write through on another server where the cache SSD has not failed?

According to lvmcache(7) is can be changed:

Code:

    dm-cache cache modes

        The  default  dm-cache cache mode is "writethrough".  Writethrough ensures that any data written will be stored both in the cache and on the origin LV.  The loss of a device associated with the cache in this case would
        not mean the loss of any data.

        A second cache mode is "writeback".  Writeback delays writing data blocks from the cache back to the origin LV.  This mode will increase performance, but the loss of a cache device can result in lost data.

        With the --cachemode option, the cache mode can be set when caching is started, or changed on an LV that is already cached.  The current cache mode can be displayed with the cache_mode reporting option:

        lvs -o+cache_mode VG/LV

        lvm.conf(5) allocation/cache_mode
        defines the default cache mode.

        $ lvconvert --type cache --cachepool fast \
             --cachemode writethrough vg/main

Search

Search

[SOLVED] LVM failure caused by cache SSD failure

richardk

Member

LnxBil

Distinguished Member

richardk

Member

LnxBil

Distinguished Member

richardk

Member

richardk

Member

LnxBil

Distinguished Member