lvm and dmcache

Introduction

Device mapper has a caching feature and there is support in LVM for adding a cache to a logical volume. The idea is you have a large slow device (HDD array) and you setup a fast device (SSD, NVMe) as a cache.

“By default dm-cache (as is currently upstream) is not going to cache
sequential IO, and it also isn’t going to cache IO that is first
written. It waits for hit counts to elevate to the promote threshold.
So dm-cache effectively acts as a hot-spot cache by default.” source

What this means is that, by default, you won’t see much improvement for benchmarks like fio, bonnie++, etc. Instead, the improvement will come over time from blocks that are seeing a lot of activity. There are tunables that control how things get promoted to the cache, etc, but by default it’s tuned with the assumption that the “slow” device is a HDD spindle that does OK with streaming reads and writes and reserves the cache usage for non-streaming stuff. The best thing to do is have munin/etc graph your disk i/o stats under the workload and then later add the cache and see how much it improves.

How to setup

Start with an existing VG on the “slow” device.

# install 
apt-get install thin-provisioning-tools

# add the encrypted SSD PV to the VG
vgextend vg_hostname1 /dev/mapper/fast_crypt

# create the cache metadata LV on the SSD PV
lvcreate -L 1G -n lv_cache_meta vg_moltar1 /dev/mapper/fast_crypt

# create the cache LV on the SSD PV
lvcreate -L 208G -n lv_cache vg_moltar1 /dev/mapper/fast_crypt

# convert the cache and cache metadata LVs to a cache pool
lvconvert --type cache-pool --poolmetadata vg_moltar1/lv_cache_meta vg_moltar1/lv_cache

# attach the cache pool to the LV you want
lvconvert --type cache --cachepool vg_moltar1/lv_cache vg_moltar1/srv

# display the cache settings
dmsetup status |grep vg_moltar1/srv

More stuff

  • munin plugins dm_cache_statistics_ and dm_cache_occupancy_
  • a nice utility for setting things up and displaying stats lvcache
  • lvm has a lvmetad service, but it appears to mostly be for automatically scanning devices when added by udev, not something we need.