[SOLVED] ZFS eating more poolspace than allocated

sbielski

Member
Dec 14, 2012
10
0
21
We are running a Proxmox 4.4 Host containing a VM with a ZFS disk image, size set to 9.77TiB, but it eats more than 13TiB of space from the zfs pool:

host:~# pvesm list local-zfs
[...]
local-zfs:vm-500-disk-2 raw 10742228603372 500

host:~# zfs list -t all
[...]
rpool/data/vm-500-disk-2 13.7T 3.21G 13.7T -

I don't see any reason for that big difference. Is there any chance to get it corrected and limited to the given size?
 
Hi,

if you have snapshots it would be normal.
 
I forgot to mention this: No snapshots used. And of course I have checked if any snapshot exists, there is none.
 
For sure:
host:~# zfs get all rpool/data/vm-500-disk-2
NAME PROPERTY VALUE SOURCE
rpool/data/vm-500-disk-2 type volume -
rpool/data/vm-500-disk-2 creation Wed Jul 19 21:53 2017 -
rpool/data/vm-500-disk-2 used 13.7T -
rpool/data/vm-500-disk-2 available 3.21G -
rpool/data/vm-500-disk-2 referenced 13.7T -
rpool/data/vm-500-disk-2 compressratio 1.00x -
rpool/data/vm-500-disk-2 reservation none default
rpool/data/vm-500-disk-2 volsize 9.77T local
rpool/data/vm-500-disk-2 volblocksize 8K -
rpool/data/vm-500-disk-2 checksum on default
rpool/data/vm-500-disk-2 compression lz4 inherited from rpool
rpool/data/vm-500-disk-2 readonly off default
rpool/data/vm-500-disk-2 copies 1 default
rpool/data/vm-500-disk-2 refreservation none default
rpool/data/vm-500-disk-2 primarycache all default
rpool/data/vm-500-disk-2 secondarycache all default
rpool/data/vm-500-disk-2 usedbysnapshots 0 -
rpool/data/vm-500-disk-2 usedbydataset 13.7T -
rpool/data/vm-500-disk-2 usedbychildren 0 -
rpool/data/vm-500-disk-2 usedbyrefreservation 0 -
rpool/data/vm-500-disk-2 logbias latency default
rpool/data/vm-500-disk-2 dedup off default
rpool/data/vm-500-disk-2 mlslabel none default
rpool/data/vm-500-disk-2 sync standard inherited from rpool
rpool/data/vm-500-disk-2 refcompressratio 1.00x -
rpool/data/vm-500-disk-2 written 13.7T -
rpool/data/vm-500-disk-2 logicalused 8.48T -
rpool/data/vm-500-disk-2 logicalreferenced 8.48T -
rpool/data/vm-500-disk-2 snapshot_limit none default
rpool/data/vm-500-disk-2 snapshot_count none default
rpool/data/vm-500-disk-2 snapdev hidden default
rpool/data/vm-500-disk-2 context none default
rpool/data/vm-500-disk-2 fscontext none default
rpool/data/vm-500-disk-2 defcontext none default
rpool/data/vm-500-disk-2 rootcontext none default
rpool/data/vm-500-disk-2 redundant_metadata all default
 
For sure:
Code:
host:~# zfs get all rpool/data/vm-500-disk-2
NAME                      PROPERTY              VALUE                  SOURCE
rpool/data/vm-500-disk-2  type                  volume                 -
rpool/data/vm-500-disk-2  creation              Wed Jul 19 21:53 2017  -
rpool/data/vm-500-disk-2  used                  13.7T                  -
rpool/data/vm-500-disk-2  available             3.21G                  -
rpool/data/vm-500-disk-2  referenced            13.7T                  -
rpool/data/vm-500-disk-2  compressratio         1.00x                  -
rpool/data/vm-500-disk-2  reservation           none                   default
rpool/data/vm-500-disk-2  volsize               9.77T                  local
rpool/data/vm-500-disk-2  volblocksize          8K                     -
rpool/data/vm-500-disk-2  checksum              on                     default
rpool/data/vm-500-disk-2  compression           lz4                    inherited from rpool
rpool/data/vm-500-disk-2  readonly              off                    default
rpool/data/vm-500-disk-2  copies                1                      default
rpool/data/vm-500-disk-2  refreservation        none                   default
rpool/data/vm-500-disk-2  primarycache          all                    default
rpool/data/vm-500-disk-2  secondarycache        all                    default
rpool/data/vm-500-disk-2  usedbysnapshots       0                      -
rpool/data/vm-500-disk-2  usedbydataset         13.7T                  -
rpool/data/vm-500-disk-2  usedbychildren        0                      -
rpool/data/vm-500-disk-2  usedbyrefreservation  0                      -
rpool/data/vm-500-disk-2  logbias               latency                default
rpool/data/vm-500-disk-2  dedup                 off                    default
rpool/data/vm-500-disk-2  mlslabel              none                   default
rpool/data/vm-500-disk-2  sync                  standard               inherited from rpool
rpool/data/vm-500-disk-2  refcompressratio      1.00x                  -
rpool/data/vm-500-disk-2  written               13.7T                  -
rpool/data/vm-500-disk-2  logicalused           8.48T                  -
rpool/data/vm-500-disk-2  logicalreferenced     8.48T                  -
rpool/data/vm-500-disk-2  snapshot_limit        none                   default
rpool/data/vm-500-disk-2  snapshot_count        none                   default
rpool/data/vm-500-disk-2  snapdev               hidden                 default
rpool/data/vm-500-disk-2  context               none                   default
rpool/data/vm-500-disk-2  fscontext             none                   default
rpool/data/vm-500-disk-2  defcontext            none                   default
rpool/data/vm-500-disk-2  rootcontext           none                   default
rpool/data/vm-500-disk-2  redundant_metadata    all                    default


Hard to suggest how to solve it. Maybe it is similar to https://github.com/zfsonlinux/zfs/issues/3255
 
The VM is running BTRFS, maybe it's related. But still weird, ZFS using more space than allocated, there should be really some kind of limit enforcing. Sadly, the ZFS issue was closed, maybe not many people are affected.
 
It is not recommend to run a cow on a cow fs.
This makes general problems.
I'm not aware of this problem but I do not test this setups.
 
The problem is facing you is that you run a cow (btrfs) over another cow (zfs). Also the 8k zvol sector size. This could be amplified if you use many small files that are very often modified. And maybe your partition inside this zvol are missaligned ?!

By the way, when you create a zvol, the limit size is only from the user perspective, and not from zfs storage. Let take a exagerate example. We create a zvol(8k) with 16 k as size. Now the user will write a file with 12k . From user perspective he use let say 14k (data file + metadata). From zfs perspective, it will need to write let say 15 k (data file + metadata + checksums + zfs metadata). So then zfs will write 15k , but the minimum is 8k, so will be 2 x block size = 16k(15k data + 1k padding). So in the end you will exceed your alocated size = 15 k.
I hope you now understand how you can use more space than the alocated size. And could be another factors that could amplified this phenomenon (raidz for example).
 
Ah, understood. Thanks for the hints. Anyway, we will replace the zfs with a hardware RAID for now, but I will take a note not to use zfs together with btrfs in such a case.
 
I don`t think btrfs can break something in ZFS. If VM file system do not do fstrim ZFS will not shrink zvol but it can`t exceed the limit. I remember ceph do the same think as happened to you. Not sure how ceph do today.

If you can I suggest you re-save zvol.

1. Create new zvol from Proxmox GUI with scsi disk and discard option.
2. Copy old zvol data to new zvol with dd.
3. Look at new zvol situation and run fstrim in new vzol and compare with vzol status before.

Interesting to see whats will happens
 
Sadly, there is not enough place to do such operations. Currently we are migrating the userdata out of this node to recreate it with changed hardware so I cannot do any further tests.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!