Proxmox Thin-Pool Full but reported only half full?

Jospeh Huber

Well-Known Member
Apr 18, 2016
98
6
48
45
Hi,


today one of my VMs (backed on LVM thin and backed on a RAID) ran out of disk space. The only vm without disk monitoring :-((
It seems that this has destroyed my other vms in my thin pool. Several I/O errors have occured.

My LVM thin pool is not overbooked, total vm space is 871 Gb and space on device is 892Gb.
When I look int the ProxMox GUI of the storage the used VM space is only 501 Gb (56%) of the 892 Gb.

In my syslog I have found a " switching pool to out-of-data-space" and this was the beginning ot the end...
Code:
Nov 25 05:38:37 wpx3 kernel: [85437823.532713] device-mapper: thin: 251:3: switching pool to out-of-data-space (queue IO) mode
Nov 25 05:38:37 wpx3 lvm[1052]: Thin ssd-ssd-tpool is now 100% full.

After that several I/O errors on the other VMs have occured.

Code:
Nov 25 05:39:37 wpx3 kernel: [85437883.566354] Buffer I/O error on device dm-12, logical block 23020159
Nov 25 05:39:37 wpx3 kernel: [85437883.566614] Buffer I/O error on device dm-6, logical block 22077699
Nov 25 05:39:37 wpx3 kernel: [85437883.567171] EXT4-fs warning (device dm-6): ext4_end_bio:330: I/O error -28 writing to inode 6947209 (offset 343932928 size 8388608 starting block 22078720)
Nov 25 05:39:40 wpx3 kernel: [85437887.228228] JBD2: Detected IO errors while flushing file data on dm-6-8
Nov 25 05:39:41 wpx3 kernel: [85437888.145104] JBD2: Detected IO errors while flushing file data on dm-6-8
...
Nov 25 06:24:03 wpx3 kernel: [85440550.222865] dm-9: rw=0, want=35262332145712, limit=16777216
Nov 25 06:24:03 wpx3 kernel: [85440550.226850] dm-9: rw=0, want=35262332145712, limit=16777216
Nov 25 06:24:03 wpx3 kernel: [85440550.230841] attempt to access beyond end of device

The error reports "ssd-ssd-tpool (251:3)" that means the thin pool itself is full, right?
Code:
root@wpx3:/var/log# dmsetup ls --tree
ssd-vm--111--disk--1 (251:12)
 └─ssd-ssd-tpool (251:3)
    ├─ssd-ssd_tdata (251:2)
    │  └─ (8:17)
    └─ssd-ssd_tmeta (251:1)
       └─ (8:17)
ssd-vm--126--disk--1 (251:9)
 └─ssd-ssd-tpool (251:3)
    ├─ssd-ssd_tdata (251:2)
    │  └─ (8:17)
    └─ssd-ssd_tmeta (251:1)
       └─ (8:17)
...

Any ideas? I do not know what to do here ...

Luckily i didn't lose any data, because the VMs are all part of clusters and get the current state when they are starting up again.

Thx for your help.
 
can you post the output of lvs, pvs, and pvesm status ?
 
Here it is:
Code:
 lvs
  LV                              VG     Attr       LSize   Pool Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  lvol0                           backup -wi-ao----   2.30t                                               
  data                            pve    twi-aotz--   1.00t                    0.00   0.42                 
  root                            pve    -wi-ao----  96.00g                                               
  swap                            pve    -wi-ao---- 251.00g                                               
  snap_vm-123-disk-1_Before_HTTP2 ssd    Vri---tz-k  15.00g ssd  vm-123-disk-1                             
  ssd                             ssd    twi-aotz-- 892.91g                    60.12  29.70               
  vm-102-disk-2                   ssd    Vwi-aotz--   8.00g ssd                95.85                       
  vm-111-disk-1                   ssd    Vwi-aotz-- 380.00g ssd                61.55                       
  vm-115-disk-1                   ssd    Vwi-aotz-- 256.00g ssd                39.72                       
  vm-116-disk-1                   ssd    Vwi-a-tz--   8.00g ssd                9.91                       
  vm-117-disk-2                   ssd    Vwi-aotz--   9.00g ssd                99.85                       
  vm-118-disk-2                   ssd    Vwi-aotz--  10.00g ssd                99.87                       
  vm-119-disk-2                   ssd    Vwi-aotz-- 120.00g ssd                99.71                       
  vm-123-disk-1                   ssd    Vwi-aotz--  20.00g ssd                99.84                       
  vm-126-disk-1                   ssd    Vwi-aotz--  60.00g ssd                36.66

Code:
 pvs
  PV         VG     Fmt  Attr PSize   PFree
  /dev/sda3  pve    lvm2 a--    1.34t 4.00m
  /dev/sda4  backup lvm2 a--    2.30t    0
  /dev/sdb1  ssd    lvm2 a--  893.13g    0

Code:
pvesm status
BACKUP           dir 1      2429387420      1199276004      1106682552 52.51%
BACKUP_WPX1      nfs 1      2429387776       921167872      1384791040 40.45%
BACKUP_WPX2      nfs 1      2429387776       836751360      1469207552 36.79%
BACKUP_WPX3      nfs 1      2429387776      1199276032      1106682880 52.51%
REMOTE_BACKUP    nfs 1      1238411136       497142912       741149440 40.65%
SSD            lvmthin 1       936288256       562896499       373391756 60.62%
local            dir 1        98952796         3222248        90681000 3.93%
local-lvm      lvmthin 1      1073741824               0      1073741824 0.50%
 
I think the above is not a good idea, if the physical space is fully used ...

But I found another thing here that sounds good for me:
https://mellowhost.com/billing/inde...8/How-to-Extend-meta-data-of-a-thin-pool.html
When I do a lvs -a I can see that my meta data pool is only 112m.
Code:
...
  ssd                             ssd    twi-aotz-- 892.91g                    60.15  29.71
  [ssd_tdata]                     ssd    Twi-ao---- 892.91g
  [ssd_tmeta]                     ssd    ewi-ao---- 112.00m
 ...

Is it safe to extend it with the following command?
Code:
lvextend -L+128M /dev/ssd/ssd_tmeta
In other words: from were does it take the store, because the ssd_tdata is using all of the physical space?

I have installed a detailed monitoring of the meta data space, so I will see in the future when it overflows...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!