DRBD9 wrong free space calculation

hjjg

New Member
Mar 27, 2014
9
0
1
Hi folks,


I have 2 nodes with this setup:
Code:
virt01 ~ # vgs
  VG          #PV #LV #SN Attr   VSize   VFree 

  drbdpool      1  13   0 wz--n-   8.19t  4.17t

virt01 ~ # lvs
  LV               VG          Attr       LSize   Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert

  .drbdctrl_0      drbdpool    -wi-ao----   4.00m                                                            
  .drbdctrl_1      drbdpool    -wi-ao----   4.00m                                                            
  drbdthinpool     drbdpool    twi-aotz--   4.01t                     17.15  68.91                           
  vm-101-disk-2_00 drbdpool    Vwi-aotz--  15.00g drbdthinpool        21.06                                  
  vm-102-disk-1_00 drbdpool    Vwi-aotz--   8.00g drbdthinpool        99.98                                  
  vm-103-disk-1_00 drbdpool    Vwi-aotz--  16.00g drbdthinpool        100.00                                 
  vm-103-disk-2_00 drbdpool    Vwi-aotz-- 300.07g drbdthinpool        100.00                                 
  vm-104-disk-1_00 drbdpool    Vwi-aotz--  20.01g drbdthinpool        0.02                                   
  vm-104-disk-2_00 drbdpool    Vwi-aotz--  50.01g drbdthinpool        100.00                                 
  vm-106-disk-1_00 drbdpool    Vwi-aotz--   5.00g drbdthinpool        43.15                                  
  vm-107-disk-1_00 drbdpool    Vwi-aotz--   8.00g drbdthinpool        99.98                                  
  vm-107-disk-2_00 drbdpool    Vwi-aotz-- 300.07g drbdthinpool        100.00                                 
  vm-110-disk-3_00 drbdpool    Vwi-aotz-- 180.04g drbdthinpool        9.79                                   

virt01 ~ # drbd-overview 
  0:.drbdctrl/0      Connected(2*) Secondary(2*) UpToDa/UpToDa 
  1:.drbdctrl/1      Connected(2*) Secondary(2*) UpToDa/UpToDa 
100:vm-103-disk-1/0  Connected(2*) Primar/Second UpToDa/UpToDa 
101:vm-106-disk-1/0  Connected(2*) Primar/Second UpToDa/UpToDa 
102:vm-102-disk-1/0  Connected(2*) Primar/Second UpToDa/UpToDa 
103:vm-103-disk-2/0  Connected(2*) Primar/Second UpToDa/UpToDa 
104:vm-104-disk-1/0  Connected(2*) Secondary(2*) UpToDa/UpToDa 
105:vm-104-disk-2/0  Connected(2*) Primar/Second UpToDa/UpToDa 
107:vm-101-disk-2/0  Connected(2*) Primar/Second UpToDa/UpToDa 
108:vm-107-disk-1/0  Connected(2*) Secondary(2*) UpToDa/UpToDa 
109:vm-107-disk-2/0  Connected(2*) Secondary(2*) UpToDa/UpToDa 
112:vm-110-disk-3/0  Connected(2*) Primar/Second UpToDa/Incons 


virt02 ~ # vgs
  VG          #PV #LV #SN Attr   VSize   VFree 
  drbdpool      1  13   0 wz--n-   8.19t  4.17t
virt02 ~ # lvs
  Couldn't find device with uuid 0cIwcx-nBiD-i9tR-nSDk-RRIs-ITaK-18ULTx.
  LV               VG          Attr       LSize   Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert
  .drbdctrl_0      drbdpool    -wi-ao----   4.00m                                                            
  .drbdctrl_1      drbdpool    -wi-ao----   4.00m                                                            
  drbdthinpool     drbdpool    twi-aotz--   4.01t                     18.25  75.79                           
  vm-101-disk-2_00 drbdpool    Vwi-aotz--  15.00g drbdthinpool        100.00                                 
  vm-102-disk-1_00 drbdpool    Vwi-aotz--   8.00g drbdthinpool        99.98                                  
  vm-103-disk-1_00 drbdpool    Vwi-aotz--  16.00g drbdthinpool        100.00                                 
  vm-103-disk-2_00 drbdpool    Vwi-aotz-- 300.07g drbdthinpool        100.00                                 
  vm-104-disk-1_00 drbdpool    Vwi-aotz--  20.01g drbdthinpool        99.98                                  
  vm-104-disk-2_00 drbdpool    Vwi-aotz--  50.01g drbdthinpool        100.00                                 
  vm-106-disk-1_00 drbdpool    Vwi-aotz--   5.00g drbdthinpool        99.95                                  
  vm-107-disk-1_00 drbdpool    Vwi-aotz--   8.00g drbdthinpool        99.98                                  
  vm-107-disk-2_00 drbdpool    Vwi-aotz-- 300.07g drbdthinpool        100.00                                 
  vm-110-disk-3_00 drbdpool    Vwi-aotz-- 180.04g drbdthinpool        15.58                                  

virt02 ~ # drbd-overview 
  0:.drbdctrl/0      Connected(2*) Secondary(2*) UpToDa/UpToDa 
  1:.drbdctrl/1      Connected(2*) Secondary(2*) UpToDa/UpToDa 
100:vm-103-disk-1/0  Connected(2*) Second/Primar UpToDa/UpToDa 
101:vm-106-disk-1/0  Connected(2*) Second/Primar UpToDa/UpToDa 
102:vm-102-disk-1/0  Connected(2*) Second/Primar UpToDa/UpToDa 
103:vm-103-disk-2/0  Connected(2*) Second/Primar UpToDa/UpToDa 
104:vm-104-disk-1/0  Connected(2*) Secondary(2*) UpToDa/UpToDa 
105:vm-104-disk-2/0  Connected(2*) Second/Primar UpToDa/UpToDa 
107:vm-101-disk-2/0  Connected(2*) Second/Primar UpToDa/UpToDa 
108:vm-107-disk-1/0  Connected(2*) Secondary(2*) UpToDa/UpToDa 
109:vm-107-disk-2/0  Connected(2*) Secondary(2*) UpToDa/UpToDa 
112:vm-110-disk-3/0  Connected(2*) Second/Primar Incons/UpToDa

Proxmox says that I have about 100G left. Redundancy is 2:

Code:
drbd: drbd1
	nodes virt01,virt02
	content images,rootdir
	redundancy 2
I've tried drbdmanage update-pool. The drbdthinpool has a size of 4TB. I can see that lvs says there is 18% used data.

Whats wrong?

Thanks in advance,

hjjg
 

hjjg

New Member
Mar 27, 2014
9
0
1
After a reboot, the free space is 2.7TB. Let's see if this runs out of sync again.
 

seventh

New Member
Jan 28, 2016
22
2
3
35
Hi,

I have the same issue as you that the LVs is differently allocated on each node. I have done some tests back and forth that I think will prove my theory and maybe contribute to the community.

Scenario 1:
On node1 we create a vm with id 101.
The hard disk, vm-101-disk-1 is created directly to the drbd storage.
The LV will allocate Data%: 0.05 on node1.
Node1 will start to sync the DRBD resource to node2.
When the sync is complete the LV for disk vm-101-disk-1 on node2 will show allocated Data%: 99.66

Code:
root@node1:~# lvs
  LV               VG       Attr       LSize  Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert
  .drbdctrl_0      drbdpool -wi-ao----  4.00m
  .drbdctrl_1      drbdpool -wi-ao----  4.00m
  drbdthinpool     drbdpool twi-aotz--  1.69t                     0.00   1.57
  vm-101-disk-1_00 drbdpool Vwi-aotz--  1.00g drbdthinpool        0.05

root@node2:~# lvs
  LV               VG       Attr       LSize  Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert
  .drbdctrl_0      drbdpool -wi-ao----  4.00m
  .drbdctrl_1      drbdpool -wi-ao----  4.00m
  drbdthinpool     drbdpool twi-aotz--  1.69t                     0.06   1.68
  vm-101-disk-1_00 drbdpool Vwi-aotz--  1.00g drbdthinpool        99.66
Scenario 2:
On node1 we create a vm with id 102.
The hard disk, vm-102-disk-1 is created directly to the local storage.
Now we move the disk from local storage to drbd storage.
The LV will then allocate Data%: 99.66 on node1.
Node1 will start to sync the DRBD resource to node2.
When the sync is complete the LV for disk vm-102-disk-1 on node2 will show allocated Data%: 99.66

Code:
root@node1:~# lvs
  LV               VG       Attr       LSize  Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert
  .drbdctrl_0      drbdpool -wi-ao----  4.00m
  .drbdctrl_1      drbdpool -wi-ao----  4.00m
  drbdthinpool     drbdpool twi-aotz--  1.69t                     0.06   1.69
  vm-101-disk-1_00 drbdpool Vwi-aotz--  1.00g drbdthinpool        0.05
  vm-102-disk-1_00 drbdpool Vwi-aotz--  1.00g drbdthinpool        99.66

root@node2:~# lvs
  LV               VG       Attr       LSize  Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert
  .drbdctrl_0      drbdpool -wi-ao----  4.00m
  .drbdctrl_1      drbdpool -wi-ao----  4.00m
  drbdthinpool     drbdpool twi-aotz--  1.69t                     0.12   1.80
  vm-101-disk-1_00 drbdpool Vwi-aotz--  1.00g drbdthinpool        99.66
  vm-102-disk-1_00 drbdpool Vwi-aotz--  1.00g drbdthinpool        99.66
Conclusion:
It seems like when you create the hard disks directly to drbd storage it will not allocate correct LV data%.
So the solution for now is to create or restore disks to local storage and then move to drbd storage.

The command "drbdmanage list-nodes" will show different values for Pool Free when LVs are not fully allocated.

Is this a problem with Proxmox or DRBD?
 

seventh

New Member
Jan 28, 2016
22
2
3
35
Hi again,

The issue doesn't seems to be that they are at different allocated data%.
I have found out that it doesn't matter if drbdmanage list-nodes shows different Pool Free, the reason why it's different LVM Data% is because when you create the disk directly to drbd pool then it only shows real used data for the lvm on local node, while the sync to the second node needs to write all information to the lvm disk.

BUT!
If I create a disk on node1 that is for example 5GB the LVM will show 5GB, but if I do "drbdmanage update-pool" and then "drbdmanage list-nodes" the "Pool Free" will show that it used approximate 15GB for the 5GB resource...
So is drbdmanage calculating wrong or what? I don't want to loose 3x of my disk space :(

Here is my pveversion:
Code:
root@node1:~# pveversion -v
proxmox-ve: 4.1-34 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-5 (running version: 4.1-5/f910ef5c)
pve-kernel-4.2.6-1-pve: 4.2.6-34
pve-kernel-4.2.2-1-pve: 4.2.2-16
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-31
qemu-server: 4.0-49
pve-firmware: 1.1-7
libpve-common-perl: 4.0-45
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-3
pve-container: 1.0-39
pve-firewall: 2.0-15
pve-ha-manager: 1.0-19
ksm-control-daemon: 1.2-1
glusterfs-client: 3.7.6-1
lxc-pve: 1.1.5-6
lxcfs: 0.13-pve3
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve7~jessie
drbdmanage: 0.91-1
 

seventh

New Member
Jan 28, 2016
22
2
3
35
Anyone have a clue what might be the problem with drbdmanage pool free report? As it is now I can't add any new disks because drbdmanage says the thinpool is full...

Look below at the LVS command and drbdmanage list-nodes:
Code:
root@node1:/var/lib/drbd.d# drbdmanage n
+------------------------------------------------------------------------------------------------------------+
| Name    | Pool Size | Pool Free |                                                            | State |
|------------------------------------------------------------------------------------------------------------|
| node1   |   1776640 |         0 |                                                            |    ok |
| node1   |   1776640 |         0 |                                                            |    ok |
+------------------------------------------------------------------------------------------------------------+
root@node1:/var/lib/drbd.d# lvs
  LV               VG       Attr       LSize   Pool         Origin Data%  Meta%  Move Log Cpy%Sync Convert
  vm-200-disk-1    backup   -wi-------   2.24t
  .drbdctrl_0      drbdpool -wi-ao----   4.00m
  .drbdctrl_1      drbdpool -wi-ao----   4.00m
  drbdthinpool     drbdpool twi-aotz--   1.69t                     43.24  87.26
  vm-100-disk-1_00 drbdpool Vwi-aotz-- 200.04g drbdthinpool        100.00
  vm-100-disk-2_00 drbdpool Vwi-aotz-- 200.04g drbdthinpool        100.00
  vm-104-disk-1_00 drbdpool Vwi-aotz-- 200.04g drbdthinpool        100.00
  vm-200-disk-1_00 drbdpool Vwi-aotz-- 150.04g drbdthinpool        100.00
  root             pve      -wi-ao----  96.00g
  swap             pve      -wi-ao----  31.00g
 

argonius

Member
Jan 17, 2012
34
0
6
Hi,

any update here? We are also encountering this problem and I think it should be easy for proxmox guys to reproduce this problem in a lab.
I think for the moment it would be best to use drbd the "old-way" DRBD + LVM Storage (no thin prov)
 

mmenaz

Member
Jun 25, 2009
736
5
18
Northern east Italy
Hi,

any update here? We are also encountering this problem and I think it should be easy for proxmox guys to reproduce this problem in a lab.
I think for the moment it would be best to use drbd the "old-way" DRBD + LVM Storage (no thin prov)
Sigh, I've the same issue, see my unasnwered posts of yesterday:
https://forum.proxmox.com/threads/restored-vm-on-thin-provisioned-drbd9-storage-is-no-more-thin.26165/
https://forum.proxmox.com/threads/drbd9-replication-is-not-thin-on-replicated-node.26169/

I'm really scared with all this DRBD9, is my first cluster and seems we are left on our own. I know is "technical preview" but if they want to implement it they should take more care of the issues.
 

seventh

New Member
Jan 28, 2016
22
2
3
35
Hey guys,

I would just like to inform you that I solved my issue with drbdthinpool by using lvm.Lvm storage plugin instead of lvm_thinlv.LvmThinLv.
More information here http://drbd.linbit.com/users-guide-9.0/s-drbdmanage-storage-plugins.html.

First make sure you have cluster up and running or SSH keys are configured.

In my below example I have two nodes:
PVE1 with IP: 10.255.255.2
PVE2 with IP: 10.255.255.4

Do the following if you would like to use a VG as the pool:

PVE1 and PVE2:
Code:
sed -i "s/lvm_thinlv.LvmThinLv/lvm.Lvm/g" /usr/lib/python2.7/dist-packages/drbdmanage/server.py
vgcreate drbdpool /dev/sdX

PVE1:
Code:
drbdmanage init 10.255.255.2
drbdmanage add-node pve2 10.255.255.4

Add the following to /etc/pve/storage.cfg:
Code:
drbd: drbd1
        content images,rootdir
        redundancy 2
Now you should have a VG named drbdpool which your drbd resources will store lvm directly on the VG.

Regards,
seventh
 

mmenaz

Member
Jun 25, 2009
736
5
18
Northern east Italy
Wondering if not installing the package
thin-provisioning-tools
would have got the same effect.
In short, thin provisioning seems badly broken at the moment, better avoid it
 

Stefanauss

New Member
Jan 27, 2016
4
1
3
PVE1 and PVE2:
Code:
sed -i "s/lvm_thinlv.LvmThinLv/lvm.Lvm/g" /usr/lib/python2.7/dist-packages/drbdmanage/server.py
vgcreate drbdpool /dev/sdX
Obviously this will be overwritten at every drbdmanage upgrade, and then DRBD won't be able to operate on drbdpool because it won't find any LVM thinpool.

I did the substitution prior to cluster creation and then changed the global config with

Code:
drbdmanage modify-config
# and uncommenting the storage plugin setting
apt-get install --reinstall drbdmanage -y
# drbd still works and it's over classic LVM
 

argonius

Member
Jan 17, 2012
34
0
6
hey guys,

last posts looking very interesting.
So what's the conclusion for now? Using DRBD9 with lvm.lvm driver instead of lvm_thinlv.LvmThinLv is safe?
I think, as long as Proxmox is telling that using DRBD 9 is "technology preview" and Linbit is saying on DRBD9 FAQ page:
"It is a pre-release version, not for production use. Please do not yet report bugs, we know that there's a lot to do."
(https://www.linbit.com/en/resources/technical-publications/11-en/products-and-services/drbd/250-drbd9-faq) it is best to stick on
DRBD 8.
 

seventh

New Member
Jan 28, 2016
22
2
3
35
Obviously this will be overwritten at every drbdmanage upgrade, and then DRBD won't be able to operate on drbdpool because it won't find any LVM thinpool.

I did the substitution prior to cluster creation and then changed the global config with

Code:
drbdmanage modify-config
# and uncommenting the storage plugin setting
apt-get install --reinstall drbdmanage -y
# drbd still works and it's over classic LVM
Thank you for the hint! But i do have a question regarding the drbdmanage modfiy-config part and that is when I uncomment storage plugin and save/quit I get the message:

Code:
Could not parse configuration, your changes will not be saved.
Do you have any idea why I can't save the changes?
 

Stefanauss

New Member
Jan 27, 2016
4
1
3
But i do have a question regarding the drbdmanage modfiy-config part and that is when I uncomment storage plugin and save/quit I get the message:

Code:
Could not parse configuration, your changes will not be saved.
Do you have any idea why I can't save the changes?
Not really, it never happened to me. Does it happen if you also try to uncomment other defaults, such as max peers? (It won't affect cluster operation if you didn't previously change the defaults, so you can try).
 

seventh

New Member
Jan 28, 2016
22
2
3
35
Not really, it never happened to me. Does it happen if you also try to uncomment other defaults, such as max peers? (It won't affect cluster operation if you didn't previously change the defaults, so you can try).
Yes it also happens on max-peers...
 

mmenaz

Member
Jun 25, 2009
736
5
18
Northern east Italy
Drbdmanage 0.96 released at linbit! See the changes made to lvm_thinlv.py
http://git.linbit.com/drbdmanage.git/blobdiff/ed7bc09050e418aa6bb4eaf11ba3bce89b755b58..4ce35b0f9d5ef57794463e5ec1d68a97b6882dbc:/drbdmanage/storage/lvm_thinlv.py

Now let's hope Proxmox add this update asap so we can start testing drbd9 for real! :)
There is 0.97.3-1 in the repo, but after upgrade the calculation is still wrong. Is there any command to run to make it "recalculate", or something further to do? I should have plenty of free space but is reported not, and I will be introubles if I will have to create some more VM or enlarge some virtual disk!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!