Is it a right way to add a disk to a VM?

lumox · Nov 28, 2020

Hi,
I created a ZFS pool, set one dataset for my VM to run, then I created a new dataset to put in it some data and set for it a quotas too.
So, I restored my VMs in lukepool/vmstorage. So far so good.
Next, I wanted to create a disk for my linux VM machine

SO, first datacenter->add:zfs. I picked my dataset
it created another storage (called disk_mint)
then, in my linux vm setup I added a disk and picked that storage
it worked. I see it in my linux vm's shell now.

Fdisk -l

another screen:

Hard disk: disk_mint is the disk I add out of my 250GB dataset storage

here are a few issues though.
My dataset (lukepool/share) has 250GB as quotas
I would assign tehe entire size to my new linux vm's HDD
But I could only set much less than that, only 150 GB

my updated zfs list:

Code:

root@pve:~# zfs list
NAME                               USED  AVAIL     REFER  MOUNTPOINT
lukepool                           277G   621G      128K  /lukepool
lukepool/share                     205G  45.3G      128K  /lukepool/share
lukepool/share/vm-103-disk-0       205G   250G     74.6K  -
lukepool/vmstorage                72.4G   621G      128K  /lukepool/vmstorage
lukepool/vmstorage/vm-101-disk-0  34.6G   621G     34.6G  -
lukepool/vmstorage/vm-102-disk-0  1.26G   621G     1.26G  -
lukepool/vmstorage/vm-103-disk-0  36.5G   621G     36.5G  -

Not sure what this line lukepool/share 205G 45.3G 128K /lukepool/share means exactly.

Your thoughts, please.
Thanks

Dunuin · Nov 28, 2020

lumox said:
Hi,
I created a ZFS pool, set one dataset for my VM to run, then I created a new dataset to put in it some data and set for it a quotas too.
So, I restored my VMs in lukepool/vmstorage. So far so good.
Next, I wanted to create a disk for my linux VM machine

SO, first datacenter->add:zfs. I picked my dataset
it created another storage (called disk_mint)
then, in my linux vm setup I added a disk and picked that storage
it worked. I see it in my linux vm's shell now.

Fdisk -l

View attachment 21568

another screen:

View attachment 21569
Hard disk: disk_mint is the disk I add out of my 250GB dataset storage

here are a few issues though.
My dataset (lukepool/share) has 250GB as quotas
I would assign tehe entire size to my new linux vm's HDD
But I could only set much less than that, only 150 GB

my updated zfs list:

Code:

root@pve:~# zfs list NAME USED AVAIL REFER MOUNTPOINT lukepool 277G 621G 128K /lukepool lukepool/share 205G 45.3G 128K /lukepool/share lukepool/share/vm-103-disk-0 205G 250G 74.6K - lukepool/vmstorage 72.4G 621G 128K /lukepool/vmstorage lukepool/vmstorage/vm-101-disk-0 34.6G 621G 34.6G - lukepool/vmstorage/vm-102-disk-0 1.26G 621G 1.26G - lukepool/vmstorage/vm-103-disk-0 36.5G 621G 36.5G -

Not sure what this line lukepool/share 205G 45.3G 128K /lukepool/share means exactly.

Your thoughts, please.
Thanks

ZFS is using 205GB to store your 150GB virtual Disk. You always loose some space because of additional metadata, parity or padding.
How did you setup your pool? Is it raidz1/2/3? Before creating a virtual drive you should calculate the best volblocksize and set that value for your pool. The default volblocksize is 8k and it isn't uncommon that everything uses something link 50% more space because of bad padding.

Look here for the volblocksize and overhead.

lumox · Nov 29, 2020

Dunuin said:
ZFS is using 205GB to store your 150GB virtual Disk. You always loose some space because of additional metadata, parity or padding.
How did you setup your pool? Is it raidz1/2/3? Before creating a virtual drive you should calculate the best volblocksize and set that value for your pool. The default volblocksize is 8k and it isn't uncommon that everything uses something link 50% more space because of bad padding.

Look here for the volblocksize and overhead.

Thank you for the file.

My pool is raidz1 with 3 500GB-size disks. So, I need to know the best volblocksize before creating a virtual drive, because otherwise I'd risk to waste space. Correct? Could you help me figure out the best disk size setup to get the most out of my 250GB dataset (lukepool/share) which I created a Virtual disk from?
More importantly, is it worth creating and assigning a virtual disk to my linux machine this way, or I'd better to simply add a new disk in the hardware setup, set a size for it and put it in my zfs pool (lukepool) directly without any fuss?
Anyway, my goal was to have a disk separated from my linux VM in order, in case, to assign it to another machine or reassign to it if I need to format. Also, I thought I can save/backup it manually via FTP.

Dunuin · Nov 29, 2020

lumox said:
Thank you for the file.

My pool is raidz1 with 3 500GB-size disks. So, I need to know the best volblocksize before creating a virtual drive, because otherwise I'd risk to waste space. Correct? Could you help me figure out the best disk size setup to get the most out of my 250GB dataset (lukepool/share) which I created a Virtual disk from?
More importantly, is it worth creating and assigning a virtual disk to my linux machine this way, or I'd better to simply add a new disk in the hardware setup, set a size for it and put it in my zfs pool (lukepool) directly without any fuss?
Anyway, my goal was to have a disk separated from my linux VM in order, in case, to assign it to another machine or reassign to it if I need to format. Also, I thought I can save/backup it manually via FTP.

You can try 16k as volblocksize if you created your pool with ashift of 12. That should only waste 33% and not 50% like now for padding/parity.

I don't really get your question.

If you want to backup your VMs use the build in backup features and make sure all virtual disks are included.
You can add several virtual disks to a VM if you don't want everything (swap/home/var) on the same virtual disk. I for example used 3 virtual disks so I was able to move my var and swap to the hdd-pool so save SSD wearing.

lumox · Nov 29, 2020

Dunuin said:
You can try 16k as volblocksize if you created your pool with ashift of 12. That should only waste 33% and not 50% like now for padding/parity.

Yes, I created my pool wirh ashift of 12. I need now to learn how to set volblocksize

Dunuin said:
I don't really get your question.

If you want to backup your VMs use the build in backup features and make sure all virtual disks are included.
You can add several virtual disks to a VM if you don't want everything (swap/home/var) on the same virtual disk. I for example used 3 virtual disks so I was able to move my var and swap to the hdd-pool so save SSD wearing.

Yes, I could add virtual disks in the hardware setup and save my Vm and its Virtual disks together. However I'm trying to figure out if there is really a difference between the two methods I mentioned above.
Also, I'd like to understand if the "dataset" method I used (the first one) makes my new Virtual disk more "independent/separated" than the classic way.
In a few words, Is it really worth creating a dataset to create then a Virtual disk out of it for a VM?
Thanks

Dunuin · Nov 29, 2020

lumox said:
Yes, I created my pool wirh ashift of 12. I need now to learn how to set volblocksize

WebGUI: "Datacenter -> Storage -> YourPoolName -> Block Size: 16k"

lumox said:
Yes, I could add virtual disks in the hardware setup and save my Vm and its Virtual disks together. However I'm trying to figure out if there is really a difference between the two methods I mentioned above.
Also, I'd like to understand if the "dataset" method I used (the first one) makes my new Virtual disk more "independent/separated" than the classic way.
In a few words, Is it really worth creating a dataset to create then a Virtual disk out of it for a VM?
Thanks

Creating datasets might be useful if you want to group virtual harddiscs (zvols) for easier recursive snapshots/replication or if you want different virtual harddisks to use different ZFS options.
You could for example create one dataset for encrypted virtual disks with enabled encryption and one dataset for non-encrypted virtual disks. That way you can choose if it should be encrypted or not by adding it as a child of one of the two datasets. Also useful if you want to change the compression algorithm or disable sync writes for specific virtual disks.

lumox · Nov 29, 2020

Dunuin said:
WebGUI: "Datacenter -> Storage -> YourPoolName -> Block Size: 16k"

It is set on 8K. But I read on the excel sheet that it wouldn't make any difference with a zfs pool made of 3 disks . By the way, what is the difference between the "RAIDZ1 PARITY COST, PERCENTAGE OF TOTAL STORAGE" sheet and "RAIDZ1 PARITY COST" sheet?

Dunuin said:
Creating datasets might be useful if you want to group virtual harddiscs (zvols) for easier recursive snapshots/replication or if you want different virtual harddisks to use different ZFS options.
You could for example create one dataset for encrypted virtual disks with enabled encryption and one dataset for non-encrypted virtual disks. That way you can choose if it should be encrypted or not by adding it as a child of one of the two datasets. Also useful if you want to change the compression algorithm or disable sync writes for specific virtual disks.

I think I got it now, thanks

Last thing last.

When I created the dataset and the Virtual disk in it, I got only 2/3 of the the size (lost 1/3 again) for parity.

Code:

root@pve:~# zfs list
NAME       USED  AVAIL     REFER  MOUNTPOINT
lukepool   480K   898G      128K  /lukepool

I thought that I would loose space only when I first created the Zpool once and for all.
Did I really loose 1/3 out of 1500GB + 1/3 out of 250GB, that is about 500GB + 84GB = 584GB?
Probably it can't be like that and I haven't yet understood how it works.

Here is how it is now:

Code:

oot@pve:~# zfs list
NAME                               USED  AVAIL     REFER  MOUNTPOINT
lukepool                           277G   621G      128K  /lukepool
lukepool/share                     205G  45.3G      128K  /lukepool/share
lukepool/share/vm-103-disk-0       205G   250G     74.6K  -
lukepool/vmstorage                72.4G   621G      128K  /lukepool/vmstorage
lukepool/vmstorage/vm-101-disk-0  34.6G   621G     34.6G  -
lukepool/vmstorage/vm-102-disk-0  1.26G   621G     1.26G  -
lukepool/vmstorage/vm-103-disk-0  36.5G   621G     36.5G  -
root@pve:~# zpool iostat -v
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
lukepool     109G  1.25T     20      9   120K  77.7K
  raidz1     109G  1.25T     20      9   120K  77.7K
    sda         -      -      6      3  40.9K  26.0K
    sdb         -      -      6      3  39.4K  25.8K
    sdc         -      -      6      3  39.8K  25.9K
----------  -----  -----  -----  -----  -----  -----

Could you help me make sense of it please?
Thanks

Dunuin · Nov 30, 2020

lumox said:
It is set on 8K. But I read on the excel sheet that it wouldn't make any difference with a zfs pool made of 3 disks .

Where did you see that? Columns are "block size in sectors" and rows are the number of HDDs your Pool consists of. If you use ashift 12 you are using a sector size of 4k for each HDD. So your 8k volblocksize is "block size in sectors = 2" in the table and a volblocksize of 16k would be "block size in sectors = 4" in the table. If you look at the "RAIDZ1 parity cost, percent of total storage" tab your 8k volblocksize (column with value "2") will waste 50% of the total raw capacity of your 3 drives and a volblocksize of 16k (column with value "4") will only waste 33%.

lumox said:
By the way, what is the difference between the "RAIDZ1 PARITY COST, PERCENTAGE OF TOTAL STORAGE" sheet and "RAIDZ1 PARITY COST" sheet?

Its the same just a other relation. "RAIDZ1 PARITY COST, PERCENTAGE OF TOTAL STORAGE" tells you how much space you loose in relation to the total raw capacity of all your drives. For Example, if your pool consists of 3x 500GB drives and you loose 50% of the 1500GB raw capacity, you can only use 750GB.
"RAIDZ1 parity cost" is in relation to the real data. If you loose 50% of your total raw capacity that will make the data you can store 200% of the size the real data is. Both is the same and you will loose 750GB of that 1500GB pool.

lumox said:
I think I got it now, thanks

Last thing last.

When I created the dataset and the Virtual disk in it, I got only 2/3 of the the size (lost 1/3 again) for parity.

Code:

root@pve:~# zfs list NAME USED AVAIL REFER MOUNTPOINT lukepool 480K 898G 128K /lukepool

I thought that I would loose space only when I first created the Zpool once and for all.
Did I really loose 1/3 out of 1500GB + 1/3 out of 250GB, that is about 500GB + 84GB = 584GB?
Probably it can't be like that and I haven't yet understood how it works.

Here is how it is now:

Code:

oot@pve:~# zfs list NAME USED AVAIL REFER MOUNTPOINT lukepool 277G 621G 128K /lukepool lukepool/share 205G 45.3G 128K /lukepool/share lukepool/share/vm-103-disk-0 205G 250G 74.6K - lukepool/vmstorage 72.4G 621G 128K /lukepool/vmstorage lukepool/vmstorage/vm-101-disk-0 34.6G 621G 34.6G - lukepool/vmstorage/vm-102-disk-0 1.26G 621G 1.26G - lukepool/vmstorage/vm-103-disk-0 36.5G 621G 36.5G - root@pve:~# zpool iostat -v capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- lukepool 109G 1.25T 20 9 120K 77.7K raidz1 109G 1.25T 20 9 120K 77.7K sda - - 6 3 40.9K 26.0K sdb - - 6 3 39.4K 25.8K sdc - - 6 3 39.8K 25.9K ---------- ----- ----- ----- ----- ----- -----

Could you help me make sense of it please?
Thanks

If you use raidz1 with 3x 500GB drives you will always loose 33% of your total raw capacity because of the parity overhead. Thats why you will loose 33% with a volblocksize of 16k. But if you use a volblocksize of 8k you also got another 17% loss due to padding overhead. In other words: ZFS will show you a capacity of the pool of 1000GB because 500GB are used for parity. But because of bad padding you your data will need 33% more space on the drives. So if you write 750GB of real data to the pool that data will need 1000GB to be stored on the drives.

And you shouldn't fill up a pool more then 80% because it is a copy-on-write file system and always needs free space. So right know your usable real capacity is 600GB of that 1500GB raw capacity.
If you change that volblocksize to 16k and destroy and recreate every virtual disk (that can't be edited afterwards) you should be able to use 1000GB and if you keep 20% free for copy-on-write stuff you will be able to store 800GB of real data.

Also keep in mind that snapshotting might use space if you want to use that. So you might for example want to keep 50% of that 800GB free for snapshots and all of your virtual harddiscs combined shouldn't be bigger then 400GB.

lumox · Nov 30, 2020

Dunuin said:
If you change that volblocksize to 16k and destroy and recreate every virtual disk (that can't be edited afterwards) you should be able to use 1000GB and if you keep 20% free for copy-on-write stuff you will be able to store 800GB of real data.

So, I should destroy and recreate my VMs and it added disks if I want to change the volblocksize?
Thanks

Dunuin · Nov 30, 2020

lumox said:
So, I should destroy and recreate my VMs and it added disks if I want to change the volblocksize?
Thanks

Yes, because volblocksize is only set at creation. So you need to recreate them after changing the volblocksize of the pool.

Search

Search

Is it a right way to add a disk to a VM?

lumox

Member

Dunuin

Distinguished Member

lumox

Member

Dunuin

Distinguished Member

lumox

Member

Dunuin

Distinguished Member

lumox

Member

Dunuin

Distinguished Member

lumox

Member

Dunuin

Distinguished Member