Changed raid and now things have taken a turn for the worst

1slotrk

New Member
Aug 15, 2020
5
0
1
39
I wanted to rebuild my server for multiple reasons. So i did. One thing i wanted to change was move my array from a RAIDz2 to a RAIDz1. I moved my proxmox install from the array to a single 10k drive as well. After installation the array (6x 4TB disks) came out to to the right size of 21.83TiB. Now i move down to my storage tab and the usage maximum is 16.89TiB. Then when i build my disks for one of my VMs (1x 5000G 1x 4000G and 1x 500G) that apparently equals 89.59% (15.13TiB of my 16.89TiB) storage. How does any of this work out to be right? I feel like I'm missing something here. I moved to an array that is supposed to have more storage and i end up with less. I am pretty new to linux and proxmox. Thanks for the help.
root@proxmox:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 3.7T 0 disk ├─sda1 8:1 0 3.7T 0 part └─sda9 8:9 0 8M 0 part sdb 8:16 0 3.7T 0 disk ├─sdb1 8:17 0 3.7T 0 part └─sdb9 8:25 0 8M 0 part sdc 8:32 0 3.7T 0 disk ├─sdc1 8:33 0 3.7T 0 part └─sdc9 8:41 0 8M 0 part sdd 8:48 0 3.7T 0 disk ├─sdd1 8:49 0 3.7T 0 part └─sdd9 8:57 0 8M 0 part sde 8:64 0 3.7T 0 disk ├─sde1 8:65 0 3.7T 0 part └─sde9 8:73 0 8M 0 part sdf 8:80 0 3.7T 0 disk ├─sdf1 8:81 0 3.7T 0 part └─sdf9 8:89 0 8M 0 part sdg 8:96 0 279.5G 0 disk ├─sdg1 8:97 0 1007K 0 part ├─sdg2 8:98 0 512M 0 part └─sdg3 8:99 0 279G 0 part ├─pve-swap 253:5 0 8G 0 lvm [SWAP] ├─pve-root 253:6 0 69.5G 0 lvm / ├─pve-data_tmeta 253:7 0 1.9G 0 lvm │ └─pve-data-tpool 253:9 0 181.8G 0 lvm │ ├─pve-data 253:10 0 181.8G 0 lvm │ ├─pve-vm--100--disk--0 253:11 0 4G 0 lvm │ └─pve-vm--101--disk--0 253:12 0 32G 0 lvm └─pve-data_tdata 253:8 0 181.8G 0 lvm └─pve-data-tpool 253:9 0 181.8G 0 lvm ├─pve-data 253:10 0 181.8G 0 lvm ├─pve-vm--100--disk--0 253:11 0 4G 0 lvm └─pve-vm--101--disk--0 253:12 0 32G 0 lvm sdh 8:112 0 372.5G 0 disk └─docker-vm--101--disk--0 253:0 0 320G 0 lvm zd0 230:0 0 4.9T 0 disk zd16 230:16 0 3.9T 0 disk zd32 230:32 0 500G 0 disk

zd32 230:32 0 500G 0 disk root@proxmox:~# uname -a Linux proxmox 5.4.73-1-pve #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) x86_64 GNU/Linux
 
Can you please post the output of zpool status in [code][/code] tags?
 
Corrected. Thanks for the right one

Code:
root@proxmox:~# zpool status
  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 0 days 06:52:37 with 0 errors on Thu Dec  3 22:45:06 2020
config:

        NAME                                      STATE     READ WRITE CKSUM
        tank                                      ONLINE       0     0     0
          raidz1-0                                ONLINE       0     0     0
            ata-TOSHIBA_MG04ACA400N_46T8K123FVLC  ONLINE       0     0     0
            ata-TOSHIBA_MG04ACA400N_562BK134FVLC  ONLINE       0     0     0
            ata-TOSHIBA_MG04ACA400N_5623K1N0FVLC  ONLINE       0     0     0
            ata-TOSHIBA_MG04ACA400N_56JDK1TVFVLC  ONLINE       0     0     0
            ata-TOSHIBA_MG04ACA400N_46R3K1ILFVLC  ONLINE       0     0     0
            ata-TOSHIBA_MG04ACA400N_56L7K2N3FVLC  ONLINE       0     0     0

errors: No known data errors
 
Having the output of zfs list might help as well but I do believe I know what is going on. We have a section in the docs about pool design and what to take into consideration [0].

The TL;DR is that you are using a raidz1 which needs to store parity data. Depending on the block size used in the zvol for the VM disk and the block size used underneath (ashift) you will see quite the explosion in data usage.

For a VM workload we recommend a pool of mirrored vdevs (raid 10). The linked section in the documentation talks about the performance benefits of it.

If you really need to use a raidz1 pool you are lucky though because AFAICT those disks use native 512b sectors and not 4k. You could recreate the pool with an ashift of 9 and should be a much better data to parity ratio.


[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_zfs_raid_considerations
 
I would like to use raidz1. The only VM i am running rightnow is OpenMediaVault for a media server. Sometimes i will mess around with other VMs and OSs but not where i need alot of storage. I came from a raidz2. I also moved away from the RAID card i had to th LSI 9211i so i could use zfs. I hope i can make this right. Currently the block size for "tank" is 8k.

If i use raid 10 then from my understanding i will lose 50% of the overall pool capacity as compared to about 18% with raidz1

Code:
root@proxmox:~# zfs list
NAME                 USED  AVAIL     REFER  MOUNTPOINT
tank                15.1T  1.76T      153K  /tank
tank/vm-101-disk-0  7.97T  3.34T     6.38T  -
tank/vm-101-disk-1  6.37T  7.21T      940G  -
tank/vm-101-disk-2   816G  2.55T     11.1M  -
 
Like aaron said your volblocksize of 8k is causing alot of padding overhead and because of that everything is bigger and you loose space. A fix would be to destroy and recreate the pool to lower the ashift to 9 or destroy and recreate the virtual disks after changing the volblocksize to something bigger like 32K. Right now with raidz1, 6 drives, ashift of 12 and volblocksize of 8k your pool should loose 50% of its total raw capacity to parity and padding so its the same as using Raid10. With a volblocksize of 32K or higher you would only loose around 20%. Look here.

That is a common beginner problem and every second day someone asks the same question. We should have a sticky FAQ thread for that so it is easier to point people to a already existing short explanation.
 
Last edited:
Like aaron said your volblocksize of 8k is causing alot of padding overhead and because of that everything is bigger and you loose space. A fix would be to destroy and recreate the pool to lower the ashift to 9 or destroy and recreate the virtual disks after changing the volblocksize to something bigger like 32K. Right now with raidz1, 6 drives, ashift of 12 and volblocksize of 8k your pool should loose 50% of its total raw capacity to parity and padding so its the same as using Raid10. With a volblocksize of 32K or higher you would only loose around 20%. Look here.

That is a common beginner problem and every second day someone asks the same question. We should have a sticky FAQ thread for that so it is easier to point people to a already existing short explanation.
How does that math work out. I would really love to know. I am very new to virtual environments but i love teaching myself things. Forums are like magic to me. I have read about ashift and volblocksize but it is always clear as mud. A sticky explaining this to someone like me that is learning would save so much heartache.

Which would be better? Changing the ashift or changing the blocksize? Advantages of each? Both cases i have to lose my data but it is backed up so no worries.

Thank you, CHEERS
 
Which would be better? Changing the ashift or changing the blocksize? Advantages of each? Both cases i have to lose my data but it is backed up so no worries.
If you lower the ashift to 9 (so the pool is optimized for drives with 512B LBA) you can run into problems later because you can't replace a damaged drive with a new one of a LBA size of 4K. Drives are always getting bigger in capacity and because of that less 512B LBA drives are produced. As soon as no manufacturer isn't producing 512B LBA drives any longer you won't be able to replace your drive. Big advantage would be that a smaller ashift will waste less space if you write a lot of small files. For example if you are running some kind of databases.

Increasing the volblocksize will speed up reading/writing of big files but you will loose capacity if you are trying to store small files.
 
Most of my writes to "tank" are one time writes and a lot of reads. Along with multiple gig files. So from what i can understand a higher volblocksize will be better for me (64k) maybe??? I would like to be able to upgrade my drives but will most likely destroy the pool and restore from a backup anyway. How low should i go on the ashift, assuming 9 is optimized for my drives?
 
Its no problem do use 4K on 512B LBA drives you just shouldn't try to use 512B on an 4K LBA drive. If you don't care about disk replacements in the future you can use ashift 9 for 512B harddisk block size and you can keep your 8K volblocksize or even lower it to 4K. In both cases you should only loose around 20% of your total raw capacity due to overhead.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!