Best Drive setup with my current hardware.

catbodi

Member
Oct 21, 2018
11
0
21
I need some help with my drive setup on my new dedicated server.

2x240GB SSD SATA Soft Raid
4x12TB HDD Soft Raid
32GB RAM DDR4 ECC


I'm going for the best performance + reasonable safety. Server will be used for self-hosting many important VMs like web server, file storage, VPN, etc. Speeds inside VM matter here, I cannot customize hardware like add/remove drives so that is out of the question. So whats the best way to go about this? I have spent a few hours researching and there is so much conflicting information...

Use the 4 HDD's in RAIDZ1 or RAIDZ2 or RAID 5 for Proxmox OS + storage, use the 2x SSD's for caching somehow or use the SSD's as Proxmox OS?
 
The two SSDs can be used for the OS and if you want to add a ZIL you could leave some space free during the installation to add another partition to be used as ZIL. Usually a few GB are enough for the ZIL.
For this, set the `hdsize` parameter in the advanced disk options during installation. (https://pve.proxmox.com/pve-docs/pve-admin-guide.html#advanced_zfs_options)

If you want the best performance don't go for any raidz or raid5 setups. If you use ZFS, use a RAID 10 like setup with two mirror VDEVs.

Any RAIDZ vdev will behave like 1 single disk in regards to IOPS, with a lot of bandwidth.

A mirror VDEV when writing will behave like 1 single disk (IOPS and bandwidth), when reading it will behave like two disks (IOPS and bandwidth).

Thus with two mirror VDEVs you will have double the IOPS writing and 4 times the IOPS when reading. And IOPS is what you want to optimize for when running multiple VMs on that storage.

More RAM would probably be nice though depending on how much the VMs will need.

You won't want to add ZFS read cache (L2ARC) to this setup. You are already on the low end of RAM and adding an L2ARC will need its additional space in RAM to store the indexes and such.
 
The two SSDs can be used for the OS and if you want to add a ZIL you could leave some space free during the installation to add another partition to be used as ZIL. Usually a few GB are enough for the ZIL.
For this, set the `hdsize` parameter in the advanced disk options during installation. (https://pve.proxmox.com/pve-docs/pve-admin-guide.html#advanced_zfs_options)

If you want the best performance don't go for any raidz or raid5 setups. If you use ZFS, use a RAID 10 like setup with two mirror VDEVs.

Any RAIDZ vdev will behave like 1 single disk in regards to IOPS, with a lot of bandwidth.

A mirror VDEV when writing will behave like 1 single disk (IOPS and bandwidth), when reading it will behave like two disks (IOPS and bandwidth).

Thus with two mirror VDEVs you will have double the IOPS writing and 4 times the IOPS when reading. And IOPS is what you want to optimize for when running multiple VMs on that storage.

More RAM would probably be nice though depending on how much the VMs will need.

You won't want to add ZFS read cache (L2ARC) to this setup. You are already on the low end of RAM and adding an L2ARC will need its additional space in RAM to store the indexes and such.

Here:
Screen Shot 2020-07-10 at 1.11.06 AM.png

I used 2 separate commands, sudo zpool create hddpool mirror wwn drive1 drive2 && another command with zpool add mirror. In this configuration I can still have a hard drive fail and it be ok, correct?
 
I used 2 separate commands, sudo zpool create hddpool mirror wwn drive1 drive2 && another command with zpool add mirror. In this configuration I can still have a hard drive fail and it be ok, correct?
You could have done it in one command with
zpool create hddpool mirror /path/to/drive1 /path/to/drive2 mirror /path/to/drive3 /patch/to/drive4

Did you set the ashift? If not, what disks are those? For HDDs you will most likely want to set the ashift to 12 in order to have the pool set to 4k blocks (2^12 = 4k)

zpool create -o ashift=12 hddpool mirror /path/to/drive1 /path/to/drive2 mirror /path/to/drive3 /patch/to/drive4

In this setup, the hddpool can suffer the loss of two disks, IF it is the right two disks: one per VDEV.

Once both disks in a VDEV are lost, the pool will be unusable as well.
 
You could have done it in one command with
zpool create hddpool mirror /path/to/drive1 /path/to/drive2 mirror /path/to/drive3 /patch/to/drive4

Did you set the ashift? If not, what disks are those? For HDDs you will most likely want to set the ashift to 12 in order to have the pool set to 4k blocks (2^12 = 4k)

zpool create -o ashift=12 hddpool mirror /path/to/drive1 /path/to/drive2 mirror /path/to/drive3 /patch/to/drive4

In this setup, the hddpool can suffer the loss of two disks, IF it is the right two disks: one per VDEV.

Once both disks in a VDEV are lost, the pool will be unusable as well.
According to the hdparm -I, here is link to the matching model number: https://www.disctech.com/Hitachi-0F30145-12TB-SATA-Hard-Drive
It showing Sector Size: 512 / 512e on data sheet and

Logical Sector size: 512 bytes
Physical Sector size: 4096 bytes
in hdparm -I

Here is hdparm -I of one of the disks
Code:
Configuration:
    Logical        max    current
    cylinders    16383    16383
    heads        16    16
    sectors/track    63    63
    --
    CHS current addressable sectors:    16514064
    LBA    user addressable sectors:   268435455
    LBA48  user addressable sectors: 23437770752
    Logical  Sector size:                   512 bytes
    Physical Sector size:                  4096 bytes
    Logical Sector-0 offset:                  0 bytes
    device size with M = 1024*1024:    11444224 MBytes
    device size with M = 1000*1000:    12000138 MBytes (12000 GB)
    cache/buffer size  = unknown
    Form Factor: 3.5 inch
    Nominal Media Rotation Rate: 7200


ashift=12 on both the rpool & hddpool. (according to zdb -C hddpool) What should be optimal here? I'm not having great performance in VM. I'm using KVM Linux guests, what should the blocksize of the ZFS storage on HDDs be? I noticed it defaulted to 8k block size in node zfs creation. What should blocksize be in guest, for example my guests imported are currently showing 4k.
 
If you look at the `Physical Sector size` you see that it is 4k. If a drive this big had 512b sectors I would have been surprised. Drives do like to lie about their sector sizes :-/

If you set ashift to 12, the blocks ZFS will use, will be the same size as the sectors size of the disk. The ashift can not be changed after the pool has been created.

Thus, I recommend that you recreate the pool with ashift=12 now at the beginning when the pain of doing so is not as high. You will also be on the safer side if you have to replace a failed drive in the future. Performance should also be better when you align the block size of ZFS to the real sector size of the underlying disks.
 
If you look at the `Physical Sector size` you see that it is 4k. If a drive this big had 512b sectors I would have been surprised. Drives do like to lie about their sector sizes :-/

If you set ashift to 12, the blocks ZFS will use, will be the same size as the sectors size of the disk. The ashift can not be changed after the pool has been created.

Thus, I recommend that you recreate the pool with ashift=12 now at the beginning when the pain of doing so is not as high. You will also be on the safer side if you have to replace a failed drive in the future. Performance should also be better when you align the block size of ZFS to the real sector size of the underlying disks.
output of zbd -C hddpool and rpool shows that it is already ashift=12 from default. I did not manually specify when making these but installer had ashift=12 automatically for the SSD's and the hddpool shows ashift=12 already there when I made with command line (didn't specify, but shows done). So no further steps needed, right?

I will change blocksize to 4k instead of 8k. Can this be changed on my hddpool without destroying?
 
output of zbd -C hddpool and rpool shows that it is already ashift=12 from default.
Ah good to see that ZFS chose the right ashift automatically :)

I will change blocksize to 4k instead of 8k. Can this be changed on my hddpool without destroying?
On the ZVOLs (VM disks) you mean? I would leave the default. It's a multiple of 4k. Playing around with that will also need the FS of the guest to be aligned.

The blocksize of a ZVOL can also only be set during creation.
 
Ah good to see that ZFS chose the right ashift automatically :)


On the ZVOLs (VM disks) you mean? I would leave the default. It's a multiple of 4k. Playing around with that will also need the FS of the guest to be aligned.

The blocksize of a ZVOL can also only be set during creation.
How can I add ZIL for hddpool as you suggested? I used the hdsize parameter during installation and gave the SSD OS installation only 30GB of space so there is ample free space on the SSDs... Do I need to create a partition for use of the ZIL? From researching all over online I can only see people using entire dedicated SSD's for ZIL, etc. Nobody that is using their SSD mirror as the Proxmox OS and also cache.
 
you would have to create a new partition yes. and a few GB (<10) are usually way more than enough. So you could increase the size of the partition 3 which holds the boot pool (rpool) and then create a small 4th partition which you could try to use as ZIL. Performance will not be as good as on a dedicated SSD but probably still better than without it.
 
you would have to create a new partition yes. and a few GB (<10) are usually way more than enough. So you could increase the size of the partition 3 which holds the boot pool (rpool) and then create a small 4th partition which you could try to use as ZIL. Performance will not be as good as on a dedicated SSD but probably still better than without it.
Screen Shot 2020-07-15 at 5.06.18 AM.png

Should ZIL be mirrored or no? I'm seeing a lot of contradicting information regarding this. Also for the new partitions I should make them ZFS type, correct?

Just add a partition to a single SSD and then add as log to the pool?
 
Should ZIL be mirrored or no? I'm seeing a lot of contradicting information regarding this.
It's up to you. If you would get a new SSD for it, I would say, go unmirrored. Since you have two, why not mirror it.

Also for the new partitions I should make them ZFS type, correct?
Does not really matter TBH.

What does make me wonder though is that the SSDs are ~220GB large an so far only about 40GB are used. Do you intend to use the remaining space for the ZIL? If so, this is way too much. A few GB are usually large enough for the ZIL and a lot of times it might not even use 1 GB. It only stores the sync writes for a few seconds.
 
It's up to you. If you would get a new SSD for it, I would say, go unmirrored. Since you have two, why not mirror it.


Does not really matter TBH.

What does make me wonder though is that the SSDs are ~220GB large an so far only about 40GB are used. Do you intend to use the remaining space for the ZIL? If so, this is way too much. A few GB are usually large enough for the ZIL and a lot of times it might not even use 1 GB. It only stores the sync writes for a few seconds.
I just selected 40GB because I didn't think I'd need much more for just the Proxmox OS, I don't think I've even hit 10GB on that. Storage/VM's will be on hddpool. I went ahead and created an 8GB partition on both sda & sdb , without formatting filesystem for these new partitions. I added these to the hddpool with "zpool add hddpool log /dev/sda4 /dev/sdb4".

I believe performance has improved slightly, however I noticed something that I said incorrectly before. I transferred my KVM linux guests from Proxmox and the sector size inside guest VM is showing "Sector size (logical/physical): 512 bytes / 512 bytes". This is likely hurting my performance correct? My HDD's are 4k, the ZFS hddpool block size is 8k (which you said is OK since its a multiple), but inside the Guest VM is 512... whats the way to convert? I only found an opensuse guide about this on the net and I don't think that applies to this.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!