And that's it? No further configuration is required in Proxmox?Of course there is. Just add another vdev via
Code:zpool add rpool mirror /dev/sd[ef]
For more information, please refer to the manpage.
Thats basically it.And that's it? No further configuration is required in Proxmox?
zpool add rpool mirror /dev/sde /dev/sdf
Jup, but not really needed. By default when creating a raid10 via the webUI it will also only create the boot partitions on the first mirror but not on the second/third/... mirror. Once both disks of the first mirror fail at the same time you wouldn't be able to boot anymore as the other disks ren'T bootable...but on the other hand all data on that pool is then lost anyway...And in case it is your boot pool, you might want to make the new disks also bootable by adapting the instructions here:
https://pve.proxmox.com/wiki/ZFS_on_Linux#_zfs_administration -> "Changing a failed bootable device"
Once both disks of the first mirror fail at the same time you wouldn't be able to boot anymore as the other disks ren'T bootable...but on the other hand all data on that pool is then lost anyway...
Thats basically it.
But...
1.) depending on the number of disks you add you maybe want to increase the volblocksize for better performance
[..]
AFAIK, the volblocksize does not matter with a stripped mirror setup as much as it matters with RAIDz*.Just to clarify with the following setup:
- 10 nvme disks (RAID 10, striped & mirrored)
- ashift 12 (4096K)
5 mirrors * 4k = 20k
is 16k or 20k recommended for volblocksize?
zfs set redundant_metadata=most rpool
zfs set xattr=sa rpool
zfs set atime=off rpool
zfs set compression=lz4 rpool
zfs setprimarycache = metadata ← only for testing to not benchmark the RAM :)
ZVOL Block Size | SEQ READ | RAND READ | SEQ WRITE | RAND WRITE |
16K | 14.6GiB/s | 140k | 13.4GiB/s | 59.5k |
32K | 21.6GiB/s | 144k | 24.0GiB/s | 42.0k |
64k | 41.0GiB/s | 146k | 27.7GiB/s | 42.4k |
128k | 59.0GiB/s | 132k | 30.6GiB/s | 40.6k |
256k | 69.3GiB/s | 130k | 29.2GiB/s | 41.7k |
512k | 58.1GiB/s | 124k | 27.9GiB/s | 40.9k |
1M | 60.4GiB/s | 69.9k | 26.3GiB/s | 38.5k |
fio --ioengine=libaio --direct=1 --name=test --filename=seq_read.fio \
--bs=1M --iodepth=16 --size=1G --rw=read --numjobs=8 --refill_buffers \
--time_based --runtime=30
fio --ioengine=libaio --direct=1 --name=test --filename=seq_write.fio \
--bs=1M --iodepth=16 --size=1G --rw=write --numjobs=8 --refill_buffers \
--time_based --runtime=30
fio --ioengine=libaio --direct=1 --name=test --filename=rand_read.fio \
--bs=4K --iodepth=16 --size=1G --rw=randread --numjobs=8 --group_reporting \
--refill_buffers --time_based --runtime=30
fio --ioengine=libaio --direct=1 --name=test --filename=rand_write.fio \
--bs=4K --iodepth=16 --size=1G --rw=randwrite --numjobs=8 --group_reporting \
--refill_buffers --time_based --runtime=30
Yes, for a synthetic benchmark, this is normal and you can see that 4k random read write cycles are much better with smaller volblocksizes and would still be better if you would benchmark with lower volblocksizes down to 4k.Based on that results it seems the best performance could be reached with a block size of 128K - 256k which is far away from the default of 8k (or ZFS default oft 16k).
That depends heavily on the used data inside of the VM. You also need to tune the filesystem in the guest to have aligned and proper blocksizes in order to not have a huge read and/or write amplification. If you would e.g. storage a lot of large files, a bigger volblocksize is better, yet if you store a lot of small files < and << your volblocksize, you will have tremendous write amplification cycles. If they're from a database, you will have a lot of sync writes of huge blocks and could potentially have a lot of read amplification if the blocks are not currently in memory.Do you really recommend setting the block size to such high value of 128k / 256k?
Just be aware that this will NOT rebalance existing data in your pool. that means that further writes may have inconsistent and often "poor" performance due to the lack of vdev avaliability for full stroke writes. in an ideal world, you would export all data out to a temporary space, and then zfs receive it back.Code:zpool add rpool mirror /dev/sd[ef]
Just send/receive the data from and to the same pool, delete the original data and rename the dataset to the previous name. If you have a fine granular setup regarding datasets, it's no problem.in an ideal world, you would export all data out to a temporary space, and then zfs receive it back.
that assumes that the pool is less than half full; chances are thats not the case for a request to increase the pool size. Moreover, the free space resulting would probably not be distributed across all vdevs anyway, since the pool copy has to reside on the free space- which would be mostly on the added vdev.ust send/receive the data from and to the same pool, delete the original data and rename the dataset to the previous name. If you have a fine granular setup regarding datasets, it's no problem.
Yes, it depends on a few assumptions, yet you can always run multiple rebalancing loops. The upside is that you can do that almost online or just with a very small downtime window for each dataset and you don't need double the space overall to copy everything off. If you have the space and a downtime window ... do a send/receive to another pool.that assumes that the pool is less than half full; chances are thats not the case for a request to increase the pool size. Moreover, the free space resulting would probably not be distributed across all vdevs anyway, since the pool copy has to reside on the free space- which would be mostly on the added vdev.