How can I set the correct ashift on ZFS ?

batijuank

Member
Nov 16, 2017
45
0
11
35
I bought 4 Seagate Barracuda ST2000DM008 (Bytes per sector: 4096 according to datasheet) to be used in a Proxmox 5.4 with ZFS, during installation I choosed ashift=12, however after installation, I decided to check the bytes per sector using: fdisk -l /dev/sd[abcd]
This gives me:
Code:
Disk /dev/sda: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 6CE03003-5475-4CFC-8ED1-8DBC1DEBB92E

Device       Start        End    Sectors  Size Type
/dev/sda1       34       2047       2014 1007K BIOS boot
/dev/sda2     2048    1050623    1048576  512M EFI System
/dev/sda3  1050624 3907029134 3905978511  1,8T Solaris /usr & Apple ZFS

Partition 1 does not start on physical sector boundary.


Disk /dev/sdb: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 3582B747-5524-4BBC-92A3-7637941098DF

Device       Start        End    Sectors  Size Type
/dev/sdb1       34       2047       2014 1007K BIOS boot
/dev/sdb2     2048    1050623    1048576  512M EFI System
/dev/sdb3  1050624 3907029134 3905978511  1,8T Solaris /usr & Apple ZFS

Partition 1 does not start on physical sector boundary.


Disk /dev/sdc: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 42B65B40-3F48-6345-8339-DAD622956949

Device          Start        End    Sectors  Size Type
/dev/sdc1        2048 3907012607 3907010560  1,8T Solaris /usr & Apple ZFS
/dev/sdc9  3907012608 3907028991      16384    8M Solaris reserved 1


Disk /dev/sdd: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D109985F-8F3D-CF45-B312-102D05B6C264

Device          Start        End    Sectors  Size Type
/dev/sdd1        2048 3907012607 3907010560  1,8T Solaris /usr & Apple ZFS
/dev/sdd9  3907012608 3907028991      16384    8M Solaris reserved 1

Here I see that each disk Sector size is 512 bytes / 4096 bytes, yet the hard drives is using sectors unit of 512 bytes, my question is should I choose ashift=9 instead of ashift=12 with this 4 HDD? I'm using this 4 HDD for a RAID10 with ZFS
 
This means, your ashift=12 value is correct. You will get slower performance by using a slower ashift value on 4K drives.

Thanks for helping

Then why this Units: sectors of 1 * 512 = 512 byte line is telling that each sector has a size of 512 bytes? What does that means ?
 
Thanks for helping

Then why this Units: sectors of 1 * 512 = 512 byte line is telling that each sector has a size of 512 bytes? What does that means ?

That is the logical size, the physical size matters for ZFS and is has been 4K for almost a decade. Your drive has 512e format which internally uses 4K but presents 512 to the OS for compatibility reasons. More information is here: https://en.wikipedia.org/wiki/Advanced_Format
 

According to that link:
The translation process is more complicated when writing data that is either not a multiple of 4K or not aligned to a 4K boundary. In these instances, the hard drive must read the entire 4096-byte sector containing the targeted data into internal memory, integrate the new data into the previously existing data and then rewrite the entire 4096-byte sector onto the disk media.

I'm seeing here:
Partition 1 does not start on physical sector boundary.

That my partitions are not aligned to a 4K boundary. Before this setup, I used to have 4 Seagate ST1000DM003 on ZFS with ashift=9, and I had better peformance (Less I/O delay percent).

Am I missing something here?
 
According to that link:


I'm seeing here:


That my partitions are not aligned to a 4K boundary. Before this setup, I used to have 4 Seagate ST1000DM003 on ZFS with ashift=9, and I had better peformance (Less I/O delay percent).

Am I missing something here?

partition 1 is the bios boot partition, it's not used by ZFS. the other partitions are aligned correctly..
 
Please post fdisk -l /dev/<yourdisk>

Bash:
# fdisk -l
Disk /dev/sda: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 6CE03003-5475-4CFC-8ED1-8DBC1DEBB92E

Device       Start        End    Sectors  Size Type
/dev/sda1       34       2047       2014 1007K BIOS boot
/dev/sda2     2048    1050623    1048576  512M EFI System
/dev/sda3  1050624 3907029134 3905978511  1,8T Solaris /usr & Apple ZFS

Partition 1 does not start on physical sector boundary.


Disk /dev/sdb: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 3582B747-5524-4BBC-92A3-7637941098DF

Device       Start        End    Sectors  Size Type
/dev/sdb1       34       2047       2014 1007K BIOS boot
/dev/sdb2     2048    1050623    1048576  512M EFI System
/dev/sdb3  1050624 3907029134 3905978511  1,8T Solaris /usr & Apple ZFS

Partition 1 does not start on physical sector boundary.


Disk /dev/sdc: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 42B65B40-3F48-6345-8339-DAD622956949

Device          Start        End    Sectors  Size Type
/dev/sdc1        2048 3907012607 3907010560  1,8T Solaris /usr & Apple ZFS
/dev/sdc9  3907012608 3907028991      16384    8M Solaris reserved 1


Disk /dev/sdd: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D109985F-8F3D-CF45-B312-102D05B6C264

Device          Start        End    Sectors  Size Type
/dev/sdd1        2048 3907012607 3907010560  1,8T Solaris /usr & Apple ZFS
/dev/sdd9  3907012608 3907028991      16384    8M Solaris reserved 1




Disk /dev/zd0: 74,4 GiB, 79856402432 bytes, 155969536 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0xd990d990

Device     Boot    Start       End  Sectors  Size Id Type
/dev/zd0p1 *        2048  61435903 61433856 29,3G  7 HPFS/NTFS/exFAT
/dev/zd0p2      61435904 155965439 94529536 45,1G  f W95 Ext'd (LBA)
/dev/zd0p5      61437952 155965439 94527488 45,1G  7 HPFS/NTFS/exFAT


Disk /dev/zd16: 32 GiB, 34359738368 bytes, 67108864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x90f4c784

Device      Boot    Start      End  Sectors Size Id Type
/dev/zd16p1 *        2048 62916607 62914560  30G 83 Linux
/dev/zd16p2      62918654 67106815  4188162   2G  5 Extended
/dev/zd16p5      62918656 67106815  4188160   2G 82 Linux swap / Solaris

Partition 2 does not start on physical sector boundary.


Disk /dev/zd32: 200 GiB, 214748364800 bytes, 419430400 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x000c400a

Device      Boot   Start       End   Sectors  Size Id Type
/dev/zd32p1 *       2048   2099199   2097152    1G 83 Linux
/dev/zd32p2      2099200 419430399 417331200  199G 83 Linux


Disk /dev/zd48: 40 GiB, 42949672960 bytes, 83886080 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x000c400a

Device      Boot   Start      End  Sectors Size Id Type
/dev/zd48p1 *       2048  2099199  2097152   1G 83 Linux
/dev/zd48p2      2099200 83886079 81786880  39G 8e Linux LVM


Disk /dev/zd64: 40 GiB, 42949672960 bytes, 83886080 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x000c400a

Device      Boot   Start      End  Sectors Size Id Type
/dev/zd64p1 *       2048  2099199  2097152   1G 83 Linux
/dev/zd64p2      2099200 83886079 81786880  39G 8e Linux LVM


Disk /dev/zd80: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x7b677b67

Device      Boot Start       End   Sectors  Size Id Type
/dev/zd80p1 *       56 209687519 209687464  100G  7 HPFS/NTFS/exFAT

Partition 1 does not start on physical sector boundary.


Disk /dev/zd96: 40 GiB, 42949672960 bytes, 83886080 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x000c400a

Device      Boot   Start      End  Sectors Size Id Type
/dev/zd96p1 *       2048  2099199  2097152   1G 83 Linux
/dev/zd96p2      2099200 83886079 81786880  39G 8e Linux LVM


Disk /dev/zd112: 40 GiB, 42949672960 bytes, 83886080 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x000c400a

Device       Boot   Start      End  Sectors Size Id Type
/dev/zd112p1 *       2048  2099199  2097152   1G 83 Linux
/dev/zd112p2      2099200 83886079 81786880  39G 8e Linux LVM


Disk /dev/zd128: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x000ca6fe

Device       Boot   Start       End   Sectors Size Id Type
/dev/zd128p1 *       2048   2099199   2097152   1G 83 Linux
/dev/zd128p2      2099200 209715199 207616000  99G 83 Linux


Disk /dev/zd144: 2 TiB, 2199023255552 bytes, 4294967296 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: gpt
Disk identifier: 341AE166-2D40-49D7-B407-738DC1F92D52

Device       Start        End    Sectors Size Type
/dev/zd144p1  2048 4294967262 4294965215   2T Linux filesystem


Disk /dev/zram0: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram1: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram2: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram3: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram4: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram5: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram6: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram7: 602,3 MiB, 631595008 bytes, 154198 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

I'm having low peformance, I would really appreciate any help
 
Sorry, I've not looked at the post again, you already provided the information.

Your ZFS starts as 2048 (so 1 MB, which is a multiple of 4K), so it's totally fine.

So what might be causing a low peformance issue? Wherever I made a big copy (50 GB or more) the copy speed drops suddenly, and after a while it goes back again, this repeats over and over. And also some of my KVM servers start to have low peformance too.

Could you help me ?
 
So what might be causing a low peformance issue?

It's correct, so no. The original question is therefore solved.

Wherever I made a big copy (50 GB or more) the copy speed drops suddenly, and after a while it goes back again, this repeats over and over. And also some of my KVM servers start to have low peformance too.

That's another problem and not so uncommon. Most of the time it's based on not so optimal hardware. Could you please post the output of zpool list and zpool status -v?
 
@LnxBil thanks for helping out, here is the output

Could you please post the output of zpool list

Code:
# zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
rpool  3,62T  1,73T  1,90T         -    16%    47%  1.00x  ONLINE  -

and zpool status -v

Code:
# zpool status -v
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 11h28m with 0 errors on Sun Sep 22 13:28:09 2019
config:

    NAME                                 STATE     READ WRITE CKSUM
    rpool                                ONLINE       0     0     0
      mirror-0                           ONLINE       0     0     0
        sda3                             ONLINE       0     0     0
        sdb3                             ONLINE       0     0     0
      mirror-1                           ONLINE       0     0     0
        sdc                              ONLINE       0     0     0
        ata-ST2000DM008-2FR102_WFL11A52  ONLINE       0     0     0

errors: No known data errors
 
Okay, no obvious pitfalls.

What you can try is to copy the file and while it is running monitor your pool io with zpool iostat rpool 5 and arc usage with arcstat 5 and post back.
 
  • Like
Reactions: batijuank
Most of the time it's based on not so optimal hardware

This is ST2000DM008 (I'm using 4 of them in a RAID10 configuration) datasheet and product manual
I'm using this HDD because datasheet said this HDD have 4096 bytes per sector and 220MB/s Max. Sustained Transfer Rate OD.

Motherboard is an ASUS P6T
CPU is Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
6 Kingston x4GB = 24 GB of RAM datasheet
 
Last edited:
The most obvious reason to your bad performance is that you use ZFS against recommendation. A ZFS filesystem should always be made of hole disks and not partitions. I would add another disk for system and boot partition and then recreate the ZFS filesystem on the 4 hole disks.
 
The most obvious reason to your bad performance is that you use ZFS against recommendation. A ZFS filesystem should always be made of hole disks and not partitions. I would add another disk for system and boot partition and then recreate the ZFS filesystem on the 4 hole disks.

This ZFS layout was the result of Proxmox installation. I don't remember an option in Proxmox installation to do what you are telling me, could you please elaborate more on your suggestion ?
 
you probably expect too much from a COW file system on spinning disks. 24GB of RAM is also not very much for ZFS+KVM usage.. but you can troubleshoot further with "iostat", likely your disks are just busy with random access to meta data..
 
  • Like
Reactions: batijuank
This ZFS layout was the result of Proxmox installation. I don't remember an option in Proxmox installation to do what you are telling me, could you please elaborate more on your suggestion ?
Just choose other disks for system under installation. After installation you can create your data pool on your disks assigned for that purpose.
 
Just choose other disks for system under installation. After installation you can create your data pool on your disks assigned for that purpose.

@mir do you mean that using the default PVE and ZoL setup with grub+efi partition is not good or could lead to performance problems?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!