Custom zfs arguments during installation?

Sakis

Active Member
Aug 14, 2013
121
6
38
I know that we can change simple ext4 arguments in Proxmox.4 iso installer. Is there any commands for zfs options? I am interested in size and ashift values. Apparently i have the same problem with: https://forum.proxmox.com/threads/zfs-change-ashift-after-pool-creation.25613/

Software raid is not supported
I dont want to install from Debian since Proxmox team doesn't suggest this.

I recently did several installations (around 20) with default zfs options and already (1 month up) i have 5 brand new ssd disk failures! I can only blame ashift values. The disks that broke are at nodes/pools that i used for a couple of vm backups at local storage. One node had both ssds fail!

root@prox_n3:~# zpool get all | grep ashift
rpool ashift 12 local
root@prox_n3:~# hdparm -I /dev/sdb | grep Sector
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
 
ashift=12 is standard on zfsonlinux. It's there for a good reason and a design change that was not "light headed".

It has nothing to do with SSD failing. Perhaps you can tell us more on what SSD you're using and what type of workload you have on them.

When you say failed, what do you mean ? Did they failed physically or did they read end of life (max TBW) ?
 
Here are some specs of my hardware

Server is Supermicro 1028U-TN10RT+ and the disks are 2 x SSD-DM064-PHI. These are SATA DOM (superDOM) disks that used for boot OS. You gain 2 slots space in front for disks. They use the following controller that is quite popular: http://www.phison.com/English/newProductView.asp?ID=239&SortID=63

The only high intensive disk activity i did (except 2-3 OS reinstalls) was 4 backups of an 40G kvm hosted in a ceph cluster and then 5 to 7 restores to the same cluster. Ceph nodes are different machines from the compute ones. I never reached more than 80% usage of the disk during backups.

I know that zfs ashift value doesn't hurt them a lot, but it needs to be alligned with sectors. I may had low iops if i used this as a storage for vms.

The disks are physically damaged. Even bios don't show them. Slots seem "free" except of one that it is shown as a 20MB disk!
I used them no more than 20 days.
 
You should use your warranty with supermicro. There's nothing zfs could do to break the disks. Why do you think that it was shift in the first place ? Why do you think than NTFS, Ext4 or other filesystems will write in anything but 4k sectors ?

And BTW, SSD have physical block size ranging in the 16K to 128K depending on their flash memory and the exact way it's arranged. They're still doing extensive gimmicks to emulate 512 and 4K sectors to the upper layers... so no... it was not ashift=12 that killed them.

BTW: It is "automagically" aligned by default.

Might have been bad firmware or bad batch or both, but they're supposed to last way longer than what you got out of them, under much worse conditions, so I vote for a bad batch.

Also, you did not give me the TBW value of the disks. It will tell you something. If possible get it from the other disks that are still alive.

Cheers.
 
Last edited:
Hey, I was a little busy. Unfortunately 3 more disks failled from my last reply. Here are some results from different machines

In this server sdb is failled. sda is keep running
Code:
root@prox_n7:/var/log# awk '/sd/ {print $3"\t"$10 / 2 / 1024}' /proc/diskstats
sda    82472.2
sda1    0
sda2    82472.2
sda9    0
sdb    53382.4
sdb1    0
sdb2    53382.4
sdb9    0

in a ok server
Code:
root@prox_n3:~# awk '/sd/ {print $3"\t"$10 / 2 / 1024}' /proc/diskstats
sda    164347
sda1    0
sda2    164347
sda9    0
sdb    164347
sdb1    0
sdb2    164347
sdb9    0

in another ok server with less usage
Code:
root@prox_n6:~# awk '/sd/ {print $3"\t"$10 / 2 / 1024}' /proc/diskstats
sda    93906.4
sda1    0
sda2    93906.4
sda9    0
sdb    93906.4
sdb1    0
sdb2    93906.4
sdb9    0

smartctl on prox_n6
Code:
root@prox_n6:~# smartctl -a /dev/sdb
smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.2.6-1-pve] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     SATA SSD
Firmware Version: S9FM02.1
User Capacity:    64,023,257,088 bytes [64.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      MicroSSD
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Mar 18 09:27:17 2016 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       658
12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       12
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
169 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       50
170 Unknown_Attribute       0x0013   100   100   010    Pre-fail  Always       -       31
173 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       13238285
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0023   070   070   000    Pre-fail  Always       -       30
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
218 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
231 Temperature_Celsius     0x0013   100   100   000    Pre-fail  Always       -       99
233 Media_Wearout_Indicator 0x0013   100   100   000    Pre-fail  Always       -       535935
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       287687
242 Total_LBAs_Read         0x0012   100   100   000    Old_age   Always       -       6072
246 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       114597

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%       658         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I contacted Supermicro. We are in an RMA procedure but no reply for them what could actually cause it.

As i search more and more about zfs i also conclude that it shouldn't be the main reason. But with the rate that disks fail i will probably have to change them all just for sure.

I attached some logs from the node that one disk is still running but the other failled.
 

Attachments

I can't see any evidence that you did something wrong. There brand new too (in terms of TBW). Probably a bad batch or bad product.
 
Searching the bios settings i saw that there is an option at sata controller that you choose for the drives that appear connected,
Hard disk drive or Solid state disk. I am not sure what it changes but i will choose Solid state and reboot all machines.

Luckily there is also hardware Raid available for this controller (enabling it was not obvious at first glance) and i will evaluate this option too. Although gparted live cd, centos etc can see the raid volume0, proxmox installation iso can only detect /dev/sd[ab]... but this is irrelevant with this thread. pfff