ext4 + barriers=0: to do or not to do ?

jinjer

Renowned Member
Oct 4, 2010
204
7
83
Hi,I was investigating a difference I'm seeing in two proxmox servers I work on. They have very similar hardware (sata disks in soft-raid, proxmox 3.1, similar specs).One of them has ext3 as the filesystem, while the other is ext4 (the beefier machine). The ext3 server had 560 fsyncs/sec The ext4 server had 51 fsyncs/sec (ouch). I remembered the ext4 barriers issue and mounted with barrier=0. The result was an astonishing 1171 fsync/sec. So, the default proxmox partition is ext3 which supports no barriers, while the ext4 has barriers by default. I am thinking about using ext4 with no-barriers. It should be no worse than using the default ext3 everybody has been using for a decade. Sure, it's only soft-raid, but on a server hosted in a web farm, so power supply should be no problem. Do you think this is a bad idea? jinjer ADDED: There is a problem with new lines on most of my posts, using different browsers: Is this a vbulletin problem or mine?
 
Last edited:
Sure, it's only soft-raid, but on a server hosted in a web farm, so power supply should be no problem. Do you think this is a bad idea?

You should only use nobarrier if you have means to ensure data integrity on power outages (BBU on HW RAID, for instance) - or if you simply don't care. I have experienced multiple power outages affecting various hosting companies and wouldn't count on power availability. 51 fsyncy/sec does seem too slow, however. Did you optimize the stride/stipe-width already?

Cheers,
Stephan
 
Since kernel 2.6.39 ext4 has a default setting of nobarrier due to a completely different VFS implementation. Redhat has backported this patch to there 2.6.32 kernel so using nobarrier for ext4 is as safe or unsafe as using nobarrier on ext3. Since Proxmox team has changed default mount options for ext3 to nobarrier this means using it for ext4 should be equally safe as well. The reason why this not the case for ext4, IMHO, is that the official proxmox install uses ext3.
 
you should use ext3 as this is supported and recommended by proxmox;
i had ext4 on one host which was running fine for about 4 month and then i run into problems that backups did not work, they hang until reboot of the host - there were some ext4 issues in the syslog and i switched back to ext3
 
You should only use nobarrier if you have means to ensure data integrity on power outages (BBU on HW RAID, for instance) - or if you simply don't care. I have experienced multiple power outages affecting various hosting companies and wouldn't count on power availability. 51 fsyncy/sec does seem too slow, however. Did you optimize the stride/stipe-width already?

Cheers,
Stephan

I have not optimized stripe/stripe as this is a single raid1 on two disks which are only used for booting proxmox and host some service containters (monitoring and such). All the real VM reside on a separate raid array.

The raid1 array should be able to do around 100IOPS, so 50 fsyncs means two writes per fsync: one journal and one real (provided caching is not enabled). Does that look right?
 
Since kernel 2.6.39 ext4 has a default setting of nobarrier due to a completely different VFS implementation. Redhat has backported this patch to there 2.6.32 kernel so using nobarrier for ext4 is as safe or unsafe as using nobarrier on ext3. Since Proxmox team has changed default mount options for ext3 to nobarrier this means using it for ext4 should be equally safe as well. The reason why this not the case for ext4, IMHO, is that the official proxmox install uses ext3.
Thanks for sharing this information. I'll keep ext4 with no-barriers then. After all this is supposed to be a cluster, and one node can fail and still not affect the rest.
 
Another tip. If you are using SSD's dont use the mount option discard. Using mount option discard disables all write cache as well as make two real writes per flush, and flush will be used for all writes. The penalty is degration by a factor 10 in performance. Instead install a cron script like this:
Code:
cat /etc/cron.daily/fstrim 
#!/bin/sh


PATH=/bin:/sbin:/usr/bin:/usr/sbin


ionice -n7 fstrim -v /


ionice -n7 fstrim -v /var/lib/vz
 
Nice :)

Another tip. If you are using SSD's dont use the mount option discard. Using mount option discard disables all write cache as well as make two real writes per flush, and flush will be used for all writes. The penalty is degration by a factor 10 in performance. Instead install a cron script like this:
Code:
cat /etc/cron.daily/fstrim 
#!/bin/sh


PATH=/bin:/sbin:/usr/bin:/usr/sbin


ionice -n7 fstrim -v /


ionice -n7 fstrim -v /var/lib/vz
 
Another tip. If you are using SSD's dont use the mount option discard. Using mount option discard disables all write cache as well as make two real writes per flush, and flush will be used for all writes. The penalty is degration by a factor 10 in performance. Instead install a cron script like this:
Code:
cat /etc/cron.daily/fstrim 
#!/bin/sh


PATH=/bin:/sbin:/usr/bin:/usr/sbin


ionice -n7 fstrim -v /


ionice -n7 fstrim -v /var/lib/vz

Thanks mir for the hint, but when I've done as suggested - I run into this issue output, like:
Code:
fstrim: /: FITRIM ioctl failed: Operation not supported

Are there any ideas on how to fix it?
Thanks an advance!

Just for your Reference 'pveversion' putput:
Code:
# pveversion -v
proxmox-ve-2.6.32: 3.2-121 (running kernel: 2.6.32-27-pve)
pve-manager: 3.2-1 (running version: 3.2-1/1933730b)
pve-kernel-2.6.32-27-pve: 2.6.32-121
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-15
pve-firmware: 1.1-2
libpve-common-perl: 3.0-14
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-4
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
 
Last edited:
You are sure you are using ext4 as filesystem?

Past the response from: cat /proc/mounts
Code:
# cat /proc/mountssysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=4097993,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=3280388k,mode=755 0 0
/dev/disk/by-uuid/f42d2c8b-db58-44b4-8464-e4ef63f26619 / ext4 rw,noatime,errors=remount-ro,commit=200,barrier=0,data=ordered 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,noatime,size=5120k 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=8237660k 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
/dev/md1 /boot ext2 rw,relatime,errors=continue,user_xattr,acl 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev,noatime 0 0
tmpfs /var/tmp tmpfs rw,nosuid,nodev,noatime 0 0
tmpfs /var/spool/postfix tmpfs rw,nosuid,nodev,noatime 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
/dev/fuse /etc/pve fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
beancounter /proc/vz/beancounter cgroup rw,relatime,blkio,name=beancounter 0 0
container /proc/vz/container cgroup rw,relatime,freezer,devices,name=container 0 0
fairsched /proc/vz/fairsched cgroup rw,relatime,cpuacct,cpu,cpuset,name=fairsched 0 0

Just in case may help more info on fstab:
Code:
# cat /etc/fstabproc /proc proc defaults 0 0
/dev/md/0 none swap sw 0 0
/dev/md/1 /boot ext2 defaults 0 0
/dev/md/2 / ext4 noatime,barrier=0,errors=remount-ro,commit=200 0 1
tmpfs /tmp tmpfs nodev,nosuid,noatime,mode=1777 0 0
tmpfs /var/tmp tmpfs nodev,nosuid,noatime 0 0
tmpfs /var/lock tmpfs nodev,nosuid,noatime 0 0
tmpfs /var/spool/postfix tmpfs nodev,nosuid,noatime 0 0

All that setup runs under RAID-1 as mdadm on SSD
 
Yes, run the command as root and also made 'fstrim' script file executable:
Code:
# chmod a+x /etc/cron.daily/fstrim
# /etc/cron.daily/fstrim
       fstrim: /: FITRIM ioctl failed: Operation not supported
 
Last edited:
Ok, thank you for link provided.

Is there a chance that other TRIM-option like 'discard' at /etc/fstab will work in mdraid instead?
 
Last edited:
Let me ask for appologies that my further question might be going beyond the initial topic starter:
- What TRIM-option would you suggest to be enabled for Proxmox 3.2 Host bearing in mind that it has 2xSSD running in 'mdadm' software RAID-1 based on ext4+barriers=0?
 
Last edited: