Really slow sequential write with 8 x 10kRPM SAS disks on HW raid

mailinglists · Nov 23, 2016

Hi guys,

on HP DL 580 G5 with p800 BBC HW RAID 10 setup using 8 x 10kRPM SAS disks on 4.3 (4.3-9/f7c6f0cd (running kernel: 4.4.21-1-pve)) and ext4 on thin LVM default install, we get really poor disk write speeds in host (as well on VMs).

Here's a few simple tests which show write speed of around 70MB/s !?! inside host (not from VM):

Code:

fio --filename=brisi --sync=1 --rw=write --bs=10M --numjobs=1 --iodepth=1 --size=2000MB --name=test
...
Run status group 0 (all jobs):
  WRITE: io=2000.0MB, aggrb=69567KB/s, minb=69567KB/s, maxb=69567KB/s, mint=29439msec, maxt=29439msec
Disk stats (read/write):
    dm-0: ios=0/2668, merge=0/0, ticks=0/209376, in_queue=209904, util=87.09%, aggrios=126/2423, aggrmerge=0/261, aggrticks=272/210032, aggrin_queue=210300, aggrutil=87.07%
  cciss!c0d0: ios=126/2423, merge=0/261, ticks=272/210032, in_queue=210300, util=87.07%

dd if=/dev/zero of=brisi bs=100M count=30 oflag=dsync
30+0 records in
30+0 records out
3145728000 bytes (3.1 GB) copied, 36.3609 s, 86.5 MB/s

pveperf 
CPU BOGOMIPS:      115201.14
REGEX/SECOND:      913465
HD SIZE:           32.36 GB (/dev/dm-0)
BUFFERED READS:    360.37 MB/sec
AVERAGE SEEK TIME: 4.55 ms
FSYNCS/SECOND:     2600.34
DNS EXT:           34.94 ms
DNS INT:           1.62 ms

I was expecting sequential write speeds around 400MB/s and read around 800MB/s.

Hw controller is configured as write back.

Code:

=> controller slot=8 show detail
Smart Array P800 in Slot 8
  Bus Interface: PCI
  Slot: 8
  Serial Number: X
  Cache Serial Number: X
  RAID 6 (ADG) Status: Enabled
  Controller Status: OK
  Hardware Revision: E
  Firmware Version: 7.22
  Rebuild Priority: Medium
  Expand Priority: Medium
  Surface Scan Delay: 15 secs
  Surface Scan Mode: Idle
  Queue Depth: Automatic
  Monitor and Performance Delay: 60  min
  Elevator Sort: Enabled
  Degraded Performance Optimization: Disabled
  Inconsistency Repair Policy: Disabled
  Wait for Cache Room: Disabled
  Surface Analysis Inconsistency Notification: Disabled
  Post Prompt Timeout: 0 secs
  Cache Board Present: True
  Cache Status: OK
  Cache Ratio: 25% Read / 75% Write
  Drive Write Cache: Disabled
  Total Cache Size: 512 MB
  Total Cache Memory Available: 456 MB
  No-Battery Write Cache: Disabled
  Cache Backup Power Source: Batteries
  Battery/Capacitor Count: 2
  Battery/Capacitor Status: OK
  SATA NCQ Supported: True

I have no idea what else to check. I have disabled barriers in FS:

Code:

cat /etc/fstab  | grep ext4
/dev/pve/root / ext4 errors=remount-ro,barrier=0 0 1

Any ideas would be greatly appreciated.

mailinglists · Nov 23, 2016

Read speed seems acceptable:

Code:

# /sbin/sysctl -w vm.drop_caches=3
# dd if=brisi of=/dev/null bs=10M
300+0 records in
300+0 records out
3145728000 bytes (3.1 GB) copied, 7.87702 s, 399 MB/s

fireon · Nov 23, 2016

LVM Thin is not the best for really old servers: https://forum.proxmox.com/threads/r...pgarded-server-pve4-lvm-thin-very-slow.30341/

sumsum · Nov 23, 2016

Enable Drive Write Cache and you will get better performance.

Rhinox · Nov 24, 2016

sumsum said:
Enable Drive Write Cache and you will get better performance.

I would not recommend to enable disk-cache for write-ops on production server, unless having very solid and reliable ups.

mir · Nov 24, 2016

Rhinox said:
I would not recommend to enable disk-cache for write-ops on production server, unless having very solid and reliable ups.

Why would you do that considering OP's controller contains a working BBU?

Rhinox · Nov 24, 2016

mir said:
Why would you do that considering OP's controller contains a working BBU?

BBU (or super-capacitor) attached to controller is protecting only controller-cache, NOT hard-drive (or SSD) cache! Once data is sent by controller to disk, there is no way for controller to know if data are in disk-cache, or already written to disk. If disk is using its own cache even for write-ops, it reports data as "written" at the moment when they are in disk-cache.

And that is exactly the reason why some raid-controllers automatically turn disk-cache for write-ops off: to avoid data corruption in case of power loss.

mailinglists · Nov 24, 2016

Hi guys. Thank you for your replys.

As you have already figured out, I have had enabled write cache on BBU RAID card and I will keep write cache on the drives themselves disabled, even thou we have parallel UPS cluster with diesel engine backing.

Code:

  Cache Status: OK -> Means cache on RAID card.
  Cache Ratio: 25% Read / 75% Write
  Drive Write Cache: Disabled -> Means cache on disks themselves.

I did test the install a bit more and what i found it is that if I write directly to a partition, i get around 170MB/s average sequential write speed. As soon as I put ext3, ext4 on it, it slows down to around 80MB/s. I will continue to investigate. All ideas and suggestions are welcome.

Code:

# mount | grep data1
/dev/cciss/c0d0p4 on /data1 type ext3 (rw,relatime,data=ordered)
# dd if=/dev/zero of=/data1/brisi bs=100MB count=10 oflag=dsync
10+0 records in
10+0 records out
1000000000 bytes (1.0 GB) copied, 13.7555 s, 72.7 MB/s
# umount /data1
# dd if=/dev/zero of=/dev/cciss/c0d0p4  bs=100MB count=10 oflag=dsync
10+0 records in
10+0 records out
1000000000 bytes (1.0 GB) copied, 5.9946 s, 167 MB/s

sumsum · Nov 24, 2016

Considering, that you are in testing Phase. What Speed so you get if you enable Drive Write Cache?

mailinglists · Nov 24, 2016

During my tests I will also do that and report back. If you want me to do other tests, just ask. Because I have other production related work, it might not happen instantly.

mailinglists · Nov 24, 2016

Uf.. Well ... I looked at the RAID setup first before debugging Linux any further .. and ... I'm kinda ashamed to say, but the server has had RAID 6 setup and not RAID 10 as I thought.

Results are as expected for RAID 6 with 8 x 300 GB SAS disks.
Sorry for wasting your time guys. :-(

Search

Search

Really slow sequential write with 8 x 10kRPM SAS disks on HW raid

mailinglists

Renowned Member

mailinglists

Renowned Member

fireon

Distinguished Member

sumsum

Renowned Member

Rhinox

Active Member

mir

Famous Member

Rhinox

Active Member

mailinglists

Renowned Member

sumsum

Renowned Member

mailinglists

Renowned Member

mailinglists

Renowned Member

We value your privacy