HW Raid performance query: LSI 3008

Discussion in 'Proxmox VE: Installation and configuration' started by fortechitsolutions, Apr 19, 2019.

  1. fortechitsolutions

    Joined:
    Jun 4, 2008
    Messages:
    328
    Likes Received:
    12
    Hi,

    I wonder if anyone has experience and can comment maybe.

    I've just spent some time reviewing a pair of lenovo servers, which have this HW Raid controller. 2 x identical nodes in a small proxmox cluster, proxmox 5.Latest.

    There is no problem with the controller being recognized and usable. There are 2 x Raid1 mirrors present,
    2 x 500gig SAS drives for main proxmox install
    2 x 2Tb SATA drives for /localraid mirror which is for extra VM storage, just setup as a 'directory' and configured in proxmox as such / formatted as EXT4.

    Via CLI tools, we see

    Code:
    LISTED IN output from LSPCI and in DMESG:
    
    01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3008 [Fury] (rev 02)
    
    and
    
    ServeRAID M1215 SAS/SATA Controller
    
    and checking what we see from megaclisas-status: general setup of the raid volumes / health etc:
    
    root@pve252:/etc/apt/sources.list.d# megaclisas-status
    
    -- Controller information --
    
    -- ID | H/W Model       | RAM    | Temp | BBU    | Firmware
    c0    | ServeRAID M1215 | 0MB    | 80C  | Absent | FW: 24.16.0-0082
    
    
    
    -- Array information --
    -- ID | Type   |    Size |  Strpsz | Flags | DskCache |   Status |  OS Path | CacheCade |InProgress
    c0u0  | RAID-1 |    465G |   64 KB | RA,WT | Disabled |  Optimal | /dev/sda | None      |None
    c0u1  | RAID-1 |   1817G |   64 KB | RA,WT | Disabled |  Optimal | /dev/sdb | None      |None
    
    
    
    -- Disk information --
    
    -- ID  | Type | Drive Model                                   | Size     | Status          | Speed    | Temp | Slot ID  | LSI ID
    c0u0p0 | HDD  | 9XF47RH8ST9500620NS 00AJ137 00AJ140IBM LE2B   | 464.7 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [62:2]   | 8
    c0u0p1 | HDD  | 9XF47QYJST9500620NS 00AJ137 00AJ140IBM LE2B   | 464.7 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [62:0]   | 9
    c0u1p0 | HDD  | LENOVO-XST2000NX0433 LD4AW460GL89LD4ALD4ALD4A | 1.817 TB | Online, Spun Up | 12.0Gb/s | 34C  | [62:1]   | 10
    c0u1p1 | HDD  | LENOVO-XST2000NX0433 LD48W460AM3ALD48LD48LD48 | 1.817 TB | Online, Spun Up | 12.0Gb/s | 34C  | [62:3]   | 11

    So. When I do some basic performance tests with pveperf,


    first on localraid
    then on gig-ether NFS mounted Synology storage pool

    we see

    Code:
    root@pve252:/localraid# pveperf /localraid
    
    CPU BOGOMIPS:      134392.00
    REGEX/SECOND:      2372072
    HD SIZE:           1831.49 GB (/dev/sdb1)
    BUFFERED READS:    53.70 MB/sec
    AVERAGE SEEK TIME: 15.07 ms
    FSYNCS/SECOND:     25.61
    DNS EXT:           79.67 ms
    DNS INT:           1.47 ms (prox.local)
    
    
    root@pve252:/localraid# pveperf /mnt/pve/nfs-prod-nic2
    CPU BOGOMIPS:      134392.00
    REGEX/SECOND:      2248736
    HD SIZE:           8553.00 GB (192.168.11.250:/volume3/PROD)
    FSYNCS/SECOND:     1342.24
    DNS EXT:           79.33 ms
    DNS INT:           1.54 ms (prox.local)
    root@pve252:/localraid#
    ie, we have dreadful fsyncs per second on the local raid controller. The NFS Gig Ether Synology has much better performance. Woot.

    I tweaked config of the local raid slightly,

    Code:
    root@pve252:/localraid# megacli -LDSetProp EnDskCache -LAll -aAll
    
    Set Disk Cache Policy to Enabled on Adapter 0, VD 0 (target id: 0) success
    Set Disk Cache Policy to Enabled on Adapter 0, VD 1 (target id: 1) success
    
    and then after this the fsyncs per second jumped to an awe-inspiring level of 230. So better than 26 but still pretty dreadful.

    I am curious if anyone else has banged their head against this problem before. If there is a known good workaround to try to make things less bad with a controller like this? Clearly by design this controller has no battery and no good-proper controller cache. It is I believe an entry-level raid controller. But to have such utterly dreadful performance - seems like there is something wrong, not just 'mediocre' performance.


    PlanB for this thing is (a) migrate all VMs from Host2 onto Host1 (there are 2 nodes in a cluster here) (b) Install new raid controller, mid-range unit which has BBU and cache (c) Blow away old config on host, setup new raid / controller (d) Migrate everything to this host, upgrade the other proxmox node in similar manner, then once finished re-balanced VMs across the 2 x proxmox nodes.

    Any comments or feedback are greatly appreciated.

    Thanks!

    Tim
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice