Suggestions for SAN Config

Discussion in 'Proxmox VE: Installation and configuration' started by axion.joey, Feb 24, 2016.

  1. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    We use the DCS3510's in some of our servers. We did have one go bad after two weeks. But besides that they've been great.
     
  2. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    We've used both FreeNas and Nas4Free. Both have been solid. I also helped a buddy deploy an IXSystems solution in a DC and those guys were great to work with.

    SSD's are purely a reliability play. We're based on California, but our data centers are spread throughout the country. While we can always use the DC's remote hands service, we prefer to deploy systems that will be as reliable as possible. And since we don't need a lot of storage, SSD's aren't cost prohibitive.
     
  3. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,466
    Likes Received:
    94
    Before settling on FreeNas/Nas4Free you should consider Omnios + Napp-it too. You get native in-kernel ZFS and an iSCSI implementation which is more stable, scaleable and outperforms both istgt and cld. The integration with Proxmox is also better than the two others.
     
  4. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    Thanks Mir. Napp-it looks like another good option. We've had it running on our labs, but never put it into production. Do you run it in production with OpenVZ or LXC?
     
  5. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,466
    Likes Received:
    94
    I have a 3.4 in production and a 4.1 in lab.

    For OpenVZ and LXC you simply export a dataset through NFS and you a good to go. For KVM use Zvol (ZFS_over_iSCSI: Supports snapshots and (linked) clones. Also supports thin provisioned Zvols)
     
  6. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,466
    Likes Received:
    94
    Forgot to mention: If you use thin provisioned Zvols you should choose scsi disks and virtio-scsi controller because comstar (the iscsi daemon in Omnios) supports the scsi unmap command. This means the trim command is honored by comstar and therefore trimmed blocks will be released to the pool from the Zvol.
     
  7. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    Thanks again Mir. Do you use redundant NAS's
     
  8. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,466
    Likes Received:
    94
    No. UPS is enough for me.
     
  9. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    That seems to be the common consensus. Obviously we'll build the NAS to be as reliable as possible, but it seems like the component that would be the single point of failure to the entire cluster should be redundant.
     
  10. hec

    hec Member

    Joined:
    Jan 8, 2009
    Messages:
    200
    Likes Received:
    5
    Sorry but what does your UPS help if you have fire or something else

    remember murphy's law so always have the data mirrored to an other datacenter

    how you will do firmware upgrade or something else on storage system where you need to reboot the controller? I don't think the solution is to stop all vms on all hosts. Simply switch to other datacenter and do maintaince on the local storage controller.
     
  11. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,466
    Likes Received:
    94
    For redundancy check this out: http://www.znapzend.org/
     
  12. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    Checked it out. Not a lot of documentation. And no one on the IRC channel. Has anyone ever been successful getting OpenVZ/LXC working on Gluster?
     
  13. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1

    Here's some io info from our busiest production server:

    root@proxmox:~# iostat -d -x 5 3
    Linux 2.6.32-39-pve (proxmox) 02/26/2016 _x86_64_ (24 CPU)

    Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
    sda 0.03 670.24 1.16 501.29 15.81 8258.38 32.94 0.81 1.62 6.30 1.61 0.10 5.27
    dm-0 0.00 0.00 0.01 11.72 0.16 46.88 8.02 0.03 2.14 10.56 2.13 0.07 0.08
    dm-1 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 1.13 1.11 1.13 0.07 0.00
    dm-2 0.00 0.00 1.18 1159.83 15.64 8211.50 14.17 0.46 0.39 6.57 0.39 0.04 5.21

    Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
    sda 0.00 474.20 0.00 390.60 0.00 7222.40 36.98 0.05 0.12 0.00 0.12 0.06 2.34
    dm-0 0.00 0.00 0.00 1.80 0.00 7.20 8.00 0.00 0.00 0.00 0.00 0.00 0.00
    dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    dm-2 0.00 0.00 0.00 863.00 0.00 7215.20 16.72 0.13 0.15 0.00 0.15 0.03 2.40

    Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
    sda 0.00 492.00 0.00 452.00 0.00 7472.00 33.06 0.26 0.57 0.00 0.57 0.05 2.42
    dm-0 0.00 0.00 0.00 122.60 0.00 490.40 8.00 0.38 3.13 0.00 3.13 0.02 0.20
    dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    dm-2 0.00 0.00 0.00 821.40 0.00 6981.60 17.00 0.15 0.19 0.00 0.19 0.03 2.34
     
  14. Q-wulf

    Q-wulf Active Member

    Joined:
    Mar 3, 2013
    Messages:
    593
    Likes Received:
    28
    Q: Have you checked how much of your 8-12 TB data is Cold-Data and how much is Hot data ? Might make a big difference in terms of Cache-Sizing or the need of Raid-Levels on the NAS (Raiz2 / Raid10)




    Am i reading this correctly ? You are doing <=600 Writes /s ? and no reads ?
    That would mean your 3-Node Cluster does less then 2k Writes/Reads, right ?



    The following is assuming you are doing the typical 5% ultra-Hot, 5% hot and 90% Cold-Data.

    If this is true, i'd do a self build ZFS based NAS or Gluster setup with enough HDD's in RaidZ2 (min. 5x4 TB for your 12TB Goal) and dedicated SSDs for L2Arc and ZIL. I'd add at least 2 Spare HDD's and a minimum of 2 SSD for ZIL and L2Arc (256+ GB - no need to go crazy here) and make it all consumer-grade. This should be MORE then enough performance for what your iostat indicates and allow for growth in terms of additional IO.

    Not having 24/7 easy access to the server and having to rely on DC support staff, might make me a bit weary tho, which might convince me to shell out the $$$ for enterprise-grade SSD', but even then i'd rather go with add additional spare SSD's and maybe a couple more spare HDD's. Consider that ZIL and L2Arc are adaptive, meaning that if they are not present, your setup will still work, but will only be able to rely on your RAM and unable to offload "overflow" to the ZIL / L2Arc.



    In short you want at least:
    2x OS-Drive
    7x 4TB Disk (5x4TB for 12 GB goal with RaidZ2 + 2 spares)
    2x 512GB SSD (so you can have one fail for Zil and l2arc)
    Appropriate Sized RAM
    should all fit into a 12-bay 2HE unit.


    Or (if you want more redundancy in lieu of Enterprise-Grade Drives):
    2x OS-Drive
    10x 4TB HDD (12 GB RaidZ2 + 5 Spares - or any other config)
    4x 512GB SSD (2x for Zil, 2x for L2arc)
    Appropriate Sized RAM
    Fits a 16-bay 3HE Case.


    should still be cheaper then going all enterprise gear or even all enterprise Flash based.



    Personal Note: we have all our servers in local datacenters (3), operated by our own staff (24/7/365), so we can switch failed drives rather quickly our selfs. We also rely heavily on redundancy in terms of additional servers and think of "pod-redundancy" instead of single -component redundancy. Thereby accepting multiple servers down at the same time due to failed Disks or PSU or network cards. Thats why we always go Consumer-Grade rather then enterprise-gear, dual PSU's, etc. A lot cheaper at scale. So keep that in mind when reading my suggestions :)
     
    #34 Q-wulf, Feb 29, 2016
    Last edited: Feb 29, 2016
  15. /Nils

    /Nils New Member

    Joined:
    Feb 5, 2016
    Messages:
    6
    Likes Received:
    0
    While we're throwing out suggestions ... Have you had a look at Open-E's cluster setup?
    It's basically DRBD, but then supported.

    I've been running it as redundant backing for VMware, because it's supported by them too. Rock solid!
     
  16. axion.joey

    axion.joey Member

    Joined:
    Dec 29, 2009
    Messages:
    73
    Likes Received:
    1
    Hi Nils. Open-E looks interesting. Do you connect to it over ISCSI?
     
  17. /Nils

    /Nils New Member

    Joined:
    Feb 5, 2016
    Messages:
    6
    Likes Received:
    0
    Yes, we use it as iSCSI solution. I think it also supports FC.

    Their new version of the OS uses ZFS as backend, but we havent deployed that yet, since we'd like to see site-resilience working with that setup first.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice