ZFS or BTRFS for Proxmox 3.1 ?

Discussion in 'Proxmox VE: Installation and configuration' started by jinjer, Oct 10, 2013.

  1. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    I would like to create a self-contained node for proxmox and need to choose between ZFS and BTRFS for the local storage.

    I normally use ZFS for it's snapshot capabilities, but am not sure of it's a viable solution for a self-contained proxmox node.

    I would try to keep both the root and the storage on the same disks, and this is a little no-no for zfs (it likes to use whole disks).

    Has anyone tried booting proxmox from a ZFS multidisk storage? This is a grub issue, but perhaps also an initrd issue for proxmox.

    On the other hand, btrfs can be booted from (altough with the same issues as zfs) but I'm not sure how reliable it is.

    Any suggestions are welcome.

    jinjer.
     
  2. MatthiasF

    MatthiasF New Member

    Joined:
    Oct 8, 2013
    Messages:
    5
    Likes Received:
    0
  3. p3x-749

    p3x-749 Member

    Joined:
    Jan 19, 2010
    Messages:
    103
    Likes Received:
    0
    ...if you only have that *one* and single disk, you could try and setup LVM on that disk first.
    You may need to install Proxmox via the Debian install method ...not sure if wheezy installer nowadays does support booting from lvm though.

    second option is to employ a small USB stick for boot
     
  4. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,480
    Likes Received:
    96
    The ISO installer will by default make and LVM based installation. Regarding Debian then boot from LVM has been supported since Squeeze or Lenny.
     
  5. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    Well,

    something I did not think about is that I can always use a smaller partition and install proxmox on only two disks (softraid and lvm).

    Then I can use the rest for a mirrored vdev of the zfs pool.

    The rest of the disks can be used as raw disks in the zfs as other vdevs.
     
  6. p3x-749

    p3x-749 Member

    Joined:
    Jan 19, 2010
    Messages:
    103
    Likes Received:
    0
    True, but the boot partition will not be LVM based and AFAIU the OP, zfs as the one-and-only filesystem is the requirement here.
     
  7. p3x-749

    p3x-749 Member

    Joined:
    Jan 19, 2010
    Messages:
    103
    Likes Received:
    0
    ...this is not a real big deal.
    Based on your first post, I've gathered that you only want to employ a single disk.
    If you're happy with /boot being *not* on zfs, use the Debian way and use zfs for all filesystems on LVM, including root and storage
     
  8. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    Sorry... I just need to redo the /var/lib/vz mount as I need using what is left from the original lvm partition and add the rest of the disks.

    BTW, is there a way to install proxmox directly on top of a raid-1 setup without the need to manually hack it after installation?
     
  9. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,429
    Likes Received:
    298
    No problem if you use a HW raid controller.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  10. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    I'm a die hard that hates hardware raid on linux :)

    Hey... I don't seem to find pve-headers for the pve-kernel and those are needed to install zfs on proxmox 3.1. Any hints?
     
  11. p3x-749

    p3x-749 Member

    Joined:
    Jan 19, 2010
    Messages:
    103
    Likes Received:
    0
    -> instructions as per wiki did not work? -> http://pve.proxmox.com/wiki/ZFS
    Edit: bummer!...what a coincidence..looks like you're not the only one... -> http://forum.proxmox.com/threads/16326-pve-headers-install-problem
    Edit2: check your /etc/sources/list ..pve-repos might be missing
     
    #11 p3x-749, Oct 10, 2013
    Last edited: Oct 10, 2013
  12. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    yes, pve was missing from the repo. fixed now.
     
  13. symmcom

    symmcom Active Member

    Joined:
    Oct 28, 2012
    Messages:
    1,062
    Likes Received:
    16
    CEPH is pretty much out of question since you are shooting for only one node setup as far as i can tell. In order to setup CEPH you need more than one node or else it is just plain waste of time with hardly any benefit. For single node setup, ZFS cant be beat. I am not sure how mission critical your setup is, but the fact that you are going for Single node Proxmox tells me redundancy is not a big issue in your case. A simple headache less setup would be putting Proxmox on 1 SSD or 2 SSds if you need somewhat redundancy. Then use rest of the local HDDs to create ZFS volume. ZFS is resilient enough to tackle just about any disaster on a single node.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  14. SamTzu

    SamTzu Member

    Joined:
    Mar 27, 2009
    Messages:
    356
    Likes Received:
    6
    Meanwhile ZFS announced that it was ready for production [theregister.co.uk] on Linux only this spring (March) so it's not like it's that old and stable like people like to think.
    Every year I hate LVM more and more and hope btrfs will finally become of age (ie. announce it's stable) but I fear they are not interested to do so easily.
    The Better File System (btrFS) has dedup which would be a great help in reducing storage need, since most of OpenVZ containers are built with mostly same files.
    Fact is hard-drives cost money and we are not all rich - so I for one would love to see solutions that save me money. There is an interesting thread about new file-systems here.
    And I really hate the way LVM reboots after it's been used for a while. Who's idea was it that it's acceptable for a server to reboot for an hour? Specially when it's a headless server and you can't really see what is taking so long.
     
    #14 SamTzu, Oct 11, 2013
    Last edited: Oct 11, 2013
  15. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    I ended up making a node with 8 x 1TB 2.5 hard disks spinning at 7200rpm.
    proxmox was installed on two of these hard disks by manually partititioning and using only 64GB per disk. I them manually converted the install to md linux soft raid1.
    The rest of the disks became the first vdev of the mirrored striped zfs pool. That is about 3.5 formatted capacity.

    The node is mission critical, but down time is a minor issue. I have a secondary backup node that is receiving daily snapshots of all the zfs filesystems.

    BTW: I must look at ceph for a more distributed solution. I can't make my mind whether to use ceph or glusterfs performance-wise.
     
    #15 jinjer, Oct 16, 2013
    Last edited: Oct 16, 2013
  16. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,480
    Likes Received:
    96
    It is not recommended to share ZFS disks with other file systems as this can lead to performance issues and data loss.
    "ZFS can use individual slices or partitions, though the recommended mode of operation is to use whole disks"
    "For pools to be portable, you must give the zpool command whole disks, not just slices, so that ZFS can label the disks with portable EFI labels. Otherwise, disk drivers on platforms of different endianness will not recognize the disks."
    http://www.manpagez.com/man/8/zpool/
     
  17. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    Thank you for pointing that out. My take is that some of this information is not current.

    The issue endianness of different platforms is only theoretical. I don't plan to mount this pool on some motorola or risc hardware, as I don't plan mounting it in solaris too. I'm quite sure I can mount it on solaris amd64.

    Regarding the issue of performance, it's a non-issue too. The proxmox base distro is eating around 10iops from the disks, which is not a problem with the 800+ iops from the 8 disks. This is only theoretical. Real world bonnie++ performance gives around 400 seeks/sec and about 480mb read, 380 mb write and 180 mb rewrite for the pool (not bad for only 8 disks).

    The race conditions for using zfs disks accross pools might still be there. However the problem lies within zfs itself and not between different type of filesystems. I see no problems with running ext4+zfs on a single set disks. Also, it's not common practice to run log and cache devices on the same set of SSD disks with no issues (log is a negligible portion of the ssd, and the rest is pure l2arc cache).

    From a cost perspective, this is $100 saved on a $400 disk disk pool and two sata ports free to be used for more disks.

    So far, the array has passed several bonnie++ benchmarks and is now running a couple of KVM windows machines. I also copied a few TB of data back and forth from and to backup storage, and had no issues so far.
     
  18. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,480
    Likes Received:
    96
    If you know of common practice you must also be aware of the rule of thumb: As long your hardware supports adding more RAM then add more RAM instead of separating zil and l2arc to physical devices - RAM will always be faster than a device.
     
  19. jinjer

    jinjer Member

    Joined:
    Oct 4, 2010
    Messages:
    194
    Likes Received:
    5
    ops. I wrote "not common practice" while I meat "it is now common practice".

    Sure, ram is king for zfs. At the end it all depends on your data and access pattern. Say you need a separate zil but are cost conscious: It's a pita to use a big mirror of SSD disks just to store 1-2Gig (at best) of zil. You partition the SSD and do a mirror zil on 2 disks (or three) and the rest goes to striped l2arc: this is standard setup.
     
  20. mir

    mir Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,480
    Likes Received:
    96
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice