Hi Folks,
First, I think ZFS being in proxmox now for a while is awesome. I love ZFS and really think it's one of the best filesystems out there. That being said, I've recently seen some important shortcomings with it's implementation in proxmox, and I really hope my concerns can get addressed in future developments.
First, no alerts in the webGUI.
When I test-yanked a disk out of a ZFS mirror that is the OS pool (and local storage), the webGUI literally showed ZERO alerts or information of a disk being missing. This was very alarming to me as an admin I would have to go and manually check to see such an event had occurred. We really need alerts for problem scenarios for ZFS. Even if the disk is okay, but disconnected, this kind of thing _NEEDS_ to be presented as an alarm in the webGUI, email, both, whatever.
Second, no hot spare options.
When making zpools, be it for booting the OS, or other local storage, having hot spares would help. I've found no install or GUI way to setup spares, and I think this would increase the quality of life for ZFS usage in the Proxmox VE ecosystem.
Third, no live action for re-insertion of disks.
In my mirror yanking scenario, I plugged the disk back in, and.... nothing. zpool status showed that the pool did not see the disk as reconnected. The Linux OS however did present the /dev/ device, yet ZFS had not reconnected the disk back to the pool, despite it being a member. This is a problem for intermittent interfaces, or if you accidentally yank a disk and put it back in. My "solution" was rebooting the proxmox host. This _should not_ be the solution. My expectation is it should see the device as present and put it back in the pool without user interaction.
Fourth, no webGUI tools for interacting with ZFS zpools.
This part is a pretty big deal. Putting aside the lack of alerts, being unable to replace a disk in the webGUI makes ZFS unattractive to admins, and also potentially problematic. Having a webGUI method means that it is not only convenient to administrate ZFS, but also ensure the replacement disk is correctly partitioned. I myself know why the partitioning is a good idea, but many other admins may not. Furthermore, having webGUI functions to check the status of pools can be helpful for just briefly checking the health of such. These tools should also include things like adding L2ARC/ZIL devices, hot spares, etc. I don't think this should _necessarily_ be as robust as say... FreeNAS' toolset, but it should be at least capable of replacing disks.
Fifth, no scrubbing, snapshot, or other scheduled task abilities at present.
Scrubbing a ZFS zpool is important, even if it's done once a month. It's important because it can pick up on data corruption that might not have been addressed in passing activities. Over a long period of time not scrubbing your pool can have compounding effects. At the bare minimum the webGUI should have a way to schedule scrubs and view the results of the most recent one. If the webGUI could later be extended to do snazzy snapshotting stuff I think this would make ZFS super awesome in the Proxmox VE environment (snapshots would rely on crons or other schedule task stuff).
Despite these areas that I think would really benefit from dev love, I like how far ZFS has come in Proxmox VE. It's dead simple to setup in the installer, and I don't see any immediate failures of it. However, the current implementation leaves me wanting for when I may have to deal with failure scenarios. I don't think most of what I'm asking here is too much, and I think a fair amount of it would be generally appreciated by the Proxmox VE community.
That being said, I would love to hear your thoughts. If there's any more I can do to help with this development, apart from writing it myself, please let me know. I work with ZFS as part of my business, so I've studied it heavily.
Thanks peeps!
First, I think ZFS being in proxmox now for a while is awesome. I love ZFS and really think it's one of the best filesystems out there. That being said, I've recently seen some important shortcomings with it's implementation in proxmox, and I really hope my concerns can get addressed in future developments.
First, no alerts in the webGUI.
When I test-yanked a disk out of a ZFS mirror that is the OS pool (and local storage), the webGUI literally showed ZERO alerts or information of a disk being missing. This was very alarming to me as an admin I would have to go and manually check to see such an event had occurred. We really need alerts for problem scenarios for ZFS. Even if the disk is okay, but disconnected, this kind of thing _NEEDS_ to be presented as an alarm in the webGUI, email, both, whatever.
Second, no hot spare options.
When making zpools, be it for booting the OS, or other local storage, having hot spares would help. I've found no install or GUI way to setup spares, and I think this would increase the quality of life for ZFS usage in the Proxmox VE ecosystem.
Third, no live action for re-insertion of disks.
In my mirror yanking scenario, I plugged the disk back in, and.... nothing. zpool status showed that the pool did not see the disk as reconnected. The Linux OS however did present the /dev/ device, yet ZFS had not reconnected the disk back to the pool, despite it being a member. This is a problem for intermittent interfaces, or if you accidentally yank a disk and put it back in. My "solution" was rebooting the proxmox host. This _should not_ be the solution. My expectation is it should see the device as present and put it back in the pool without user interaction.
Fourth, no webGUI tools for interacting with ZFS zpools.
This part is a pretty big deal. Putting aside the lack of alerts, being unable to replace a disk in the webGUI makes ZFS unattractive to admins, and also potentially problematic. Having a webGUI method means that it is not only convenient to administrate ZFS, but also ensure the replacement disk is correctly partitioned. I myself know why the partitioning is a good idea, but many other admins may not. Furthermore, having webGUI functions to check the status of pools can be helpful for just briefly checking the health of such. These tools should also include things like adding L2ARC/ZIL devices, hot spares, etc. I don't think this should _necessarily_ be as robust as say... FreeNAS' toolset, but it should be at least capable of replacing disks.
Fifth, no scrubbing, snapshot, or other scheduled task abilities at present.
Scrubbing a ZFS zpool is important, even if it's done once a month. It's important because it can pick up on data corruption that might not have been addressed in passing activities. Over a long period of time not scrubbing your pool can have compounding effects. At the bare minimum the webGUI should have a way to schedule scrubs and view the results of the most recent one. If the webGUI could later be extended to do snazzy snapshotting stuff I think this would make ZFS super awesome in the Proxmox VE environment (snapshots would rely on crons or other schedule task stuff).
Despite these areas that I think would really benefit from dev love, I like how far ZFS has come in Proxmox VE. It's dead simple to setup in the installer, and I don't see any immediate failures of it. However, the current implementation leaves me wanting for when I may have to deal with failure scenarios. I don't think most of what I'm asking here is too much, and I think a fair amount of it would be generally appreciated by the Proxmox VE community.
That being said, I would love to hear your thoughts. If there's any more I can do to help with this development, apart from writing it myself, please let me know. I work with ZFS as part of my business, so I've studied it heavily.
Thanks peeps!