As i've got to answer some questions in our company (being not a guru but the most experienced zfs user), i'd like to share some answers/best practices, especially according to speed up things with SSDs.
Really great source i've found answering most all basic questions: http://constantin.glez.de/blog/2011/02/frequently-asked-questions-about-flash-memory-ssds-and-zfs
Best Of:
[h=3]Should I mirror flash memory drives?[/h] Interesting question. On one hand, hardware breaks all the time, and SSDs are no exception. On the other hand, SSDs don't have any moving parts and so they're statistically much less susceptible to failures.
It really depends on your risk tolerance:
In theory you can split up an SSD into two slices through the format(1M) command.
In practice, this means that you'll have two streams of data (ZIL writes and L2ARC writes and reads) instead of one, competing for the limited resources of the SSD's connection and controller. That may compromise your ZIL performance as two mechanisms step on each other's feet.
Better try it out: split up the SSD, configure the ZIL part of it, see how much it improves your write performance, then hook up the L2ARC part, while observing if the ZIL performance is still good. Use zilstat for monitoring ZIL performance.
[h=3]How much space do I need for a ZIL?[/h] The role of the ZIL is to store a transaction group until it has safely been written to disk. After that, this can be safely deleted and the space used for the next transaction group.
So the question becomes: How much transaction group data is "in flight" (i.e. not yet written to disk) at any time?
ZFS issues a new transaction group (and consequently a new pool update) every 5 seconds at the latest (more if the load is higher). While one transaction group is written to the ZIL, the previous one may still be in the process of being written to disk, so we need enough space to store two transaction groups, which means 10 seconds of maximum write throughput worth of data.
What's the maximum amount of data that your server writes in 10 seconds? Well, an upper boundary would be the maximum write speed of your SSD. At the time of this writing this was about 170 MB/s for an Intel X25-E, times 10 that would be just short of 2 GB for a typical ZIL.
So for ZILs, a little can go a long way.
[h=3]How much space do I need for an L2ARC?[/h] This is more difficult, or more easy, depending on how you put it. More is always better, but too much would be a waste if it's not used. Check your L2ARC usage with arc_summary and if you still see a significant amount of ghosts after adding an L2ARC, you'll likely benefit from even more L2ARC space.
Another way to estimate L2ARC need is by looking at your working set: The amount of data that is used most frequently. Depending on your application, this could be your top 10 research projects, your top 20% of recurring customers, your most popular 100 products etc.
More hints an sharing SSD for both jobs:
One thing to keep in mind is that the ZIL should be mirrored to protect the speed of the ZFS system. If the ZIL is not mirrored, and the drive that is being used as the ZIL drive fails, the system will revert to writing the data directly to the disk, severely hampering performance.
Really great source i've found answering most all basic questions: http://constantin.glez.de/blog/2011/02/frequently-asked-questions-about-flash-memory-ssds-and-zfs
Best Of:
[h=3]Should I mirror flash memory drives?[/h] Interesting question. On one hand, hardware breaks all the time, and SSDs are no exception. On the other hand, SSDs don't have any moving parts and so they're statistically much less susceptible to failures.
It really depends on your risk tolerance:
- The ZIL is the last resort to go to if the system crashes before data that was promised to the application to be "safe" is actually written to disk. Then, upon reboot, the system reads back the ZIL and performs the missing updates on the actual ZFS storage pool. Since the ZIL is so important for ensuring data integrity, it should therefore be mirrored and ZFS supports that quite nicely.
- The L2ARC is a read cache: It stores data for convenience and speed only, but every bit of data in the L2ARC is also available elsewhere. So mirroring an L2ARC SSD is not really necessary (though Marcelo has a very good point in that a dramatic loss of performance may actually justify an L2ARC mirror). Instead, ZFS will use any extra SSDs you give it for L2ARC in order to expand the amount of space for caching data.
In theory you can split up an SSD into two slices through the format(1M) command.
In practice, this means that you'll have two streams of data (ZIL writes and L2ARC writes and reads) instead of one, competing for the limited resources of the SSD's connection and controller. That may compromise your ZIL performance as two mechanisms step on each other's feet.
Better try it out: split up the SSD, configure the ZIL part of it, see how much it improves your write performance, then hook up the L2ARC part, while observing if the ZIL performance is still good. Use zilstat for monitoring ZIL performance.
[h=3]How much space do I need for a ZIL?[/h] The role of the ZIL is to store a transaction group until it has safely been written to disk. After that, this can be safely deleted and the space used for the next transaction group.
So the question becomes: How much transaction group data is "in flight" (i.e. not yet written to disk) at any time?
ZFS issues a new transaction group (and consequently a new pool update) every 5 seconds at the latest (more if the load is higher). While one transaction group is written to the ZIL, the previous one may still be in the process of being written to disk, so we need enough space to store two transaction groups, which means 10 seconds of maximum write throughput worth of data.
What's the maximum amount of data that your server writes in 10 seconds? Well, an upper boundary would be the maximum write speed of your SSD. At the time of this writing this was about 170 MB/s for an Intel X25-E, times 10 that would be just short of 2 GB for a typical ZIL.
So for ZILs, a little can go a long way.
[h=3]How much space do I need for an L2ARC?[/h] This is more difficult, or more easy, depending on how you put it. More is always better, but too much would be a waste if it's not used. Check your L2ARC usage with arc_summary and if you still see a significant amount of ghosts after adding an L2ARC, you'll likely benefit from even more L2ARC space.
Another way to estimate L2ARC need is by looking at your working set: The amount of data that is used most frequently. Depending on your application, this could be your top 10 research projects, your top 20% of recurring customers, your most popular 100 products etc.
More hints an sharing SSD for both jobs:
- ZIL devices should be low-capacity, low-latency devices capable of high IOPS. They are typically mirrored.
- L2ARC devices should be high-capacity (within reason: You need to add RAM as L2ARC size increases). They scale by striping.
One thing to keep in mind is that the ZIL should be mirrored to protect the speed of the ZFS system. If the ZIL is not mirrored, and the drive that is being used as the ZIL drive fails, the system will revert to writing the data directly to the disk, severely hampering performance.