Hi All
I have been testing KRBD vs standard RBD in KVM virtual machines. I fired up a LXC container last week and ran a quick throughput test and it was much faster than all my VMs so I did some digging and found some really weird results.
All thoughts welcome. I've used the iometer profile in fio in the KVM vms.
The VM had a 32GB HDD 1 cpu core and 512MB of RAM
Testing is on a 4 node all SSD cluster
I'll summarise the results - full FIO are in the pastebin link at the bottom
First tests are librbd
I threw one test with iothread on with standard LIBRBD - didn't make any noticeable difference
Results from LIBRBD cache=default(no cache) discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [125.5MB/31860KB/0KB /s] [26.4K/6495/0 iops] [eta 00m:00s]
read : io=5042.1MB, bw=172107KB/s, iops=25328, runt= 30004msec
The next one is weird - it goes against all the advice and recommendations. Its slower than no caching
Cache=writeback discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [103.7MB/26661KB/0KB /s] [19.2K/4899/0 iops] [eta 00m:00s
read : io=4120.6MB, bw=140209KB/s, iops=19193, runt= 30094msec
As expected this is slower.
cache=writethrough discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [79111KB/21062KB/0KB /s] [11.8K/2985/0 iops] [eta 00m:00s]
read : io=2581.5MB, bw=88103KB/s, iops=10516, runt= 30003msec
The unsafe writeback option should look blisteringly quick - it doesn't
cache=writeback(unsafe)
Jobs: 1 (f=1): [m(1)] [100.0% done] [105.3MB/27021KB/0KB /s] [19.5K/4979/0 iops] [eta 00m:00s]
read : io=4171.9MB, bw=142364KB/s, iops=19570, runt= 30002msec
Considering this is an all SSD cluster the write IOPS are pretty crap using librbd
so the kernel (KRBD) mode tests are next.
Pretty good improvement with no cache KRBD already better than librbd with fullcaching.
DEFAULT(NO CACHE) discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [118.2MB/30085KB/0KB /s] [25.9K/6352/0 iops] [eta 00m:00s]
read : io=5529.1MB, bw=188636KB/s, iops=28767, runt= 30019msec
Things start to get weird again. I would expect the next one to be a lot slower but its not - its actually better
cache=directsync discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [135.6MB/33683KB/0KB /s] [29.6K/7211/0 iops] [eta 00m:00s]
read : io=5519.4MB, bw=188352KB/s, iops=28700, runt= 30005msec
The next one is the type of jump in IOPS and throughput I would expect to see from caching - why is this not the case in the librbd tests?
cache=writeback discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [365.9MB/95586KB/0KB /s] [85.3K/21.4K/0 iops] [eta 00m:00s]
read : io=6553.9MB, bw=325181KB/s, iops=53259, runt= 20638msec
Tasty IOPS and this is the caching mode recommended everywhere for ceph on RBD
writeback(unsafe) discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [457.2MB/115.4MB/0KB /s] [106K/26.5K/0 iops] [eta 00m:00s]
read : io=6553.9MB, bw=636786KB/s, iops=104295, runt= 10539msec
Marginally faster as you would expect - certainly not worth losing your data for.
cache=none discard
Jobs: 1 (f=1): [m(1)] [100.0% done] [134.6MB/34497KB/0KB /s] [29.4K/7266/0 iops] [eta 00m:00s]
read : io=5570.3MB, bw=190098KB/s, iops=29073, runt= 30005msec
This is better than all the librbd modes.
From these tests it appears that caching doesn't seem to be doing anything when using librbd and that the kernel rbd driver is absolutely screaming compared to using librbd. I'd love to check metrics on iowait and cpu usage differences when running these tests. will do at some other time.
Any experts got any thoughts on this?
Thanks!
Full FIO results --> http://pastebin.com/PyBWu6GV
I have been testing KRBD vs standard RBD in KVM virtual machines. I fired up a LXC container last week and ran a quick throughput test and it was much faster than all my VMs so I did some digging and found some really weird results.
All thoughts welcome. I've used the iometer profile in fio in the KVM vms.
The VM had a 32GB HDD 1 cpu core and 512MB of RAM
Testing is on a 4 node all SSD cluster
I'll summarise the results - full FIO are in the pastebin link at the bottom
First tests are librbd
I threw one test with iothread on with standard LIBRBD - didn't make any noticeable difference
Results from LIBRBD cache=default(no cache) discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [125.5MB/31860KB/0KB /s] [26.4K/6495/0 iops] [eta 00m:00s]
read : io=5042.1MB, bw=172107KB/s, iops=25328, runt= 30004msec
The next one is weird - it goes against all the advice and recommendations. Its slower than no caching
Cache=writeback discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [103.7MB/26661KB/0KB /s] [19.2K/4899/0 iops] [eta 00m:00s
read : io=4120.6MB, bw=140209KB/s, iops=19193, runt= 30094msec
As expected this is slower.
cache=writethrough discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [79111KB/21062KB/0KB /s] [11.8K/2985/0 iops] [eta 00m:00s]
read : io=2581.5MB, bw=88103KB/s, iops=10516, runt= 30003msec
The unsafe writeback option should look blisteringly quick - it doesn't
cache=writeback(unsafe)
Jobs: 1 (f=1): [m(1)] [100.0% done] [105.3MB/27021KB/0KB /s] [19.5K/4979/0 iops] [eta 00m:00s]
read : io=4171.9MB, bw=142364KB/s, iops=19570, runt= 30002msec
Considering this is an all SSD cluster the write IOPS are pretty crap using librbd
so the kernel (KRBD) mode tests are next.
Pretty good improvement with no cache KRBD already better than librbd with fullcaching.
DEFAULT(NO CACHE) discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [118.2MB/30085KB/0KB /s] [25.9K/6352/0 iops] [eta 00m:00s]
read : io=5529.1MB, bw=188636KB/s, iops=28767, runt= 30019msec
Things start to get weird again. I would expect the next one to be a lot slower but its not - its actually better
cache=directsync discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [135.6MB/33683KB/0KB /s] [29.6K/7211/0 iops] [eta 00m:00s]
read : io=5519.4MB, bw=188352KB/s, iops=28700, runt= 30005msec
The next one is the type of jump in IOPS and throughput I would expect to see from caching - why is this not the case in the librbd tests?
cache=writeback discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [365.9MB/95586KB/0KB /s] [85.3K/21.4K/0 iops] [eta 00m:00s]
read : io=6553.9MB, bw=325181KB/s, iops=53259, runt= 20638msec
Tasty IOPS and this is the caching mode recommended everywhere for ceph on RBD
writeback(unsafe) discard on
Jobs: 1 (f=1): [m(1)] [100.0% done] [457.2MB/115.4MB/0KB /s] [106K/26.5K/0 iops] [eta 00m:00s]
read : io=6553.9MB, bw=636786KB/s, iops=104295, runt= 10539msec
Marginally faster as you would expect - certainly not worth losing your data for.
cache=none discard
Jobs: 1 (f=1): [m(1)] [100.0% done] [134.6MB/34497KB/0KB /s] [29.4K/7266/0 iops] [eta 00m:00s]
read : io=5570.3MB, bw=190098KB/s, iops=29073, runt= 30005msec
This is better than all the librbd modes.
From these tests it appears that caching doesn't seem to be doing anything when using librbd and that the kernel rbd driver is absolutely screaming compared to using librbd. I'd love to check metrics on iowait and cpu usage differences when running these tests. will do at some other time.
Any experts got any thoughts on this?
Thanks!
Full FIO results --> http://pastebin.com/PyBWu6GV