Have any other ceph users noticed weirdness with performance graph. Where one read or write
does not seem to reflect real situation? Mine currently shows this and I think that it's a bit off...
Specifically looking at Reads... for +-50 VMs this is weird.
One thing to say, that it was after cloning of VM disk took added a bit to much load on it, but it corrected it self after stopping cloning (will do research on this later on):
Interestingly I don't see any issues on VMs reads or writes or any other signs of issues with storage. And network monitor also shows traffic flowing both ways +- on what I expect between ceph and hypervisor nodes
At the moment thinking on how to proceed with this. As I need to run updates on both ceph and hypervisor nodes possibly reboots after that will fix it...
Any comments?
----------------------------------------------------------------------------
Update, well everything seems to be workig. Maybe reads are really that periodic and low. or those stats are just off...
Also one VM that was collecting logs and analyzing them which was generating majority of reads was pretty much moved to legacy status in our systems. Possibly I had a minor heart attack over nothing...
Either way. Solved, probably as everything seems to work fine.
does not seem to reflect real situation? Mine currently shows this and I think that it's a bit off...
Specifically looking at Reads... for +-50 VMs this is weird.
One thing to say, that it was after cloning of VM disk took added a bit to much load on it, but it corrected it self after stopping cloning (will do research on this later on):
Code:
2019-09-11 14:44:25.523765 osd.57 osd.57 172.16.50.139:6800/4603 1883 : cluster [WRN] 23 slow requests, 5 included below; oldest blocked for > 30.008303 secs
2019-09-11 14:44:25.523775 osd.57 osd.57 172.16.50.139:6800/4603 1884 : cluster [WRN] slow request 30.007747 seconds old, received at 2019-09-11 14:43:55.515871: osd_op(client.25082543.0:2315 2.82b9d13 2:c8b9d410:::rbd_object_map.7f2be46b8b4567:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ack+ondisk+write+known_if_redirected e1139) currently waiting for rw locks
2019-09-11 14:44:25.523779 osd.57 osd.57 172.16.50.139:6800/4603 1885 : cluster [WRN] slow request 30.007696 seconds old, received at 2019-09-11 14:43:55.515923: osd_op(client.25082543.0:2316 2.82b9d13 2:c8b9d410:::rbd_object_map.7f2be46b8b4567:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ack+ondisk+write+known_if_redirected e1139) currently waiting for rw locks
Interestingly I don't see any issues on VMs reads or writes or any other signs of issues with storage. And network monitor also shows traffic flowing both ways +- on what I expect between ceph and hypervisor nodes
At the moment thinking on how to proceed with this. As I need to run updates on both ceph and hypervisor nodes possibly reboots after that will fix it...
Any comments?
----------------------------------------------------------------------------
Update, well everything seems to be workig. Maybe reads are really that periodic and low. or those stats are just off...
Also one VM that was collecting logs and analyzing them which was generating majority of reads was pretty much moved to legacy status in our systems. Possibly I had a minor heart attack over nothing...
Either way. Solved, probably as everything seems to work fine.
Last edited: