We were running all of our interfaces for storage and lan over a dell power connect but wanted to separate out ceph on its on switch so we got a netgear smartplus jgs524e we installed it on the 6th. Randomly starting at 10:33a I started getting a now reply on osd5 which is on my pm04 node. When I go to the node it shows ethernet 3 going up and down. For now should I take out eth3 off the bond. This is messing with ceph and seems to be starting to affect the other osd's with the constant healing. I am using round robin on this bond and all the other nodes should I just use active backup on this node and will that mess with the other nodes that are all using round robin?
May 12 10:18:29 PM01 systemd-timesyncd[1655]: interval/delta/delay/jitter/drift 2048s/+0.002s/0.034s/0.005s/-26ppm
May 12 10:33:55 PM01 bash[12869]: 2017-05-12 10:33:55.850942 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:35.850940)
May 12 10:33:56 PM01 bash[12869]: 2017-05-12 10:33:56.851119 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:36.851116)
May 12 10:33:57 PM01 bash[12869]: 2017-05-12 10:33:57.526970 7f19a00bd700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:37.526969)
May 12 10:33:57 PM01 bash[12869]: 2017-05-12 10:33:57.851371 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:37.851368)
May 12 10:33:58 PM01 bash[12869]: 2017-05-12 10:33:58.851575 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:57.526822 (cutoff 2017-05-12 10:33:38.851573)
May 12 10:33:59 PM01 bash[12869]: 2017-05-12 10:33:59.827420 7f19a00bd700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:57.526822 (cutoff 2017-05-12 10:33:39.827418)
May 12 10:33:59 PM01 bash[12869]: 2017-05-12 10:33:59.851730 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:39.851728)
May 12 10:34:00 PM01 bash[12869]: 2017-05-12 10:34:00.851904 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:40.851901)
May 12 10:34:01 PM01 bash[12869]: 2017-05-12 10:34:01.852117 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:41.852114)
May 12 10:34:02 PM01 bash[12869]: 2017-05-12 10:34:02.127829 7f19a00bd700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:42.127828)
May 12 10:34:02 PM01 bash[12869]: 2017-0
May 12 10:18:29 PM01 systemd-timesyncd[1655]: interval/delta/delay/jitter/drift 2048s/+0.002s/0.034s/0.005s/-26ppm
May 12 10:33:55 PM01 bash[12869]: 2017-05-12 10:33:55.850942 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:35.850940)
May 12 10:33:56 PM01 bash[12869]: 2017-05-12 10:33:56.851119 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:36.851116)
May 12 10:33:57 PM01 bash[12869]: 2017-05-12 10:33:57.526970 7f19a00bd700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:37.526969)
May 12 10:33:57 PM01 bash[12869]: 2017-05-12 10:33:57.851371 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:55.226460 (cutoff 2017-05-12 10:33:37.851368)
May 12 10:33:58 PM01 bash[12869]: 2017-05-12 10:33:58.851575 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:57.526822 (cutoff 2017-05-12 10:33:38.851573)
May 12 10:33:59 PM01 bash[12869]: 2017-05-12 10:33:59.827420 7f19a00bd700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:57.526822 (cutoff 2017-05-12 10:33:39.827418)
May 12 10:33:59 PM01 bash[12869]: 2017-05-12 10:33:59.851730 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:39.851728)
May 12 10:34:00 PM01 bash[12869]: 2017-05-12 10:34:00.851904 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:40.851901)
May 12 10:34:01 PM01 bash[12869]: 2017-05-12 10:34:01.852117 7f19ba912700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:41.852114)
May 12 10:34:02 PM01 bash[12869]: 2017-05-12 10:34:02.127829 7f19a00bd700 -1 osd.3 2210 heartbeat_check: no reply from osd.5 since back 2017-05-12 10:33:35.424086 front 2017-05-12 10:33:59.827222 (cutoff 2017-05-12 10:33:42.127828)
May 12 10:34:02 PM01 bash[12869]: 2017-0