Yes just using the Mellanox cards in Ethernet mode fixed the problem for me. I did move to using the Mellanox switches for faster connections. Since I made the switch I have seen no slow reads. Ceph works great now.
I had a similar problem. I have Intel 10gb cards. I had all sorts of slow requests.
So I decided to upgrade my network to InfiniBand. I purchased some Mellanox cards and 10gb adapters. I install the new cards and started using the 10gb adapters with my existing 10gb fiber switch and all my slow...
I found the following worked well for me:
ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
You have to make sure all your machines in the ceph cluster are using the latest code. The previous build was a development branch, and did not support nodes with different versions. The latest version is a release candidate (RC).
I was bit by this yesterday too. Had to shut everything down and...
It's not very predictable when everything goes to rename#!
I added the following to /etc/default/grub:
GRUB_CMDLINE_LINUX="net.ifnames=0"
You need to do an update-grub2, then reboot.
Now you can create a file /etc/udev/rules.d/70-persistent-net.rules:
SUBSYSTEM=="net", ACTION=="add"...
Any still seeing this (pve-manager/3.3-5/bfebec03 (running kernel: 2.6.32-33-pve)). I have had my cluster of four identical nodes go down 4 times in the past week with this type of error. I'm using a managed 10gb switch. I have not found a way to make this reproducible. Seems to be happening at...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.