CEPH problem after upgrade to 5.1 / slow requests + stuck request

black4 · Dec 12, 2017

Until you have inactive (activating) pgs your cluster will be useless.
My cluster is now fixed and we are testing IO operations on it. I will post solution, but I need to translate my notes

I'll be back with reply.

bradkollmyer · Dec 13, 2017

I had a similar problem. I have Intel 10gb cards. I had all sorts of slow requests.

So I decided to upgrade my network to InfiniBand. I purchased some Mellanox cards and 10gb adapters. I install the new cards and started using the 10gb adapters with my existing 10gb fiber switch and all my slow reads went away. I've been running for two weeks without any issues. Previous to this I had slow requests multiple times a day!

So my suspicion is that Intel 10gb driver has issues... For months I tried to tweak the driver. It's nice to see everything working as expected.

I have a Mellanox switch, and I'm planning on migrating to that switch for ceph traffic.

Instigater · Dec 13, 2017

I don't know solution but all problematic PGs became on normal recovery track when I restarted second monitor. I also have too much PGs per OSD and maybe that was the reason. I set high enough mon_max_pg_per_osd to pass my current setup and on second monitor restart it all became on right track. It still barks about too much PGs in ceph -s command. Previously I did full node restarts without any success. I also issued to set optimal tunables and that caused major rebalance and peering issues.

Jarek · Dec 13, 2017

Explanation and solution:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-December/023223.html

ignaqui · Dec 13, 2017

Hi guys,

I have exactly the same problem as @black4.
@Jarek, can you see any issues after you have implemented your fix?

Thank you,

ignaqui · Dec 13, 2017

@black4: I've upgrade from Hammer to Jewel about month ago. From Jewel to Luminous (4.4 pve) on wednesday and it was working fine few days. After upgrade pve 4.4 to 5.1 issues started. I must admit that when I was upgrading PVE to 5.1 CEPH health status was WARN - only "too many PGs per OSD (323 > max 200)".

I did pretty much the same as @black4, except I didn't have "too many PGs" message.

My troubles started when I execute this:

Code:

ceph osd crush tunables optimal

After that only ~610 pgs became active, other ~5000 became activating+remapped. I changed tunables back to hammer:

Code:

ceph osd crush tunables hammer

And now I have this:

Code:

 cluster:
    id:     089d3673-5607-404d-9351-2d4004043966
    health: HEALTH_ERR
            Reduced data availability: 127 pgs inactive
            Degraded data redundancy: 127 pgs unclean, 15 pgs degraded
            417 slow requests are blocked > 32 sec
            184 stuck requests are blocked > 4096 sec

  services:
    mon: 3 daemons, quorum 2,1,0
    mgr: tw-dwt-prx-05(active), standbys: tw-dwt-prx-03, tw-dwt-prx-07
    osd: 92 osds: 92 up, 92 in; 117 remapped pgs

  data:
    pools:   3 pools, 6144 pgs
    objects: 1412k objects, 5645 GB
    usage:   16969 GB used, 264 TB / 280 TB avail
    pgs:     2.067% pgs not active
             6017 active+clean
             112  activating+remapped
             10   activating+degraded
             5    activating+degraded+remapped

ignaqui · Dec 13, 2017

Jarek said:
Explanation and solution:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-December/023223.html

The fix worked like a charm! Thank you so much!

Instigater · Dec 14, 2017

Just a small update. mon_max_pg_per_osd was taken by monitors but ceph status still barked about it until I restarted active mgr daemon.

TheDevouringOne · Dec 20, 2017

Jarek said:
Explanation and solution:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-December/023223.html

This fixed it for me. I guess having a mix of 240GB, 512GB, 1TB SSDs, 1TB, 2TB, 4TB 2.5 inch HDDs and 8TB HDDs scattered unevenly across my 3 nodes made the update angry. Thank you so much @Jerek just wished I would have found this and tried it 15 hours ago.

Instigater · Dec 21, 2017

I eventually fixed it by creating new pools withing current best practice recommendations, migrated data and deleted old ones. It still involved more than 60 hours or data movement.

RobFantini · Nov 11, 2018

bradkollmyer said:
I had a similar problem. I have Intel 10gb cards. I had all sorts of slow requests.

So I decided to upgrade my network to InfiniBand. I purchased some Mellanox cards and 10gb adapters. I install the new cards and started using the 10gb adapters with my existing 10gb fiber switch and all my slow reads went away. I've been running for two weeks without any issues. Previous to this I had slow requests multiple times a day!

So my suspicion is that Intel 10gb driver has issues... For months I tried to tweak the driver. It's nice to see everything working as expected.

I have a Mellanox switch, and I'm planning on migrating to that switch for ceph traffic.

Hello there
apologies for replying to an old post.

We have similar issue, and are currently using Intel 10G switches.

I do have 2 Mellanox switches and 10G cards in storage room.

My question: Did using Mellanox fix the issues for you ?

bradkollmyer · Nov 11, 2018

Yes just using the Mellanox cards in Ethernet mode fixed the problem for me. I did move to using the Mellanox switches for faster connections. Since I made the switch I have seen no slow reads. Ceph works great now.

RobFantini · Nov 11, 2018

bradkollmyer said:
Yes just using the Mellanox cards in Ethernet mode fixed the problem for me. I did move to using the Mellanox switches for faster connections. Since I made the switch I have seen no slow reads. Ceph works great now.

Great!
Thank You.

RobFantini · Nov 13, 2018

bradkollmyer said:
Yes just using the Mellanox cards in Ethernet mode fixed the problem for me. I did move to using the Mellanox switches for faster connections. Since I made the switch I have seen no slow reads. Ceph works great now.

Hello
which model Mellanox cards are you using?

We've got to buy a few more and am trying t5o get a list of which are good to get.

bradkollmyer · Nov 15, 2018

I'm using CONNECTX-3 cards. Note that MCX314A-BCBT are ethernet cards. If you want to run InfiniBand you need MCX354A-FCBT.

RobFantini · Nov 15, 2018

bradkollmyer said:
I'm using CONNECTX-3 cards. Note that MCX314A-BCBT are ethernet cards. If you want to run InfiniBand you need MCX354A-FCBT.

thank you for the info.

I found a 8 InfiniiBand connect-x first version cards in storage. we'll start with those. we do not have a huge amount of data getting transferred so hopefully ConnectX-1 works ok. else we'll upgrade.

RobFantini · Nov 15, 2018

bradkollmyer said:
I'm using CONNECTX-3 cards. Note that MCX314A-BCBT are ethernet cards. If you want to run InfiniBand you need MCX354A-FCBT.

Hello , We do the following from a cronjob daily to check for a ceph issue

in a crontab

Code:

# .----------------- minute (0 - 59)
# | .-------------- hour (0 - 23)
# | | .---------- day of month (1 - 31)
# | | |   .------- month (1 - 12) OR jan,feb,mar,apr ...
# | | |   |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# | | |   |  |
59 23 *   *  * root    grep "slow requests are blocked"   /var/log/ceph/ceph.log
                               #  Note we do logrotate at 00:00 . if you do it debian default  rotate then change minute and hour .

on the average we get one instance of that happening per day.

do you mind trying that on your cluster? not necessarily the cronjob just this if you could.

Code:

grep "slow requests are blocked"   /var/log/ceph/ceph.log

before switching to IB , I'd like to see if others using a good IB set up have same issue.

thanks for the help . you've done a lot and no need for more if you can not get to it.

bradkollmyer · Nov 30, 2018

Since moving to the Mellanox cards I have had no 'slow requests are blocked'.

RobFantini · Nov 30, 2018

We switched over to Mellanox cards yesterday . Time will tell if issue is fixed.

Looks good so far.

update - 2/15/19 - using Mellanox cards totally solved our issues.

CEPH problem after upgrade to 5.1 / slow requests + stuck request

Member

Renowned Member

Renowned Member

Well-Known Member

Active Member

Active Member

Active Member

Renowned Member

Member

Renowned Member

Famous Member

Renowned Member

Famous Member

Famous Member

Renowned Member

Famous Member

Famous Member

Renowned Member

Famous Member

We value your privacy