CEPH problem

gosha

Well-Known Member
Oct 20, 2014
302
24
58
Russia
Hi!

My CEPH-storage gives an error:

# ceph health
HEALTH_ERR 27 pgs are stuck inactive for more than 300 seconds; 7 pgs down; 27 pgs incomplete; 27 pgs stuck inactive; 27 pgs stuck unclean; 1 requests are blocked > 32 sec

root@acn2:~# ceph health detail
HEALTH_ERR 27 pgs are stuck inactive for more than 300 seconds; 7 pgs down; 27 pgs incomplete; 27 pgs stuck inactive; 27 pgs stuck unclean; 1 requests are blocked > 32 sec; 1 osds have slow requests
pg 1.c7 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.c1 is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.c0 is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.af is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.dd is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.b7 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.e4 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.64 is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.5b is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.9e is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.f7 is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.3f is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.e8 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.48 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.1b is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.f is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.24 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.d4 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.19 is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.a3 is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.7c is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.84 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.87 is stuck inactive since forever, current state down+incomplete, last acting [3,0]
pg 1.c9 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.6b is stuck inactive since forever, current state incomplete, last acting [3,0]
pg 1.98 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.a0 is stuck inactive since forever, current state incomplete, last acting [0,3]
pg 1.c7 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.c1 is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.c0 is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.af is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.a3 is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.a0 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.9e is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.87 is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.84 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.7c is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.24 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.19 is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.3f is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.1b is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.48 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.64 is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.5b is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.98 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.6b is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.c9 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.d4 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.dd is stuck unclean since forever, current state incomplete, last acting [3,0]
pg 1.b7 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.e4 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.e8 is stuck unclean since forever, current state incomplete, last acting [0,3]
pg 1.f is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.f7 is stuck unclean since forever, current state down+incomplete, last acting [3,0]
pg 1.f7 is down+incomplete, acting [3,0]
pg 1.f is down+incomplete, acting [3,0]
pg 1.e8 is incomplete, acting [0,3]
pg 1.e4 is incomplete, acting [0,3]
pg 1.dd is incomplete, acting [3,0]
pg 1.d4 is incomplete, acting [0,3]
pg 1.c9 is incomplete, acting [0,3]
pg 1.5b is incomplete, acting [0,3]
pg 1.64 is incomplete, acting [3,0]
pg 1.48 is incomplete, acting [0,3]
pg 1.3f is down+incomplete, acting [3,0]
pg 1.1b is incomplete, acting [3,0]
pg 1.19 is down+incomplete, acting [3,0]
pg 1.24 is incomplete, acting [0,3]
pg 1.6b is incomplete, acting [3,0]
pg 1.7c is incomplete, acting [0,3]
pg 1.84 is incomplete, acting [0,3]
pg 1.87 is down+incomplete, acting [3,0]
pg 1.98 is incomplete, acting [0,3]
pg 1.9e is down+incomplete, acting [3,0]
pg 1.a0 is incomplete, acting [0,3]
pg 1.a3 is down+incomplete, acting [3,0]
pg 1.af is incomplete, acting [3,0]
pg 1.b7 is incomplete, acting [0,3]
pg 1.c0 is incomplete, acting [3,0]
pg 1.c1 is incomplete, acting [3,0]
pg 1.c7 is incomplete, acting [0,3]
1 ops are blocked > 16777.2 sec on osd.3
1 osds have slow requests

# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 5.41745 root default
-2 2.70428 host acn1
1 0.89999 osd.1 up 1.00000 1.00000
2 0.89999 osd.2 up 1.00000 1.00000
0 0.90430 osd.0 up 1.00000 1.00000
-3 2.71317 host acn2
3 0.90439 osd.3 up 1.00000 1.00000
4 0.90439 osd.4 up 1.00000 1.00000
5 0.90439 osd.5 up 1.00000 1.00000

Storage has three monitors.
mon.0 and mon.1 both with three OSD
mon.2 - for quorum only (without OSD).

# ceph version
ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)

How can I fix this error?

Best regards,
Gosha
 
you run just two OSD hosts?

you should run at least 3 nodes and you should use a replication of 3. If you do not follow this, problems are expected.
 
you run just two OSD hosts?
you should run at least 3 nodes and you should use a replication of 3. If you do not follow this, problems are expected.

Yes - two OSD hosts and replication of 2.
Does this mean that the problem is not solvable for my configuration?

Best regards,
Gosha
 
Yes - two OSD hosts and replication of 2.
Does this mean that the problem is not solvable for my configuration?

Best regards,
Gosha

It means that your date/config could break in some cases. It should be possible to recover, check the Ceph troubleshooting docs (ceph.com).

After that, make sure you have 3 nodes and a replication of 3.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!