ceph - lost access to VM after recovery.

dpecile

Member
Nov 11, 2020
17
1
8
52
Hi
I have 3 nodes in a cluster.

After remove an OSD trying to get more speed (fo use only SSD), and add again, because lack of space, I ended having this error:


cluster:
id: XX
health: HEALTH_WARN
Reduced data availability: 12 pgs inactive
220 slow ops, oldest one blocked for 8642 sec, daemons [osd.0,osd.1,osd.2,osd.3,osd.5,mon.nube1,mon.nube2] have slow ops.

services:
mon: 3 daemons, quorum nube1,nube5,nube2 (age 56m)
mgr: nube1(active, since 57m)
osd: 6 osds: 6 up (since 55m), 6 in (since 6h)

data:
pools: 3 pools, 257 pgs
objects: 327.42k objects, 1.2 TiB
usage: 3.7 TiB used, 1.7 TiB / 5.3 TiB avail
pgs: 4.669% pgs unknown
245 active+clean
12 unknown

The cluster is very slow, and the VM disks are apparently locked.

When start the VM hang afer bios splash.
Tried to mount from recovery CD inside VM, same story.

Any idea how to start to solve the problem ?

I have backups from yesterday morning, but need to recover last day.

Thanks

Demian
 
Ok, restored 1 day old backups in another proxmox without ceph.
But now the ceph nodes are unusable.
Any idea how to restore the nodes without complete format the nodes ?
I still have some hope to restore access to old VM disks.

Thanks
 
Have you tried restarting the monitors?
Once they have been restarted, restart the OSDs.

Please provide the output of the following command: pveversion -v
And also provide the latest ceph log (/var/log/ceph/ceph.log).
 
Hi Mira
Thanks for answer.

I restarted the nodes, no success.

Bellow results of pveversion -v

The log is huge, because has a lot of cluster [WRN] slow request osd_op.(22.811.414 of lines), after my rebalance ...

Code:
proxmox-ve: 6.4-1 (running kernel: 5.4.143-1-pve)
pve-manager: 6.4-13 (running version: 6.4-13/9f411e79)
pve-kernel-helper: 6.4-8
pve-kernel-5.4: 6.4-7
pve-kernel-5.4.143-1-pve: 5.4.143-1
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.114-1-pve: 5.4.114-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 15.2.15-pve1~bpo10
ceph-fuse: 15.2.15-pve1~bpo10
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve4~bpo10
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve1~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.6-pve1~bpo10+1
 
You could zip it and/or upload it somewhere and provide a download link if it is still too big to attach here.
 
Hi Mira
The problem started on November 5.

Bellow the sizes.

-rw------- 1 ceph ceph 6.7G Nov 8 08:04 ceph.log
-rw------- 1 ceph ceph 1.5G Nov 8 00:00 ceph.log.1.gz
-rw------- 1 ceph ceph 1.1G Nov 6 23:59 ceph.log.2.gz
-rw------- 1 ceph ceph 1.5G Nov 5 23:59 ceph.log.3.gz
-rw------- 1 ceph ceph 20M Nov 4 23:59 ceph.log.4.gz
-rw------- 1 ceph ceph 910K Nov 3 23:59 ceph.log.5.gz

I will try to digest the one from that day, removing the slow message.

and attach if it's possible.

Thanks

Demian
 
Could you provide the output of ip -details -statistics a for all nodes?
After rebooting the nodes, did you wait until everything was synced up again?

If some OSDs don't start, please also provide the logs for those. You can find them on each node under /var/log/ceph/ceph-osd.<num>.log.
And for a better overview of the current state of the cluster, please provide the current ceph.log and the output of ceph -s.
 
Hi Mira.

The nodes are syncd.

The osd are running.

root@nube5:~# ceph -s
cluster:
id: 109ca7a2-7e4c-41b7-8693-398a7170f630
health: HEALTH_WARN
Reduced data availability: 12 pgs inactive
124 slow ops, oldest one blocked for 4642 sec, daemons [osd.0,osd.2,osd.3,osd.5,mon.nube1,mon.nube2,mon.nube5] have slow ops.

services:
mon: 3 daemons, quorum nube1,nube5,nube2 (age 98m)
mgr: nube1(active, since 2h)
osd: 6 osds: 6 up (since 98m), 6 in (since 3d)

data:
pools: 3 pools, 257 pgs
objects: 327.42k objects, 1.2 TiB
usage: 3.7 TiB used, 1.7 TiB / 5.3 TiB avail
pgs: 4.669% pgs unknown
245 active+clean
12 unknown

Current ceph.log uploaded here ceph.log

ip -details -statistics a uploaded here ip details

Thanks

Demian
 
Thank you for the logs.
It seems the slow ops are slowly reduced, but there are still the 12 inactive PGs.
Could you run the following commands?
Code:
ceph pg 2.1 query
ceph pg 2.3 query
ceph pg 2.9 query
ceph pg 2.35 query
ceph pg 2.37 query
ceph pg 2.51 query
ceph pg 2.53 query
ceph pg 2.54 query
ceph pg 2.5d query
ceph pg 2.6e query
ceph pg 2.77 query
ceph pg 4.0 query
 
Hi Mira

root@nube1:~# ceph pg 2.1 query
Error ENOENT: i don't have pgid 2.1
root@nube1:~# ceph pg 2.3 query
Error ENOENT: i don't have pgid 2.3
root@nube1:~# ceph pg 2.9 query
Error ENOENT: i don't have pgid 2.9
root@nube1:~# ceph pg 2.35 query
Error ENOENT: i don't have pgid 2.35
root@nube1:~# ceph pg 2.37 query
Error ENOENT: i don't have pgid 2.37
root@nube1:~# ceph pg 2.51 query
Error ENOENT: i don't have pgid 2.51
root@nube1:~# ceph pg 2.53 query
Error ENOENT: i don't have pgid 2.53
root@nube1:~# ceph pg 2.54 query
Error ENOENT: i don't have pgid 2.54
root@nube1:~# ceph pg 2.5d query
Error ENOENT: i don't have pgid 2.5d
root@nube1:~# ceph pg 2.6e query
Error ENOENT: i don't have pgid 2.6e
root@nube1:~# ceph pg 2.77 query
Error ENOENT: i don't have pgid 2.77
root@nube1:~# ceph pg 4.0 query
Error ENOENT: i don't have pgid 4.0

Regards

Demian
 
Hi Mira
Sure, I read some documentation, I am afraid about losing all the data, for that reason looking for some advise.

pg 2.1 is stuck inactive for 22h, current state unknown, last acting []
pg 2.3 is stuck inactive for 22h, current state unknown, last acting []
pg 2.9 is stuck inactive for 22h, current state unknown, last acting []
pg 2.35 is stuck inactive for 22h, current state unknown, last acting []
pg 2.37 is stuck inactive for 22h, current state unknown, last acting []
pg 2.51 is stuck inactive for 22h, current state unknown, last acting []
pg 2.53 is stuck inactive for 22h, current state unknown, last acting []
pg 2.54 is stuck inactive for 22h, current state unknown, last acting []
pg 2.5d is stuck inactive for 22h, current state unknown, last acting []
pg 2.6e is stuck inactive for 22h, current state unknown, last acting []
pg 2.77 is stuck inactive for 22h, current state unknown, last acting []
pg 4.0 is stuck inactive for 22h, current state unknown, last acting []

I have a ceph health detail before the ceph man reboot.

2.6e stale+undersized+degraded+peered [5] 5 [5] 5
2.5d stale+undersized+degraded+peered [5] 5 [5] 5
2.53 stale+undersized+degraded+peered [5] 5 [5] 5
2.77 stale+undersized+degraded+peered [5] 5 [5] 5
2.51 stale+undersized+degraded+peered [5] 5 [5] 5
2.37 stale+undersized+degraded+peered [5] 5 [5] 5
2.9 stale+undersized+degraded+peered [5] 5 [5] 5
2.35 stale+undersized+degraded+peered [5] 5 [5] 5
4.0 stale+undersized+degraded+peered [5] 5 [5] 5
2.1 stale+undersized+degraded+peered [5] 5 [5] 5
2.54 stale+undersized+degraded+peered [5] 5 [5] 5
2.3 stale+undersized+degraded+peered [5] 5 [5] 5

As i thought the info was in the node 3, I restarted, and then the stale+undersized+degraded+peered become, unknown.

Thanks and regards.

Demian
 
Hi Mira
Any idea how to format the OSD without reinstall ?
The information is too old to make sense recover.
The nodes are in the same status since 1 week ..
 
Ok, I force created the missing PG, and the nodes started to work again ...
Now is rebalancing, so no idea what information is missing.

I used the command ceph osd force-create-pg
 
Sorry for the late reply.

How is the cluster now? Is everything up and running again?
 
Hi Mira, I had to delete all the VM disks, but yes, is up and running.
Waiting to harden a little before put in production again.
Changing from bridge networking to 2 switches, and adding to more nodes, to a total of 5.
 
  • Like
Reactions: mira

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!