Osds down on reboot and Failed to update device symlinks: Too many levels of symbolic links after 6.4 to 7 update

Mikepop

Well-Known Member
Feb 6, 2018
63
5
48
50
Hello, we see these errors since 7 version update in every ceph node:

Code:
72619:Jul 20 08:27:14 int101 systemd-udevd[4773]: sdb2: Failed to update device symlinks: Too many levels of symbolic links

72620:Jul 20 08:27:14 int101 systemd-udevd[4822]: sdd2: Failed to update device symlinks: Too many levels of symbolic links

72621:Jul 20 08:27:14 int101 systemd-udevd[4970]: sde2: Failed to update device symlinks: Too many levels of symbolic links

72622:Jul 20 08:27:14 int101 systemd-udevd[4770]: sdf2: Failed to update device symlinks: Too many levels of symbolic links

72623:Jul 20 08:27:15 int101 systemd-udevd[4763]: sdc2: Failed to update device symlinks: Too many levels of symbolic links

72624:Jul 20 08:27:15 int101 systemd-udevd[4764]: sdb2: Failed to update device symlinks: Too many levels of symbolic links

72625:Jul 20 08:27:15 int101 systemd-udevd[4774]: sdh2: Failed to update device symlinks: Too many levels of symbolic links

72626:Jul 20 08:27:15 int101 systemd-udevd[4773]: sdd2: Failed to update device symlinks: Too many levels of symbolic links

72627:Jul 20 08:27:15 int101 systemd-udevd[4822]: sde2: Failed to update device symlinks: Too many levels of symbolic links

72628:Jul 20 08:27:15 int101 systemd-udevd[4770]: sdc2: Failed to update device symlinks: Too many levels of symbolic links

72629:Jul 20 08:27:15 int101 systemd-udevd[4764]: sdb2: Failed to update device symlinks: Too many levels of symbolic links

72630:Jul 20 08:27:15 int101 systemd-udevd[4763]: sdh2: Failed to update device symlinks: Too many levels of symbolic links

72631:Jul 20 08:27:15 int101 systemd-udevd[4773]: sdd2: Failed to update device symlinks: Too many levels of symbolic links

72632:Jul 20 08:27:15 int101 systemd-udevd[4822]: sde2: Failed to update device symlinks: Too many levels of symbolic links

72633:Jul 20 08:27:15 int101 systemd-udevd[4770]: sdc2: Failed to update device symlinks: Too many levels of symbolic links

however I cannot any symlink on dev...


Code:
root@int101:/dev# ls -la sd*

brw-rw---- 1 root disk 8,   0 Jul 20 08:26 sda

brw-rw---- 1 root disk 8,  16 Jul 20 08:26 sdb

brw-rw---- 1 root disk 8,  17 Jul 20 08:26 sdb1

brw-rw---- 1 ceph ceph 8,  18 Jul 20 11:27 sdb2

brw-rw---- 1 root disk 8,  32 Jul 20 08:26 sdc

brw-rw---- 1 root disk 8,  33 Jul 20 08:26 sdc1

brw-rw---- 1 ceph ceph 8,  34 Jul 20 11:27 sdc2

brw-rw---- 1 root disk 8,  48 Jul 20 08:26 sdd

brw-rw---- 1 root disk 8,  49 Jul 20 08:26 sdd1

brw-rw---- 1 ceph ceph 8,  50 Jul 20 11:27 sdd2

brw-rw---- 1 root disk 8,  64 Jul 20 08:26 sde

brw-rw---- 1 root disk 8,  65 Jul 20 08:26 sde1

brw-rw---- 1 ceph ceph 8,  66 Jul 20 11:27 sde2

brw-rw---- 1 root disk 8,  80 Jul 20 08:26 sdf

brw-rw---- 1 root disk 8,  81 Jul 20 08:26 sdf1

brw-rw---- 1 ceph ceph 8,  82 Jul 20 11:27 sdf2

brw-rw---- 1 root disk 8,  96 Jul 20 08:26 sdg

brw-rw---- 1 root disk 8,  97 Jul 20 08:26 sdg1

brw-rw---- 1 ceph ceph 8,  98 Jul 20 11:27 sdg2

brw-rw---- 1 root disk 8, 112 Jul 20 08:26 sdh

brw-rw---- 1 root disk 8, 113 Jul 20 08:26 sdh1

brw-rw---- 1 ceph ceph 8, 114 Jul 20 11:27 sdh2

brw-rw---- 1 root disk 8, 128 Jul 20 08:26 sdi

brw-rw---- 1 root disk 8, 144 Jul 20 08:26 sdj

brw-rw---- 1 root disk 8, 160 Jul 20 08:26 sdk

brw-rw---- 1 root disk 8, 176 Jul 20 08:26 sdl

brw-rw---- 1 root disk 8, 192 Jul 20 08:26 sdm

brw-rw---- 1 root disk 8, 193 Jul 20 08:26 sdm1

brw-rw---- 1 root disk 8, 194 Jul 20 08:26 sdm2

brw-rw---- 1 root disk 8, 195 Jul 20 08:26 sdm3

brw-rw---- 1 root disk 8, 208 Jul 20 08:26 sdn

or


Code:
root@int101:/var/lib/ceph/osd# cd ceph-0/
root@int101:/var/lib/ceph/osd/ceph-0# ll
total 56
-rw-r--r-- 1 root root 402 Mar  1  2018 activate.monmap
-rw-r--r-- 1 ceph ceph   3 Mar  1  2018 active
lrwxrwxrwx 1 root root   9 Jul 20 08:26 block -> /dev/sdb2
-rw-r--r-- 1 ceph ceph  37 Mar  1  2018 block_uuid
-rw-r--r-- 1 ceph ceph   2 Mar  1  2018 bluefs
-rw-r--r-- 1 ceph ceph  37 Mar  1  2018 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 Mar  1  2018 fsid
-rw------- 1 ceph ceph  56 Mar  1  2018 keyring
-rw-r--r-- 1 ceph ceph   8 Mar  1  2018 kv_backend
-rw-r--r-- 1 ceph ceph  21 Mar  1  2018 magic
-rw-r--r-- 1 ceph ceph   4 Mar  1  2018 mkfs_done
-rw-r--r-- 1 ceph ceph   6 Mar  1  2018 ready
-rw------- 1 ceph ceph   3 Jul 18 10:29 require_osd_release
-rw-r--r-- 1 ceph ceph   0 Jul 21  2019 systemd
-rw-r--r-- 1 ceph ceph  10 Mar  1  2018 type
-rw-r--r-- 1 ceph ceph   2 Mar  1  2018 whoami

Any idea how to find and clean de symlink loop?

Regards
 
Also, have one or more osds down on node reboots:

2021-08-01T07:55:38.476+0200 7f22b9b69f00 -1 bluestore(/var/lib/ceph/osd/ceph-23/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-23/block: (13) Permission denied
2021-08-01T07:55:38.476+0200 7f22b9b69f00 1 bdev(0x55f022ef8400 /var/lib/ceph/osd/ceph-23/block) open path /var/lib/ceph/osd/ceph-23/block
2021-08-01T07:55:38.476+0200 7f22b9b69f00 -1 bdev(0x55f022ef8400 /var/lib/ceph/osd/ceph-23/block) open open got: (13) Permission denied
2021-08-01T07:55:38.476+0200 7f22b9b69f00 -1 osd.23 0 OSD:init: unable to mount object store
2021-08-01T07:55:38.476+0200 7f22b9b69f00 -1 ** ERROR: osd init failed: (13) Permission denied

I have other osd's thar started without any permissions issue with user and group root in symbolic link. Tried to chown link to ceph:ceph but same permission error.

Regards
 
Last edited:
More info on this, sometimes I'm able to readd failing osd's a bit later without changing anything at all:

Code:
2021-08-01T07:55:38.718+0200 7f35a7603f00 -1 bluestore(/var/lib/ceph/osd/ceph-22/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-22/block: (13) Permission denied
2021-08-01T07:55:38.718+0200 7f35a7603f00  1 bluestore(/var/lib/ceph/osd/ceph-22) _mount path /var/lib/ceph/osd/ceph-22
2021-08-01T07:55:38.718+0200 7f35a7603f00  0 bluestore(/var/lib/ceph/osd/ceph-22) _open_db_and_around read-only:0 repair:0
2021-08-01T07:55:38.718+0200 7f35a7603f00 -1 bluestore(/var/lib/ceph/osd/ceph-22/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-22/block: (13) Permission denied
2021-08-01T07:55:38.718+0200 7f35a7603f00  1 bdev(0x55f357306400 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T07:55:38.718+0200 7f35a7603f00 -1 bdev(0x55f357306400 /var/lib/ceph/osd/ceph-22/block) open open got: (13) Permission denied
2021-08-01T07:55:38.718+0200 7f35a7603f00 -1 osd.22 0 OSD:init: unable to mount object store
2021-08-01T07:55:38.718+0200 7f35a7603f00 -1  ** ERROR: osd init failed: (13) Permission denied
2021-08-01T08:25:13.658+0200 7f8ef067af00  0 set uid:gid to 64045:64045 (ceph:ceph)
2021-08-01T08:25:13.658+0200 7f8ef067af00  0 ceph version 16.2.5 (9b9dd76e12f1907fe5dcc0c1fadadbb784022a42) pacific (stable), process ceph-osd, pid 76003
2021-08-01T08:25:13.658+0200 7f8ef067af00  0 pidfile_write: ignore empty --pid-file
2021-08-01T08:25:13.662+0200 7f8ef067af00  1 bdev(0x564ef318a800 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:13.662+0200 7f8ef067af00  1 bdev(0x564ef318a800 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2021-08-01T08:25:13.662+0200 7f8ef067af00  1 bluestore(/var/lib/ceph/osd/ceph-22) _set_cache_sizes cache_size 3221225472 meta 0.45 kv 0.45 data 0.06
2021-08-01T08:25:13.662+0200 7f8ef067af00  1 bdev(0x564ef318ac00 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:13.666+0200 7f8ef067af00  1 bdev(0x564ef318ac00 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2021-08-01T08:25:13.666+0200 7f8ef067af00  1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-22/block size 894 GiB
2021-08-01T08:25:13.666+0200 7f8ef067af00  1 bdev(0x564ef318ac00 /var/lib/ceph/osd/ceph-22/block) close
2021-08-01T08:25:13.986+0200 7f8ef067af00  1 bdev(0x564ef318a800 /var/lib/ceph/osd/ceph-22/block) close
2021-08-01T08:25:14.234+0200 7f8ef067af00  0 starting osd.22 osd_data /var/lib/ceph/osd/ceph-22 /var/lib/ceph/osd/ceph-22/journal
2021-08-01T08:25:14.262+0200 7f8ef067af00  0 load: jerasure load: lrc load: isa
2021-08-01T08:25:14.262+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:14.262+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2021-08-01T08:25:14.262+0200 7f8ef067af00  1 bluestore(/var/lib/ceph/osd/ceph-22) _set_cache_sizes cache_size 3221225472 meta 0.45 kv 0.45 data 0.06
2021-08-01T08:25:14.262+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) close
2021-08-01T08:25:14.570+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:14.570+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2021-08-01T08:25:14.570+0200 7f8ef067af00  1 bluestore(/var/lib/ceph/osd/ceph-22) _set_cache_sizes cache_size 3221225472 meta 0.45 kv 0.45 data 0.06
2021-08-01T08:25:14.570+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) close
2021-08-01T08:25:14.894+0200 7f8ef067af00  0 osd.22:0.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2021-08-01T08:25:14.894+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:14.894+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2021-08-01T08:25:14.894+0200 7f8ef067af00  1 bluestore(/var/lib/ceph/osd/ceph-22) _set_cache_sizes cache_size 3221225472 meta 0.45 kv 0.45 data 0.06
2021-08-01T08:25:14.894+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) close
2021-08-01T08:25:15.226+0200 7f8ef067af00  0 osd.22:1.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2021-08-01T08:25:15.226+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:15.226+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported
2021-08-01T08:25:15.226+0200 7f8ef067af00  1 bluestore(/var/lib/ceph/osd/ceph-22) _set_cache_sizes cache_size 3221225472 meta 0.45 kv 0.45 data 0.06
2021-08-01T08:25:15.226+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) close
2021-08-01T08:25:15.554+0200 7f8ef067af00  0 osd.22:2.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2021-08-01T08:25:15.554+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
2021-08-01T08:25:15.554+0200 7f8ef067af00  1 bdev(0x564ef3e54400 /var/lib/ceph/osd/ceph-22/block) open size 960091197440 (0xdf89e51000, 894 GiB) block_size 4096 (4 KiB) non-rotational discard supported

Any idea what's going on?

Regards
 
@aaron this is the output form osd 22

root@int101:~# ls -l /dev/sde* brw-rw---- 1 root disk 8, 64 Aug 1 07:54 /dev/sde brw-rw---- 1 root disk 8, 65 Aug 1 07:54 /dev/sde1 brw-rw---- 1 ceph ceph 8, 66 Aug 2 23:51 /dev/sde2

Regards
 
@aaron I have other osds, maybe the recreated ones only with one partition:

root@int101:~# ls -l /dev/sda* brw-rw---- 1 root disk 8, 0 Aug 1 07:54 /dev/sda

Regards
 
If you are able at a later time to start the OSD, does it still show the second partition as being owned by ceph as here?
Code:
root@int101:~# ls -l /dev/sde*
brw-rw---- 1 root disk 8, 64 Aug  1 07:54 /dev/sde
brw-rw---- 1 root disk 8, 65 Aug  1 07:54 /dev/sde1
brw-rw---- 1 ceph ceph 8, 66 Aug  2 23:51 /dev/sde2
 
I have other osds, maybe the recreated ones only with one partition:
If you recreate OSDs, they will be created with the latest bluestore layout which stores everything in one large LVM. Slightly older ones have 2 partitions. One to store the metadata and one to store the actual data.
 
Yes, it does, the ones with the new layout are owned by root:disk but also old layout ones, thery have mixed permissions:

Code:
root@int101:~#  ls -l /dev/sde*
brw-rw---- 1 root disk 8, 64 Aug  4 11:17 /dev/sde
brw-rw---- 1 root disk 8, 65 Aug  4 11:17 /dev/sde1
brw-rw---- 1 ceph ceph 8, 66 Aug  4 11:45 /dev/sde2
root@int101:~#  ls -l /dev/sdg*
brw-rw---- 1 root disk 8, 96 Aug  4 11:17 /dev/sdg
brw-rw---- 1 root disk 8, 97 Aug  4 11:17 /dev/sdg1
brw-rw---- 1 root disk 8, 98 Aug  4 11:17 /dev/sdg2
root@int101:~#  ls -l /dev/sdd*
brw-rw---- 1 root disk 8, 48 Aug  4 11:17 /dev/sdd
brw-rw---- 1 root disk 8, 49 Aug  4 11:17 /dev/sdd1
brw-rw---- 1 ceph ceph 8, 50 Aug  4 11:45 /dev/sdd2
root@int101:~#  ls -l /dev/sda*
brw-rw---- 1 root disk 8, 0 Aug  4 11:17 /dev/sda
root@int101:~#  ls -l /dev/sdb*
brw-rw---- 1 root disk 8, 16 Aug  4 11:17 /dev/sdb

Regards
 
Hmm okay. What are the permissions inside the /var/lib/ceph/ost/ceph-<OSD ID>/ directory in both situations?
 
Here they are, same order:

Code:
root@int101:~# ll /var/lib/ceph/osd/ceph-22/
total 56
-rw-r--r-- 1 root root 402 Mar  5  2018 activate.monmap
-rw-r--r-- 1 ceph ceph   3 Mar  5  2018 active
lrwxrwxrwx 1 root root   9 Aug  4 11:17 block -> /dev/sde2
-rw-r--r-- 1 ceph ceph  37 Mar  5  2018 block_uuid
-rw-r--r-- 1 ceph ceph   2 Mar  5  2018 bluefs
-rw-r--r-- 1 ceph ceph  37 Mar  5  2018 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 Mar  5  2018 fsid
-rw------- 1 ceph ceph  57 Mar  5  2018 keyring
-rw-r--r-- 1 ceph ceph   8 Mar  5  2018 kv_backend
-rw-r--r-- 1 ceph ceph  21 Mar  5  2018 magic
-rw-r--r-- 1 ceph ceph   4 Mar  5  2018 mkfs_done
-rw-r--r-- 1 ceph ceph   6 Mar  5  2018 ready
-rw------- 1 ceph ceph   3 Jul 18 10:29 require_osd_release
-rw-r--r-- 1 ceph ceph   0 Jul 21  2019 systemd
-rw-r--r-- 1 ceph ceph  10 Mar  5  2018 type
-rw-r--r-- 1 ceph ceph   3 Mar  5  2018 whoami
root@int101:~# ll /var/lib/ceph/osd/ceph-23/
total 56
-rw-r--r-- 1 root root 402 Mar  5  2018 activate.monmap
-rw-r--r-- 1 ceph ceph   3 Mar  5  2018 active
lrwxrwxrwx 1 root root   9 Jul 28 17:29 block -> /dev/sdg2
-rw-r--r-- 1 ceph ceph  37 Mar  5  2018 block_uuid
-rw-r--r-- 1 ceph ceph   2 Mar  5  2018 bluefs
-rw-r--r-- 1 ceph ceph  37 Mar  5  2018 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 Mar  5  2018 fsid
-rw------- 1 ceph ceph  57 Mar  5  2018 keyring
-rw-r--r-- 1 ceph ceph   8 Mar  5  2018 kv_backend
-rw-r--r-- 1 ceph ceph  21 Mar  5  2018 magic
-rw-r--r-- 1 ceph ceph   4 Mar  5  2018 mkfs_done
-rw-r--r-- 1 ceph ceph   6 Mar  5  2018 ready
-rw------- 1 ceph ceph   3 Jul 18 10:29 require_osd_release
-rw-r--r-- 1 ceph ceph   0 Jul 21  2019 systemd
-rw-r--r-- 1 ceph ceph  10 Mar  5  2018 type
-rw-r--r-- 1 ceph ceph   3 Mar  5  2018 whoami
root@int101:~# ll /var/lib/ceph/osd/ceph-10/
total 56
-rw-r--r-- 1 root root 402 Mar  1  2018 activate.monmap
-rw-r--r-- 1 ceph ceph   3 Mar  1  2018 active
lrwxrwxrwx 1 root root   9 Aug  4 11:17 block -> /dev/sdd2
-rw-r--r-- 1 ceph ceph  37 Mar  1  2018 block_uuid
-rw-r--r-- 1 ceph ceph   2 Mar  1  2018 bluefs
-rw-r--r-- 1 ceph ceph  37 Mar  1  2018 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 Mar  1  2018 fsid
-rw------- 1 ceph ceph  57 Mar  1  2018 keyring
-rw-r--r-- 1 ceph ceph   8 Mar  1  2018 kv_backend
-rw-r--r-- 1 ceph ceph  21 Mar  1  2018 magic
-rw-r--r-- 1 ceph ceph   4 Mar  1  2018 mkfs_done
-rw-r--r-- 1 ceph ceph   6 Mar  1  2018 ready
-rw------- 1 ceph ceph   3 Jul 18 10:29 require_osd_release
-rw-r--r-- 1 ceph ceph   0 Jul 21  2019 systemd
-rw-r--r-- 1 ceph ceph  10 Mar  1  2018 type
-rw-r--r-- 1 ceph ceph   3 Mar  1  2018 whoami
root@int101:~# ll /var/lib/ceph/osd/ceph-47/
total 28
lrwxrwxrwx 1 ceph ceph 93 Aug  4 11:17 block -> /dev/ceph-abbce406-397b-4f5b-b23f-f7ced4f9ca6f/osd-block-2effd9b2-37b0-4dc3-8bf6-e5e23a306f55
-rw------- 1 ceph ceph 37 Aug  4 11:17 ceph_fsid
-rw------- 1 ceph ceph 37 Aug  4 11:17 fsid
-rw------- 1 ceph ceph 56 Aug  4 11:17 keyring
-rw------- 1 ceph ceph  6 Aug  4 11:17 ready
-rw------- 1 ceph ceph  3 Aug  4 11:17 require_osd_release
-rw------- 1 ceph ceph 10 Aug  4 11:17 type
-rw------- 1 ceph ceph  3 Aug  4 11:17 whoami
root@int101:~# ll /var/lib/ceph/osd/ceph-0/
total 28
lrwxrwxrwx 1 ceph ceph 93 Aug  4 11:17 block -> /dev/ceph-5e966819-6026-40a8-8b9b-398c5c7abd6d/osd-block-85ef2815-1995-4a1b-b8a5-70fdcd3804aa
-rw------- 1 ceph ceph 37 Aug  4 11:17 ceph_fsid
-rw------- 1 ceph ceph 37 Aug  4 11:17 fsid
-rw------- 1 ceph ceph 55 Aug  4 11:17 keyring
-rw------- 1 ceph ceph  6 Aug  4 11:17 ready
-rw------- 1 ceph ceph  3 Aug  4 11:18 require_osd_release
-rw------- 1 ceph ceph 10 Aug  4 11:17 type
-rw------- 1 ceph ceph  2 Aug  4 11:17 whoami

Regards
 
Hmm, we could try to wait for a newer Ceph version to see if that fixes the problem or try to dig deeper ourselves. Alternatively, you could also go ahead and recreate the OSDs one by one to have them using the latest on disk format.

If you want to go ahead recreating the OSds and want to avoid a full recover/rebalance each time, first enable the "norecover" and "norebalance" OSD flags. Then stop the OSD and set it to OUT. Once it is stopped and out, you can destroy it (make sure the "Cleanup Disk" checkbox is active).

Then recreate the OSD and once it is back UP and IN, disable the previously set OSD flags to let Ceph recreate the data on that OSD. Once the cluster is healthy again, you can try to reboot that node and see if the problem persists.
 
Thanks Aarom, I'm recreating them, but it takes about 2 or 3 hours each and cannot degrade service performance too much.
It seems also nobody else it's in the same situation, so digging may not woth it.
I'll let you know if it issue persists once osd's been updated.

Regards
 
Thanks Aarom, I'm recreating them, but it takes about 2 or 3 hours each and cannot degrade service performance too much.
It seems also nobody else it's in the same situation, so digging may not woth it.
I'll let you know if it issue persists once osd's been updated.

Regards
Hi @Mikepop recreating all osds fixed the issue? The problem at reboot was similar to this https://forum.proxmox.com/threads/p...ecomes-unstable-after-rebooting-a-node.96799/ ? many thanks
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!