Hello
I've created a net-new PVE 7.1 cluster (3 nodes). While configuring Ceph 16.2 and adding OSDs, I got a seg-fault error:
The OSD does not show up in the list, and appears unused. I noticed that customary VG/LV were created for the block device in question, so I removed them via vgremove / pvremove and a wipefs -a for good measure.
The cleaned device does not enumerate in the PVE web console when attempting to re-add, and I'm unable to add the device as per the pveceph CLI command:
Any idea where /dev/sdd could still be referenced? Its not showing up in a
output
My cluster:
I've created a net-new PVE 7.1 cluster (3 nodes). While configuring Ceph 16.2 and adding OSDs, I got a seg-fault error:
Code:
create OSD on /dev/sdd (bluestore)
wiping block device /dev/sdd
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.30635 s, 161 MB/s
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ef3e9862-3322-4a69-84bf-f63beb18eb3d
Running command: /sbin/vgcreate --force --yes ceph-e264d0e2-bbca-4f3a-8917-03f574fc8f88 /dev/sdd
stdout: Physical volume "/dev/sdd" successfully created.
stdout: Volume group "ceph-e264d0e2-bbca-4f3a-8917-03f574fc8f88" successfully created
Running command: /sbin/lvcreate --yes -l 286160 -n osd-block-ef3e9862-3322-4a69-84bf-f63beb18eb3d ceph-e264d0e2-bbca-4f3a-8917-03f574fc8f88
stdout: Logical volume "osd-block-ef3e9862-3322-4a69-84bf-f63beb18eb3d" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: /sbin/cryptsetup --batch-mode --key-file - luksFormat /dev/ceph-e264d0e2-bbca-4f3a-8917-03f574fc8f88/osd-block-ef3e9862-3322-4a69-84bf-f63beb18eb3d
Running command: /sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-e264d0e2-bbca-4f3a-8917-03f574fc8f88/osd-block-ef3e9862-3322-4a69-84bf-f63beb18eb3d olrKyp-7EEP-1y3i-7x9x-19Tw-9Chn-49oUhr
Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-12
--> Executable selinuxenabled not in PATH: /sbin:/bin:/usr/sbin:/usr/bin
Running command: /bin/chown -h ceph:ceph /dev/mapper/olrKyp-7EEP-1y3i-7x9x-19Tw-9Chn-49oUhr
Running command: /bin/chown -R ceph:ceph /dev/dm-3
Running command: /bin/ln -s /dev/mapper/olrKyp-7EEP-1y3i-7x9x-19Tw-9Chn-49oUhr /var/lib/ceph/osd/ceph-12/block
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-12/activate.monmap
stderr: 2022-03-10T10:35:40.107-0500 7fa544b2a700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
2022-03-10T10:35:40.107-0500 7fa544b2a700 -1 AuthRegistry(0x7fa54005b2e8) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
stderr: got monmap epoch 3
Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-12/keyring --create-keyring --name osd.12 --add-key AQDCGipitcAaARAAHdsUNxg1tUaMyR66LQl5Qg==
stdout: creating /var/lib/ceph/osd/ceph-12/keyring
added entity osd.12 auth(key=AQDCGipitcAaARAAHdsUNxg1tUaMyR66LQl5Qg==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/keyring
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 12 --monmap /var/lib/ceph/osd/ceph-12/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-12/ --osd-uuid ef3e9862-3322-4a69-84bf-f63beb18eb3d --setuser ceph --setgroup ceph
stderr: 2022-03-10T10:35:40.547-0500 7f09800ebf00 -1 bluestore(/var/lib/ceph/osd/ceph-12/) _read_fsid unparsable uuid
stderr: 2022-03-10T10:35:40.571-0500 7f09800ebf00 -1 bluefs _replay 0x0: stop: uuid 70144648-6562-4cc3-c13e-0f1e022a4795 != super.uuid 1320e201-073f-4b55-88a2-036f81185f14, block dump:
.... (see attachment) ....
stderr: 2022-03-10T10:35:41.355-0500 7f09800ebf00 -1 rocksdb: verify_sharding unable to list column families: NotFound:
stderr: 2022-03-10T10:35:41.355-0500 7f09800ebf00 -1 bluestore(/var/lib/ceph/osd/ceph-12/) _open_db erroring opening db:
stderr: 2022-03-10T10:35:41.891-0500 7f09800ebf00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
stderr: 2022-03-10T10:35:41.891-0500 7f09800ebf00 -1 [0;31m ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-12/: (5) Input/output error[0m
--> Was unable to complete a new OSD, will rollback changes
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.12 --yes-i-really-mean-it
stderr: 2022-03-10T10:35:42.035-0500 7fe8fd907700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2) No such file or directory
2022-03-10T10:35:42.035-0500 7fe8fd907700 -1 AuthRegistry(0x7fe8f805b2e8) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx
stderr: purged osd.12
--> RuntimeError: Command failed with exit code 250: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 12 --monmap /var/lib/ceph/osd/ceph-12/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-12/ --osd-uuid ef3e9862-3322-4a69-84bf-f63beb18eb3d --setuser ceph --setgroup ceph
TASK ERROR: command 'ceph-volume lvm create --cluster-fsid 9acff7fb-eae6-4b9d-a89f-119138f3b798 --data /dev/sdd --dmcrypt' failed: exit code 1
The OSD does not show up in the list, and appears unused. I noticed that customary VG/LV were created for the block device in question, so I removed them via vgremove / pvremove and a wipefs -a for good measure.
The cleaned device does not enumerate in the PVE web console when attempting to re-add, and I'm unable to add the device as per the pveceph CLI command:
Code:
# pveceph osd create /dev/sdd
device '/dev/sdd' is already in use
Any idea where /dev/sdd could still be referenced? Its not showing up in a
Code:
ceph devices ls
My cluster:
Code:
# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-5-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-12
pve-kernel-5.13: 7.1-8
pve-kernel-5.13.19-5-pve: 5.13.19-13
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph: 16.2.7
ceph-fuse: 16.2.7
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-3
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-2
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.1.5-1
proxmox-backup-file-restore: 2.1.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-7
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-5
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.1-2
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1