Ceph upgrade to Nautilus - error mount point and no "uuid"

Kaboom

Active Member
Mar 5, 2019
119
11
38
52
Dear all,

I am busy with upgrading Ceph to Nautilus https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus

But I get this error when running the ceph-volume simple scan

root@node002:/dev/disk/by-partuuid# ceph-volume simple scan /dev/sdc1
Running command: /sbin/cryptsetup status /dev/sdc1
Running command: /bin/mount -v /dev/sdc1 /tmp/tmpC53VLj
stderr: mount: /tmp/tmpC53VLj: /dev/sdc1 already mounted or mount point busy.
--> RuntimeError: command returned non-zero exit status: 32

root@node002:/dev/disk/by-partuuid# ceph-volume simple activate --all
--> activating OSD specified in /etc/ceph/osd/2-9fef792d-e0fd-4d9f-9b99-3040e636cf16.json
--> RuntimeError: Unable to activate OSD None - no "uuid" key found for data

Can anyone help me out?

Thanks!
 
root@node002:/dev/disk/by-partuuid# cat /etc/ceph/osd/2-9fef792d-e0fd-4d9f-9b99-3040e636cf16.json
{
"active": "ok",
"block": {
"path": "/dev/disk/by-partuuid/8755dd67-fee5-46f2-b0eb-e9fd75725722",
"uuid": "8755dd67-fee5-46f2-b0eb-e9fd75725722"
},
"block_uuid": "8755dd67-fee5-46f2-b0eb-e9fd75725722",
"bluefs": 1,
"ceph_fsid": "09935360-cfe7-48d4-ac76-c02e0fdd95de",
"cluster_name": "ceph",
"data": {
"path": "../dm-8",
"uuid": ""
},
"fsid": "9fef792d-e0fd-4d9f-9b99-3040e636cf16",
"keyring": "AQBAUPVaHbCsFhAAkHybC1sITAfeFsCJTshPHA==",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"require_osd_release": 12,
"systemd": "",
"type": "bluestore",
"whoami": 2
}
 
Code:
ceph-volume simple scan
ceph-volume simple activate --all
Run this command to get all OSDs activated. You can re-run it after the reboot too. The only thing is that the OSDs won't start if that step was missed.
 
Thanks for your fast answer, but I get this error or is this no problem?

root@node003:/dev# ceph-volume simple activate --all
--> activating OSD specified in /etc/ceph/osd/9-168c72e2-02a2-480e-818b-861f3e2dff0c.json
--> RuntimeError: Unable to activate OSD None - no "uuid" key found for data
 
--> activating OSD specified in /etc/ceph/osd/9-168c72e2-02a2-480e-818b-861f3e2dff0c.json
--> RuntimeError: Unable to activate OSD None - no "uuid" key found for data
OSD.9 doesn't seem to have an UUID for its partition. Check with lsblk -l -o NAME,UUID if it has one like the others. If not, it may be easier to destroy & create the OSD again.

Also check if you didn't miss a step on our upgrade guide.
https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes
 
I have destroyed the OSD (with hdparm, time fswipe and zap) but it stays unused in Proxmox if I want to add it 'No disks unused'. Should I partition it first?

=====

fdisk =>

Unpartitioned space /dev/sdd: 447.1 GiB, 480102932480 bytes, 937701040 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes

Start End Sectors Size
2048 937703087 937701040 447.1G

=====

pveceph createosd /dev/sdd =>

device '/dev/sdd' is already in use

=====

ls -la /dev/sdd =>

brw-rw---- 1 root disk 8, 48 Nov 5 10:12 sdd

There is no ssd1 or ssd2

=====

pvesm status =>

Name Type Status Total Used Available %
NFS008 nfs disabled 0 0 0 N/A
ceph_ssd rbd active 3883469120 3333127424 550341696 85.83%
local dir active 1120317312 536248448 584068864 47.87%
local-thin-lvm lvmthin disabled 0 0 0 N/A
local-zfs zfspool disabled 0 0 0

=====

Thanks!
 
Last edited:
Check with lsblk if the disks is still show with partitions. The kernel might not have picked it up. If so, run a partprobe to tell the kernel that the layout changed.
 
lsblk =>

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.1T 0 disk
├─sda1 8:1 0 1007K 0 part
├─sda2 8:2 0 1.1T 0 part
└─sda9 8:9 0 8M 0 part
sdb 8:16 0 1.1T 0 disk
├─sdb1 8:17 0 1007K 0 part
├─sdb2 8:18 0 1.1T 0 part
└─sdb9 8:25 0 8M 0 part
sdc 8:32 0 447.1G 0 disk
└─355cd2e414e2e5527 253:0 0 447.1G 0 mpath
sdd 8:48 0 447.1G 0 disk
└─355cd2e414e2e5ec1 253:1 0 447.1G 0 mpath
sde 8:64 0 447.1G 0 disk
└─355cd2e414f491349 253:2 0 447.1G 0 mpath
sdf 8:80 0 447.1G 0 disk
└─355cd2e414f491f34 253:3 0 447.1G 0 mpath
sdg 8:96 0 447.1G 0 disk
└─355cd2e414f492d87 253:4 0 447.1G 0 mpath
sdh 8:112 0 447.1G 0 disk
└─355cd2e414f482739 253:5 0 447.1G 0 mpath
zd0 230:0 0 8G 0 disk [SWAP]

======

I ran partprobe, but still No disks unused in Proxmox. I want to add SSD's sdc until sdh.
 
Please post such output in CODE tags (triple dot), its hard to read now.

sdd 8:48 0 447.1G 0 disk
└─355cd2e414e2e5ec1 253:1 0 447.1G 0 mpath
Our tooling doesn't allow iSCSI devices to be used. It is in any case not a good idea to use a SAN/NAS for OSDs. You need to use ceph-volume by itself.
 
└─355cd2e414f482739 253:5 0 447.1G 0 mpath
These seem to be multipathed. This usually originates from a SAN/NAS multipath disk. What does ls -lah /sys/block/sdd show?
 
lrwxrwxrwx 1 root root 0 Nov 5 14:49 /sys/block/sdd -> ../devices/pci0000:ae/0000:ae:00.0/0000:af:00.0/host0/port-0:3/end_device-0:3/target0:0:3/0:0:3:0/block/sdd
 
And sfdisk -l /dev/sdd? If the disk is empty, does a OSD creation on the CLI work pveceph osd create /dev/sdd?

EDIT: otherwise run a sgdisk -Z /dev/sdd to remove any GPT or MBR leftover.
 
Code:
sgdisk -Z /dev/sdd

Creating new GPT entries.
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.

====

Code:
pveceph osd create /dev/sdd

device '/dev/sdd' is already in use

====

Code:
sfdisk -l /dev/sdd

Disk /dev/sdd: 447.1 GiB, 480103981056 bytes, 937703088 sectors
Disk model: INTEL SSDSC2KB48
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

It keeps saying it is already in use.
 
Please reboot the node and try again, the kernel may got stuck with an old partition layout.
 
I did that already several times, even tried to load an older kernel. Now running
Linux 5.0.21-3-pve #1 SMP PVE 5.0.21-7 (Mon, 30 Sep 2019 09:11:02 +0200)

Any other ideas?
 
Try to use ceph-volume directly. What version of Ceph are you running ceph versions?
 
Code:
ceph versions
{
    "mon": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 3
    },
    "mgr": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 3
    },
    "osd": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 30
    },
    "mds": {},
    "overall": {
        "ceph version 14.2.4 (65249672c6e6d843510e7e01f8a4b976dcac3db1) nautilus (stable)": 36
    }
}

=====

Code:
Running command: /sbin/vgcreate -s 1G --force --yes ceph-e0618136-83c9-4bfd-b0a0-139ce1c72c39 /dev/sdd
 stderr: Device /dev/sdd excluded by a filter.
-->  RuntimeError: command returned non-zero exit status: 5

=====

Is this helpful?

Code:
ceph-volume inventory /dev/sdd

====== Device report /dev/sdd ======

     available                 False
     rejected reasons          locked
     path                      /dev/sdd
     scheduler mode            mq-deadline
     rotational                0
     vendor                    ATA
     human readable size       447.13 GB
     sas address               0x4433221103000000
     removable                 0
     model                     INTEL SSDSC2KB48
     ro                        0
 
stderr: Device /dev/sdd excluded by a filter.
So Ceph is blocking the creation for some reason. Is there anything more in the journal/syslog and ceph logs?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!