[SOLVED] Problem after upgrade to ceph octopus

mikipn · Dec 27, 2020

Hi, this evening I did upgrade to proxmox 6.3 and ceph octopus.
after restarting osd, osd process started, but osd is in offline mode, and I don;t see in osd log that osd is doimng anything, it looks like normal start but osd is not considered up and in.
I will leave it like that duting the night, to see if it is doing format conversion that is mentioned in manual for upgrade from Nautilus to Octopus.

Also I deleted one osd, and attempted to create it again
resalt was unsuccesfull with result:

pveceph osd create /dev/sdb
create OSD on /dev/sdb (bluestore)
wipe disk/partition: /dev/sdb
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 0.506091 s, 414 MB/s
--> AttributeError: module 'ceph_volume.api.lvm' has no attribute 'is_lv'
command 'ceph-volume lvm create --cluster-fsid 4bcfed01-7c42-470f-99a7-dd54560eb61e --data /dev/sdb' failed: exit code 1

mikipn · Dec 27, 2020

I have forgot to paste output of pveversion
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-4.15: 5.4-19
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.13.13-5-pve: 4.13.13-38
pve-kernel-4.13.13-3-pve: 4.13.13-34
pve-kernel-4.13.4-1-pve: 4.13.4-26
ceph: 15.2.8-pve2
ceph-fuse: 15.2.8-pve2
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

Klaus Steinberger · Dec 27, 2020

Are these OSD's which you created before Nautilus with ceph-disk?

did follow this step in the upgrade to Nautilus:

ceph-volume siple scan /dev/...
and the activate it?

mikipn · Dec 27, 2020

Those osd's were created from Nautilus. Because I had to recreate them recently.

mikipn · Dec 27, 2020

But whole proxmox system went upgrades from 5.x to 6.x
During upgrade to Nautilus I lost control of osd's in gui, buttons to comand with osd's do not work, but command line couterparrts work
I downgraded programs to 15.2.6-pve1 and I was able to create osd but it works the same way, processes work supposedly ok, but ceph cluster does not see them
osd's have active tcp connections to active mgr and mon nodes

mikipn · Dec 27, 2020

I am now setting up some p[ropxmox cluster with 6.3 version to setup octopus from zero to see how this will work

mikipn · Dec 27, 2020

problem solved
I had
ceph osd require-osd-release mimic and this blocked acceptance of octopus osd's
after
ceph osd require-osd-release nautilus all osd's started without p[roblem

Julian Lliteras · Jan 27, 2021

I had to reinstall a host from scratch and have the same error when adding a single osd. The node is up and running with ceph services ok. But when adding an osd reported this error from gui:

create OSD on /dev/sdc (bluestore)
wipe disk/partition: /dev/sdc
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.83528 s, 114 MB/s
--> AttributeError: module 'ceph_volume.api.lvm' has no attribute 'is_lv'
TASK ERROR: command 'ceph-volume lvm create --cluster-fsid 0faf0a88-08b9-4143-b0bf-a8b7b9654527 --data /dev/sdc' failed: exit code 1

I zapped disk with sgdisk and doesn't work. Tried from console and no luck. Tried another disk and nothing. I suspect that is a bug with libraries from ceph.

pveversion:
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.8-pve2
ceph-fuse: 15.2.8-pve2
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-4
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-4
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-3
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

mikipn · Jan 29, 2021

Hi
Have you checked ceph osd require-osd-release ?
although I am not sure that it have something to do with this problem, The problem you have I solved by downgrading ceph on that node where I had that error to 15.2.6-pve1.
Later I installed from scratch pve 6.3 cluster with ceph 15 and all went well, no such errors.
So maybe value of ceph osd require-osd-release have something to do with that if it is low enough.
The rest of your cluster runs also ceph 15.2.8?

Julian Lliteras · Feb 2, 2021

Hi Miki

My whole cluster is in ceph 15.2.8 and running ok. I posted same error on this thread and seems to be a ceph bug reported by Fabian.

I'll waiting for the patch and hope to be fixed soon. Can't downgrade ceph and reinstall all nodes.

Greetings.

mikipn · Feb 3, 2021

Hi Julian
For me solution for is_lv error was downgrading ceph on that node to 15.2.6-pve1, create osd, and after upgrade back to 15.2.8-pve2
But fix in upstream is definitly better solution.
And
ceph osd require-osd-release nautilus
solved my problem not seeing osd's after upgrade to octopus, because I hat too low requirement for osd's that was not supported anymore in octopus
previous setting for this was on my system
ceph osd require-osd-release mimic
so ceph osd require-osd-release nautilus was not usefull for is_lv error.

Waywatcher · Jun 17, 2022

mikipn said:
mikipn said:

problem solved
I had
ceph osd require-osd-release mimic and this blocked acceptance of octopus osd's
after
ceph osd require-osd-release nautilus all osd's started without p[roblem

Click to expand...

This just saved my butt.

n0ll · Jun 23, 2022

Got the error running pve6to7

Bash:

sed -i 's/nautilus/octopus/' /etc/apt/sources.list.d/ceph.list
apt update
apt full-upgrade
reboot now

Resolved error and now can upgrade to 7

Search

Search

[SOLVED] Problem after upgrade to ceph octopus

mikipn

Member

mikipn

Member

Klaus Steinberger

Renowned Member

mikipn

Member

mikipn

Member

mikipn

Member

mikipn

Member

Julian Lliteras

Active Member

mikipn

Member

Julian Lliteras

Active Member

mikipn

Member

Waywatcher

Member

n0ll

Member