Proxmox 5, Ceph Luminous observations and notes

alexskysilk · May 1, 2017

apologies if this belongs in a different forum. I set up a cluster using Proxmox5/stretch + Ceph 12 in the lab. Here are some observations that may be useful for UX purposes:

1. the default rbd pool was always a needless nuisance but was always easy to delete- but with Luminous the default behavior is to deny pool deletion. This is generally the correct behavior but it creates a UX problem where there is a default, useless pool created at pveceph install and cannot be removed. This will also affect deletion of any pools, so the global variable for pool deletion (
mon_allow_pool_delete) should either be set OR provide an alternative, one time method to do so via the GUI.
2. creating OSDs is a very iffy proposition; the process completes well enough from either the GUI or CLI, but it does not add them to the crush map. I got it to add the OSDs from one node ONCE; other nodes did not- they're not even showing up as available hosts in the crush map. I am following the normal process of ceph-disk zap followed by pveceph createosd. The OSD creation does appear in the tasks logs and completes successfully, but does not mount or create a process.

pveversion -v

proxmox-ve: 5.0-6 (running kernel: 4.10.8-1-pve)
pve-manager: 5.0-9 (running version: 5.0-9/c7bdd872)
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.10.8-1-pve: 4.10.8-6
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-4.2.8-1-pve: 4.2.8-41
pve-kernel-4.4.16-1-pve: 4.4.16-64
libpve-http-server-perl: 2.0-2
lvm2: 2.02.168-pve2
corosync: 2.4.2-pve2
libqb0: 1.0.1-1
pve-cluster: 5.0-4
qemu-server: 5.0-4
pve-firmware: 2.0-2
libpve-common-perl: 5.0-8
libpve-guest-common-perl: 2.0-1
libpve-access-control: 5.0-3
libpve-storage-perl: 5.0-3
pve-libspice-server1: 0.12.8-3
vncterm: 1.4-1
pve-docs: 5.0-1
pve-qemu-kvm: 2.9.0-1
pve-container: 2.0-6
pve-firewall: 3.0-1
pve-ha-manager: 2.0-1
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.0.7-500
lxcfs: 2.0.6-pve500
criu: 2.11.1-1~bpo90
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.6.5.9-pve16~bpo90
ceph: 12.0.1-pve1

pveceph status attached.

fabian · May 2, 2017

alexskysilk said:
apologies if this belongs in a different forum. I set up a cluster using Proxmox5/stretch + Ceph 12 in the lab. Here are some observations that may be useful for UX purposes:

1. the default rbd pool was always a needless nuisance but was always easy to delete- but with Luminous the default behavior is to deny pool deletion. This is generally the correct behavior but it creates a UX problem where there is a default, useless pool created at pveceph install and cannot be removed. This will also affect deletion of any pools, so the global variable for pool deletion (
mon_allow_pool_delete) should either be set OR provide an alternative, one time method to do so via the GUI.

there is a proposal on pve-devel to enable pool deletion by default (when Ceph is setup using "pveceph init").

2. creating OSDs is a very iffy proposition; the process completes well enough from either the GUI or CLI, but it does not add them to the crush map. I got it to add the OSDs from one node ONCE; other nodes did not- they're not even showing up as available hosts in the crush map. I am following the normal process of ceph-disk zap followed by pveceph createosd. The OSD creation does appear in the tasks logs and completes successfully, but does not mount or create a process.

could you post the complete journal output from such a failed osd creation? I think there is a problem on some systems because udev already tries to activate the OSD before it is done initializing..

pveversion -v

proxmox-ve: 5.0-6 (running kernel: 4.10.8-1-pve)
..
ceph: 12.0.1-pve1

there'll be 12.0.2 packages shortly, if you want to test those

alexskysilk · May 2, 2017

fabian said:
could you post the complete journal output from such a failed osd creation?

be glad to. how do I do that when the OSD isnt mounted?

fabian · May 3, 2017

alexskysilk said:
be glad to. how do I do that when the OSD isnt mounted?

Code:

journalctl --since "timestamp one minute before pveceph createosd" --until "timestamp 3 minutes after pveceph createosd"

where timestamps are "2017-05-03 08:30:00" or "2017-05-03 08:30". if you redo it "now", you can also do something like

Code:

journalctl --since "-5m"

to get the last 5 minutes of logs

alexskysilk · May 4, 2017

oh THAT journal...

udo · May 4, 2017

alexskysilk said:
oh THAT journal...

Hi,
" Alternate GPT is invalid, using primary GPT" sound strange!

Is the partition-info right and happens the same if you repair the info with parted?

Udo

alexskysilk · May 4, 2017

It does sound strange, but it happens every time I use ceph-disk zap. it doesnt appear to affect anything. Parted doesnt show anything wrong at all:

parted /dev/sdc
GNU Parted 3.2
Using /dev/sdc
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: ATA MM1000GBKAL (scsi)
Disk /dev/sdc: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
2 1049kB 5370MB 5369MB ceph journal
1 5370MB 1000GB 995GB xfs ceph data

fabian · May 5, 2017

the log indicates that something is not right with your bootstrap keys:

Code:

May 04 11:03:35 pve22 sh[1067736]: ceph_disk.main.Error: Error: ceph osd create failed: Command '/usr/bin/ceph' returned non-zero exit status 1: 2017-05-04 11:03:35.471823 7f71f4c9e700  0 librados: client.bootstrap-osd authentication error (1) Operation not permitted

I assume this is a test cluster?

If so, what does "ceph auth list | grep -v key" output?
does the file /var/lib/ceph/bootstrap-osd/ceph.keyring exist? what permissions does it have?
do "ceph auth get client.bootstrap-osd" and "ceph-authtool -l /var/lib/ceph/bootstrap-osd/ceph.keyring" print the same key? (no need to post the output here, just whether they are the same!)

alexskysilk · May 5, 2017

fabian said:
If so, what does "ceph auth list | grep -v key" output?

installed auth entries:

osd.0
caps: [mon] allow profile osd
caps: [osd] allow *
osd.1
caps: [mon] allow profile osd
caps: [osd] allow *
osd.2
caps: [mon] allow profile osd
caps: [osd] allow *
osd.3
caps: [mon] allow profile osd
caps: [osd] allow *
osd.4
caps: [mon] allow rwx

caps: [osd] allow *
client.admin
auid: 0
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
client.bootstrap-mds
caps: [mon] allow profile bootstrap-mds
client.bootstrap-osd
caps: [mon] allow profile bootstrap-osd
client.bootstrap-rgw
caps: [mon] allow profile bootstrap-rgw
mgr.0
caps: [mon] allow *
mgr.1
caps: [mon] allow *
mgr.2
caps: [mon] allow *
mgr.3
caps: [mon] allow *

fabian said:
does the file /var/lib/ceph/bootstrap-osd/ceph.keyring exist? what permissions does it have?

it does.
-rw-r--r-- 1 ceph ceph 113 Dec 14 2015 ceph.keyring

fabian said:
do "ceph auth get client.bootstrap-osd" and "ceph-authtool -l /var/lib/ceph/bootstrap-osd/ceph.keyring" print the same key?

Interesting. they do not. I should note that these nodes were upgraded from 4.4 and I did a pveceph purge before reinstalling ceph...

alexskysilk · May 6, 2017

I went ahead and purged the config, then manually deleted /var/lib/ceph/* on each participating node. problem went away.

I think "pveceph purge" needs to be more aggresive (maybe with a --aggresive switch) and kill node related ceph settings on purge.

Search

Search

Proxmox 5, Ceph Luminous observations and notes

alexskysilk

Distinguished Member

Attachments

fabian

Proxmox Staff Member

alexskysilk

Distinguished Member

fabian

Proxmox Staff Member

alexskysilk

Distinguished Member

Attachments

udo

Distinguished Member

alexskysilk

Distinguished Member

fabian

Proxmox Staff Member

alexskysilk

Distinguished Member

alexskysilk

Distinguished Member

We value your privacy