microk8s connect-external-ceph error

l.ansaloni

Renowned Member
Feb 20, 2011
42
2
73
Nonantola, Italy
newlogic.it
Hi all,
I have a Proxmox cluster with 3 nodes and Ceph storage, I installed version 1.28/stable of microk8s on 3 VMs with the command:
sudo snap install microk8s --classic --channel=1.28/stable
I would like to use Proxmox Ceph cluster as shared storage for microk8s cluster and so I enabled rook-ceph and this is the status of the cluster:

Code:
$ microk8s status
microk8s is running
high-availability: yes
  datastore master nodes: 10.15.10.121:19001 10.15.10.122:19001 10.15.10.123:19001
  datastore standby nodes: none
addons:
  enabled:
    dns                  # (core) CoreDNS
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
    rook-ceph            # (core) Distributed Ceph storage using Rook
  disabled:
    cert-manager         # (core) Cloud native certificate management
    cis-hardening        # (core) Apply CIS K8s hardening
    community            # (core) The community addons repository
    dashboard            # (core) The Kubernetes dashboard
    gpu                  # (core) Automatic enablement of Nvidia CUDA
    host-access          # (core) Allow Pods connecting to Host services smoothly
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    ingress              # (core) Ingress controller for external access
    kube-ovn             # (core) An advanced network fabric for Kubernetes
    mayastor             # (core) OpenEBS MayaStor
    metallb              # (core) Loadbalancer for your Kubernetes cluster
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
    minio                # (core) MinIO object storage
    observability        # (core) A lightweight observability stack for logs, traces and metrics
    prometheus           # (core) Prometheus operator for monitoring and logging
    rbac                 # (core) Role-Based Access Control for authorisation
    registry             # (core) Private image registry exposed on localhost:32000
    storage              # (core) Alias to hostpath-storage add-on, deprecated

I then took the ceph.conf and ceph.client.admin.keyring files from the Proxmox cluster and used them for the connection command:
$ sudo microk8s connect-external-ceph --ceph-conf ceph.conf --keyring ceph.client.admin.keyring --rbd-pool microk8s-rbd
which however returns an error:

Code:
Attempting to connect to Ceph cluster
Successfully connected to cd8ad1c4-fde4-4a3a-b169-3f4ba4762d4b (10.15.15.121:0/3422868158)
WARNING: Pool microk8s-rbd already exists
Configuring pool microk8s-rbd for RBD
Successfully configured pool microk8s-rbd for RBD
Creating namespace rook-ceph-external
Error from server (AlreadyExists): namespaces "rook-ceph-external" already exists
Configuring Ceph CSI secrets
Traceback (most recent call last):
  File "/var/snap/microk8s/common/plugins/connect-external-ceph", line 184, in <module>
    main()
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/var/snap/microk8s/common/plugins/connect-external-ceph", line 169, in main
    import_external_ceph_cluster(ceph_conf, keyring, namespace, rbd_pool)
  File "/var/snap/microk8s/common/plugins/connect-external-ceph", line 109, in import_external_ceph_cluster
    p = subprocess.run(
  File "/snap/microk8s/6089/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/snap/microk8s/6089/usr/bin/python3', PosixPath('/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py'), '--format=bash', '--rbd-data-pool-name=microk8s-rbd', '--ceph-conf=ceph.conf', '--keyring=ceph.client.admin.keyring']' returned non-zero exit status 1.

I am aware that the problem is not related to Proxmox, which works perfectly, but if anyone has already done a similar configuration and can help me I would be very grateful.
 
I deleted the microk8s-rbd pool from Proxmox and from microk8s I deleted the rook-ceph-external namespace but it still gives an error:

Code:
$ sudo microk8s connect-external-ceph --ceph-conf ceph.conf --keyring ceph.client.admin.keyring --rbd-pool microk8s-rbd
Attempting to connect to Ceph cluster
Successfully connected to cd8ad1c4-fde4-4a3a-b169-3f4ba4762d4b (10.15.15.121:0/3342210623)
Creating pool microk8s-rbd in Ceph cluster
Configuring pool microk8s-rbd for RBD
Successfully configured pool microk8s-rbd for RBD
Creating namespace rook-ceph-external
namespace/rook-ceph-external created
Configuring Ceph CSI secrets
Traceback (most recent call last):
  File "/var/snap/microk8s/common/plugins/connect-external-ceph", line 184, in <module>
    main()
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/snap/microk8s/6089/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/var/snap/microk8s/common/plugins/connect-external-ceph", line 169, in main
    import_external_ceph_cluster(ceph_conf, keyring, namespace, rbd_pool)
  File "/var/snap/microk8s/common/plugins/connect-external-ceph", line 109, in import_external_ceph_cluster
    p = subprocess.run(
  File "/snap/microk8s/6089/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/snap/microk8s/6089/usr/bin/python3', PosixPath('/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py'), '--format=bash', '--rbd-data-pool-name=microk8s-rbd', '--ceph-conf=ceph.conf', '--keyring=ceph.client.admin.keyring']' returned non-zero exit status 1.

has successfully created the microk8s-rbd pool and the rook-ceph-external namespace, this means that the connection to the ceph cluster is working.
 
The communication with the Ceph cluster obviously works.

Is there a debug option for microk8s that would output more info on what is failing here?

Can you run the command
Bash:
/snap/microk8s/6089/usr/bin/python3 /var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py --format=bash --rbd-data-pool-name=microk8s-rbd --ceph-conf=ceph.conf --keyring=ceph.client.admin.keyring
?
 
Is there a debug option for microk8s that would output more info on what is failing here?
I'm sorry but I'm not very expert on microk8s and I can't find any other debug option
Can you run the command
Bash:
/snap/microk8s/6089/usr/bin/python3 /var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py --format=bash --rbd-data-pool-name=microk8s-rbd --ceph-conf=ceph.conf --keyring=ceph.client.admin.keyring
?

Code:
$ /snap/microk8s/6089/usr/bin/python3 /var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py --format=bash --rbd-data-pool-name=microk8s-rbd --ceph-conf=ceph.conf --keyring=ceph.client.admin.keyring
Execution Failed: 'auth get-or-create client.healthchecker' command failed
Error: key for client.healthchecker exists but cap osd does not match
Traceback (most recent call last):
  File "/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py", line 1819, in <module>
    raise err
  File "/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py", line 1816, in <module>
    rjObj.main()
  File "/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py", line 1798, in main
    generated_output = self.gen_shell_out()
  File "/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py", line 1528, in gen_shell_out
    self._gen_output_map()
  File "/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py", line 1435, in _gen_output_map
    self.out_map["ROOK_EXTERNAL_USER_SECRET"] = self.create_checkerKey()
  File "/var/snap/microk8s/common/plugins/.rook-create-external-cluster-resources.py", line 1148, in create_checkerKey
    raise ExecutionFailureException(
__main__.ExecutionFailureException: 'auth get-or-create client.healthchecker' command failed
Error: key for client.healthchecker exists but cap osd does not match

It seems like the error is this:
Error: key for client.healthchecker exists but cap osd does not match

on the Proxmox server if I run the following command I get:
Code:
# ceph auth get-or-create client.healthchecker
[client.healthchecker]
        key = AQDS4M1lCmSgGxxxxxxxxxxxx

Not if I can delete the client.healthchecker user from the ceph from Proxmox, I wouldn't want to damage ceph, the Proxmox server is in production and there can't be an error.
 
Please run "ceph auth get client.healthchecker" to show the capabilities of that key.
AFAIK this is not a standard Ceph client key and is used exclusively by rook. You could remove that key with "ceph auth rm client.healthchecker" and try to rerun the microk8s setup.
If anything else fails you can recreate the key with the capabilities from the ceph auth get output above.
 
  • Like
Reactions: l.ansaloni
It worked! Thanks so much for the advice!

I deleted the client.healthchecker user from ceph with the command ceph auth rm client.healthchecker and the command terminated successfully (I just added the "-E" option for another previously reported problem):

Code:
$ sudo -E microk8s connect-external-ceph --ceph-conf ceph.conf --keyring ceph.client.admin.keyring --rbd-pool microk8s-rbd
Attempting to connect to Ceph cluster
Successfully connected to cd8ad1c4-fde4-4a3a-b169-3f4ba4762d4b (10.15.15.121:0/1182513807)
WARNING: Pool microk8s-rbd already exists
Configuring pool microk8s-rbd for RBD
Successfully configured pool microk8s-rbd for RBD
Creating namespace rook-ceph-external
Error from server (AlreadyExists): namespaces "rook-ceph-external" already exists
Configuring Ceph CSI secrets
Successfully configured Ceph CSI secrets
Importing Ceph CSI secrets into MicroK8s
secret rook-ceph-mon already exists
configmap rook-ceph-mon-endpoints already exists
secret rook-csi-rbd-node already exists
secret csi-rbd-provisioner already exists
secret csi-cephfs-node already exists
secret csi-cephfs-provisioner already exists
storageclass ceph-rbd already exists
storageclass cephfs already exists
Importing external Ceph cluster
NAME: rook-ceph-external
LAST DEPLOYED: Fri Feb 16 11:55:34 2024
NAMESPACE: rook-ceph-external
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The Ceph Cluster has been installed. Check its status by running:
  kubectl --namespace rook-ceph-external get cephcluster


Visit https://rook.io/docs/rook/latest/CRDs/ceph-cluster-crd/ for more information about the Ceph CRD.


Important Notes:
- You can only deploy a single cluster per namespace
- If you wish to delete this cluster and start fresh, you will also have to wipe the OSD disks using `sfdisk`


=================================================


Successfully imported external Ceph cluster. You can now use the following storageclass
to provision PersistentVolumes using Ceph CSI:


NAME       PROVISIONER                     RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
ceph-rbd   rook-ceph.rbd.csi.ceph.com      Delete          Immediate           true                   22m
cephfs     rook-ceph.cephfs.csi.ceph.com   Delete          Immediate           true                   22m

I take this opportunity to ask you something else, I discovered that on the Ceph storage class I can only create volumes in ReadWriteOnce mode and therefore I cannot have multiple replicated pods writing to the same volume. Did I miss something or is there no way to have a volume in ReadWriteMany mode?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!