ceph warning post upgrade to v8

@Max Carrara Thank you for attempting to fix this issue. I've been following this thread for a few weeks now hoping that it would eventually bear some fruit; sad to see that's not the case.

As we don't have a dashboard (and seemingly won't get one for quite some time) I was hoping someone could point me in the direction of creating rgw (s3) buckets through the command line. All the documentation I've found so far says that you need to do it through the dashboard and I haven't found a single command that would create a bucket.

If I could create a bucket through the command line, I wouldn't even need the dashboard...
 
  • Like
Reactions: herzkerl
@Max Carrara Thank you for attempting to fix this issue. I've been following this thread for a few weeks now hoping that it would eventually bear some fruit; sad to see that's not the case.

As we don't have a dashboard (and seemingly won't get one for quite some time) I was hoping someone could point me in the direction of creating rgw (s3) buckets through the command line. All the documentation I've found so far says that you need to do it through the dashboard and I haven't found a single command that would create a bucket.

If I could create a bucket through the command line, I wouldn't even need the dashboard...
You can use the cli tool radosgw-admin to manage users and buckets
 
  • Like
Reactions: fabian
To summarize: Upgrading to Proxmox v8 works without any issues regarding Ceph itself—only the Ceph dashboard (which I also enabled for a simple overview of RadosGW) doesn't work after upgrading. Is that correct?
 
To summarize: Upgrading to Proxmox v8 works without any issues regarding Ceph itself—only the Ceph dashboard (which I also enabled for a simple overview of RadosGW) doesn't work after upgrading. Is that correct?
Precisely.

However, in some instances the dashboard seems to continue to work, due to the dashboard not needing to be "bootstrapped" again. I haven't really had the time to reproduce that behaviour myself, but I think it's pretty much safe to say that the dashboard will be broken in most (if not all) circumstances.
 
  • Like
Reactions: herzkerl
Upgrade PVE 8.0 -> 8.1

Bash:
# pveversion
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.5.11-4-pve)

Bash:
# ceph -s
health: HEALTH_WARN
Module 'dashboard' has failed dependency: PyO3 modules may only be initialized once per interpreter process

Bash:
# systemctl status ceph-mgr@pve1
Nov 25 12:58:16 pve1 ceph-mgr[3433]: File "/lib/python3/dist-packages/cryptography/x509/__init__.py", line 6, in <module>
Nov 25 12:58:16 pve1 ceph-mgr[3433]: from cryptography.x509 import certificate_transparency
Nov 25 12:58:16 pve1 ceph-mgr[3433]: File "/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py", line 10, in <module>
Nov 25 12:58:16 pve1 ceph-mgr[3433]: from cryptography.hazmat.bindings._rust import x509 as rust_x509
Nov 25 12:58:16 pve1 ceph-mgr[3433]: ImportError: PyO3 modules may only be initialized once per interpreter process
Nov 25 12:58:16 pve1 ceph-mgr[3433]: 2023-11-25T12:58:16.439+0300 7f617dc1d000 -1 mgr[py] Class not found in module 'restful'
Nov 25 12:58:16 pve1 ceph-mgr[3433]: 2023-11-25T12:58:16.439+0300 7f617dc1d000 -1 mgr[py] Error loading module 'restful': (2) No such file or directory
Nov 25 12:58:16 pve1 ceph-mgr[3433]: 2023-11-25T12:58:16.611+0300 7f617dc1d000 -1 mgr[py] Module test_orchestrator has missing NOTIFY_TYPES member
Nov 25 12:58:16 pve1 ceph-mgr[3433]: 2023-11-25T12:58:16.691+0300 7f617dc1d000 -1 mgr[py] Module telegraf has missing NOTIFY_TYPES member
Nov 25 12:58:16 pve1 ceph-mgr[3433]: 2023-11-25T12:58:16.691+0300 7f617dc1d000 -1 log_channel(cluster) log [ERR] : Failed to load ceph-mgr modules: restful

Bash:
# ceph version
ceph version 17.2.7 (e303afc2e967a4705b40a7e5f76067c10eea0484) quincy (stable)
 
Hi everyone.

One of my viewers wrote an email to me after a view about another Ceph module that had a dependency issue. I read through this post, and what I understand is that there is an issue with JWT having dependencies with Rust code and that the interpreters don't handle this gracefully.

I might be totally out of line, and maybe I've misunderstood the issue.

But if we assume that the problem is with the JWT module and that JWT is a well-understood standard that could easily be implemented in 100 lines of code, why not replace it with a native implementation?

Something like this:
https://github.com/ceph/ceph/commit/e1d21fc7f8ffbe9a373dcf0f59317ef8cd293ab4

Best regards, a none native python programmer
Daniel
 
the problem is anything that uses the "cryptography" module (which in turn uses pyo3/Rust), and that one is fairly widespread unfortunately. the only real solution is to either fix pyo3 (work ongoing, but a pretty big change) to support this use case, or Ceph changing the way they setup and call mgr modules..
 
  • Like
Reactions: Max Carrara
Hi @fabian

I've read the thread, and I understand that the cryptography library could be a problem. But if you never use it, then you don't need it, IMHO. In this case, we never use the EllipticCurve curves or DSS signature, so why import it?

From the JWT library:

try: from cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve from cryptography.hazmat.primitives.asymmetric.utils import ( decode_dss_signature, encode_dss_signature, ) except ModuleNotFoundError: pass

Again I might have misunderstood the problem.

Best regards
Daniel
 
Last edited:
Thank you all for your efforts in investigating this Problem.

I ran into this problem too and after reading this thread I'm not entirely sure if I can safely disable the offending module (restful in this case)
Is it used just for the dashboard, or did I break something else by disabling it? :eek:
 
Hi again

The code I mentioned above have now been refined, tested and I've created a PR against the Ceph repository.

https://github.com/ceph/ceph/pull/54710

Best regards
Daniel

Awesome work, thank you very much! I had absolutely not noticed that PyJWT is only barely used within the Dashboard, so this looks really promising!

I'll keep my eyes on the PR and on Ceph overall. If this gets merged, the PR will hopefully trickle downstream quickly - otherwise i'll see that we ship your patch ourselves in the meantime.

Fingers crossed! This is very good news.
 
I'll keep my eyes on the PR and on Ceph overall. If this gets merged, the PR will hopefully trickle downstream quickly - otherwise i'll see that we ship your patch ourselves in the meantime.

Fingers crossed! This is very good news.

Greetings. I am wondering if any progress has been made here? Thanks.
 
it's not merged yet upstream.
 
Thanks all for the fantastic work here to diagnose and troubleshoot.

I am currently in the process of upgrading our production cluster to v8 and am already missing my Ceph dashboard.

As a temporary fix - is there a working method to port Ceph data to the native Prox influx server? I've tried and tried in the past, but never managed any success with it to integrate onto our Grafana dashboard. Ceph's native influx module never cooperated for me.
 
I just found this thread after a fresh install of Proxmox 8.1 and Ceph 18.2 only to find out that the dashboard, due to said upstream dependency, doesn't work.

I understand that it is generally NOT recommended to use an older version (where and when security updates are concerned), but for those that might be just running it for their own homelab, where that may potentially be slightly less of an issue -- I just reinstalled Proxmox 8.1 and then installed Ceph version 17.2.7 and at least the dashboard isn't complaining about the same issue as Ceph version 18.2.

(sidenote: it still complains about restful, unfortunately, but I just disabled that to make the message go away as I am not really certain what it is used for -- but I am also equally certain that there are other people for whom, this "solution" won't work.)
 
Hello everybody, I've got good news!

I returned from my Christmas holidays and just sent a patch that backports @kalaspuffar's pull request to our mailing list. Once / if the patch gets applied and shipped, the dashboard can be made to work again - though, a small workaround is required, for now.

In essence, the dashboard seems to only launch if TLS has been disabled in its configuration. You should then be able to put the dashboard behind a reverse proxy that does all the TLS termination work instead, at least for the time being. I will post more detailed instructions once the patch is actually applied (don't want to get too excited here ;)).

So, yes, the dashboard will be usable again, albeit with a small workaround!

I'll keep you posted.
 
Hello everybody, I've got good news!

I returned from my Christmas holidays and just sent a patch that backports @kalaspuffar's pull request to our mailing list. Once / if the patch gets applied and shipped, the dashboard can be made to work again - though, a small workaround is required, for now.

In essence, the dashboard seems to only launch if TLS has been disabled in its configuration. You should then be able to put the dashboard behind a reverse proxy that does all the TLS termination work instead, at least for the time being. I will post more detailed instructions once the patch is actually applied (don't want to get too excited here ;)).

So, yes, the dashboard will be usable again, albeit with a small workaround!

I'll keep you posted.
Amazing News Cant Wait
 
Hello again everybody! This time I've got fantastic news.

In my previous post I had mentioned that the dashboard will only be able to be used if TLS is turned off. This is no longer the case; the dashboard will work again as intended. So, no reverse proxy or other workarounds needed. The patch series was recently applied, which means that you should eventually see updates trickle in.

Some more details: Besides the backport of the PyJWT replacement, I've found that there are only a couple usages of another module that uses PyO3. That module was PyOpenSSL. All other SSL/TLS-related functions use Python's built-in ssl module from the standard library. This module however doesn't expose everything OpenSSL can do, which is probably why PyOpenSSL helper functions were brought in.

One of those usages was a check during the dashboard's startup, that made sure that the TLS certificate and key match. In my opinion, it's very unlikely for such misconfiguration to happen, and if it does, your browser will warn you anyway.

The only other caveat is that the ceph dashboard create-self-signed-cert command will no longer work. Instead, you'll have to manually provide a self-signed certificate and key - when you try to use the command, you will be shown a little help message on how to achieve that. It's almost frictionless. ;) Just make sure the cert and key match up, or your browser will complain (due to the removal of the aforementioned check). You will only need this command during setup of the dashboard anyway, so for existing users, you should see your dashboard come up again once updates are out and installed. If it doesn't come up or there's some other problem, please ping me!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!