: Ceph auth=none required after reinstall on Proxmox 9 / Ceph 19 Squid — cephx breaks monitors, mgr, OSDs

n7qnm

Member
Jan 4, 2024
20
4
8
Prosser, WA, USA
www.n7qnm.net
Background: After a failed mixed Proxmox 8/9 upgrade broke Ceph, I did a full Ceph reinstall on a 5-node Proxmox 9.1 cluster running Ceph 19.2.3 (Squid). After reinstall I could not get cephx working — enabling it causes monitors to lose quorum, OSDs and managers fail to start.

Symptoms:

With cephx enabled, all daemons fail with either handle_auth_bad_method server allowed_methods [1] but i only support [2] or the reverse

Monitors lose quorum immediately when auth_*_required = none is removed from ceph.conf

OSDs fail under systemd with failed to fetch mon config but work fine when run in foreground as root

pvesm list <storage> returns rbd error: rbd: listing images failed: (95) Operation not supported with auth=none, or (13) Permission denied with cephx

What I found:

After reinstall, monitor keyrings were stored as [mon.] instead of [mon.pve0] etc, and were not registered in the cluster auth database — fixed by using ceph-authtool to create properly named keyrings and ceph auth add to register them

The ceph user was not in the www-data group, so it couldn't read /etc/pve/ceph.conf (owned by root:www-data, mode 640) — fixed with usermod -aG www-data ceph on each node

OSDs under systemd fail auth negotiation even with auth=none in ceph.conf — workaround is --no-mon-config in a systemd override

Proxmox RBD plugin (PVE::CephConfig) sets auth_supported=cephx if the storage keyring file exists, causing rados_connect to fail with error 95 — workaround is removing the keyring file so it falls back to auth_supported=none

Ceph config database auth settings (set via ceph config set) override ceph.conf and affect daemons differently — caused significant confusion during troubleshooting

Current workarounds in place:

auth_*_required = none in /etc/pve/ceph.conf

--no-mon-config systemd override for all OSDs on all nodes

Storage keyring files removed (renamed to .bak)

usermod -aG www-data ceph on all nodes

Questions:

Why do monitors fail to form quorum with cephx even after properly registering mon.pve0/pve1/pve4 keys in the auth database?
Is there a supported procedure for re-enabling cephx after a Ceph reinstall on an existing Proxmox cluster?

Is the www-data group membership for the ceph user supposed to be set automatically by Proxmox? If so, why wasn't it set during reinstall?

Should the OSD systemd unit be able to read /etc/pve/ceph.conf via the symlink at /etc/ceph/ceph.conf? The symlink exists but the file permissions prevent the ceph user from reading it without the www-data group.

Environment:

Proxmox VE 9.1.5

Ceph 19.2.3-pve4 (Squid)

5 nodes, 3 monitors (pve0, pve1, pve4), 6 OSDs across 3 hosts

Fresh Ceph reinstall (not upgrade)
 
  • Like
Reactions: david.bowser
Thanks for documenting the work.

I ran into a similar issue going to 9.1 with squid running normally on 8.4. I upgraded the servers with OSDs first (no issues), but spread out over hours to make sure nothing died. Then upgraded 2 ceph mons before it started to bork. I noticed the mons and mgr services on the upgraded servers were failing and the ceph GUI and ceph -s were timing out, but the ceph storage was still working. I left the last mon/mgr on 8.4 and got some sleep to see if it worked itself out over night.

I ended up finding your post after several hours of troubleshooting.

My workarounds
auth*required = none
storage keyring renamed - this was a weird one for me because some VMs/LXC came back up before i made this change and some failed. No clue why.
usermod -aG www-data ceph on all nodes
had to use the ceph command mon cleanup because although the mgr would run as standby on 9.1, the upgraded mons would just crash out
created a new mon on an upgraded server (2 mons total) to get ceph quorum

Plan
I am going to rebuild the 2 broken mon/mgr nodes on 9.1 from scratch and add them back.
I will keep my 8.4 mon/mgr until I can confirm that everything is stable
 
Thanks for the response! Haven't had any other yet......

Have you done the rebuild yet? Did it work?
You mention "ceph command mon cleanup" I don't see a "mon cleanup" command. Can you share the specific commands you used ?
 
Thanks for the response! Haven't had any other yet......

Have you done the rebuild yet? Did it work?
You mention "ceph command mon cleanup" I don't see a "mon cleanup" command. Can you share the specific commands you used ?
I rebuilt 1 broken mon from 9.1-1 iso. It has been running ceph mon for 8hrs. I am letting it sit for a day (need to work on other things) to make sure it is fully stable.

The "mon cleanup" was supposed to be the unhealthy cluster section, but I think I put in the wrong link. I did steps 3-8 in that section exactly as written with my node names. I was on console so steps 1-2 were moot.

Current state
7 total proxmox nodes
5 were upgraded from 8.4 (pve1-5)
pve6 was rebuilt from 9.1-1 iso (running ceph mon, no VMs or extra services)
1 is still 8.4 (pve7)
4 nodes 9.1 with OSDs (pve1, 2, 3, 5) running VMs and LXCs
2 nodes with mgr (pve7 is 8.4, pve4 is 9.1)
3 nodes with mon (pve5, 6 are 9.1, pve7 is 8.4)
ceph 19.2 squid
 
A few questions and observations that might help narrow this down:

How was the reinstall done?​

PVE ships `pveceph purge` specifically for this scenario — it stops all Ceph services, wipes `/var/lib/ceph/{mon,mgr,mds,osd,...}/` (including the monitor's RocksDB), and removes the keyrings and config from `/etc/pve/priv/`. If the reinstall was done without running this first, the monitor data directories from the previous cluster would survive, and `ceph-mon --mkfs` would silently do nothing if it found them non-empty — causing the monitor to start against the old RocksDB with its old auth database, old config DB entries, and old keys. This is almost certainly the root cause of "allowed_methods [1] but i only support [2]" (CEPH_AUTH_NONE vs CEPH_AUTH_CEPHX — the old monitor's auth DB doesn't know the new keyrings).

Could you confirm whether `pveceph purge` was run? And were the OSD data directories (`/var/lib/ceph/osd/ceph-*/`) wiped as part of the reinstall?

On the `www-data` group fix:​

The description "ceph user couldn't read `/etc/pve/ceph.conf`" is likely pointing at the right symptom but the wrong file. `ceph-osd` reads `ceph.conf` before dropping privileges (via `global_pre_init()` → `conf.parse_config_files()`, called before `setuid()`), so ceph.conf itself isn't the file that's inaccessible.

What the ceph user can't access after the privilege drop is a keyring: in `global_init()`, after `setuid(ceph)` runs, the code calls `mc_bootstrap.get_monmap_and_config()` to connect to the monitors and fetch the initial config. This is where "failed to fetch mon config" comes from — that exact error string is in `global_init.cc`, after the privilege drop. `get_monmap_and_config()` loads a keyring via `KeyRing::from_ceph_context()` to authenticate. For `ceph-osd`, the default keyring path is `$osd_data/keyring` (i.e., `/var/lib/ceph/osd/ceph-X/keyring`), which in a normal setup is owned `ceph:ceph` and accessible without any group membership.

I can't tell from here exactly what file required `www-data` access in your case — it might be that the OSD data directories had wrong ownership after the reinstall, or that something (a ceph.conf `[osd]` section or the config DB) overrode the keyring path to a location readable only by root. `ceph config dump | grep keyring` and `ls -la /var/lib/ceph/osd/ceph-*/keyring` would show which.

One thing that would help understand this: what led you to conclude that `/etc/pve/ceph.conf` specifically was the unreadable file? Was there an explicit permission error in the OSD logs pointing to that path, or was it inferred from the `www-data` group fix working? Knowing the actual error (e.g., from `journalctl -u ceph-osd@X --no-pager` at the time of the failure) would help pin down exactly where the permission boundary was.

On the `--no-mon-config` override for OSDs:​

`--no-mon-config` skips the bootstrap `get_monmap_and_config()` call inside `global_init()` entirely — no connection to monitors is made at startup, and no config DB values are fetched or applied. Without it, the OSD fetches the config DB from the monitors early in startup and applies those values on top of ceph.conf. If the old monitor's RocksDB has `auth_cluster_required = cephx` (or similar) in the config DB, that would override `auth_*_required = none` from ceph.conf after the fetch, causing the OSD to switch back to requiring cephx for subsequent operations. Skipping the config DB fetch keeps ceph.conf settings in effect throughout.

The right path forward:​


If `pveceph purge` wasn't used, the cluster is still in an inconsistent state. The correct procedure:

1. Remove remaining OSDs, MDS, MGR, and extra monitors (GUI or CLI)
2. On the last remaining node: `pveceph purge` — this wipes the RocksDB stores and keyrings from `/etc/pve/priv/`
3. Reinitialize: `pveceph init --network <ceph-cidr>` then `pveceph createmon` on each node

If the cluster is too damaged for PVE tooling to run cleanly, `pveceph purge` handles an unreachable RADOS gracefully (wraps the RADOS calls in eval and proceeds with file cleanup regardless).

@david.bowser: your case is different in an important way — you started from an upgrade, not a reinstall, so the old monitor data directories surviving is expected. If you're seeing the same cephx breakage after a PVE 8→9 upgrade, that points to a bug in the upgrade path itself rather than a user procedure issue. It would be very useful to understand what actually went wrong:
  • At what point did the upgrade fail or leave Ceph in a broken state? Was it during the `apt dist-upgrade` on the first node, or when trying to start Ceph services on the upgraded node?
  • Did `ceph -s` still show a healthy cluster immediately after the upgrade completed on the first node, before you noticed auth issues?
  • Were the monitors on the upgraded node(s) the ones that broke, or did upgrading an OSD-only node trigger the problem?
  • What does `ceph mon dump` show — are all monitor entries still present with correct addresses?

This kind of failure during a supported upgrade path is something the PVE team would want to investigate and fix.