I am having problems configuring ceph within pve. The blocker for me is that, as I have obviously made a mistake with the configuration somewhere, I am not able to purge and restart.
In other words, I have a forked ceph installation on my newly installed 6.0 node. These are the things I have tried in order to reset the configuration and restart the ceph installation/configuration.
* pveceph purge - "unable to get monitor info from DNS SRV with service name: ceph-mon"
* rm -Rf /etc/ceph /etc/pve/ceph.conf /etc/pve/priv/ceph* /var/lib/ceph
* apt remove ceph ceph-base ceph-mon ceph-mgr ceph-osd ceph-mgr-dashboard ceph-mgr-diskprediction-local ceph-mgr-ssh (extra packages were installed during one of the partially successful attempts at ceph installation) - apt fails because the ceph*.prerm scripts in /var/lib/dpkg/info fails to stop the services
* rm ceph-{base,mds,mgr,mon,osd}.prerm in the dpkg folder
* retry of above apt remove - successful
* rm ceph-{base,mds,mgr,mon,osd}.* in the dpkg folder
* rm -Rf /etc/ceph /etc/pve/ceph.conf /etc/pve/priv/ceph* /var/lib/ceph
I've been using posts from https://forum.proxmox.com/threads/ceph-config-broken.54122/page-2 as inspiration.
After the above steps I try installing ceph cleanly, getting these results:
* pveceph install
122MB additional disk space etc etc.
- installed ceph nautilus successfully
configure ceph in GUI
public network set to default network of node
cluster network set to default network of node (I have a separate network intended for cluster)
monitor node = pve node
- error with cfs lock 'file-ceph_conf': command 'cp /etc/pve/priv/ceph.client.admin.keyring /etc/ceph/ceph.client.admin.keyring' failed: exit code 1 (500)
Suggestions? Or is reinstallation of the node the only solution when the ceph installation gets borked?
In other words, I have a forked ceph installation on my newly installed 6.0 node. These are the things I have tried in order to reset the configuration and restart the ceph installation/configuration.
* pveceph purge - "unable to get monitor info from DNS SRV with service name: ceph-mon"
* rm -Rf /etc/ceph /etc/pve/ceph.conf /etc/pve/priv/ceph* /var/lib/ceph
* apt remove ceph ceph-base ceph-mon ceph-mgr ceph-osd ceph-mgr-dashboard ceph-mgr-diskprediction-local ceph-mgr-ssh (extra packages were installed during one of the partially successful attempts at ceph installation) - apt fails because the ceph*.prerm scripts in /var/lib/dpkg/info fails to stop the services
* rm ceph-{base,mds,mgr,mon,osd}.prerm in the dpkg folder
* retry of above apt remove - successful
* rm ceph-{base,mds,mgr,mon,osd}.* in the dpkg folder
* rm -Rf /etc/ceph /etc/pve/ceph.conf /etc/pve/priv/ceph* /var/lib/ceph
I've been using posts from https://forum.proxmox.com/threads/ceph-config-broken.54122/page-2 as inspiration.
After the above steps I try installing ceph cleanly, getting these results:
* pveceph install
122MB additional disk space etc etc.
- installed ceph nautilus successfully
configure ceph in GUI
public network set to default network of node
cluster network set to default network of node (I have a separate network intended for cluster)
monitor node = pve node
- error with cfs lock 'file-ceph_conf': command 'cp /etc/pve/priv/ceph.client.admin.keyring /etc/ceph/ceph.client.admin.keyring' failed: exit code 1 (500)
Suggestions? Or is reinstallation of the node the only solution when the ceph installation gets borked?