Hey dear Proxmox Pros,
I have an issue with my Ceph Cluster after the attempt of migrating it to a dedicated network.
I have a 3 node Proxmox Cluster with Ceph enabled. Since I only had one network connection on each of the nodes, I wanted to create a dedicated and separate network only for the Ceph Cluster.
But my attempt was ill fated, because now, my whole cluster isn't working anymore.
What have tried:
So then I tried to revert my changes and tried to go back to where I was and the beginning.
But now I get the same issues, when I reverted everything.
I'm pretty sure, that I haven't had the option mon_host = 192.168.200.18 in my /etc/pve/ceph before I switched, but when this option isn't present I get the following error message, when I enter ceph --status
And if this option is present, I get this error message:
I'm working on this issue since 12 hours now, and because I'm a noob when it comes to ceph, I don't know what I can do now.
I have played around with the ceph.conf here and there. Mostly on the Monitor side, that I don't know what the original one looked like anymore. But here it is anyway:
I'm really out of ideas here and in need of help.
Can someone help me?
If you need more information, please ask. I have tried so many things, that I don't know what is important anymore.
Cheers and thanks for reading this.
I have an issue with my Ceph Cluster after the attempt of migrating it to a dedicated network.
I have a 3 node Proxmox Cluster with Ceph enabled. Since I only had one network connection on each of the nodes, I wanted to create a dedicated and separate network only for the Ceph Cluster.
But my attempt was ill fated, because now, my whole cluster isn't working anymore.
What have tried:
- I have given every node a separate ip-address on the new network. 192.168.100.1, 192.168.100.2, 192.168.100.3.
- Since they are the only nodes on this network, there is no gateway or something else.
- After that, I change the cluster_network and the public_network from 192.168.200.4/24 (the old ip-address) to 192.168.100.1/24 in my /etc/pve/ceph.conf
- Afterwards, I restarted the nodes.
- Then I changed the ip-address of the monitor from 192.168.200.18 to 192.168.100.2
So then I tried to revert my changes and tried to go back to where I was and the beginning.
But now I get the same issues, when I reverted everything.
I'm pretty sure, that I haven't had the option mon_host = 192.168.200.18 in my /etc/pve/ceph before I switched, but when this option isn't present I get the following error message, when I enter ceph --status
Bash:
root@pve-02:~# ceph --status
failed to get an address for mon.pve-02: error -2
unable to get monitor info from DNS SRV with service name: ceph-mon
2024-11-02T22:49:22.369+0100 72d5402006c0 -1 failed for service _ceph-mon._tcp
2024-11-02T22:49:22.369+0100 72d5402006c0 -1 monclient: get_monmap_and_config cannot identify monitors to contact
[errno 2] RADOS object not found (error connecting to the cluster)
And if this option is present, I get this error message:
Bash:
root@pve-02:~# ceph status
2024-11-02T22:42:07.468+0100 7c96a50006c0 0 monclient(hunting): authenticate timed out after 300
[errno 110] RADOS timed out (error connecting to the cluster)
I'm working on this issue since 12 hours now, and because I'm a noob when it comes to ceph, I don't know what I can do now.
I have played around with the ceph.conf here and there. Mostly on the Monitor side, that I don't know what the original one looked like anymore. But here it is anyway:
Code:
root@pve-02:~# cat /etc/pve/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.200.4/24
fsid = cf5fdd35-0db9-4936-b163-a36e1457fb1e
mon_allow_pool_delete = true
mon_host = 192.168.200.18 192.168.200.15
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 192.168.200.4/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[client.crash]
keyring = /etc/pve/ceph/$cluster.$name.keyring
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[mds.pve-01]
host = pve-01
mds_standby_for_name = pve
[mds.pve-03]
host = pve-03
mds_standby_for_name = pve
[mon.pve-02]
public_addr = 192.168.200.18
[mon.pve-03]
public_addr = 192.168.200.15
I'm really out of ideas here and in need of help.
Can someone help me?
If you need more information, please ask. I have tried so many things, that I don't know what is important anymore.
Cheers and thanks for reading this.
Last edited: