Ceph completely broken - Error got timeout (500)

spamsam · Feb 21, 2023

I have a test cluster of 3 VMs in virtualbox on my pc.

Their IPs changed as follows:
A: 172.16.0.150 > 192.168.100.1 (NOT a gateway)
A: 172.16.0.151 > 192.168.100.2
A: 172.16.0.152 > 192.168.100.2

Now I get the following message when i run 'service ceph-mon@local status':

Code:

Processor -- bind unable to bindto v2:172.16.0.150:3300/0: (99) Cannot assign requested address.

And this is driving me mad as it downed ceph on all 3 nodes with the same error.

I updated the address in the /etc/ceph/ceph.conf (which is just a symlink to /etc/pve/ceph.conf). It is as follows:

Code:

[global]
     auth_client_required = cephx
     auth_cluster_required = cephx
     auth_service_required = cephx
     cluster_network = 192.168.100.1/24
     fsid = bbc8efc5-af69-4460-8e18-c5d5e76d0c9e
     mon_allow_pool_delete = true
     mon_host =  192.168.100.1
     ms_bind_ipv4 = true
     ms_bind_ipv6 = false
     osd_pool_default_min_size = 2
     osd_pool_default_size = 3
     public_network = 192.168.100.1/24

[client]
     keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.localhost]
     public_addr = 192.168.100.1

I rebooted the node multiple times, same error. I scrubbed the whole system of every instance of 172.16.0.150, same error.

IT is still a fresh cluster that I am using to test ceph quincy on proxmox 7.3 prior to going into production. Tho at the moment i am weary.

Any advise here?

spamsam · Feb 22, 2023

Just as further info, commands like ''ceph -s" hang indefinitely. all pveceph commands say timeout

Moayad · Feb 22, 2023

Hi,

Is the above Ceph config is from /etc/pve/ceph.conf? if yes can you compare it with /etc/ceph/ceph.conf as well?

Did you see anything interesting in the ceph monitor logs?

jaykavathe · Sep 7, 2023

Did you find any resolution for this? I am in the same boat and getting help from folks here but wondering what fixed it for you.. if it did.

Whatever · Sep 7, 2023

jaykavathe said:
Did you find any resolution for this? I am in the same boat and getting help from folks here but wondering what fixed it for you.. if it did.

check /etc/hosts and update accordantly

Maximiliano · Sep 7, 2023

Hello, could you please post you MON map?

Code:

ceph --admin-daemon /run/ceph/ceph-mon.$(hostname).asok mon_status

Should provide plenty info. In case the networks are wrong there I would advice to take a look at [1].

[1] https://docs.ceph.com/en/latest/cephadm/services/mon/#moving-monitors-to-a-different-network

EagleBBS · Oct 14, 2023

ceph fs error got timeout (500) during fist time installation... any idea?

calebshah · Jul 31, 2024

I am using Proxmox Version 8.2.4 ( installed 3 Nodes and install CEPH, it was working.
I purchased three physical hardware same spaces and install the same way. But cant install CEPH
I have tried multiple time. still same

anyone has a solution ?

Moayad · Jul 31, 2024

In the screenshot you attached look like the Ceph reef is installed! Did you re-fresh your browser? What is the output of `pveceph status`?

calebshah · Jul 31, 2024

Moayad said:
In the screenshot you attached look like the Ceph reef is installed! Did you re-fresh your browser? What is the output of `pveceph status`?

root@pve1:~# pveceph status
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')
command 'ceph -s' failed: exit code 1
root@pve1:~#

Did you re-fresh your browser
have tried Safari, Crome and all stall same

calebshah · Aug 1, 2024

calebshah said:
root@pve1:~# pveceph status
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')
command 'ceph -s' failed: exit code 1
root@pve1:~#

Did you re-fresh your browser
have tried Safari, Crome and all stall same

Anyone has a solution.
I feel so wasted buying three node same specs.

UdoB · Aug 1, 2024

calebshah said:
I feel so wasted buying three node same specs.

As far as I can see you didn't tell us anything about your system type and configuration. What kind of answer do you expect?

calebshah · Aug 1, 2024

Node 1
16 x AMD Ryzen 7 5800U with Radeon Graphics
Kernel Version : Linux 6.8.8-4-pve (2024-07-26T11:15Z
Manager Version: pve-manager/8.2.4/faa83925c9641325
HDD1: 256GB M2 SSD
HDD2: 1TB SSD

Node 2
16 x AMD Ryzen 7 5800U with Radeon Graphics
Kernel Version : Linux 6.8.8-4-pve (2024-07-26T11:15Z
Manager Version: pve-manager/8.2.4/faa83925c9641325
HDD1: 256GB M2 SSD
HDD2: 1TB SSD

Node 3
16 x AMD Ryzen 7 5800U with Radeon Graphics
Kernel Version : Linux 6.8.8-4-pve (2024-07-26T11:15Z
Manager Version: pve-manager/8.2.4/faa83925c9641325
HDD1: 256GB M2 SSD
HDD2: 1TB SSD

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Aug 1 16:11:11 +08 2024 from xxx.xxx.xxx.xx on pts/1
root@pve1:~# pveceph status
command 'ceph -s' failed: got timeout
root@pve1:~#

Maximiliano · Aug 1, 2024

Hello,

What is the output of

- `pvecm status`
- `cat /etc/ceph/ceph.conf`

calebshah · Aug 1, 2024

root@pve1:~# pvecm status
Cluster information
-------------------
Name: Sync-Cluster
Config Version: 5
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Thu Aug 1 16:41:29 2024
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1.1ac
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.201.50 (local)
0x00000002 1 192.168.201.60
0x00000003 1 192.168.201.70
root@pve1:~#
root@pve1:~# `cat /etc/ceph/ceph.conf`
-bash: [global]: command not found
root@pve1:~# cat /etc/ceph/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.202.50/24
fsid = f290bf78-5b29-4109-8ae7-0d88e52785d2
mon_allow_pool_delete = true
mon_host = 192.168.201.50,192.168.201.60,192.168.201.70
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 192.168.202.50/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

root@pve1:~# cat /etc/ceph/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.202.50/24
fsid = f290bf78-5b29-4109-8ae7-0d88e52785d2
mon_allow_pool_delete = true
mon_host = 192.168.201.50,192.168.201.60,192.168.201.70
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 192.168.202.50/24

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

Maximiliano · Aug 1, 2024

Both `public_network` and `cluster_network` are set to `192.168.202.X/24` while the MONs are at

Code:

192.168.201.50,192.168.201.60,192.168.201.70

(201 vs 202). Do you have any valuable data already in the Ceph cluster? If not I would suggest to destroy the MONs and create new ones using IPs which are part of the `public_network` subnet.

calebshah · Aug 1, 2024

yes I did backup and try already.
no difference

Maximiliano · Aug 1, 2024

Could you please share the new config? Whats the state of ceph (`ceph -s`)? Are the mon services running `systemctl status ceph-mon@NODE_HOSTNAME.service` ?

eqajize_879 · Sep 2, 2024

Hello, same problem for me.
PVE 8.2.2 with 3 identical servers in a cluster.
Ceph installed on first node and the 2 others got time out.
Tried to purge config many times, reinstall by GUI or CLI with the same result.
Do someone find something?

calebshah · Sep 3, 2024

eqajize_879 said:
Hello, same problem for me.
PVE 8.2.2 with 3 identical servers in a cluster.
Ceph installed on first node and the 2 others got time out.
Tried to purge config many times, reinstall by GUI or CLI with the same result.
Do someone find somethin

I give up troubleshooting, uninstalled Ceph

Ceph completely broken - Error got timeout (500)

Active Member

Active Member

Proxmox Staff Member

Active Member

Renowned Member

Proxmox Staff Member

New Member

Member

Attachments

Proxmox Staff Member

Member

Member

Distinguished Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

New Member

Member

We value your privacy