So for the record, I've succeed in making ceph working.
First there was some ghost monitors that I've succeed to delete from monmap
Then, I had some ACL issues on the directory structure :
rocksdb: IO error: While opening a file for sequentially reading...
Hi @fabian , FMI, do you think that the official support of proxmox could cover this case ?
I mean, for changing the ceph conf to make it work as it used to recently.
Regards
additionnal infos :
the one node ceph was working perfectly, it has just failed recently and I'm looking for the reason.
a recent update change the NIC name and maybe that's the reason but I don't find any clue for this hypothesis
understood.
Sadly I have some VM on this server that hasn't been saved for a long time. not so critics but with a long setup process.
If I reinstall everything, will I be able to recover the osd ?
Hi @fabian
Thanks for this precision, you're right.
So If I resume, ceph can't run on a single node ? OR do we need to adapt the corum also for ceph.
Thanks for the time spent to read and answer
so maybe it's not corosync the issue but ceph.
The ceph logs are throwing :
e13 handle_auth_request failed to assign global_id
2024-12-11T14:55:53.720+0100 7f815b4c5700 -1 mon.server@1(probing) e13 get_health_metrics reporting 4 slow ops, oldest is auth(proto 0 29 bytes epoch 0)...
I've also tried to turn this into a standalone server : https://forum.proxmox.com/threads/proxmox-ve-6-removing-cluster-configuration.56259/#post-259203
Yet ceph is not starting but I have no more pmxfs issues at boot
Hi everyone,
I have a one node server that has been part of a 4 nodes cluster.
The current server has 2 disk with OSD and VM + CT using ceph.
A few days ago ceph had turned unresponsive with question mark and got timeout (500) in the web UI.
We updated the PXVE6.4 to the latest release, all the...
and all the logs between the osd start and stop :
osd.4 pg_epoch: 34842 pg[2.1d7( v 34623'14234948 (34483'14233294,34623'14234948] lb MIN local-lis/les=34622/34623 n=0 ec=48/48 lis/c=34622/34620 les/c/f=34623/34621/0 sis=34838) [7,1] r=-1 lpr=34842 pi=[33949,34838)/1 crt=34623'14234948 lcod 0'0...
Here is thanks for your support @spirit
Here are the logs that looks significant to me :
y
2024-05-18T16:18:18.292+0000 71872b0006c0 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.4 down, but it is still running
2024-05-18T16:18:18.292+0000 71872b0006c0 0 log_channel(cluster)...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.