So I see monmap is correct
I see fsmap cephfs on node2 with 2 up standby
I see mgmap stack1 active with node2 and 7 stanby
but still show mon.node900 down not in quorum...
Ceph commands still fail... no quorum is shown on proxmox gui but logs show quorum of stack1, node2 and node7 all exist -...
well that wasn't that helpful tbh... when cluster was running (it still is running - just not ceph pools) I was having no issues at all and it was just churning along like it should.. and I was learning... so that IS worth my time.
Anyhow - if anyone has ideas on how to help me get node 900...
Yes - you're right. I have 6 x 1gb network cards in each of the r210ii servers and can isolate the traffic later.. right not it was all working fine for my limited home use... I have 2x24port unmanaged switches (10/100/1000) and am looking for ideas on cards to install to up the node to node...
Got all monitors to start I think... still no quorum.
node 900 was weird - it is the only one that has 2 targets mons... shows ceph-mon@node900.service as well as ceph-mon@stack1.service
so I stop all
then restart with mon900 and whammo - monitor900 finally active...
now I am trying to...
monitor services seem up - but not talking.
I mean maybe they are talking but no MDS or FS there anymore... seems the monitors see same number of tasks... show active... idk.. been deep diving into the ceph boards trying to learn more - seems this has happened a few times to others but no...
no it was running fine - I did a PM update and upgrade and started having all sorts of issues with root out of space on pve drives.. then the entire ceph fs died and I dont see any file system for ceph left... monitors are not talking and no mds is up.
Within Proxmox I can see all the physical...
systemd-timesyncd gone after the update - chrony installed now and running... might have broken something... but why no mds server running? Any way to restore it? Gotta be something simple I am missing...
I post all those log and screenshots... only issues I am seeing are mon out out of space...
Looking at logs mon stack 1 critical on space... so I go to var/lib/ceph... see proxmox setup that /var is on root partition.. but root isn't full... kinda lost on why it would be reporting 1% available mon.stack1
can you give me steps and commands you use to revert to backup and reinject monmap so I can get my managers and mds back and hopefully rebuild data from the ODS locations?
I have similar issue in another thread with all my details.... If you have any help I would appreciate it. How did you fix...
node 5 still timeout
ceph -s just hangs too
also, node 5 HDD does not show the OSD anymore... as if the drive is initialized and not used for OSD store...
node5 systemctl status
root@node5:~# systemctl status
● node5
State: running
Jobs: 1 queued
Failed: 0 units
Since: Sun...
Looks like mon node7 went critical on space
Mon_Disk_LOW mon stack1 is 1% avail
so a couple of them ran out of space... how?
Where are they filling up? Is the map on pve root filling or just the osd and map being stored on osd?
Wondering how exactly it stores data for mds and mon - and where...?
You did ask for most of that I think.. lol yeah I know... just frustrated with this now and cannot seem to figure out how to get any manager back and get the ceph to respond
ceph -s just sits there and hangs/freezes then timeout
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.