Search results

  1. G

    ceph not working monitors and managers lost

    i reboot them all one by one doing apt update and upgrade then pvecm updatecerts all but node900 now show systemctl status -- running ok node 3 now off too..? all others are looking ok still, no mds server still need to see if the data map exists anywhere to be restored no managers seem...
  2. G

    ceph not working monitors and managers lost

    node 2 came back to life a little after reboot... humm
  3. G

    ceph not working monitors and managers lost

    ceph log on node2 2022-02-21T11:23:39.813865-0600 osd.0 (osd.0) 4244643 : cluster [WRN] slow request osd_op(client.471789829.0:1612332 7.dc 7.2b14fadc (undecoded) ondisk+retry+write+known_if_redirected e967246) initiated 2022-02-20T22:30:29.742607-0600 currently delayed...
  4. G

    ceph not working monitors and managers lost

    here is startup log for node 2 -> well at least some of it...
  5. G

    ceph not working monitors and managers lost

    So for giggles I rebooted node 2 and am looking at logs on reboot Feb 28 21:23:36 node2 systemd[1]: Started The Proxmox VE cluster filesystem. Feb 28 21:23:36 node2 systemd[1]: Started Ceph metadata server daemon. Feb 28 21:23:36 node2 systemd[1]: Started Ceph cluster manager daemon. Feb 28...
  6. G

    ceph not working monitors and managers lost

    Just tried to create manager on node1 (stack1) no managers will start... I dont see OSD listed on any node other than on the drive assignments themselves... wondering if map is lost and any way to restore the managers and metadata before I lose it from all the nodes
  7. G

    ceph not working monitors and managers lost

    All nodes are identical on packages and are up to date proxmox-ve: 7.1-1 (running kernel: 5.13.19-4-pve) pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe) pve-kernel-helper: 7.1-12 pve-kernel-5.13: 7.1-7 pve-kernel-5.11: 7.0-10 pve-kernel-5.4: 6.4-4 pve-kernel-5.13.19-4-pve: 5.13.19-9...
  8. G

    ceph not working monitors and managers lost

    I guess I forgot to add an OSD on node 5 - or the drive was purged or something... not sure... "node 5" no longer has OSD on that 1TB HDD - like I have on all the others... I may not have set it up - skipped it or something - but pretty sure I had it in there... but the other nodes are all...
  9. G

    ceph not working monitors and managers lost

    OSD Layout Node 'node900' Node 'node8' Node 'node7' Node 'node5' Node 'node4' Node 'node3' Node 'node2' Node 'stack1'
  10. G

    ceph not working monitors and managers lost

    node6 - removed from cluster (fixing server - will add back asap) node 7 root@node7:~# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo...
  11. G

    ceph not working monitors and managers lost

    10.0.1.3 root@node3:~# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host...
  12. G

    ceph not working monitors and managers lost

    10.0.1.1 root@stack1:~# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host...
  13. G

    ceph not working monitors and managers lost

    root@node7:/etc/pve# ceph auth ls 2022-02-27T17:17:55.616-0600 7f907359e700 0 monclient(hunting): authenticate timed out after 300 2022-02-27T17:22:55.618-0600 7f907359e700 0 monclient(hunting): authenticate timed out after 300 2022-02-27T17:27:55.615-0600 7f907359e700 0 monclient(hunting)...
  14. G

    Old server cluster - 6 x 1gb nic - best way to config?

    Also - it is weird that on most nodes it shows eno1 as the active port... but on like node 7 it shows enps0f0 as the active nic port... I currently only have one port connected - the on-board port which SHOULD reports as eno1 but for some odd reason ProxMox thinks its the physical 4 port card...
  15. G

    Old server cluster - 6 x 1gb nic - best way to config?

    ok so how are you saying to do this?
  16. G

    ceph not working monitors and managers lost

    I see host node 900 is down (bulk of the osd storage) so that may be one big thing that if I can get back online will help rebalance things again..

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!