Recent content by jaykavathe

  1. J

    Network crash during PVE cluster backups onto PBS

    Trying to figure out why backup process crashing my network and what is better strategy for long term. My setup for 3 node Ceph HA cluster is (2x 1G, 2x 10G): node 1: 10.10.40.11 node 2: 10.10.40.12 node 3: 10.10.40.13 Only 3 above form the HA cluster. Each has 4 port NIC, 2 are taken by...
  2. J

    Replacing Truenas server with PBS in a CT (or Baremetal)

    I have a 3 (S1, S2, S3) node CEPH cluster running plenty VMs and CTS. At the moment I have a 4th server (S4/50TB) running baremetal TrueNAS which provides NFS share for the whole cluster for backups. I also run Media server in a CT that uses S4 for data storage (no backups for media storage...
  3. J

    [SOLVED] Clean boot stuck at EFI stub: Loaded initrd ....

    I actually found my issue. The drives had old zpool from previous installation on them. I loaded gparted live boot, wiped everything clean and then had no issue at all. Also pretty much everytime, adding "nomodeset" to launch params is helping me.
  4. J

    3 node cluster replication error (networking)

    Thank you, updating the 10G subnet to /24 helped and at least I can ping each other but now I am not able to migrate still. "2023-12-03 18:39:02 100-0: end replication job with error: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=myst2' root@192.168.1.202 pvecm mtunnel...
  5. J

    3 node cluster replication error (networking)

    Spent a while today learning a bit about networking, traffic and setup 3 different networks on my 3 node cluster. Each of 3 node have 2x 1GB and 2x 10GB ports. I used eno3 for management, eno4 for cluster and eno1/eno2 for point to point full mesh. Now when I try to start replication, I am...
  6. J

    [SOLVED] Clean boot stuck at EFI stub: Loaded initrd ....

    I have a 3 node cluster, 3 identical Dell R630s. All were running Proxmox7 and eventually updated them to 8 ... all good. Having played with stuff, decided upgrade storage, and do a clean installation on 3. Got dell bios/idrac updated on all etc. Now 2 of my nodes got installation done...
  7. J

    Designing a Proxmox HA cluster with 4 nodes on 2 remote sites - Quorum and impact

    Bumping up this thread, I am interested as well to hear the answers.
  8. J

    Moving ceph to ipv6 ring network

    I have 3 node proxmox cluster on a managed switch (192.168.1.150/151/152) All VMs are on VLAN 10. All nodes have 2x1G + 2x 10G. Nodes are running ceph cluster. For each node, 1 x 1G (vmbr0) used for management (192.168.1.150/151/152) 2x 10G on used for IPV6/ospf6 Ring network but network unused...
  9. J

    [SOLVED] Help: CEPH pool non responsive/inactive after moving to a new house/new connection

    I asked about this during setting up and people said it wont help much to use second interface on server. If I am hearing you right.... Should I do this and use those 4 extra 10G ports on my netgear managed switch. Does it look right? Or should I just skip 1G connection and use 10G port only.
  10. J

    [SOLVED] Help: CEPH pool non responsive/inactive after moving to a new house/new connection

    I guess I wont be angry if put my name up in most stupid person of the month on top of this forum. One of the node was not fully updated and repo was bad on that one. The thing is that when you update proxmox from webshell (running 3 nodes) and switch node while one node is updating... you dont...
  11. J

    [SOLVED] Help: CEPH pool non responsive/inactive after moving to a new house/new connection

    I definitely didnt mess with external tools until exhausted other options. CEPH was non-responsive before. You think reinstall and reimporting OSDS is possible. I just dont want to lose the data that I thought is copied on 6 discs :)
  12. J

    [SOLVED] Help: CEPH pool non responsive/inactive after moving to a new house/new connection

    Another info. A friend said RADOS is corrupted because of this message. root@mystic1:/var/log/ceph# systemctl status ceph-radosgw.target Unit ceph-radosgw.target could not be found. root@mystic1:/var/log/ceph# systemctl status...
  13. J

    [SOLVED] Help: CEPH pool non responsive/inactive after moving to a new house/new connection

    Hmmm , now suspecting that this whole thing also has to do something with following error. Should I try to update Ceph to Reef or something? root@mystic1:~# ./cephadm install Installing packages ['cephadm']... Non-zero exit code 100 from apt-get install -y cephadm apt-get: stdout Reading...
  14. J

    [SOLVED] Help: CEPH pool non responsive/inactive after moving to a new house/new connection

    Also I tried to install cephadm and upgrade proxmox but now getting this error. Wondering if some packages are damaged may be? Can I reinstall ceph without losing data on my disks? Setting up proxmox-kernel-6.2 (6.2.16-12) ... Errors were encountered while processing: cephadm E: Sub-process...
  15. J

    Ceph completely broken - Error got timeout (500)

    Did you find any resolution for this? I am in the same boat and getting help from folks here but wondering what fixed it for you.. if it did.