Recent content by gradinaruvasile

  1. G

    Backup job is stuck and I cannot stop it or even kill it

    The server that hosts the NFS share is dead at this moment.
  2. G

    Backup job is stuck and I cannot stop it or even kill it

    One server is out because of the aforementioned error, another will be out for the duration of the reboot. I will add 2 votes for a node temporarily to make sure quorum is maintained during this time.
  3. G

    Backup job is stuck and I cannot stop it or even kill it

    And related to the phantom process that we see in the GUI that belonged to the server that is in error. What can be done about that? It is not present on any physical server, can it be removed somehow?
  4. G

    Backup job is stuck and I cannot stop it or even kill it

    Luckily we have the resources to move the running VMs to other servers in the cluster. I think still needs some quorum trickery to prevent loss as it is a 4 node cluster.
  5. G

    Backup job is stuck and I cannot stop it or even kill it

    The hosting server thrown a CPU error according to the ilo IML logs, it cannot be powered on from ilo in this state, and nobody is on the site until sunday to try a hard reset.
  6. G

    Backup job is stuck and I cannot stop it or even kill it

    Same issue here. At backup time one of the servers died. The thing is this server was the one that shared its storage via NFS and the backups were taken on that NFS share (mounted with the "hard" option. Now we have 2 backup processes in the GUI log, one was running on the server that died, the...
  7. G

    Hostnames changed to uppercase caused cluster outage after reboot (PVE 5.3)

    Hello, We are running Proxmox 5.3 on a cluster (did not had time to upgrade yet this one) and last weekend we had a planned downtime (changed switches). After starting up, nothing was working, the web ui came up, login failed with "No Proxmox VE services running", no VMs were started, cluster...
  8. G

    Can i prevent mixed 6/7 PVE cluster from changing storage/backup config to the new version?

    Hello. We have a 5 member cluster that originally had been 3 servers running PVE 6.1. Now we added 2 more servers that were installed with the latest, 7.3 at that time. The idea is to upgrade gradually and the new servers were to test the compatibility between versions. Unfortunately there were...
  9. G

    [SOLVED] Active directory to use UPN instead of samaccountname?

    Hello, i am not sure what field is used for user matching in the AD connector, i suppose it is samaccountname. Is there an option to change it to UPN somewhere? Thanks.
  10. G

    [SOLVED] Corosync stopped working on one PVE node (out of 4) after restart

    Hmm. I did that, now it seems to be working. I restarted the pve-ha-crm and pve-ha-lrm services before, but i had reservations about corosync. So basically all corosync instances have to be restarted in these cases, not just the non working ones. I suppose the pve-ha-crm and pve-ha-lrm...
  11. G

    [SOLVED] Corosync stopped working on one PVE node (out of 4) after restart

    This is syslog from a working node. It has the "[TOTEM ] Failed to receive the leave message. failed: 4" message in it, don't know if it's relevant. tail -f /var/log/syslog | grep -i corosync Feb 4 15:53:59 ndi-srv-021 corosync[13960]: [QUORUM] Members[3]: 1 2 3 Feb 4 15:53:59 ndi-srv-021...
  12. G

    [SOLVED] Corosync stopped working on one PVE node (out of 4) after restart

    systemctl status pve-cluster systemctl status corosync.service This is the non working node pveversion -v: proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve) pve-manager: 6.3-3 (running version: 6.3-3/eee5f901) pve-kernel-5.4: 6.3-3 pve-kernel-helper: 6.3-3 pve-kernel-5.3: 6.1-6...
  13. G

    [SOLVED] Corosync stopped working on one PVE node (out of 4) after restart

    Other than the /etc/pve/corosync.conf, what else do i need? /etc/pve/corosync.conf: logging { debug: off to_syslog: yes } nodelist { node { name: ndi-srv-021 nodeid: 2 quorum_votes: 1 ring0_addr: 10.10.10.54 } node { name: ndi-srv-024 nodeid: 4...
  14. G

    [SOLVED] Corosync stopped working on one PVE node (out of 4) after restart

    Yes, but as for now not many VMs are marked as HA. This specific node has nothing on it since it was stopped and everything on it was transferred beforehand.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!