Recent content by Dragonn

  1. D

    Duplicate MAC Addresses Generated for Bonded Interfaces on Identical Server Hardware

    Debian Bullseye has machine-id on two places, it's probably good idea to keep them in sync. When clonning servers, I do something like rm /etc/machine-id /var/lib/dbus/machine-id /usr/bin/dbus-uuidgen --ensure /bin/systemd-machine-id-setup
  2. D

    Migration loop when removing host from HA group

    Sure, all hypervisors are in given group cluster with priority 1. When I want to do maintanance on some host, I just set its priority to 0. Interesting part of groups.cfg looked like this group: cluster comment Whole cluster group nodes...
  3. D

    Migration loop when removing host from HA group

    Hello guys, I am reinstalling two Proxmox hypervisors to new hardware, so I need to migrate all VMs from old hypervisors. I have lowered HA priority (from 0 to 1) of given hosts and VMs started to get migrated out (as expected). Unfortunately single VM (ID 204) migration failed (unable to...
  4. D

    Hypervisor rebooted when VM memory resized

    Thanks @aaron for all your answers. Yea, I understand that there is some cases where this behavior is not intended. I am currently in situation when I try to build cluster as realiable as possible. I don't mind reserving RAM and never using it, rather than unexpected VM failure because of OOM...
  5. D

    Hypervisor rebooted when VM memory resized

    Hello there, yesterday one of our hypervisor crashed (actualy it was probably rebooted by watchdog) when VM had it's memory resized. I would really appreciate if you can give me some insight into VM placement algorithm, because I am still unable to completely understand how node selection...
  6. D

    Cluster Losing Quorum

    Hi, current status probably won't show you that much. Try to search logs instead. You can start with corosync logs, where you can find information if any cluster communication link is lost. If so, there will be probably some networking issue to investigate.
  7. D

    [SOLVED] Unable to properly remove node from cluster

    Thank you very much @dylanw , it looks like deleting folders was enought to clean it up. I cannot find it anywhere else.
  8. D

    [SOLVED] Unable to properly remove node from cluster

    Sure, no problem. But I can see no traces of virt98 in corosync configurations or runtime. P virt3[root](15:20:22)-(~) -> pvecm status Cluster information ------------------- Name: virt Config Version: 21 Transport: knet Secure auth: on Quorum information...
  9. D

    [SOLVED] Unable to properly remove node from cluster

    Hello, I am struggling to remove single Proxmox node from cluster properly. I am following guide in docs https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_remove_a_cluster_node and it looks like node is only partially removed. Basicaly I did something like # ensured no VMs are on node...
  10. D

    Executing agent command via pvesh failed

    Done - https://bugzilla.proxmox.com/show_bug.cgi?id=3037. Thanks for info.
  11. D

    Executing agent command via pvesh failed

    Hello, I am trying to execute command via qemu-ga with pvesh but it's failing when I am trying to do it from remote node. Executing command with qm works fine: A ovirt9[root](14:01:56)-(~) -> qm guest exec 158 -- date { "exitcode" : 0, "exited" : 1, "out-data" : "Wed 23 Sep 2020...
  12. D

    Quorum lost when adding new 7th node

    Sequential restart of corosync daemons in cluster fixed the quorum issue.
  13. D

    Quorum lost when adding new 7th node

    Update: It happened again and it's still broken right now. History: added nodes ovirt6 -> ovirt5 disabling HA as workaround, everything was okay removed nodes ovirt97, ovirt98, ovirt99 nodes powered off, discs cleaned removed nodes from cluster ovecm delnode ovirt97 restarted corosync on 2...
  14. D

    Quorum lost when adding new 7th node

    Thank you for pointing out network QOS, I didn't think about that before and will definately consult that with network colleagues. Current network setup is not final, we plan upgrading to MLAG, but we are not there yet. I am currently not afraid of filling up network, but I know it can happen...
  15. D

    Quorum lost when adding new 7th node

    Hello there, I would like to ask for some help and guidance to debug issues, when our cluster lose corosync quorum (and reboots completely) when adding new node to cluster happened already twice on 7 node cluster (adding 4th and now with 7th node). Deployment context: every server is fresh...