aaron's latest activity

  • aaron
    aaron replied to the thread [SOLVED] 3-node cluster.
    Should you ever plan to have more nodes per room, the following CRUSH rule would be better, as it makes sure that replicas need to end up on different hosts: rule replicate_3rooms { id {RULE ID} type replicated step take default...
  • aaron
    aaron replied to the thread [SOLVED] 3-node cluster.
    Name Size Min Size main_3 2 2 There you go. That pool has a size of 2. That means, that some PGs only have one replica present because the only other one was on the lost node. Ceph should recover those once the DOWN OSDs are set to...
  • aaron
    aaron replied to the thread [SOLVED] 3-node cluster.
    The problem is this: pgs: 64.341% pgs not active 793382/2444972 objects degraded (32.450%) 83 undersized+degraded+peered Some PGs are not active, and therefore you have IO issues. Was the cluster healthy before...
  • aaron
    aaron replied to the thread [SOLVED] 3-node cluster.
    Well, as others mentioned, if one node is down, The Ceph MONs and Proxmox VE nodes should still have a quorum with 2 out of 3. Datawise, if you have set size/min_size to 3/2 in all the pools, things should keep working as you should still have 2...
  • aaron
    Hmm. It seems that the detection of which files or directories are present in the /var/lib/rrdcached/db directory is coming to wrong conclusions. Would you mind posting the output of the following command? for i in pve2-vm pve-vm-9.0; do echo...
  • aaron
    See https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#VM_Memory_Consumption_Shown_is_Higher Is the Ballooning Device enabled and is the BallooningService running?
  • aaron
    Grundsätzlich klingt das ein wenig seltsam was da gelaufen ist. Schau mal nach ob du noch /etc/pve/nodes/{alte nodes}/qemu-server Ordner hast und dort nicht die configs noch da sind. Dann kanns du sie mit mv in den richtigen schieben.
  • aaron
    To get more debug output from the processing side, can you please install the following build of pve-cluster? http://download.proxmox.com/temp/pve-cluster-9-rrd-debug/ wget...
  • aaron
    Thanks. That looks good and is as it should be. So I will have to take a look at the code that is receiving and processing that data.
  • aaron
    This is curious. Would it be okay for you to gather a bit more information? Because it seems that for some reason, the pvestatd service still collects and distributes the old pre PVE 9 metric format, but under the new key... So to further see...
  • aaron
    Siehe auch https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#VM_Memory_Consumption_Shown_is_Higher
  • aaron
    aaron replied to the thread 100% Swap Usage.
    Swap is more than just an escape for low memory: https://chrisdown.name/2018/01/02/in-defence-of-swap.html But given that the host has ~185G of memory, you could consider disabling swap, as that is a lot of memory, and if you run out of memory...
  • aaron
    Hmm, those versions look new enough. Can you please restart the pvestatd service on the hosts? Either in the Node→System panel, or with systemctl restart pvestatd Does that help to get rid the log messages?
  • aaron
    Yeah. I assume you have one interface for everything on the hosts, that goes to the switch, right? The single point of failure there is the switch. If you can add a direct cable between the hosts, without a switch in between, you can configure a...
  • aaron
    Well, those long timeouts are most likely the explanation. If corosync takes too long to form a new quorum with just the QDevice, it might take longer than the 60s timeout of the LRM! Please set it back to defaults, from one of my test clusters...
  • aaron
    Can you please post your /etc/pve/corosync.conf file? And make sure that the /etc/pve/corosync.conf and /etc/corosync/corosync.conf files are the same.
  • aaron
    aaron replied to the thread how P2V in Proxmox ?.
    Any tool that allows booting a live system on the physical host and the target VM to transfer disk contents. That can be just a regular Linux live system and dd + ssh on both sides. Or something more guided like Clonezilla. There are surely also...
  • aaron
    aaron replied to the thread PVE9 Memory management problems.
    Has that host been installed a while ago? Because, IIRC, since about 8.1, the installer limits the ARC by default. If you installed earlier, you can manually set a limit on the ARC...
  • aaron
    yeah, if you can do another test, I am interested in the pvecm status output of node pve/Node1 a few seconds after you disconnected pve1/Node2, but before it will eventually fence itself (if something is wrong). I think, that info is not yet in...
  • aaron
    There is a misunderstanding in how fencing works. It is handled by the LRM on each node. If it is in "active" mode and the host lost the connection to the quorum for more than 60 seconds, it will not renew the watchdog. Once the watchdog runs...