Search results

  1. G

    After unplanned reboot locked VMs don't start

    @tom have you read my post? Where did I ask how to prevent a spontaneous reboot? (Also there would be no point, as there can be many reasons, from kernel errors to power outages to harware malfunction.) I was asking is it necessary for VM backup locks to persist across reboots? If not, it would...
  2. G

    KVM disk has disappeared from Ceph pool

    Okay, so I re-added pool2 in the storage UI (did not touch the Ceph pool itself), and checked keyrings: root@proxmox:~# cat /etc/pve/priv/ceph/pool2.keyring [client.admin] key = PQDDRU9YX9u7HhAAEo3wLAFVCgVL+JsrEcs6HA== root@proxmox:~# cat /etc/pve/priv/ceph/pool3.keyring [client.admin]...
  3. G

    After unplanned reboot locked VMs don't start

    There is a bug on the Proxmox bugzilla that is (partly) about this case: https://bugzilla.proxmox.com/show_bug.cgi?id=1024
  4. G

    After unplanned reboot locked VMs don't start

    Why would I be happy that you spam my thread with off-topic nonsense? Why do you keep posting instead of reading the original question? How can you be so ignorant to tell me what to do after hijacking my thread? - no one asked your advice about how to prevent reboots - no one was blaming...
  5. G

    After unplanned reboot locked VMs don't start

    Next time why don't you try to actually read the post before replying to it... It helps, especially if you already missed reading the title.
  6. G

    After unplanned reboot locked VMs don't start

    Occasionally we experience unplanned, spontaneous reboots on our Proxmox nodes installed on ZFS. The problem we are having is related to vzdump backups: if a reboot happens during an active vzdump backup that locks a VM, after reboot the locked guest will not start, and needs to be manually...
  7. G

    KVM disk has disappeared from Ceph pool

    It does not give any output (the PVE storage is named the same as the RBD pool): root@proxmox:~# pvesm list pool2 root@proxmox:~# I have even tried to remove and re-add the RBD storage with different monitor IPs, to the same effect: Proxmox does not see the content of the RBD pools.
  8. G

    KVM disk has disappeared from Ceph pool

    This took about 1 second: root@proxmox:~# rbd ls -l pool2 NAME SIZE PARENT FMT PROT LOCK vm-102-disk-1 43008M 2 vm-112-disk-1 102400M 2 vm-126-disk-1 6144G 2 The GUI list is empty since the spontaneous node reboot Saturday night, and neither OSD repairs...
  9. G

    KVM disk has disappeared from Ceph pool

    @tom @fabian @wolfgang any idea what might have happened here? Proxmox still can't access my VM disks on the rbd pool...
  10. G

    KVM disk has disappeared from Ceph pool

    Also here is my ceph.conf, I remember disabling the cache writethrough until flush setting (probably not connected to this issue). This is clearly a Proxmox problem, since the data is there on the rbd pools, only Proxmox does not seem to see them. root@proxmox:~# cat /etc/pve/ceph.conf...
  11. G

    KVM disk has disappeared from Ceph pool

    Sure, here you go: root@proxmox:/etc/pve/qemu-server# cat 126.conf balloon: 0 bootdisk: virtio0 cores: 2 cpu: SandyBridge ide2: none,media=cdrom memory: 1024 name: OMV3 net0: virtio=3A:A9:18:97:A9:75,bridge=vmbr2 net1: virtio=0A:57:30:EB:EF:01,bridge=vmbr0 numa: 0 onboot: 1 ostype: l26 scsihw...
  12. G

    KVM disk has disappeared from Ceph pool

    root@proxmox:~# rbd ls pool2 vm-102-disk-1 vm-112-disk-1 vm-126-disk-1 Isn't that interesting? The virtual disks are there, yet Proxmox does not see them. I have since updated and restarted all cluster nodes, but the problem persists, Proxmox does not see the contents of the Ceph pools.
  13. G

    KVM disk has disappeared from Ceph pool

    I have a small 5 node Ceph (hammer) test cluster. Every node runs Proxmox, a Ceph MON and 1 or 2 OSDs. There are two pools defined, one with 2 copies (pool2), and one with 3 copies of data (pool3). Ceph has a dedicated 1Gbps network. There are a few RAW disks stored on pool2 at the moment...
  14. G

    Node rebooting without apparent reason

    Are you on ZFS? Is your system low on free memory? If your answer to both questions is yes, then your spontaneous reboot can be prevented by the following: 1. ZFS ARC size We aggressively limit the ZFS ARC size, as it has led to several spontaneous reboots in the past when left unlimited...
  15. G

    Frequent CPU stalls in KVM guests during high IO on host

    Yes, create the file with the two lines. Include the values in bytes, so 5GB equals 5x1024x1024x1024 bytes. Limit the ARC and set the values for rpool/swap as soon as you can, and your server will not reset anymore. You can also upgrade Proxmox to 4.4 right now, it will work fine. You should...
  16. G

    Frequent CPU stalls in KVM guests during high IO on host

    1. ZFS ARC size We aggressively limit the ZFS ARC size, as it has led to several spontaneous reboots in the past when unlimited. Basically, we add up all the memory the system uses without caches and buffers (like all the KVM maximum RAM combined), subtract that from total host RAM, and set the...
  17. G

    Frequent CPU stalls in KVM guests during high IO on host

    These tweaks are for the host mainly, but I'm using vm.swappiness=1 in all my VMs as well to avoid unnecessary swapping. Just as I expected: this CPU stall issue is happening when the host is using ZFS, and IO load is high.
  18. G

    Frequent CPU stalls in KVM guests during high IO on host

    @e100 were you having problems / applying this solution on the host, in the guests or both? I have come to apply the same solution to similar set of problems as well. Unfortunately I have no idea why it helps, but IO buffers and KVM memory allocation together with the heavy memory use of ZFS in...
  19. G

    Frequent CPU stalls in KVM guests during high IO on host

    I think it's fine. I have a couple of other sysctl tweaks that are all supposed to decrease IO load due to better use of memory, thus prevent stalls and OOM kills: vm.min_free_kbytes = 262144 vm.dirty_background_ratio = 5 vm.swappiness = 1 This is about why dirty_background_ratio needs to be...
  20. G

    Frequent CPU stalls in KVM guests during high IO on host

    Thank you for this report. In our case, almost all of our VMs were created under 4.3 (and their contents synced from OpenVZ), so I doubt that the version of the node that created the VM would have any effect. One possible solution On the other hand, we have found one thing that probably helps...