Search results

  1. P

    Ceph ghost OSDs in configuration database?

    Hi, I have a three node home lab PVE + Ceph cluster. Recently, I replaced one node with another server. While I removed the old node from the PVE cluster config, I missed removing the OSDs from Ceph first. So for a while they were shown (toghether with the old node ("bucket")) as down...
  2. P

    24 Scrub Errors, 3pgs inconsistent

    I just went through upgrading my cluster to PVE version 8. After the upgrade, OSD.3 did not come back online again (it kept being shown as "in" but "down", irrespective of how often I started it or the entire node). I then checked the SMART values for the drive again and this time I did find...
  3. P

    nVIDIA vGPU mdev setup not working (as per wiki)

    Hi, I am trying to set up a NVIDIA A5000 as vGPU as per this wiki article: https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE_7.x Got to the point where I enabled SRIOV and can list the virtual functions via lspci. But then it says under guest configuration that I should pick an mdev in...
  4. P

    24 Scrub Errors, 3pgs inconsistent

    No scrub information available for pg 3.36 error 2: (2) No such file or directory No scrub information available for pg 3.48 error 2: (2) No such file or directory Correct. Thank you for the recommendation. I am aware of this problem and am in the process of rectifying. But this being a...
  5. P

    24 Scrub Errors, 3pgs inconsistent

    "data_digest": "0xec9c58e1", "omap_digest": "0xffffffff", "expected_object_size": 4194304, "expected_write_size": 4194304, "alloc_hint_flags": 0, "manifest": { "type": 0 }...
  6. P

    24 Scrub Errors, 3pgs inconsistent

    { "object": { "name": "rbd_data.453d0387ce7766.00000000001028e4", "nspace": "", "locator": "", "snap": "head", "version": 371918 }, "errors": [], "union_shard_errors"...
  7. P

    24 Scrub Errors, 3pgs inconsistent

    Done. Sorry, I don't understand. Where/when wasn't I paying attention. And what do you mean that "'inactive' obiviously doesn't fit? Sorry, my Ceph knowledge is at "noob" level... PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY 3.76...
  8. P

    24 Scrub Errors, 3pgs inconsistent

    Sorry, I had missed that. I edited my post above accordingly.
  9. P

    24 Scrub Errors, 3pgs inconsistent

    Yes, they were on another node that I have since removed but missed the window to remove the OSDs first... I haven't got around to delete them from <whererever>. They don't play a role here. ceph health detail HEALTH_ERR 26 scrub errors; Possible data damage: 3 pgs inconsistent [ERR]...
  10. P

    24 Scrub Errors, 3pgs inconsistent

    So, scrubbing has continued and apparently, it identified two more errors: ceph -s cluster: id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx health: HEALTH_ERR 26 scrub errors Possible data damage: 3 pgs inconsistent services: mon: 3 daemons, quorum...
  11. P

    24 Scrub Errors, 3pgs inconsistent

    Hi, On my 3 node home lab cluster, Ceph tells me that it has discovered 24 scrub errors and that 3 pgs are inconsistent. That does not sound overly promising. More importantly, however, I have no idea what to do with this information... I take it that the nature of this error prevents Ceph...
  12. P

    Help configuring vGPU?

    Hmm. So disabling the card's displayport seems to have worked (the benefit of this is that I got my KVM video feed back because the card does not override the onboard KVM anymore) - yay! I then removed the blacklist entry from GRUB and nvidia was loaded again. But lspci -d 10de: still does...
  13. P

    Help configuring vGPU?

    Ah, got it disabled by adding module_blacklist=nvidia to the GRUB_CMDLINE_LINUX_DEFAULT line in /etc/default/grub cross my fingers that disabling the displayport is going to do the trick...
  14. P

    Help configuring vGPU?

    Hi, I am trying to get vGPU to work by following this guide: https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE_7.x#cite_ref-3 My card is an RTX A5000, the same as used in the guide, so should work. I went through with the setup but it doesn't work. lspci -d 10de: results in a very short...
  15. P

    Trying to understand nVidia vGPU

    Hi, so far, I have only used PCIe passthrough to assin a single graphics card to a single VM - that works well as long as the CPU supports IOMMU, which isn't the case with my budget Xeon E3-12xx servers. But I understand there is now another way: vGPU where I can assign (parts of) the same GPU...
  16. P

    Mailcow + PMG make sense?

    If I am not mistaken, then the temporary down time of your mail server should not result in missing (or even the loss of) emails. The sending mail server should keep trying for a while (up to two days or so) before giving up.
  17. P

    Mailcow + PMG make sense?

    And where do you see the benefits of combining the two (they seem to be duplicating some functionality between them)?
  18. P

    rbd error: rbd: listing images failed: (2) No such file or directory (500)

    Hi, I have a three node PVE cluster with Ceph with (currently) three pools. This is about one of them. When I click in the GUI on one of the pools and want to view the VM Disks or CT Volumes, I just get the error message "rbd error: rbd: listing images failed: (2) No such file or directory...
  19. P

    Ceph: Balancing disk space unequally!?!?!?!

    One more question for my understanding please: So when I have two disks on a node and one drive goes down, Ceph will try to push its content to the other drive (and if that one isn't large enough, I have a problem). If I only have one drive on a node that doesn't happen. Understood. But what...
  20. P

    [SOLVED] PBS Metrics?

    Bonus question: Are there any compatible grafana dashboard? I can only find two in the grafana dashboard library. One requires Prometheus and the other a custom script on the host. Anything that works with the PBS metrics server (out of the box)?