Search results

  1. D

    UPS - Shutdown entire cluster

    We have nut-server successfully monitoring a UPS with nut-client running on all nodes. When power goes away it correctly and simultaneously initiates 'init 0' on all nodes but this then causes problems. Nodes that only provide Ceph storage shut down before VMs are given a chance (yes, qemu...
  2. D

    Ceph Nautilus to Octopus upgrade gotchas?

    Set autoscale to warn instead of auto, until the bug with monitors not trimming osdmaps is resolved. https://forum.proxmox.com/threads/ceph-octopus-upgrade-notes-think-twice-before-enabling-auto-scale.80105/
  3. D

    Ceph Octopus upgrade notes - Think twice before enabling auto scale

    The fundamental problem is the osdmaps not being trimmed by the monitors at each step where reducing placement groups end with all being in a clean state. It only prunes the osdmaps once all monitors have been restarted and all placement groups are clean at that moment in time. ie: Large...
  4. D

    PVE 6.3 with HA Ceph iSCSI

    Ceph local SSD RBD - Flushing each write: Command Line: diskspd.exe -b8k -d120 -Suw -L -o2 -t4 -r -w30 -c250M c:\io.dat Input parameters: timespan: 1 ------------- duration: 120s warm up time: 5s cool down time: 0s measuring latency...
  5. D

    PVE 6.3 with HA Ceph iSCSI

    Ceph iSCSI remote HDD EC exported and accessed via MultiPath iSCSI - Flushing each write: Command Line: diskspd.exe -b8k -d120 -Suw -L -o2 -t4 -r -w30 -c250M e:\io.dat Input parameters: timespan: 1 ------------- duration: 120s warm up time: 5s cool...
  6. D

    PVE 6.3 with HA Ceph iSCSI

    We used Microsoft's TechNet Diskspd utility (https://gallery.technet.microsoft.com/DiskSpd-a-robust-storage-6cd2f223) to benchmark performance. The tool is command line based, herewith the parameters it was run with: @echo off cd "C:\Users\Administrator\Desktop\Diskspd-v2.0.17\amd64fre" echo...
  7. D

    Ceph-iscsi howto ?

    You could now consider running Ceph's multipath (highly available) iSCSI gateways, herewith the thread: https://forum.proxmox.com/threads/pve-6-3-with-ha-ceph-iscsi.81991/
  8. D

    Ceph IScsi initiator?

    You could now consider running Ceph's multipath (highly available) iSCSI gateways, herewith the thread: https://forum.proxmox.com/threads/pve-6-3-with-ha-ceph-iscsi.81991/
  9. D

    Proxmox 6 and CEPH iSCSI ?

    You could now consider running Ceph's multipath (highly available) iSCSI gateways, herewith the thread: https://forum.proxmox.com/threads/pve-6-3-with-ha-ceph-iscsi.81991/
  10. D

    ISCSI gateway with proxmox

    You could now consider running Ceph's multipath (highly available) iSCSI gateways, herewith the thread: https://forum.proxmox.com/threads/pve-6-3-with-ha-ceph-iscsi.81991/ Depending on your use you may however prefer to simply deploy a minimal Debian VM and then feed it additional images...
  11. D

    PVE 6.3 with HA Ceph iSCSI

    Hi, To start I would not recommend that people use this to somehow cook together PVE using a remote cluster via iSCSI as storage for VMs. In our case we have a secondary cluster which used to host a multi-tenant internet based backup service which comprised of 6 servers with 310 TiB available...
  12. D

    TASK ERROR: timeout waiting on systemd

    We also appear to also bee experiencing the same problem. We have some Windows VMs (appears to primarily affect Active Directory role related hosts, such as DCs and dedicated Azure Connect instance, although we primarily host Linux VMs on the affected cluster). We are most probably also not...
  13. D

    Show SSD wearout - SAS connected SSDs

    Hi, Please may I ask that disk health displays show the media wearout indicator for SAS connected SSDs? I presume the 'Disks' information is parsed via smartctl and subsequently displays N/A due to SAS connected SSDs not showing raw value data. Herewith a snippet of the SSDs which connect via...
  14. D

    ceph storage all pgs snaptrim every night slowing down vms

    It appears there is motivation to submit a pull request to once again change the default back to 'true' again. Whilst there are some workloads that benefit from this having been disabled, the majority of people may ultimately prefer to enable this again, especially smaller clusters comprising of...
  15. D

    Ceph Octopus - Monitor sometimes inconsistent

    kvm6a then goes on to indicate osd 22, the OSD that had just been restarted, had slow ops but again only from kvm6a's monitor's perspective...
  16. D

    Ceph Octopus - Monitor sometimes inconsistent

    kvm6c: 2020-12-04T12:52:04.363+0200 7ff0f1dc4700 1 mon.kvm6c@2(peon).osd e57825 e57825: 24 total, 24 up, 24 in 2020-12-04T12:52:04.815+0200 7ff0f45c9700 1 mon.kvm6c@2(peon).osd e57825 _set_new_cache_sizes cache_size:1020054731 inc_alloc: 71303168 full_alloc: 553648128 kv_alloc: 390070272...
  17. D

    Ceph Octopus - Monitor sometimes inconsistent

    Hi Alwin, The slow ops were reported by only one of the 3 monitors after having restarted OSD 22. The warnings continued for 4+ days until only that monitor was restarted. During this time all placement groups were active+clean, as shown in the first message above. My understanding of the logs...
  18. D

    Ceph Octopus - Monitor sometimes inconsistent

    Hi Alwin, Thanks for the feedback, issue appears to affect Octopus monitors so I would expect more reports along these lines once more and more people upgrade from Nautilus. Outstanding operations were stuck for almost 4 days whilst everything was active+clean with virtually no load. To answer...
  19. D

    Ceph Octopus - Monitor sometimes inconsistent

    We appear to have an inconsistent experience with one of the monitors sometimes appearing to miss behave. Ceph health shows a warning with slow operations: [admin@kvm6b ~]# ceph -s cluster: id: 2a554db9-5d56-4d6a-a1e2-e4f98ef1052f health: HEALTH_WARN 17 slow ops, oldest...
  20. D

    Ceph Octopus upgrade notes - Think twice before enabling auto scale

    Hi, We've been working through upgrading individual nodes to PVE 6.3 and extended this to now include hyper converged Ceph cluster nodes. The upgrades themselves went very smoothly but the recommendation around setting all storage pools to autoscale can cause some headaches. The last paragraph...