Recent content by aychprox

  1. A

    Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

    Observed for quite some time, it does happened only on SSD, but never happen to NVMe and HDD OSD. I'm still using ceph: 17.2.8-pve2. when happen, run in cli: ceph config set osd.x bluestore_slow_ops_warn_threshold 120 error go away. However, it come back again randomly on different SSD OSD...
  2. A

    Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

    Yes, no plan to upgrade to 19.2.1 yet. Only seen this on SSD so far. Initially thought it was caused by bluestore_cache_size and bluestore_cache_kv_ratio, no luck even after adjusted.
  3. A

    Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

    same issue here with ceph: 17.2.8-pve2 proxmox-ve: 8.4.0 pve-manager: 8.4.1 slow warning appeared at least 2-3 times a day from random SSD OSD. restart the OSD will temporary remove the warning.
  4. A

    Unsupported feature (s) vdev_zaps_v2, Cannot import rpool after upgrade

    Hi all, Today trying to upgrade one node from pve-manager/8.3.5 to latest release. Unfortunately the node unable to start due to zfs issue. This pool uses the following feature(s) not supported by this system: com.klarasystems: vdev_zaps_v2 cannot import 'rpool': unsupported...
  5. A

    [RESOLVED] is using msgr V1 protocol filling daemon.log/syslog

    Today rebooted one of the node and noticed that this line filling the daemon/syslog: Feb 18 02:27:55 node102 ceph-mgr[19158]: 2024-02-18T02:27:55.038+0800 7f472ccea700 -1 --2- 10.xx.xx.102:0/241056175 >> [v2:10.xx.xx.105:6801/4159,v1:10.xx.xx.105:6803/4159] conn(0x557fe2544800 0x557fe0fed080...
  6. A

    Ceph HDDs slow

    I'm interested to know your OSD apply & commit latency before and after add-in optane.
  7. A

    How to change virtio-pci tx/rx queue size

    just revive this post to see whether any roadmap to implement rx_queue_size and tx_queue_size for kvm guest. 256 just too low for high traffic guest, which eventually lead to huge packet drop.
  8. A

    Shutdown Standby Node in Cluster, Any Impact?

    thanks Alwin. sorry, it is 10 nodes in total, 5 nodes for ceph and 5 nodes for VM. corosync is on different vLan and 2 ring on 2 different switches. reason behind to keep this node is just for emergency spike of other nodes, so can easily boot up this spare node and migrate over. This...
  9. A

    Shutdown Standby Node in Cluster, Any Impact?

    I am running 9 nodes cluster with 5 nodes for ceph and 4 nodes for VM. Since the capacity is not fully utilized, and for the purpose of saving electricity, I shutdown 1 of the compute node and use as standby node. May I know in term of HA and corosync, will it be any impact in long run...
  10. A

    After configure metric server, VMs show status unknown.

    try to run: service pveproxy restart && service pvestatd restart
  11. A

    Sometimes backups to PBS fail with a timeout at 99% with a message " qmp command 'query-backup' failed - got timeout"

    Encountered the same issue. Does this required to downgrade on each PVE host as well?
  12. A

    [SOLVED] unable to parse active worker status

    Yes, it works. Sorry, maybe I missed this known issue.
  13. A

    [SOLVED] unable to parse active worker status

    Hi, Just upgraded to 0.8-11 latest release. Trying to run prune, following error popup in PBS: unable to parse active worker status 'UPID:pbs:0000021F:000**MASK**:0000000A:**MASK**:termproxy::root:' - not a valid user id Please advise ...
  14. A

    GC error

    I had open a bug on this issue. By the way, may I know how to manually remove the unwanted backup aka the backup "garbage"? Tried to use prune to remove, but overall disk use size still remain .....
  15. A

    GC error

    Hi, The output as below: root@pbs:~# proxmox-backup-client garbage-collect --repository vmbackup starting garbage collection on store vmbackup TASK ERROR: unable to get exclusive lock - EACCES: Permission denied root@pbs:~# ls -lha /vmbackup total 4.5K drwxr-xr-x 2 backup backup 0 Jul 16...