Recent content by k4y53r

  1. K

    Strange Ceph behavior

    something similar happens to me, no answer yet and mds and cephfs keep failing most of weekends
  2. K

    Ceph MDS OOM killed on weekends

    Hi, things seems to get worse, 3 MDS errors last 24 h, last a couple of minutes ago, MDS get stucks on clientreplay status and get errors on syslog (cannot paste full log due post limits, see attached txt file) May 12 14:29:19 zpveo2 ceph-mds[2264075]: ./src/mds/CDentry.h: In function...
  3. K

    Ceph MDS OOM killed on weekends

    Hi again, another MDS error today, not at weekend, but one MDS keeps freezed but on active status, but almost no request processed ceph fs status zk8scephfso1 - 78 clients ============ RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active zpveo3 Reqs: 4 /s 83.7k...
  4. K

    Ceph MDS OOM killed on weekends

    Hi, No ZFS no swap used at all It seems it's not about how much RAM are avaliable, today MDS fails on 192 GB RAM node as you could see below I could check logs and something break at midnight -- Journal begins at Fri 2022-10-28 12:10:38 CEST, ends at Tue 2025-04-29 23:49:34 CEST. -- Apr 26...
  5. K

    Ceph MDS OOM killed on weekends

    Hi, I have 4 node PVE Cluster with CephFS deployed and from a couple of months ago i get MDS oom kills and sometimes MDS are deployed on another node and get stucked on clientreplay status, so i need to restart this MDS again to gain acces to cephfs from all clients Checked scheduled jobs or...
  6. K

    [TUTORIAL] HOWTO : Wrapper Script to Use Fedora CoreOS Ignition with Proxmox cloud-init system for Docker workloads

    Hi, I could deploy fedora core OS template using geco-it scripts fine on 4 nodes cluster with shared storage (ceph for block devices and NFS for snippets) but i cannot live migrate VM due hookscript error during start of virtual machine on target node TASK ERROR: hookscript error for <VM_ID> on...