ceph-osd OOM

Fug1

Active Member
Mar 27, 2019
21
4
43
39
I have a 3-node PVE 7.4-18 cluster running Ceph 15.2.17. There is one OSD per node, so pretty simple. I'm using 3 replicas, so the data should basically be mirrored across all OSDs in the cluster.

Everything has been running fine for months, but I've suddenly lost the ability to get my OSDs up and running.

The ceph-osd on each node keeps crashing on startup, and it looks like it's being killed by the Linux OOM killer:

[ 4530.421204] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=system-ceph\x2dosd.slice,mems_allowed=0,global_oom,task_memcg=/system.slice/system-ceph\x2dosd.slice/[EMAIL]ceph-osd@4.service[/EMAIL],task=ceph-osd,pid=37704,uid=64045
[ 4530.421315] Out of memory: Killed process 37704 (ceph-osd) total-vm:39459496kB, anon-rss:31373092kB, file-rss:756kB, shmem-rss:0kB, UID:64045 pgtables:76112kB oom_score_adj:0

In the ceph-osd log, the last message before the crash is like this:

2025-01-07T14:35:42.066-0500 7f3bf2418d80 0 osd.4 36581 load_pgs

This seemed to come on suddenly and it's affecting all OSDs in the cluster. So I guess it must be something to do with the data, and I wonder if there's a way to recover it.

I found this article and wonder if I should go through this process, but wanted to find out if anyone had experienced anything similar.

https://www.croit.io/blog/how-to-solve-the-oom-killer-process-from-killing-your-osds

Happy to provide any other detail, but I'm not sure what else would be helpful.

TIA!
 
Last edited:
A couple of additional data points:

Two of the nodes have 32GB of memory, the other has 64GB of memory. All three nodes are experiencing the ceph-osd OOM issue.

osd_memory_target for the OSDs appears to be 4GB

ceph config get osd osd_memory_target
4294967296
 
Ceph version 15 is already quite a few years old. I do remember that there used to be an occasional issue with OOM, but it has been to long for any details.

I found this article and wonder if I should go through this process, but wanted to find out if anyone had experienced anything similar.
Doesn't hurt to test it on one of the OSDs.
 
Yes, I really need to upgrade but can't do that while the cluster is unhealthy.

I tried to go through the process documented on that webpage, but it's lacking some detail.

The command:

while read pg; do echo $pg; ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-{OSD-ID} --op log --pgid $pg > pglog.json; jq '(.pg_log_t.log|length),(.pg_log_t.dups|length)' < pglog.json; done < /root/osd.{OSD-ID}.pgs.txt 2>&1 | tee dups.log

creates a file called dups.log that contains two numbers for each pg. The first number seems to be the number of log entries and the second seems to be the number of duplicate log entries.

In my case, it output the below information for pg_id 18.10:
18.10
2363
9486541

Which suggests that pg_id 18.10 has 2363 log entries and 9486541 duplicate log entries. There are other pgs with a high number of duplicates, but this one is the highest.

The next step in the process is to trim the duplicate log entries with the command:

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-{OSD-ID} --op trim-pg-log-dups --pgid {PG-ID} --osd_max_pg_log_entries=100 --osd_pg_log_dups_tracked=100 --osd_pg_log_trim_max=500000

But it doesn't really indicate what an acceptable number of duplicates would be. Should I just run it on one pg at a time, starting with the one with the most duplicates, and see if the OSD is able to start after trimming each pg? Or should there normally be 0 duplicates and I need to trim any pg with any duplicates?
 
Last edited:
Apparently Ceph Quincy has a log entry that suggests running this command if the number of duplicates exceeds 6,000.

 
I ran the trim on all PGs in all OSDs where the duplicate entries were greater than 6,000. That seems to have done the trick, my OSDs can now start and my cluster is healthy.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!