Proxmox 9 eating all the memory (even without any VM onto the node, no ZFS)

seb4itik

New Member
Sep 21, 2025
4
0
1
Hey everyone,

First time I post here.

I'm facing a very big and strange problem: I installed few weeks ago a brand new Proxmox VE 9cluster (no upgrade) of 3 then 4 nodes. No ZFS, but a Ceph cluster.
Three nodes have 128 Go of memory and 4 NVME 1 To each, one has 192 Go of memory and 2 NVME 1 To each.

And I discovered that my memory is been eating every day until OOM after 8 or 9 days, for each node.
I have checked everything, made a lot of search onto Internet. Found nothing.
Even one node, without running any VMs (only two stopped VM) is eating all the memory day by day, hour by hour.
The Linux buffers or not so big.
The OSD are using 2 to 4 Go each.
The only way to have the memory back is to reboot the node.

When I sum processes memory, Linux buffers and caches, the total is very far away from what should be available.

I use PVE firewal, PVE SDN and Ceph.
But the all seem to use normal memory size.

I don't use HA and ZFS.

Thanks for any help.

Code:
root@pve3:~# free -h
               total        used        free      shared  buff/cache   available
Mem:           125Gi        69Gi        41Gi        67Mi        15Gi        55Gi
Swap:          2.0Gi          0B       2.0Gi


root@pve3:~# ps faux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
.../...
root           1  0.9  0.0  27360 17760 ?        Ss   Sep15  56:18 /sbin/init
root         483  0.0  0.1 252568 174060 ?       Ss   Sep15   0:34 /usr/lib/systemd/systemd-journald
root         532  0.0  0.0  36660 10716 ?        Ss   Sep15   0:00 /usr/lib/systemd/systemd-udevd
root         680  0.0  0.0   7508  4172 ?        Ss   Sep15   0:02 /usr/sbin/mdadm --monitor --scan
_rpc         906  0.0  0.0   6564  3780 ?        Ss   Sep15   0:00 /usr/sbin/rpcbind -f -w
root         940  0.0  0.0   5256  1456 ?        Ss   Sep15   0:00 /usr/sbin/blkmapd
ceph        1065  0.0  0.0  22272 13968 ?        Ss   Sep15   0:00 /usr/bin/python3 /usr/bin/ceph-crash
root        1067  0.0  0.0   8548  4240 ?        Ss   Sep15   0:17 /bin/bash /usr/bin/check_mk_agent
root     2634788  0.0  0.0   6012  1900 ?        S    07:54   0:00  \_ sleep 60
cmk-age+    1087  0.0  0.0  19384  5808 ?        Ssl  Sep15   0:54 /usr/bin/cmk-agent-ctl daemon
_chrony     1093  0.0  0.0  19916  3560 ?        S    Sep15   0:01 /usr/sbin/chronyd -F 1
_chrony     1094  0.0  0.0  11760  3264 ?        S    Sep15   0:00  \_ /usr/sbin/chronyd -F 1
message+    1099  0.0  0.0   8220  4264 ?        Ss   Sep15   0:18 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root        1132  0.0  0.0   5516  2428 ?        Ss   Sep15   0:00 /usr/libexec/lxc/lxc-monitord --daemon
root        1141  0.0  0.0 276132  3616 ?        Ssl  Sep15   0:00 /usr/lib/x86_64-linux-gnu/pve-lxc-syscalld/pve-lxc-syscalld --system /run/pve-lxc-syscalld/socket
root        1183  0.0  0.0 617828  8084 ?        Ssl  Sep15   1:01 /usr/bin/rrdcached -g
root        1199  0.0  0.0   9688  5640 ?        Ss   Sep15   0:00 /usr/sbin/smartd -n -q never
root        1212  0.1  0.0 658484 72964 ?        Ssl  Sep15   6:41 /usr/bin/pmxcfs
root        1228  0.0  0.0  18616  8756 ?        Ss   Sep15   0:11 /usr/lib/systemd/systemd-logind
root        1235  0.0  0.0   2432  1256 ?        Ss   Sep15   0:08 /usr/sbin/watchdog-mux
root        1279  0.0  0.0 104396  6144 ?        Ssl  Sep15   0:00 /usr/sbin/zed -F
root        1288  0.0  0.0 159064  2392 ?        Ssl  Sep15   0:00 /usr/bin/lxcfs /var/lib/lxcfs
root        1294  0.0  0.0  11768  8000 ?        Ss   Sep15   0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
root     2634953  0.0  0.0  19340 12296 ?        Ss   07:55   0:00  \_ sshd-session: root [priv]
root     2635015  0.1  0.0  19464  7000 ?        S    07:55   0:00      \_ sshd-session: root@pts/0
root     2635034  0.0  0.0   8904  5632 pts/0    Ss   07:55   0:00          \_ -bash
root     2635641  0.0  0.0   6536  3756 pts/0    R+   07:55   0:00              \_ ps faux
root        1320  0.0  0.0   2572  1252 ?        S<   Sep15   0:47 /usr/sbin/atopacctd
root        1337  0.0  0.0  39856 21920 ?        Ss   Sep15   0:19 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
root        1347  0.0  0.0   8160  2656 tty1     Ss+  Sep15   0:00 /sbin/agetty -o -- \u --noreset --noclear - linux
root        1348  0.0  0.0   8204  2760 ttyS0    Ss+  Sep15   0:00 /sbin/agetty -o -- \u --noreset --noclear --keep-baud 115200,57600,38400,9600 - vt220
root        1368  0.0  0.0  43984  4488 ?        Ss   Sep15   0:00 /usr/lib/postfix/sbin/master -w
postfix     1370  0.0  0.0  44060  7424 ?        S    Sep15   0:00  \_ qmgr -l -t unix -u
postfix   994423  0.0  0.0  52900 13532 ?        S    Sep16   0:00  \_ tlsmgr -l -t unix -u -c
postfix  2606784  0.0  0.0  44008  7252 ?        S    07:25   0:00  \_ pickup -l -t unix -u -c
ceph        1376  0.1  0.0 315764 53684 ?        Ssl  Sep15   7:00 /usr/bin/ceph-mds -f --cluster ceph --id pve3 --setuser ceph --setgroup ceph
ceph        1377  0.4  0.4 860148 628392 ?       Ssl  Sep15  27:08 /usr/bin/ceph-mon -f --cluster ceph --id pve3 --setuser ceph --setgroup ceph
root        1445  0.7  0.1 577396 184580 ?       SLsl Sep15  44:16 /usr/sbin/corosync -f
root        1449  0.0  0.0   7268  2676 ?        Ss   Sep15   0:01 /usr/sbin/cron -f
root        1452  0.0  0.0  13808  6228 ?        Ss   Sep15   0:19 /usr/libexec/proxmox/proxmox-firewall start
root        1669  0.3  0.1 174868 137596 ?       Ss   Sep15  19:01 pvestatd
root        1671  0.4  0.0 177308 111988 ?       Ss   Sep15  27:01 pve-firewall
root        1983  0.0  0.1 216964 175344 ?       Ss   Sep15   0:04 pvedaemon
root     2600969  0.0  0.1 228092 157576 ?       S    07:19   0:00  \_ pvedaemon worker
root     2600970  0.0  0.1 228160 157396 ?       S    07:19   0:00  \_ pvedaemon worker
root     2600971  0.0  0.1 228272 157772 ?       S    07:19   0:00  \_ pvedaemon worker
ceph        2268  4.1  2.0 4166004 2718636 ?     Ssl  Sep15 238:55 /usr/bin/ceph-osd -f --cluster ceph --id 10 --setuser ceph --setgroup ceph
ceph        2269  3.7  2.1 4120872 2784396 ?     Ssl  Sep15 217:36 /usr/bin/ceph-osd -f --cluster ceph --id 11 --setuser ceph --setgroup ceph
ceph        2270  3.5  1.9 3871128 2513548 ?     Ssl  Sep15 205:22 /usr/bin/ceph-osd -f --cluster ceph --id 8 --setuser ceph --setgroup ceph
ceph        2304  4.3  1.9 4033356 2627164 ?     Ssl  Sep15 249:24 /usr/bin/ceph-osd -f --cluster ceph --id 9 --setuser ceph --setgroup ceph
www-data    2774  0.0  0.1 218180 176748 ?       Ss   Sep15   0:06 pveproxy
www-data 2600981  0.0  0.1 227452 160268 ?       S    07:19   0:00  \_ pveproxy worker
www-data 2600982  0.0  0.1 226904 158236 ?       S    07:19   0:00  \_ pveproxy worker
www-data 2600983  0.0  0.1 227408 160572 ?       S    07:19   0:00  \_ pveproxy worker
www-data    2821  0.0  0.0  91768 67652 ?        Ss   Sep15   0:04 spiceproxy
www-data 2600980  0.0  0.0  92052 56224 ?        S    07:19   0:00  \_ spiceproxy worker
root        3136  0.0  0.1 199316 150768 ?       Ss   Sep15   0:27 pvescheduler
root     2185665  0.0  0.0  44012 43500 ?        S<Ls 00:00   0:03 /usr/bin/atop -w /var/log/atop/atop_20250919 600
root     2207239  0.0  0.0  79612  2380 ?        Ssl  00:22   0:02 /usr/sbin/pvefw-logger
root     2600907  0.0  0.0   5736  1408 ?        Ss   07:19   0:00 /usr/sbin/qmeventd /var/run/qmeventd.sock
root     2601001  0.0  0.0 202732 119756 ?       Ss   07:19   0:00 pve-ha-lrm
root     2601041  0.0  0.0 203476 120808 ?       Ss   07:19   0:00 pve-ha-crm
root     2634958  0.4  0.0  22688 13000 ?        Ss   07:55   0:00 /usr/lib/systemd/systemd --user
root     2634960  0.0  0.0  22384  3412 ?        S    07:55   0:00  \_ (sd-pam)
 

Attachments

  • Capture d’écran 2025-09-20 à 23.37.16.png
    Capture d’écran 2025-09-20 à 23.37.16.png
    111.9 KB · Views: 4
  • Capture d’écran 2025-09-19 à 06.57.02b.jpg
    Capture d’écran 2025-09-19 à 06.57.02b.jpg
    509.4 KB · Views: 4
Last edited:
Are you using Intel 810 cards? There are similar reports about memory leaks in combination with Intel 810 NICs.
 
  • Like
Reactions: SteveITS
Thanks.
You are incredible:

Code:
root@pve3:~# lspci |grep Intel
0a:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
0a:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)

Is there any work around?
 
Last edited: