Memory Leak

smyers119

Member
Jan 15, 2021
8
0
6
37
I have a hyperconverged setup with 3 nodes. All running Debian 10, same hardware. There appears to be a memory leak on all the nodes that I could use help tracking down. Here's some pertinent information

Package Information:
Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 14.2.15-pve3
ceph-fuse: 14.2.15-pve3
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
Memory Leak: (The dips are restarts)memleakpve2.PNG
 
Process's by memory:
Code:
root@pve2:~# ps auwx --sort rss
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           2  0.0  0.0      0     0 ?        S    Jan04   0:00 [kthreadd]
root           3  0.0  0.0      0     0 ?        I<   Jan04   0:00 [rcu_gp]
root           4  0.0  0.0      0     0 ?        I<   Jan04   0:00 [rcu_par_gp]
root           6  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/0:0H-kblockd]
root           9  0.0  0.0      0     0 ?        I<   Jan04   0:00 [mm_percpu_wq]
root          10  0.0  0.0      0     0 ?        S    Jan04   0:12 [ksoftirqd/0]
root          11  0.0  0.0      0     0 ?        I    Jan04   4:18 [rcu_sched]
root          12  0.0  0.0      0     0 ?        S    Jan04   0:01 [migration/0]
root          13  0.0  0.0      0     0 ?        S    Jan04   0:00 [idle_inject/0]
root          14  0.0  0.0      0     0 ?        S    Jan04   0:00 [cpuhp/0]
root          15  0.0  0.0      0     0 ?        S    Jan04   0:00 [cpuhp/1]
root          16  0.0  0.0      0     0 ?        S    Jan04   0:00 [idle_inject/1]
root          17  0.0  0.0      0     0 ?        S    Jan04   0:01 [migration/1]
root          18  0.0  0.0      0     0 ?        S    Jan04   0:16 [ksoftirqd/1]
root          20  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/1:0H-kblockd]
root          21  0.0  0.0      0     0 ?        S    Jan04   0:00 [cpuhp/2]
root          22  0.0  0.0      0     0 ?        S    Jan04   0:00 [idle_inject/2]
root          23  0.0  0.0      0     0 ?        S    Jan04   0:01 [migration/2]
root          24  0.0  0.0      0     0 ?        S    Jan04   0:11 [ksoftirqd/2]
root          26  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/2:0H-kblockd]
root          27  0.0  0.0      0     0 ?        S    Jan04   0:00 [cpuhp/3]
root          28  0.0  0.0      0     0 ?        S    Jan04   0:00 [idle_inject/3]
root          29  0.0  0.0      0     0 ?        S    Jan04   0:01 [migration/3]
root          30  0.0  0.0      0     0 ?        S    Jan04   0:12 [ksoftirqd/3]
root          32  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/3:0H-kblockd]
root          33  0.0  0.0      0     0 ?        S    Jan04   0:00 [cpuhp/4]
root          34  0.0  0.0      0     0 ?        S    Jan04   0:00 [idle_inject/4]
root          35  0.0  0.0      0     0 ?        S    Jan04   0:01 [migration/4]
root          36  0.0  0.0      0     0 ?        S    Jan04   0:12 [ksoftirqd/4]
root          38  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/4:0H-kblockd]
root          39  0.0  0.0      0     0 ?        S    Jan04   0:00 [cpuhp/5]
root          40  0.0  0.0      0     0 ?        S    Jan04   0:00 [idle_inject/5]
root          41  0.0  0.0      0     0 ?        S    Jan04   0:01 [migration/5]
root          42  0.0  0.0      0     0 ?        S    Jan04   0:10 [ksoftirqd/5]
root          44  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/5:0H-kblockd]
root          45  0.0  0.0      0     0 ?        S    Jan04   0:00 [kdevtmpfs]
root          46  0.0  0.0      0     0 ?        I<   Jan04   0:00 [netns]
root          47  0.0  0.0      0     0 ?        S    Jan04   0:00 [rcu_tasks_kthre]
root          48  0.0  0.0      0     0 ?        S    Jan04   0:00 [kauditd]
root          50  0.0  0.0      0     0 ?        S    Jan04   0:00 [khungtaskd]
root          51  0.0  0.0      0     0 ?        S    Jan04   0:00 [oom_reaper]
root          52  0.0  0.0      0     0 ?        I<   Jan04   0:00 [writeback]
root          53  0.0  0.0      0     0 ?        S    Jan04   0:00 [kcompactd0]
root          54  0.0  0.0      0     0 ?        SN   Jan04   0:00 [ksmd]
root          55  0.0  0.0      0     0 ?        SN   Jan04   0:00 [khugepaged]
root         101  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kintegrityd]
root         102  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kblockd]
root         103  0.0  0.0      0     0 ?        I<   Jan04   0:00 [blkcg_punt_bio]
root         106  0.0  0.0      0     0 ?        I<   Jan04   0:00 [tpm_dev_wq]
root         107  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ata_sff]
root         108  0.0  0.0      0     0 ?        I<   Jan04   0:00 [md]
root         109  0.0  0.0      0     0 ?        I<   Jan04   0:00 [edac-poller]
root         110  0.0  0.0      0     0 ?        I<   Jan04   0:00 [devfreq_wq]
root         111  0.0  0.0      0     0 ?        S    Jan04   0:00 [watchdogd]
root         114  0.0  0.0      0     0 ?        S    Jan04   0:00 [kswapd0]
root         115  0.0  0.0      0     0 ?        S    Jan04   0:00 [ecryptfs-kthrea]
root         117  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kthrotld]
root         120  0.0  0.0      0     0 ?        I<   Jan04   0:00 [acpi_thermal_pm]
root         121  0.0  0.0      0     0 ?        I<   Jan04   0:00 [nvme-wq]
root         122  0.0  0.0      0     0 ?        I<   Jan04   0:00 [nvme-reset-wq]
root         123  0.0  0.0      0     0 ?        I<   Jan04   0:00 [nvme-delete-wq]
root         124  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ipv6_addrconf]
root         133  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kstrp]
root         134  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kworker/u13:0]
root         149  0.0  0.0      0     0 ?        I<   Jan04   0:00 [charger_manager]
root         206  0.0  0.0      0     0 ?        I<   Jan04   0:00 [bnx2x]
root         207  0.0  0.0      0     0 ?        I<   Jan04   0:00 [bnx2x_iov]
root         220  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_0]
root         221  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_0]
root         222  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_1]
root         223  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_1]
root         224  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_2]
root         225  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_2]
root         226  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_3]
root         227  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_3]
root         228  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_4]
root         229  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_4]
root         230  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_5]
root         231  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_5]
root         232  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_6]
root         233  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_6]
root         234  0.0  0.0      0     0 ?        S    Jan04   0:00 [scsi_eh_7]
root         235  0.0  0.0      0     0 ?        I<   Jan04   0:00 [scsi_tmf_7]
root         272  0.0  0.0      0     0 ?        I<   Jan04   0:05 [kworker/0:1H-kblockd]
root         273  0.0  0.0      0     0 ?        I<   Jan04   0:04 [kworker/2:1H-kblockd]
root         274  0.0  0.0      0     0 ?        I<   Jan04   0:05 [kworker/5:1H-kblockd]
root         300  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         306  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         312  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         318  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         320  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         325  0.0  0.0      0     0 ?        I<   Jan04   0:00 [dm_bufio_cache]
root         329  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         330  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         347  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kdmflush]
root         349  0.0  0.0      0     0 ?        I<   Jan04   0:00 [kcopyd]
root         350  0.0  0.0      0     0 ?        I<   Jan04   0:00 [dm-thin]
root         375  0.0  0.0      0     0 ?        I<   Jan04   0:04 [kworker/1:1H-kblockd]
root         376  0.0  0.0      0     0 ?        S    Jan04   3:06 [jbd2/dm-4-8]
root         377  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ext4-rsv-conver]
root         378  0.0  0.0      0     0 ?        I<   Jan04   0:42 [kworker/4:1H-kblockd]
root         380  0.0  0.0      0     0 ?        I<   Jan04   0:03 [kworker/3:1H-kblockd]
root         436  0.0  0.0      0     0 ?        I<   Jan04   0:00 [iscsi_eh]
root         441  0.0  0.0      0     0 ?        I<   Jan04   0:00 [rpciod]
root         442  0.0  0.0      0     0 ?        I<   Jan04   0:00 [xprtiod]
root         445  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ib-comp-wq]
root         446  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ib-comp-unb-wq]
root         447  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ib_mcast]
root         448  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ib_nl_sa_wq]
root         449  0.0  0.0      0     0 ?        I<   Jan04   0:00 [rdma_cm]
root         592  0.0  0.0      0     0 ?        S<   Jan04   0:00 [spl_system_task]
root         593  0.0  0.0      0     0 ?        S<   Jan04   0:00 [spl_delay_taskq]
root         594  0.0  0.0      0     0 ?        S<   Jan04   0:00 [spl_dynamic_tas]
root         595  0.0  0.0      0     0 ?        S<   Jan04   0:00 [spl_kmem_cache]
root         633  0.0  0.0      0     0 ?        I<   Jan04   0:00 [cryptd]
root         653  0.0  0.0      0     0 ?        S    Jan04   0:00 [irq/144-mei_me]
root         655  0.0  0.0      0     0 ?        S    Jan04   0:00 [irq/145-mei_me]
root         659  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ttm_swap]
root         688  0.0  0.0      0     0 ?        S<   Jan04   0:00 [zvol]
root         692  0.0  0.0      0     0 ?        S    Jan04   0:00 [arc_prune]
root         693  0.0  0.0      0     0 ?        SN   Jan04   0:00 [zthr_procedure]
root         694  0.0  0.0      0     0 ?        SN   Jan04   0:06 [zthr_procedure]
root         695  0.0  0.0      0     0 ?        S    Jan04   0:00 [dbu_evict]
root         696  0.0  0.0      0     0 ?        SN   Jan04   0:06 [dbuf_evict]
root         719  0.0  0.0      0     0 ?        SN   Jan04   0:00 [z_vdev_file]
root         720  0.0  0.0      0     0 ?        S    Jan04   0:05 [l2arc_feed]
root        1875  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ceph-msgr]
root        1880  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ceph-watch-noti]
root        1881  0.0  0.0      0     0 ?        I<   Jan04   0:00 [ceph-completion]
root        2514  0.0  0.0      0     0 ?        S    Jan04   0:06 [kvm-nx-lpage-re]
root        2563  0.0  0.0      0     0 ?        S    Jan04   7:48 [kvm-pit/2490]
 
Process's by memory consumption part 2 of 2
Code:
root      241086  0.0  0.0      0     0 ?        I    08:16   0:00 [kworker/u12:3-dm-thin]

root      243602  0.0  0.0      0     0 ?        I    08:25   0:00 [kworker/u12:2-events_unbound]

root      247412  0.0  0.0      0     0 ?        I    08:39   0:00 [kworker/5:1-cgroup_destroy]

root      250196  0.0  0.0      0     0 ?        I    08:49   0:00 [kworker/1:0-mm_percpu_wq]

root      255217  0.0  0.0      0     0 ?        I    09:07   0:00 [kworker/0:3-ceph-msgr]

root      255495  0.0  0.0      0     0 ?        I    09:08   0:00 [kworker/3:1-events]

root      257332  0.0  0.0      0     0 ?        I    09:13   0:00 [kworker/5:3-events]

root      257509  0.0  0.0      0     0 ?        I    09:13   0:00 [kworker/u12:0-dm-thin]

root      257611  0.0  0.0      0     0 ?        I    09:14   0:00 [kworker/0:0-events]

root      257893  0.0  0.0      0     0 ?        I    09:15   0:00 [kworker/2:3-memcg_kmem_cache]

root      258452  0.0  0.0      0     0 ?        I    09:17   0:00 [kworker/3:2-cgroup_destroy]

root      259008  0.0  0.0      0     0 ?        I    09:19   0:00 [kworker/4:1-events]

root      259009  0.0  0.0      0     0 ?        I    09:19   0:00 [kworker/4:3-cgroup_destroy]

root      259290  0.0  0.0      0     0 ?        I    09:20   0:00 [kworker/1:2]

root      259571  0.0  0.0      0     0 ?        I    09:21   0:00 [kworker/2:2-events]

root      260405  0.0  0.0      0     0 ?        I    09:24   0:00 [kworker/3:0-events]

root      260685  0.0  0.0      0     0 ?        I    09:25   0:00 [kworker/4:0-cgroup_destroy]

root      260686  0.0  0.0      0     0 ?        I    09:25   0:00 [kworker/4:2-events]

root      260965  0.0  0.0      0     0 ?        I    09:26   0:00 [kworker/0:1-events]

root      261243  0.0  0.0      0     0 ?        I    09:27   0:00 [kworker/5:0-events]

root      261244  0.0  0.0      0     0 ?        I    09:27   0:00 [kworker/5:2-cgroup_destroy]

root      261673  0.0  0.0      0     0 ?        I    09:28   0:00 [kworker/u12:1-dm-thin]

root      261689  0.0  0.0      0     0 ?        I    09:28   0:00 [kworker/2:0-events]

root      261702  0.0  0.0      0     0 ?        I    09:28   0:00 [kworker/0:2-cgroup_destroy]

root         752  0.0  0.0   4088   136 ?        Ss   Jan04   0:00 /usr/sbin/qmeventd /var/run/qmeventd.sock

root         898  0.0  0.0   6888   300 ?        Ss   Jan04   0:09 /sbin/iscsid

root      261470  0.0  0.0   5256   692 ?        S    09:27   0:00 sleep 60

root         732  0.0  0.0   2140   748 ?        Ss   Jan04   0:16 /usr/sbin/watchdog-mux

root         907  0.0  0.0   2272   748 ?        S    Jan04   0:39 bpfilter_umh

root         879  0.0  0.0   3816  1192 ?        Ss   Jan04   0:00 /usr/lib/x86_64-linux-gnu/lxc/lxc-monitord --daemon

root         935  0.0  0.0   5608  1560 tty1     Ss+  Jan04   0:00 /sbin/agetty -o -p -- \u --noclear tty1 linux

root         733  0.0  0.0  85332  2000 ?        Ssl  Jan04   0:00 /usr/bin/lxcfs /var/lib/lxcfs

root      102249  0.0  0.0  86172  2328 ?        Ssl  00:00   0:01 /usr/sbin/pvefw-logger

root         784  0.0  0.0   6724  2620 ?        S    Jan04   0:09 /bin/bash /usr/sbin/ksmtuned

root        1130  0.0  0.0   8500  2808 ?        Ss   Jan04   0:01 /usr/sbin/cron -f

root         740  0.0  0.0 411024  3048 ?        Ssl  Jan04   0:00 /usr/lib/x86_64-linux-gnu/pve-lxc-syscalld/pve-lxc-syscalld --system /run/pv

root      261678  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261679  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261680  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261681  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261682  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261683  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261686  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261687  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261688  0.0  0.0  23492  3080 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261677  0.0  0.0  23492  3228 ?        S    09:28   0:00 /lib/systemd/systemd-udevd

root      261703  0.0  0.0   6920  3424 pts/0    Ss   09:28   0:00 /bin/login -f

root      261712  0.0  0.0  10960  3436 pts/0    R+   09:28   0:00 ps auwx --sort rss

root         998  0.0  0.0 511564  3696 ?        Ssl  Jan04   2:07 /usr/bin/rrdcached -B -b /var/lib/rrdcached/db/ -j /var/lib/rrdcached/journa

_rpc         712  0.0  0.0   6820  3732 ?        Ss   Jan04   0:00 /sbin/rpcbind -f -w

message+     737  0.0  0.0   9108  4416 ?        Ss   Jan04   0:11 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --syst

root      261708  0.3  0.0   7780  4492 pts/0    S    09:28   0:00 -bash

root         734  0.0  0.0 166756  4652 ?        Ssl  Jan04   0:00 /usr/sbin/zed -F

root        1089  0.0  0.0  43472  4848 ?        Ss   Jan04   0:01 /usr/lib/postfix/sbin/master -w

root         735  0.0  0.0 225820  4896 ?        Ssl  Jan04   0:05 /usr/sbin/rsyslogd -n -iNONE

root         899  0.0  0.0   7392  5020 ?        S<Ls Jan04   0:00 /sbin/iscsid

root      261685  0.0  0.0 174260  5480 ?        S    09:28   0:00 (sd-pam)

root         450  0.0  0.0  23492  5840 ?        Ss   Jan04   0:56 /lib/systemd/systemd-udevd

root         743  0.0  0.0  12644  6236 ?        Ss   Jan04   0:00 /usr/sbin/smartd -n

systemd+     710  0.0  0.0  93080  6416 ?        Ssl  Jan04   0:00 /lib/systemd/systemd-timesyncd

root         949  0.0  0.0  15848  7072 ?        Ss   Jan04   0:00 /usr/sbin/sshd -D

root         739  0.0  0.0  19516  7344 ?        Ss   Jan04   0:06 /lib/systemd/systemd-logind

postfix   250067  0.0  0.0  43828  7912 ?        S    08:48   0:00 pickup -l -t unix -u -c

postfix     1091  0.0  0.0  43876  8008 ?        S    Jan04   0:00 qmgr -l -t unix -u

root      261674  0.0  0.0  16896  8228 ?        Ss   09:28   0:00 sshd: root@pts/0

root      261684  0.6  0.0  21404  8840 ?        Ss   09:28   0:00 /lib/systemd/systemd --user

root         729  0.0  0.0  21540 11000 ?        Ss   Jan04   0:00 /usr/bin/python2.7 /usr/bin/ceph-crash

root           1  0.0  0.0 173296 13092 ?        Ss   Jan04   0:45 /sbin/init

root         443  0.0  0.1  81052 23700 ?        SLsl Jan04   4:21 /sbin/dmeventd -f

ceph        1128  0.0  0.2 315944 40696 ?        Ssl  Jan04   1:47 /usr/bin/ceph-mds -f --cluster ceph --id pve2 --setuser ceph --setgroup ceph

root         429  0.0  0.3  95564 49328 ?        Ss   Jan04   0:13 /lib/systemd/systemd-journald

www-data  276133  0.0  0.3  70508 52364 ?        S    Jan05   0:14 spiceproxy worker

www-data    1570  0.0  0.3  70372 57896 ?        Ss   Jan04   0:06 spiceproxy

root        1093  0.1  0.3 697196 63048 ?        Ssl  Jan04  30:55 /usr/bin/pmxcfs

root        1391  0.1  0.5 274884 89044 ?        Ss   Jan04  21:06 pve-firewall

root        1392  0.2  0.5 273180 89576 ?        Ss   Jan04  36:12 pvestatd

root        1479  0.0  0.6 338016 101588 ?       Ss   Jan04   3:34 pve-ha-crm

root        1589  0.0  0.6 337644 101708 ?       Ss   Jan04   3:48 pve-ha-lrm

root        1471  0.0  0.7 354668 121684 ?       Ss   Jan04   0:05 pvedaemon

root        1472  0.0  0.7 363236 129580 ?       S    Jan04   0:14 pvedaemon worker

root        1473  0.0  0.8 363676 130208 ?       S    Jan04   0:15 pvedaemon worker

root        1474  0.0  0.8 364236 130876 ?       S    Jan04   0:14 pvedaemon worker

www-data  276135  0.0  0.8 364492 131964 ?       S    Jan05   0:13 pveproxy worker

www-data  276136  0.0  0.8 364504 131964 ?       S    Jan05   0:13 pveproxy worker

www-data  276137  0.0  0.8 364628 132248 ?       S    Jan05   0:13 pveproxy worker

www-data    1558  0.0  0.8 356164 145784 ?       Ss   Jan04   0:06 pveproxy

root        1134  2.0  1.0 571456 176068 ?       SLsl Jan04 320:47 /usr/sbin/corosync -f

ceph        1127  0.0  1.1 498832 178904 ?       Ssl  Jan04   6:39 /usr/bin/ceph-mgr -f --cluster ceph --id pve2 --setuser ceph --setgroup ceph

ceph        1131  0.2  6.1 1430064 996340 ?      Ssl  Jan04  40:32 /usr/bin/ceph-mon -f --cluster ceph --id pve2 --setuser ceph --setgroup ceph

ceph        1430  0.4  7.6 2052320 1243808 ?     Ssl  Jan04  66:29 /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph

ceph        1429  0.6  9.4 2413056 1541980 ?     Ssl  Jan04 102:18 /usr/bin/ceph-osd -f --cluster ceph --id 4 --setuser ceph --setgroup ceph

root        2490  4.2 13.3 2850168 2170080 ?     Sl   Jan04 682:15 /usr/bin/kvm -id 100 -name *** -no-shutdown -chardev socket,id=qmp,path=/var

ceph        1428  1.1 17.2 3746236 2801368 ?     Ssl  Jan04 189:36 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
 
Free Memory:
Code:
root@pve2:~# free -m
              total        used        free      shared  buff/cache   available
Mem:          15856        9999        4248         135        1608        5373
Swap:          8191           0        8191

Memory Usage:
Code:
root@pve2:~# cat /proc/meminfo
MemTotal:       16236936 kB
MemFree:         4518316 kB
MemAvailable:    5673228 kB
Buffers:          552124 kB
Cached:           584300 kB
SwapCached:            0 kB
Active:         10353156 kB
Inactive:         303308 kB
Active(anon):    9571024 kB
Inactive(anon):   100564 kB
Active(file):     782132 kB
Inactive(file):   202744 kB
Unevictable:      176736 kB
Mlocked:          176736 kB
SwapTotal:       8388604 kB
SwapFree:        8388604 kB
Dirty:               408 kB
Writeback:             0 kB
AnonPages:       9696816 kB
Mapped:           208636 kB
Shmem:            138804 kB
KReclaimable:     512176 kB
Slab:             712932 kB
SReclaimable:     512176 kB
SUnreclaim:       200756 kB
KernelStack:        8128 kB
PageTables:        28536 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    16507072 kB
Committed_AS:   16380568 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       63808 kB
VmallocChunk:          0 kB
Percpu:             8384 kB
HardwareCorrupted:     0 kB
AnonHugePages:   2103296 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:      375840 kB
DirectMap2M:    12046336 kB
DirectMap1G:     4194304 kB

Last only VM running on this NODE:
vminfo.PNG
 
We had the same looking issue the past 2 months (determined by the chart) on one of two almost identical servers, one was equipped with SSD drives and one was not. The one with SSD drives crashed every 30 days. We had troubles accounting for the memory. Eventually I opened slabtop and noticed that Acpi-State was consuming 300MB and kmalloc-4k 20GB and that total kernel memory consumption was ~30% of total system ram. I then suspected the SSD's since that was the only known hardware difference. After a flash of the disk-controller firmware the system started to behave normally. Our hardware was HP Proliant DL360 Gen10 equipped with OEM SSD's.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!