Proxmox balloon does not seem to work properly in PVE8

guruevi

Member
Apr 7, 2022
12
4
8
Hi,

This is a weird one, after upgrading to Proxmox 8 I noticed that even though the balloon was properly being inflated in the VMs (down to the Linux kernel reporting 16GB 'installed'), to the host system, the memory wasn't being released (KVM process for that VM keeps using 128GB), which I'm assuming is causing the Proxmox balloon system to keep 'reclaiming' memory to get under 80%.

There are only 6 VMs, each with 128GB of memory and the host has 1TB of RAM, there was a brief period where we had a ton of VMs due to a failure in another system, but after things returned to normal, the VMs never regained their memory. I can manually set the memory back to 128GB using the monitor and that works, but slowly Proxmox is reclaiming memory again each time the system goes to 81% RAM.

The other "issue" is that Proxmox continuously keeps RAM at 80% even though that means in our case, 200GB of memory is 'available'. Is there any way of tuning those values for "large" hosts (either through a configuration variable, or by calculating 20% or 64GB, whichever is smaller).

Thanks
 
Hi,

This is a weird one, after upgrading to Proxmox 8 I noticed that even though the balloon was properly being inflated in the VMs (down to the Linux kernel reporting 16GB 'installed'), to the host system, the memory wasn't being released (KVM process for that VM keeps using 128GB), which I'm assuming is causing the Proxmox balloon system to keep 'reclaiming' memory to get under 80%.
This is indeed weird. Could you post the output of the following commands (on the host) for a VM for which ballooned memory is not properly released to the host (replace the VMID accordingly)?
Code:
pveversion -v
qm config VMID --current
pvesh create /nodes/localhost/qemu/VMID/monitor -command "info balloon"
cat /proc/$(cat /var/run/qemu-server/VMID.pid)/status
The other "issue" is that Proxmox continuously keeps RAM at 80% even though that means in our case, 200GB of memory is 'available'. Is there any way of tuning those values for "large" hosts (either through a configuration variable, or by calculating 20% or 64GB, whichever is smaller).
Currently it is not possible to configure this threshold, see this open feature request: https://bugzilla.proxmox.com/show_bug.cgi?id=2413
 
pveversion
Code:
proxmox-ve: 8.0.2 (running kernel: 6.2.16-12-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-6
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
proxmox-kernel-6.2: 6.2.16-12
pve-kernel-5.15.116-1-pve: 5.15.116-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph: 17.2.6-pve1+3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.0
libpve-access-control: 8.0.4
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.7
libpve-guest-common-perl: 5.0.3
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.4
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.2
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-5
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1

config
Code:
agent: 1
balloon: 65535
bios: ovmf
boot: order=virtio0
cores: 16
cpu: Cascadelake-Server-noTSX
efidisk0: ceph-data:vm-160-disk-0,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: mapping=A40-GPU-8G,mdev=nvidia-562,pcie=1
machine: q35
memory: 262144
meta: creation-qemu=7.1.0,ctime=1680184358
name: <redacted>
net0: virtio=0E:82:96:88:B9:49,bridge=vmbr0,firewall=1,tag=122
numa: 1
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=357fc19e-759a-4309-a4b6-d818f0602dc6
sockets: 2
tags: GPU
tpmstate0: ceph-data:vm-160-disk-1,size=4M,version=v2.0
virtio0: ceph-data:vm-160-disk-2,cache=writeback,discard=on,iothread=1,size=1000G
virtio1: ceph-data:vm-160-disk-3,cache=writeback,discard=on,iothread=1,size=4026G
vmgenid: 04813660-f282-4a8a-8a06-61e981bcee26

balloon info
Code:
balloon: actual=65535 max_mem=262144 total_mem=61273 free_mem=48162 mem_swapped_in=0 mem_swapped_out=0 major_page_faults=11740 minor_page_faults=99021479 last_update=1695992436

/proc/status
Code:
Name:   kvm
Umask:  0027
State:  S (sleeping)
Tgid:   441591
Ngid:   2155007
Pid:    441591
PPid:   1
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 512
Groups:
NStgid: 441591
NSpid:  441591
NSpgid: 441589
NSsid:  441589
VmPeak: 277830624 kB
VmSize: 277288516 kB
VmLck:  268435264 kB
VmPin:         0 kB
VmHWM:  269983252 kB
VmRSS:  269800624 kB
RssAnon:        269792512 kB
RssFile:            8112 kB
RssShmem:              0 kB
VmData: 271924244 kB
VmStk:       724 kB
VmExe:      6220 kB
VmLib:     46360 kB
VmPTE:    532808 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
CoreDumping:    0
THP_enabled:    1
Threads:        43
SigQ:   4/4125729
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000010002240
SigIgn: 0000000000381000
SigCgt: 0000000100004243
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
NoNewPrivs:     0
Seccomp:        0
Seccomp_filters:        0
Speculation_Store_Bypass:       thread vulnerable
SpeculationIndirectBranch:      conditional enabled
Cpus_allowed:   ffffffff,ffffffff,ffffffff
Cpus_allowed_list:      0-95
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000f
Mems_allowed_list:      0-3
voluntary_ctxt_switches:        1266470
nonvoluntary_ctxt_switches:     8782

uname -a (host)
Code:
Linux <redacted> 6.2.16-12-pve #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-12 (2023-09-04T13:21Z) x86_64 GNU/Linux

free -m (host)
Code:
               total        used        free      shared  buff/cache   available
Mem:         1031500      828464        7635          72      201043      203035
Swap:              0           0           0

uname -a (guest)
Code:
Linux <redacted> 6.2.0-33-generic #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep  7 10:33:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

free -m (guest)
Code:
               total        used        free      shared  buff/cache   available
Mem:           61273        3720       48310          34        9242       55728
Swap:            975           0         975
 
Last edited:
Thanks. The culprit is that the VM uses PCIe passthrough:
Code:
hostpci0: mapping=A40-GPU-8G,mdev=nvidia-562,pcie=1
Ballooning doesn't work in combination with PCI(e) passthrough: As PCI(e) devices might use Direct Memory Access (DMA), the complete guest memory needs to be mapped. In other words, if a PCI(e) device is passed through to a VM, the VM will always require the full amount of configured memory.
 
  • Like
Reactions: leesteken
what about printing a warning then when enabling balooning or keep the user away from enabling balooning ?
Yes, some kind of warning, or at least a note in the docs/wiki, sounds sensible and should be doable. I'll look into it.

I'm assuming you're not letting customers edit the Wiki :)
The wiki is open to the community, but there are some minor caveats -- see [1] for more details.

[1] https://forum.proxmox.com/threads/how-can-we-contribute-to-the-wiki.93970/#post-408790
 
  • Like
Reactions: RolandK
Yes, some kind of warning, or at least a note in the docs/wiki, sounds sensible and should be doable. I'll look into it.
Off-topic: Please also remove the pre-5.15 kernel intel_iommu remark in the manual, as it was only true for a very limited time in the previous Proxmox version. Also IOMMU groups are not really explained in the PCIe passthrough section of the manual but a common pitfall. I'll stop now and save the other mistakes for another time.
 
  • Like
Reactions: fweber

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!