/proc read-only in lxc when trying to set /proc/sys/vm/overcommit_memory

symcbean

Member
May 20, 2020
18
0
6
58
I want to disable memory overcommit on an lxc container (this runs multiple greedy cron jobs which will grab a lot of memory and use it all) however when I try to set this on the command line, the lxc host tells me that /proc is a read-only filesystem. There are several other containers runing on the same Proxmox node - I'd prefer not to apply the setting for all containers on the node - only this one.

Is this possible?
 
Forgive me if I'm missing something, but why don't you just assign this container a limited amount of memory? The ability to over-commit memory is more relevant to the PVE host, as it's managing the rest of the processes.
It's a security risk to have the /proc filesystem writable from the container, as it enables altering of the host kernel.
 
I suspect we're talking at cross purposes here.

I have assigned the container a limited amount of memory. But because memory overcommit is using the default heuristics, the OS is allowing applications in the container to malloc() more memory than is available, hoping that it the applications won't actually *use* all that memory. But the applications (or one program in particular) is trying to use all the memory the OS promised it.

On a physical machine or VM, I would simply tell the OS not to be so generous. But the way I would that there is not working from inside the lxc. I don't want to apply the config to the proxmox host as there are lots of other containers on this machine, and usually memory overcommit is beneficial (that's why its enabled by default).

Given that its a shared kernel, I don't know if what I am attempting to do is possible but I can't find any documentation either way.

If I simply give the container more memory, the OS will still allow the program in question to allocate more memory than it can supply.
 
Okay, thanks for clarifying. I understand what you mean now. Unfortunately I don't think it's possible to communicate this information in the direction you want. As the container doesn't have any information on the host's memory distribution, it's unable to prevent itself from trying to use memory which the host doesn't have available. It will simply feel that it can use what has been assigned to it. The best you can do is give it a maximum memory amount that won't interfere with the other running containers when it reaches its capacity, or increase the system's RAM or container's swap so that it can handle such spikes.

On a physical machine or VM, I would simply tell the OS not to be so generous.
This same logic applies to VMs. If you have a running VM, there is no way to tell it that even though it thinks it has memory, the host doesn't have enough memory to meet the request, so it shouldn't use it. The closest it has is the ballooning driver [1], which causes a VM to release free memory if the host is running out. I believe that this isn't required in containers as the host has a better view of the containers' memory usage.

[1] https://rwmj.wordpress.com/2010/07/17/virtio-balloon/
 
I don't think we're quite at consensus. Actually, a VM can have a very different overcommit policy to the the hypervisor. It is a similar problem to thin disk provisioning, but in that model the resource is promised by the admin and potentially consumed by resources outside the virtual host; the virtual host doesn't know there isn't that much free space until it tries to assign a new extent. OTOH with memory overcommit, the kernel *knows* it does not have the memory it tells the applications they can use. I suspect that it might not be possible for containers on the same hypervisor, and therefore the same kernel instance to have different settings.

Fortunately this is a dev machine so I can force daily restart of the relevant programs. For live it will go on a VM with overcommit set to 2.
 
Last edited:
I don't think we're quite at consensus.
You are correct ;) I took your previous note about VMs to mean that the hypervisor had some built in property which would allow only certain processes to over-commit memory. I now understand that you meant that VMs can control their own virtualized kernel's over-commit, and it's this same internal over-commit which you would like to prevent in the container.
Unfortunately, due to the sharing of the host kernel, I don't believe it's possible with LXC containers. However, it does still sound to me that the key problem here is that the container simply doesn't have enough memory to carry out whatever task is run in the cron job, and that assigning it enough memory to handle this should be the answer. But as I have nothing to base this assumption on, I will not argue it :)