RAM Disk in LXC not showing usage in PVE

AWillUS

New Member
Dec 31, 2024
10
4
3
Hey, been working on setting up a new PVE lab at home for multiple use cases incl. finally getting off my Windows Plex Server setup.

I have my initial basic cluster up and running with two nodes (one more to come soon) and been testing all sorts of LXC's and VM's to ensure I understand all I am trying to do before actually migrating over the production environments.

As part of my testing, I have been testing a RAM Disk mounted in a privileged LXC (default Debian 12 template) for a Plex Server's temporary transcoder location, and it works beautifully, saving SSD life and rapid initialization of the transcode session for live TV and HDHomeRun Prime FIOS Cable source.

However, this is where my issue arises, as the server amasses data in the RAM Disk auto mounted inside the LXC (12G of 16G allocated to the LXC), the usage does not show in PVE either for the LXC itself, the node or the cluster. Can anyone point me to why this might be as I can see this being an issue if I (or someone) ever were to use RAM Disks in a HA environment where the migration decisions could be affected by the "non showing memory consumption in PVE".

Screenshots for showing the usage of the RAM Disk inside the LXC (approx.. 50% ~ 6GB) and the PVE dashboard metrics for memory usage on LXC and Node at the same moment in time.

Screenshot 2024-12-31 120839.pngScreenshot 2024-12-31 120818.pngScreenshot 2024-12-31 120810.png
 
Not sure why you didn't just use the existing /dev/shm ... ?
So first up thanks, learned something today, did not know of /dev/shm as a concept (still learning Linux every day)

Second, a quick look into this, while I can see your point as this being a potential viable solution to the monitoring problem, it could pose another issue, where the application using /dev/shm for temp storage, in this case my Plex Transcoder service, might not limit itself to X Gb, which could introduce a new set of problems for the rest of LXC?

Note: It appears that the Plex transcoder does limit itself, at least in my current use test it seems to back off from filling out the allotted 12GB and start flushing the earlier recorded live stream when it hit some 85-89%, however one cannot trust all applications to be so nice.

Regardless, this still does not answer why the manually mounted RAM Disk usage does not show usage metrics in PVE,
 
LXC is a bit weird, if you need real constraints I would move this application into a VM.
Yeah, I prefer LXC's due to the hardware passthrough limitations in VM's,

Also I did some further research into /dev/shm and it appears at least for a privileged container like this, that it can grab memory from the host/node itself and not actually the LXC allotted memory, this is an interesting issue as well. I get that privileged containers generally are not the way to go for real world production, but curious if this is also the case for unprivileged containers as well.

Ps. running a free -h command in the LXC actually yields the proper metric, this is getting weird :)
1735671019111.png
 
  • Like
Reactions: Kingneutron
Couple of updates:
1: I tested with /dev/shm as the Plex Temp transcoder path, which works out of the box, so kudos on that suggestion, for others this is an easy approach and avoids any issues with fstab editing. It does however also not report memory usage in PVE dashboard.

2: It would appear that the node cli "free -h" as well as Plex Application itself actually do see the memory usage see screenshots. My guess is (someone with Desbian/Linux advanced experience know?) that while the data is stored in /dev/shm (or another RAM disk) it is not considered blocked/reserved and the system can still claw it back if needed or something similar to that? If not I think it would be very much a required metric to report to the PVE to ensure proper allocation of resources and potential cluster migration choices to be considered.
Screenshot 2024-12-31 144045.png
Screenshot 2024-12-31 144016.png
 
  • Like
Reactions: Kingneutron
Update for anyone monitoring this approach.

This morning I was readjusting my Plex LXC's resource allocation and encountered a interesting situation after reducing the PVE allotted memory to the LXC.

I reduced the memory to 4 GB (more then plenty for the base Plex application incl. any transcoding needs), and continued to watch live TV with the temp transcoder set to /dev/shm as previously discussed in the thread. And after some 20 min the LXC ran into a full SWAP and practically halted. As far as i can tell, the PVE allotted memory (in this case reduced from 16GB ~ 50% of host/node total memory to 4 GB), actually is enforced when using /dev/shm as a tmpfs (in this case for temp transcoding .ts file streams) however the LXC still sees 16GB available in /dev/shm when using "df" thinking it can continue to throw data into it during the live TV streaming session, and when it hit the PVE set memory limitation it starts using SWAP as a Linux machine should do when it runs out of memory, since this is video file transcoding it runs out pretty damn fast.

So in conclusion:
  1. PVE allotted memory is actually enforced on the LXC's usage of /dev/shm regardless of what the LXC sees as available
  2. The PVE reporting is clearly a problem here, as the Linux system is operating as it should but the PVE is simply not recognizing this as memory usage in the classical sense.
As I see it, PVE should be able to monitor this and report that actual memory use to both the LXC and the host node, and (this could be possible?) the LXC should be configured with a correct /dev/shm size in accordance with the PVE allotted memory to avoid a rundown on memory into SWAP without having over allocate memory, or alternately you do just that and over allocate memory (I have adjusted the LXC memory up to 20GB to test the hypothesis).

Finally I am curious about what would happen in the case of other LXC's using /dev/shm (or actual allotted memory) to a degree where the (in my case Plex temp transcoding server) is utilizing /dev/shm space it "thinks" is free but in fact is used elsewhere and how it could affect memory to SWAP runs.
 
Update for anyone monitoring this approach.


Finally I am curious about what would happen in the case of other LXC's using /dev/shm (or actual allotted memory) to a degree where the (in my case Plex temp transcoding server) is utilizing /dev/shm space it "thinks" is free but in fact is used elsewhere and how it could affect memory to SWAP runs.
I have run into exactly this problem.

/dev/shm appears across all LXC containers.

While memory usage is enforced by cgroup2 and configured set in the LXC config, any application or user on the LXC can easily just write more than the LXC should. When this happens, "oom-killer" shuts down the LXC in our case.

So a user not knowing any better (any user, doesn't have to be root) or a application using /dev/shm can virtually have the LXC killed.

It would be nice if you could enforce a max size to /dev/shm so that it's not to exceed the LXC memory size. (and cause the LXC to get shut down)

In our case, we have an 8G limit on an LXC, but /dev/shm shows 64/128G.... if you copy say 10G of data in, the LXC crashes.

It would be nice to maybe have a way to limit /dev/shm to 4G (or whatever)

Right now, I don't see a way to just disable /dev/shm at the LXC level.

Maybe an alternative is to create a mountpoint for tmpfs with a sane limit here...
 
  • Like
Reactions: AWillUS
/dev/shm appears across all LXC containers.

While memory usage is enforced by cgroup2 and configured set in the LXC config, any application or user on the LXC can easily just write more than the LXC should. When this happens, "oom-killer" shuts down the LXC in our case.

Right now, I don't see a way to just disable /dev/shm at the LXC level.

Maybe an alternative is to create a mountpoint for tmpfs with a sane limit here...
You can change how much the LXC mounts into /dev/shm with:

Edit /etc/fstab
Add/change:
Code:
none /dev/shm tmpfs defaults,size=4G 0 0

(change 4G to whatever you want and the LXC will not write more to /dev/shm than you allow it.

Still does not show in the utilization and not working great for my Plex setup as Plex fill it up and only releases small chucks which stops the DVR recorder from allowing scheduled recordings to start when Live TV has filled up the temp ram disk, but that is a Plex Issue not a ProxMox or Debian /dev/shm issue obviously.
 
You can change how much the LXC mounts into /dev/shm with:

Edit /etc/fstab
Add/change:
Code:
none /dev/shm tmpfs defaults,size=4G 0 0

(change 4G to whatever you want and the LXC will not write more to /dev/shm than you allow it.

Still does not show in the utilization and not working great for my Plex setup as Plex fill it up and only releases small chucks which stops the DVR recorder from allowing scheduled recordings to start when Live TV has filled up the temp ram disk, but that is a Plex Issue not a ProxMox or Debian /dev/shm issue obviously.

Right, I also finally found you can add it to the LXC conf as well:

lxc.mount.entry: tmpfs dev/shm tmpfs size=4G,nosuid,nodev,noexec,create=dir 0 0


At least this will allow the application to get a "disk full" error and fail at the application rather than just having oom-killer completely shut down the CT.
 
  • Like
Reactions: AWillUS
Right, I also finally found you can add it to the LXC conf as well:
lxc.mount.entry: tmpfs dev/shm tmpfs size=4G,nosuid,nodev,noexec,create=dir 0 0
Thanks for that, updating my documentation to incorporate that into future deployments.

This should probably be a GUI config parameter under Advanced when setting up LXC's since it can added directly to the .conf file, that would make it much easier for less command line savvy users to config, ProxMox could even add a "warning" if you allow for a /dev/shm that is larger than system/container memory or something.
 
Thanks for that, updating my documentation to incorporate that into future deployments.

This should probably be a GUI config parameter under Advanced when setting up LXC's since it can added directly to the .conf file, that would make it much easier for less command line savvy users to config, ProxMox could even add a "warning" if you allow for a /dev/shm that is larger than system/container memory or something.
I totally agree on that. There certainly should be a knob there. The /dev/shm should NOT be higher (or even equal to) the totally memory that is allowed in the LXC. If you exceed that, the LXC is shut down/killed. A sane level might be to set the /dev/shm to 50% of the memory limit allowed in the CT.
 
  • Like
Reactions: AWillUS
I totally agree on that. There certainly should be a knob there. The /dev/shm should NOT be higher (or even equal to) the totally memory that is allowed in the LXC. If you exceed that, the LXC is shut down/killed. A sane level might be to set the /dev/shm to 50% of the memory limit allowed in the CT.
/dev/shm is actually intended for inter-process communication, in particular POSIX shared memory (see "man shm_open"). It isn't really intended to be a dumping ground for all of your temporary files. In fact /dev/shm may not even exist if the kernel was compiled without POSIX shared memory.

What you guys are doing will work, but it isn't really a supported use-case.
 
  • Like
Reactions: AWillUS
/dev/shm is actually intended for inter-process communication, in particular POSIX shared memory (see "man shm_open"). It isn't really intended to be a dumping ground for all of your temporary files. In fact /dev/shm may not even exist if the kernel was compiled without POSIX shared memory.

What you guys are doing will work, but it isn't really a supported use-case.
Totally get that, and yes, you could just create a tmpfs ramdisk as well, and accomplish the same thing.

The bad thing about /dev/shm is as you say, it can become a DANGEROUS dumping ground where-as if anything dumped there that exceeds the memory limit of the CT will cause it to crash/shut down. A sane limit should be put on /dev/shm tmpfs to prevent this.