Hi everyone, I have a question regarding CT container memory configurations.
Currently, I'm running several LXC containers on a PVE host with various RAM allocations. The largest one is assigned about half of the host's total physical RAM. While the total allocated memory across all containers doesn't exceed the host's capacity—and I've specifically reserved about 5GB of RAM for the PVE host itself—the system should theoretically have enough overhead.
However, this issue has already happened twice now: whenever that high-load CT experiences a spike and its memory usage exceeds 90%, the entire PVE physical host crashes abruptly without warning. I have monitoring alerts set up, so I receive notifications just before it happens, but I have no time to react. Honestly, I haven't been able to pinpoint the exact root cause yet.
My current hypothesis is that even if the CT's RAM seems sufficient, the default Swap setting might be the culprit. When the CT starts utilizing Swap, it’s actually tapping into the PVE host's swap resources. Could this be the primary reason for the host's instability? I’m planning to set the Swap to 0 for all containers. Is this the right approach to take, or am I completely looking in the wrong direction?
Thanks in advance for any advice.
Currently, I'm running several LXC containers on a PVE host with various RAM allocations. The largest one is assigned about half of the host's total physical RAM. While the total allocated memory across all containers doesn't exceed the host's capacity—and I've specifically reserved about 5GB of RAM for the PVE host itself—the system should theoretically have enough overhead.
However, this issue has already happened twice now: whenever that high-load CT experiences a spike and its memory usage exceeds 90%, the entire PVE physical host crashes abruptly without warning. I have monitoring alerts set up, so I receive notifications just before it happens, but I have no time to react. Honestly, I haven't been able to pinpoint the exact root cause yet.
My current hypothesis is that even if the CT's RAM seems sufficient, the default Swap setting might be the culprit. When the CT starts utilizing Swap, it’s actually tapping into the PVE host's swap resources. Could this be the primary reason for the host's instability? I’m planning to set the Swap to 0 for all containers. Is this the right approach to take, or am I completely looking in the wrong direction?
Thanks in advance for any advice.