I'm periodically having issues with the lxc containers crashing the host node.
The errors on the node are the classic nmi_watchdog stuck and i believe so far i was treating the symptom instead of the cause.
Today, i had a very interesting "customer". His container was using 100% of his cpu (1 core), the node crashed.
I moved him to a fresh node (thinking the initial node was overloaded) and surprise, the new node crashed with the same error. I could see he was forking a lot of apache2 processes and i'm assuming that's what is causing the issues i'm continously having with lxc.
Can i prevent in any way the number of processes he can spwan? Or other ideas how to limit it?
All suggestions are welcome!
Thank you
The errors on the node are the classic nmi_watchdog stuck and i believe so far i was treating the symptom instead of the cause.
Today, i had a very interesting "customer". His container was using 100% of his cpu (1 core), the node crashed.
I moved him to a fresh node (thinking the initial node was overloaded) and surprise, the new node crashed with the same error. I could see he was forking a lot of apache2 processes and i'm assuming that's what is causing the issues i'm continously having with lxc.
Can i prevent in any way the number of processes he can spwan? Or other ideas how to limit it?
All suggestions are welcome!
Thank you