Sunday, October 20, 2013

cgroups: Maintain sanity of your development system while you screw it

I am sure many of us developers run into unresponsive machines while doing our usual stuff (developing/debugging) due to a rogue or buggy process consuming all the system's RAM, and swap. It is not usually very common, but happens fairly often for me. For example, if in an ipython shell you start allocating arrays in a loop, and add an extra zero to the size or simply store the arrays without it being garbage collected, it is easy to bog down the system if you are not careful.

For me, the most common instance of this has been using gdb. I usually install all the relevant debuginfo packages available in the Fedora repo to enhance my debugging experience. One downside of this is that gdb consumes a lot of memory. And coupled with python pretty-printing and `thread apply all bt full` command, gdb can go haywire, especially when the stack/heap is corrupted after the segfault. Previously, i would have no recourse but to hard reboot my machine, i once timed that my system to be unresponsive for more than half an hour.

The Cause:

Linux is generally good at time-slicing and not letting rogue processes DOS the system, except for the memory-swap part. The swap system in linux is still not very intelligent in swapping out portions of memory when under pressure. It seems it swaps out data and code of processes equally aggressively.
Here's what happens when you start a process which consumes a lot of memory (more than the total amount of RAM on your system) to force you to swap:
 First data of inactive processes is swapped out, you can see you machine's swap usage increasing but apparently it doesn't seem to affect the system responsiveness;
 next, executable code of processes is swapped out too, and now is when things go out of control.
When the latter happens, your machine is screwed, since your terminal process and bash and window manager and gnome-shell and X server have all likely swapped out their executable code to be able to respond to you soon, and there are only two ways to recover: 1) Hope the rogue process quickly dies, which is not very likely on my system with 8GB RAM and swap (it takes a looong time to allocate so much memory while your executable code is itself swapped out); 2) Hard reboot

Not any more, i present to you a mechanism to maintain sanity of the system and limit the amount of RAM you development processes can consume.

A Workaround:

Welcome to cgroups (Control Groups), a feature in linux kernel for managing resources of processes. You can read more about cgroups from the fedora guide at http://docs.fedoraproject.org/en-US/Fedora/17/html/Resource_Management_Guide/index.html .

Here, i will describe a simple way to restrict the cumulative memory (RAM) consumption of all processes started in a bash terminal so that essential system processes have sufficient RAM available to maintain system responsiveness.

The various tools we will use are provided by the libcgroup-tools package, so install it first using yum or the equivalent package on you distro.

Two important services i will use are cgconfig.service and cgred.service, but before enabling and starting them we will configure them. Their config are respectively located in /etc/cgconfig.conf and /etc/cgrules.conf

Here's what we will configure cgoups for:
1) Create a bash_memory group for memory subsystem and limit the total RAM consumption to slightly less than the total available RAM, which on my system is 8GB, so i have set the limit to 6GB. This is done via the /etc/cgconfig.conf file. Add the following content to it:

group bash_memory {
        memory {
                memory.soft_limit_in_bytes="5583457480";
                memory.limit_in_bytes="6442450944";
        }
}

The first line in the memory subsystem states that in case of memory pressure, when the system is actively thrashing, the memory of all processess in the bash_memory cgroup will be reclaimed (by discarding caches and swapping out dirty pages) to reduce it physical memory usage to 5.2 GB.
The second line states that the total physical memory consumption of all processes in bash_memory cgroup will never exceed 6GB, they will be swapped out instead.

You can use similar mechanisms (using different subsystem than memory) to limit various resources, like cpu usage, disk IO, network bandwidth etc.

2) Now all we need to do is add processes to the bash_memory cgroup. We will do this via the cgred service by adding a rule to the /etc/cgrules.conf file. Add the following line to the file:

*:bash          memory          bash_memory/

The first column says that the rule applies to bash process of all users, the second says that the memory controller is being set, and the third column says that the bash_memory group is applied for the memory controller. Now, the cgred.service will take care of automatically applying the bash_memory group to all bash processes whenever they are started. Due to the inheritance of cgroups, all subprocesses started by bash (and their subprocessess too) will belong to the bash_memory cgroup, thus limiting their cumulative RAM consumption.

You can add more lines to limit specific users or processess.

3) Now all we need to do is start the services and reap their benefit.

sudo systemctl enable cgconfig.service
sudo systemctl start cgconfig.service
sudo systemctl enable cgred.service
sudo systemctl start cgred.service

The linux init daemon systemd uses cgroups to control various services, and you can set specific limit on all services in a systemd based system. Also, the lxc project (lightweight linux containers) uses cgroups and the related namespace functionality to sandbox linux containers. This feature of linux is being used for great benefit by various projects.

Please comment if you find this useful or have some better suggestions :)

PS: It seems the gdb memory hog bug which started all this adventure has been fixed: https://bugzilla.redhat.com/show_bug.cgi?id=1013453