One of the most important responsibilities of Linux Kernel is optimal memory management. The first algorithms allowed for applications to allocate only memory which was physically available on the system, but it quickly turned out that this approach had significant drawbacks. It was noted that a large part of allocated memory was wasted by duplicating the same data which was used by different processes. In addition, a lot of applications reserved more memory than they, in fact, used. The solution to this problem was implementing new approach, called Virtual Memory. New strategie completely rebuild old methodology and introduce a concept of a virtual address. This method eliminates issues related to the duplication of memory by sharing page mechanisms and extending the scope of available addresses by using disk-based swap space and diskfile (over
mmap() ) to store data.
In traditional UNIX systems, applications require additional memory; they execute a
malloc() function to get it. In cases where there is a lack of memory and the system is not able to make a reservation, the function returns a
NULL value. Most of the properly implemented applications can handle this situation – it’s a normal exception. They usually, in that case, send an appropriate message to the log file and start shutting down. In the case when an application is running on Linux, the behavior is not that clear. Linux kernel was designed to allow the process to reserve more memory and make it available on the system. This, in practice, means that almost all calls of
malloc() function return “yes”, even if the required sum of memory does not exist!
Is it a good strategy?
Well, it depends. This approach was implemented, in order to more effectively manage memory by the Kernel. The problem appears when the processes start, really utilizing all the reserved memory. The system then starts swapping and if they are not able to keep up with the need for memory, they call the OOM killer for help.
What is the OOM killer?
Have you ever encountered the problem where, in some magic way, your processes start to disappear from the system and the only information which reminds you of them, begins with words “kernel, Out of memory: Kill…“? If yes, that is an OOM killer. At the time when the killer starts, it begins the selection, to choose the most suitable process to kill. The algorithm takes into account a lot of important information, for example, if it is a root process, how often the process executes syscalls, or how intensive CPU is utilized during the last X second, etc. Based on this information, they chose and killed the less meaningful (of course, in their opinion) process to rescue memory. In practice, this does not always end happily.
So what should we do?
The fastest and simplest solution is simply to disable the OOM killer. To do this, we must set a new value in
/proc/sys/.vm/panic_on_oom file. Two values are supported:
1 – Kernel will panic in case the OOM appears.
root@danielvm:~# cat /proc/sys/vm/panic_on_oom 0 root@danielvm:~# echo 1 > /proc/sys/vm/panic_on_oom root@danielvm:~# cat /proc/sys/vm/panic_on_oom 1
But does this actually solve our problem? No, not always. Instead, this setting just causes our Kernel to start killing processes randomly; there will be just panic, and if it is correctly configured, it will be booted again. In some cases, this makes more sense than killing processes. After a fresh reboot, we have a higher chance that all the required processes will start correctly. In a state when the OOM killer starts a game of roulette, we are never sure which processes will be killed. We always have the fear that they will kill processes that can have an impact on the stability of the application.
Fortunately, Linux is very flexible and allows us to solve this problem in another way as well. One of the most useful parameters to modify Linux manage memory is the overcommit_memory. To change this setting, we have to set one of the available values on the
0 – Default setting. The system allows for overcommitting memory, but with some restrictions. The kernel calculates how much memory can return to the application, if relayed to all memory, which can release other running processes,which do not start to utilize more. In practice, this means that the calculation is based on free swap space, free RAM memory, and cache space, which can be cleared at any time.
1 – The kernel allows allocating more memory, than what already exists in the system, without any exception. Operation
malloc() always returns true.
2 – The Kernel denies requests for memory equal to, or larger than, the sum of total available swap and the percentage of physical RAM specified in the overcommit_ratio. The parameter overcommit_ratio allows the user to specify how much physical memory will be used for this calculation.The default value is 50 percent.
How can we adjust the value of overcommit to own usage?
The present system has physically installed 4GB RAM, without swap space. The moment that we change the setting from the default(0) mode to the second(2) mode, the scope of available memory is reduced twice, from 4GB to 2GB (we can check the value of current limit in the meminfo file). Because we also want to use the rest of our available memory, we must change the value of the overcommit ratio to 100, and that is all! In this quick way, we secure our system by not allowing it to allocate more memory than is physically available. Of course, in each environment, depending on the application, the value of overcommit can be different and should be adjusted to the specifications of the running processes.
root@danielvm:~# echo 2 > /proc/sys/vm/overcommit_memory root@danielvm:~# cat /proc/sys/vm/overcommit_memory 2 root@danielvm:~# cat /proc/sys/vm/overcommit_ratio 50 root@danielvm:~# cat /proc/meminfo | grep Comm CommitLimit: 2014656 kB Committed_AS: 1824592 kB root@danielvm:~# echo 100 > /proc/sys/vm/overcommit_ratio root@danielvm:~# cat /proc/meminfo | grep Comm CommitLimit: 4029312 kB Committed_AS: 2857600 kB
In addition, there exists the possibility of a manually set priority for each system process, to protect it against OOM killer. There are two files located in
/proc/PID/, which are used for that, oom_score and oom_score_adj. The oom_score file presents the current score of the process; the greater value, the lower the priority for the process and the higher the possibility that it will be chosen and killed. For example, the initd process has a priority equal to 0, which in practice means that it will never be chosen by OOM Killer – it is excluded from the roulette game. In comparison, the Firefox process, which is running on a non-root account, has a score equal to 51.
root@danielvm:~# cat /proc/1/oom_score 0 root@danielvm:~# ps -elf | grep firefox 0 S daniel 1780 1 2 80 0 - 209165 poll_s 08:24 ? 00:00:46 /usr/lib/firefox/firefox root@danielvm:~# cat /proc/1780/oom_score 51
To modify the score of the process, we must set an appropriate priority in the oom_score_adj file. Acceptable values range between -1000 and 1000, where a lower value means a higher rank for the process and a smaller chance that it will be killed. The lowest value (-1000) is equal to a disabling process from OOM Killer.
root@danielvm:~# echo -10 > /proc/1780/oom_score_adj -10 root@danielvm:~# cat /proc/1780/oom_score 43 root@danielvm:~# echo -1000 > /proc/1780/oom_score_adj root@danielvm:~# cat /proc/1780/oom_score 0
Keep in mind that OOM Killer is a default strategy which is used to release memory. Unless this changes, there will always be a risk that some important process of our application will be accidently killed. Therefore, before we decide to deploy them in the production environment, we must carefully consider how our system should react when OOM appears.