There are many OOM messages in the system log file on the node about some container. What does "OOM killer in rage" mean?
Nov 24 12:17:04 hwnode kernel: [758955.653217] OOM killer in rage, 16 tasks killed in ub 1234 Nov 24 12:17:04 hwnode kernel: [758955.653313] Out of memory in UB 1234: OOM killed process 21738 (php-cgi) score 0 vm:281988kB, rss:17004kB, swap:0kB Nov 24 12:17:07 hwnode kernel: [758957.803617] Out of memory in UB 1234: OOM killed process 21747 (php-cgi) score 0 vm:281988kB, rss:17008kB, swap:0kB
The resources seem to be sufficient, there is much memory left in a container. How to resolve such situation?
For the case with memory limit in a container, the recommendations are:
- increase memory limit for a container
- reconfigure services to decrease memory usage - both for static and dynamic needs
If it is global memory shortage on the node, then the resource commitments should be reviewed or some containers moved to other nodes.
OOM killer is invoked if processes of a container are trying to use memory over the limit. That memory has been allocated previously, and applications are assuming that the previously allocated memory can be used.
Usage limit is set by
swappages. There is allocation limit, controlled by
privvmpages, and if this limit is not set, in vSwap mode it is calculated automatically as:
privvmpages = (physpages + swappages) * vm_overcommit
If memory usage reached the limit and some process tries to update a page over the limit, then OOM is invoked in a container.
Every time when OOM killer is called again in short period of time (OOM relaxation period), the in-kernel counter is increased. On 10th consequent call, it enters "berserker" mode with killing more processes in addition. The number of processes to terminate is doubled every time, so that calling OOM killer 15 times will result in this number of killed processes in overall:
- 1st-9th time: by 1 process
- 10th time: 2 processes
- 11th time: 3 processes
- 12th time: 5 processes
- 13th time: 9 processes
- 14th time: 17 processes
- 15th time: 33 processes
On killing extra processes, the message is logged:
Nov 24 12:17:04 hwnode kernel: [758955.653217] OOM killer in rage, 16 tasks killed in ub 1234
OOM relaxation period is set to 1000 in-kernel ticks, for Virtuozzo kernel 1 tick happens every 1 millisecond. Such configuration means that "rage" counter will be reset if no OOM happens for more than 1 second.
Though it is not recommended, the relaxation period can be adjusted:
~# sysctl -w vm.oom_relaxation=500
This means that it is considered safe to have 2 OOMs per second.