From: David Rientjes Adds a new sysctl, 'oom_kill_allocating_task', which will automatically kill the OOM-triggering task instead of scanning through the tasklist to find a memory-hogging target. This is helpful for systems with an insanely large number of tasks where scanning the tasklist significantly degrades performance. Cc: Andrea Arcangeli Acked-by: Christoph Lameter Signed-off-by: David Rientjes Signed-off-by: Andrew Morton --- Documentation/sysctl/vm.txt | 22 ++++++++++++++++++++++ kernel/sysctl.c | 9 +++++++++ mm/oom_kill.c | 13 ++++++++----- 3 files changed, 39 insertions(+), 5 deletions(-) diff -puN Documentation/sysctl/vm.txt~oom-add-oom_kill_allocating_task-sysctl Documentation/sysctl/vm.txt --- a/Documentation/sysctl/vm.txt~oom-add-oom_kill_allocating_task-sysctl +++ a/Documentation/sysctl/vm.txt @@ -31,6 +31,7 @@ Currently, these files are in /proc/sys/ - min_unmapped_ratio - min_slab_ratio - panic_on_oom +- oom_kill_allocating_task - mmap_min_address - numa_zonelist_order @@ -220,6 +221,27 @@ The default value is 0. 1 and 2 are for failover of clustering. Please select either according to your policy of failover. +============================================================= + +oom_kill_allocating_task + +This enables or disables killing the OOM-triggering task in +out-of-memory situations. + +If this is set to zero, the OOM killer will scan through the entire +tasklist and select a task based on heuristics to kill. This normally +selects a rogue memory-hogging task that frees up a large amount of +memory when killed. + +If this is set to non-zero, the OOM killer simply kills the task that +triggered the out-of-memory condition. This avoids the expensive +tasklist scan. + +If panic_on_oom is selected, it takes precedence over whatever value +is used in oom_kill_allocating_task. + +The default value is 0. + ============================================================== mmap_min_addr diff -puN kernel/sysctl.c~oom-add-oom_kill_allocating_task-sysctl kernel/sysctl.c --- a/kernel/sysctl.c~oom-add-oom_kill_allocating_task-sysctl +++ a/kernel/sysctl.c @@ -63,6 +63,7 @@ extern int print_fatal_signals; extern int sysctl_overcommit_memory; extern int sysctl_overcommit_ratio; extern int sysctl_panic_on_oom; +extern int sysctl_oom_kill_allocating_task; extern int max_threads; extern int core_uses_pid; extern int suid_dumpable; @@ -773,6 +774,14 @@ static ctl_table vm_table[] = { .proc_handler = &proc_dointvec, }, { + .ctl_name = CTL_UNNUMBERED, + .procname = "oom_kill_allocating_task", + .data = &sysctl_oom_kill_allocating_task, + .maxlen = sizeof(sysctl_oom_kill_allocating_task), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, + { .ctl_name = VM_OVERCOMMIT_RATIO, .procname = "overcommit_ratio", .data = &sysctl_overcommit_ratio, diff -puN mm/oom_kill.c~oom-add-oom_kill_allocating_task-sysctl mm/oom_kill.c --- a/mm/oom_kill.c~oom-add-oom_kill_allocating_task-sysctl +++ a/mm/oom_kill.c @@ -27,6 +27,7 @@ #include int sysctl_panic_on_oom; +int sysctl_oom_kill_allocating_task; static DEFINE_MUTEX(zone_scan_mutex); /* #define DEBUG */ @@ -471,14 +472,16 @@ void out_of_memory(struct zonelist *zone "No available memory (MPOL_BIND)"); break; - case CONSTRAINT_CPUSET: - oom_kill_process(current, points, - "No available memory in cpuset"); - break; - case CONSTRAINT_NONE: if (sysctl_panic_on_oom) panic("out of memory. panic_on_oom is selected\n"); + /* Fall-through */ + case CONSTRAINT_CPUSET: + if (sysctl_oom_kill_allocating_task) { + oom_kill_process(current, points, + "Out of memory (oom_kill_allocating_task)"); + break; + } retry: /* * Rambo mode: Shoot down a process and hope it solves whatever _