oom: avoid race for oom killed tasks detaching mm prior to exit Tasks detach its ->mm prior to exiting so it's possible that in progress oom kills or already exiting tasks may be missed during the oom killer's tasklist scan. When an eligible task is found with either TIF_MEMDIE or PF_EXITING set, the oom killer is supposed to be a no-op to avoid needlessly killing additional tasks. This closes the race between a task detaching its ->mm and being removed from the tasklist. Out of memory conditions as the result of memory controllers will automatically filter tasks that have detached their ->mm (since task_in_mem_cgroup() will return 0). This is acceptable, however, since memcg constrained ooms aren't the result of a lack of memory resources but rather a limit imposed by userspace that requires a task be killed regardless. Acked-by: Nick Piggin Signed-off-by: David Rientjes --- mm/oom_kill.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -290,12 +290,6 @@ static struct task_struct *select_bad_process(unsigned int *ppoints, for_each_process(p) { unsigned int points; - /* - * skip kernel threads and tasks which have already released - * their mm. - */ - if (!p->mm) - continue; /* skip the init task */ if (is_global_init(p)) continue; @@ -336,6 +330,12 @@ static struct task_struct *select_bad_process(unsigned int *ppoints, *ppoints = 1000; } + /* + * skip kernel threads and tasks which have already released + * their mm. + */ + if (!p->mm) + continue; if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) continue;