From: Oleg Nesterov No changes in fs/exec.o The for_each_process() loop in zap_threads() is very subtle, it is not clear why we don't race with fork/exit/exec. Add the fat comment. Also, change the code to use while_each_thread(). Signed-off-by: Oleg Nesterov Cc: Roland McGrath Signed-off-by: Andrew Morton --- fs/exec.c | 39 ++++++++++++++++++++++++++++++++------- 1 file changed, 32 insertions(+), 7 deletions(-) diff -puN fs/exec.c~coredump-zap_threads-comments-use-while_each_thread fs/exec.c --- a/fs/exec.c~coredump-zap_threads-comments-use-while_each_thread +++ a/fs/exec.c @@ -1516,7 +1516,7 @@ static void zap_process(struct task_stru sigaddset(&t->pending.signal, SIGKILL); signal_wake_up(t, 1); } - } while ((t = next_thread(t)) != start); + } while_each_thread(start, t); } static inline int zap_threads(struct task_struct *tsk, struct mm_struct *mm, @@ -1538,7 +1538,36 @@ static inline int zap_threads(struct tas if (atomic_read(&mm->mm_users) == mm->core_waiters + 1) goto done; - + /* + * We should find and kill all tasks which use this mm, and we should + * count them correctly into mm->core_waiters. We don't take tasklist + * lock, but this is safe wrt: + * + * fork: + * None of sub-threads can fork after zap_process(leader). All + * processes which were created before this point should be + * visible to zap_threads() because copy_process() adds the new + * process to the tail of init_task.tasks list, and lock/unlock + * of ->siglock provides a memory barrier. + * + * do_exit: + * The caller holds mm->mmap_sem. This means that the task which + * uses this mm can't pass exit_mm(), so it can't exit or clear + * its ->mm. + * + * de_thread: + * It does list_replace_rcu(&leader->tasks, ¤t->tasks), + * we must see either old or new leader, this does not matter. + * However, it can change p->sighand, so lock_task_sighand(p) + * must be used. Since p->mm != NULL and we hold ->mmap_sem + * it can't fail. + * + * Note also that "g" can be the old leader with ->mm == NULL + * and already unhashed and thus removed from ->thread_group. + * This is OK, __unhash_process()->list_del_rcu() does not + * clear the ->next pointer, we will find the new leader via + * next_thread(). + */ rcu_read_lock(); for_each_process(g) { if (g == tsk->group_leader) @@ -1549,17 +1578,13 @@ static inline int zap_threads(struct tas do { if (p->mm) { if (unlikely(p->mm == mm)) { - /* - * p->sighand can't disappear, but - * may be changed by de_thread() - */ lock_task_sighand(p, &flags); zap_process(p); unlock_task_sighand(p, &flags); } break; } - } while ((p = next_thread(p)) != g); + } while_each_thread(g, p); } rcu_read_unlock(); done: _