From: Peter Williams Problem: As the comment above the calculation of max_pull in the function states, there is a need to ensure that negative results of the subtractions do not wrap around to large numbers. This has not been implemented for the (max_load - busiest_load_per_task) expression and the possible consequences are for undesirable movement of tasks from one group to another group. E.g. consider a numa system with two nodes, each node containing four processors. If there are two processes in node-0 and with node-1 being completely idle, one of those processes will be moved to node-1 whereas the desired behavior is to retain those two processes in node-0. Fix: Make sure that max_load is greater than busiest_load_per_task before making the calculation. If it isn't max_pull will be zero and we skip directly to out_balanced. Signed-off-by: Peter Williams Cc: Ingo Molnar Signed-off-by: Andrew Morton --- kernel/sched.c | 2 ++ 1 files changed, 2 insertions(+) diff -puN kernel/sched.c~sched-protect-calculation-of-max_pull-from-integer-wrap kernel/sched.c --- devel/kernel/sched.c~sched-protect-calculation-of-max_pull-from-integer-wrap 2006-04-05 21:28:25.000000000 -0700 +++ devel-akpm/kernel/sched.c 2006-04-05 21:28:25.000000000 -0700 @@ -2183,6 +2183,8 @@ find_busiest_group(struct sched_domain * * by pulling tasks to us. Be careful of negative numbers as they'll * appear as very large values with unsigned longs. */ + if (max_load <= busiest_load_per_task) + goto out_balanced; /* Don't want to pull so many tasks that a group would go idle */ max_pull = min(max_load - avg_load, max_load - busiest_load_per_task); _