From: Peter Williams <pwil3058@bigpond.net.au>

Problem:

As the comment above the calculation of max_pull in the function states,
there is a need to ensure that negative results of the subtractions do
not wrap around to large numbers.  This has not been implemented for the
(max_load - busiest_load_per_task) expression and the possible
consequences are for undesirable movement of tasks from one group to
another group. E.g. consider a numa system with two nodes, each
node containing four processors.  If there are two processes in node-0
and with node-1 being completely idle,  one of those processes will be
moved to node-1 whereas the desired behavior is to retain those two
processes in node-0.

Fix:

Make sure that max_load is greater than busiest_load_per_task before
making the calculation.  If it isn't max_pull will be zero and we skip
directly to out_balanced.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 kernel/sched.c |    2 ++
 1 files changed, 2 insertions(+)

diff -puN kernel/sched.c~sched-protect-calculation-of-max_pull-from-integer-wrap kernel/sched.c
--- devel/kernel/sched.c~sched-protect-calculation-of-max_pull-from-integer-wrap	2006-04-05 21:28:25.000000000 -0700
+++ devel-akpm/kernel/sched.c	2006-04-05 21:28:25.000000000 -0700
@@ -2183,6 +2183,8 @@ find_busiest_group(struct sched_domain *
 	 * by pulling tasks to us.  Be careful of negative numbers as they'll
 	 * appear as very large values with unsigned longs.
 	 */
+	if (max_load <= busiest_load_per_task)
+		goto out_balanced;
 
 	/* Don't want to pull so many tasks that a group would go idle */
 	max_pull = min(max_load - avg_load, max_load - busiest_load_per_task);
_