mempolicies: fix policy_zone check There is a check in zonelist_policy that compares the bitmap obtained from the gfp mask by an AND with GFP_ZONETYPES with a zone number. The bitmaps is an ORed mask of __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM. The policy_zone is a zone number with the possible values of ZONE_DMA, ZONE_DMA32, ZONE_HIGHMEM and ZONE_NORMAL. These are two different domains of values. For some reason this worked before (mostly??) when we always had 4 zones. With the zone reduction patchset this check fails on systems with only ZONE_DMA and ZONE_NORMAL if the system actually has memory in both zones. This is because ZONE_NORMAL is selected using no __GFP flag at all (and thus gfp_zone(...) == 0). ZONE_DMA is selected when __GFP_DMA is set. __GFP_DMA is 0x01. policy_zone is set to ZONE_NORMAL (==1) if ZONE_NORMAL and ZONE_DMA are populated. gfp_zone() yields 0 which is < ZONE_NORMAL and so policy is not applied to regular memory allocations! Instead gfp_zone(__GFP_DMA) == 1 which results in policy being applied to DMA allocations! What we realy want in that place is to establish the highest allowable zone for a given gfp_mask. If the highest zone is higher or equal to the policy_zone then memory policies need to be applied. We have such a highest_zone() function in page_alloc.c. So move the highest_zone() function from mm/page_alloc.c into include/linux/gfp.h. On the way we simplify the function and use the new zone_type that was also introduced with the zone reduction patchset plus we also specify the right type for the gfp flags parameter. Signed-off-by: Christoph Lameter Signed-off-by: Lee Schermerhorn Index: test/mm/mempolicy.c =================================================================== --- test.orig/mm/mempolicy.c 2006-07-15 14:53:08.000000000 -0700 +++ test/mm/mempolicy.c 2006-08-04 12:31:17.000000000 -0700 @@ -1096,7 +1096,7 @@ case MPOL_BIND: /* Lower zones don't get a policy applied */ /* Careful: current->mems_allowed might have moved */ - if (gfp_zone(gfp) >= policy_zone) + if (highest_zone(gfp) >= policy_zone) if (cpuset_zonelist_valid_mems_allowed(policy->v.zonelist)) return policy->v.zonelist; /*FALL THROUGH*/ Index: test/include/linux/gfp.h =================================================================== --- test.orig/include/linux/gfp.h 2006-08-04 12:16:03.000000000 -0700 +++ test/include/linux/gfp.h 2006-08-04 12:31:14.000000000 -0700 @@ -85,6 +85,21 @@ return zone; } +static inline enum zone_type highest_zone(gfp_t flags) +{ + if (flags & __GFP_DMA) + return ZONE_DMA; +#ifdef CONFIG_ZONE_DMA32 + if (flags & __GFP_DMA32) + return ZONE_DMA32; +#endif +#ifdef CONFIG_HIGHMEM + if (flags & __GFP_HIGHMEM) + return ZONE_HIGHMEM; +#endif + return ZONE_NORMAL; +} + /* * There is only one page-allocator function, and two main namespaces to * it. The alloc_page*() variants return 'struct page *' and as such Index: test/mm/page_alloc.c =================================================================== --- test.orig/mm/page_alloc.c 2006-08-04 12:16:13.000000000 -0700 +++ test/mm/page_alloc.c 2006-08-04 12:55:21.000000000 -0700 @@ -1466,22 +1466,6 @@ return nr_zones; } -static inline int highest_zone(int zone_bits) -{ - int res = ZONE_NORMAL; -#ifdef CONFIG_HIGHMEM - if (zone_bits & (__force int)__GFP_HIGHMEM) - res = ZONE_HIGHMEM; -#endif -#ifdef CONFIG_ZONE_DMA32 - if (zone_bits & (__force int)__GFP_DMA32) - res = ZONE_DMA32; -#endif - if (zone_bits & (__force int)__GFP_DMA) - res = ZONE_DMA; - return res; -} - #ifdef CONFIG_NUMA #define MAX_NODE_LOAD (num_online_nodes()) static int __meminitdata node_load[MAX_NUMNODES];