From: Christoph Lameter On Tue, 10 Jan 2006, Andrew Morton wrote: > RECLAIM_DISTANCE should have a comment describing wtf it does, what units > it is in, etc. Added a comment. > > Variable zone_reclaim_mode seems poorly named. If it's boolean then it > should be perform_zone_reclaim or something. If we intend to extend it to > more than two values later then those values should be enumerated now and > we should explicitly compare for them. Right now, I look at a statement > like "zone_reclaim_mode = 1" and I don't know what it _does_. Does it turn > it on, or off, or what? I think there is a chance that we will add flags to it later. However, these flags should not be checked in the page allocator. It only should check != 0. Added a comment to that effect. > last_unsuccessful_zone_reclaim could use a comment: what it does, what > units it's in. Added a comment. > I dislike the open-coded initialisation of scan_control members. It's just > asking for us to get uninitialised variables if we later add something. > But that's Nick's fault. Doing > > struct scan_control sc = { > ... > } > > is so much nicer, but sometimes inconvenient. Changed. Signed-off-by: Christoph Lameter Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 5 +++++ include/linux/topology.h | 5 +++++ mm/vmscan.c | 22 +++++++++++++--------- 3 files changed, 23 insertions(+), 9 deletions(-) diff -puN include/linux/mmzone.h~zone-reclaim-reclaim-logic-tweaks include/linux/mmzone.h --- 25/include/linux/mmzone.h~zone-reclaim-reclaim-logic-tweaks Tue Jan 17 16:24:04 2006 +++ 25-akpm/include/linux/mmzone.h Tue Jan 17 16:24:04 2006 @@ -152,6 +152,11 @@ struct zone { /* A count of how many reclaimers are scanning this zone */ atomic_t reclaim_in_progress; + /* + * timestamp (in jiffies) of the last zone reclaim that did not + * result in freeing of pages. This is used to avoid repeated scans + * if all memory in the zone is in use. + */ unsigned long last_unsuccessful_zone_reclaim; /* diff -puN include/linux/topology.h~zone-reclaim-reclaim-logic-tweaks include/linux/topology.h --- 25/include/linux/topology.h~zone-reclaim-reclaim-logic-tweaks Tue Jan 17 16:24:04 2006 +++ 25-akpm/include/linux/topology.h Tue Jan 17 16:24:04 2006 @@ -57,6 +57,11 @@ #define node_distance(from,to) ((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE) #endif #ifndef RECLAIM_DISTANCE +/* + * If the distance between nodes in a system is larger than RECLAIM_DISTANCE + * (in whatever arch specific measurement units returned by node_distance()) + * then switch on zone reclaim on boot. + */ #define RECLAIM_DISTANCE 20 #endif #ifndef PENALTY_FOR_NODE_WITH_CPUS diff -puN mm/vmscan.c~zone-reclaim-reclaim-logic-tweaks mm/vmscan.c --- 25/mm/vmscan.c~zone-reclaim-reclaim-logic-tweaks Tue Jan 17 16:24:04 2006 +++ 25-akpm/mm/vmscan.c Tue Jan 17 16:24:04 2006 @@ -1579,6 +1579,10 @@ module_init(kswapd_init) * * If non-zero call zone_reclaim when the number of free pages falls below * the watermarks. + * + * In the future we may add flags to the mode. However, the page allocator + * should only have to check that zone_reclaim_mode != 0 before calling + * zone_reclaim(). */ int zone_reclaim_mode __read_mostly; @@ -1591,10 +1595,18 @@ int zone_reclaim_mode __read_mostly; */ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order) { - struct scan_control sc; int nr_pages = 1 << order; struct task_struct *p = current; struct reclaim_state reclaim_state; + struct scan_control sc = { + .gfp_mask = gfp_mask, + .may_writepage = 0, + .may_swap = 0, + .nr_mapped = read_page_state(nr_mapped), + .nr_scanned = 0, + .nr_reclaimed = 0, + .priority = 0 + }; if (!(gfp_mask & __GFP_WAIT) || zone->zone_pgdat->node_id != numa_node_id() || @@ -1606,14 +1618,6 @@ int zone_reclaim(struct zone *zone, gfp_ zone->last_unsuccessful_zone_reclaim + ZONE_RECLAIM_INTERVAL)) return 0; - sc.gfp_mask = gfp_mask; - sc.may_writepage = 0; - sc.may_swap = 0; - sc.nr_mapped = read_page_state(nr_mapped); - sc.nr_scanned = 0; - sc.nr_reclaimed = 0; - /* scan at the highest priority */ - sc.priority = 0; disable_swap_token(); if (nr_pages > SWAP_CLUSTER_MAX) _