Additional features for zone reclaim This patch adds the ability to shrink the cache if a zone runs out of memory or to start swapping out pages on a node. The slab shrink has some issues since it is global and not related to the zone. One could add support for zone specifications to the shrinker to make that work. Signed-off-by: Christoph Lameter Index: linux-2.6.16-rc1-mm1/mm/vmscan.c =================================================================== --- linux-2.6.16-rc1-mm1.orig/mm/vmscan.c 2006-01-18 17:36:48.000000000 -0800 +++ linux-2.6.16-rc1-mm1/mm/vmscan.c 2006-01-18 17:54:01.000000000 -0800 @@ -1816,13 +1816,14 @@ module_init(kswapd_init) * * If non-zero call zone_reclaim when the number of free pages falls below * the watermarks. - * - * In the future we may add flags to the mode. However, the page allocator - * should only have to check that zone_reclaim_mode != 0 before calling - * zone_reclaim(). */ int zone_reclaim_mode __read_mostly; +#define RECLAIM_OFF 0 +#define RECLAIM_ZONE (1<<0) /* Run shrink_cache on the zone */ +#define RECLAIM_SWAP (2<<0) /* Shrink_cache with swap out */ +#define RECLAIM_SLAB (3<<0) /* Do a global slab shrink if the zone is out of memory */ + /* * Mininum time between zone reclaim scans */ @@ -1850,7 +1851,7 @@ int zone_reclaim(struct zone *zone, gfp_ return 0; sc.may_writepage = 0; - sc.may_swap = 0; + sc.may_swap = !!(zone_reclaim_mode & RECLAIM_SWAP); sc.nr_scanned = 0; sc.nr_reclaimed = 0; sc.priority = 0; @@ -1869,7 +1870,11 @@ int zone_reclaim(struct zone *zone, gfp_ p->flags |= PF_MEMALLOC; reclaim_state.reclaimed_slab = 0; p->reclaim_state = &reclaim_state; - shrink_zone(zone, &sc); + + if (zone_reclaim_mode & (RECLAIM_ZONE|RECLAIM_SWAP)) + shrink_zone(zone, &sc); + if (sc.nr_reclaimed < nr_pages && (zone_reclaim_mode & RECLAIM_SLAB)) + sc.nr_reclaimed = shrink_slab(sc.nr_scanned, gfp_mask, order); p->reclaim_state = NULL; current->flags &= ~PF_MEMALLOC; Index: linux-2.6.16-rc1-mm1/Documentation/sysctl/vm.txt =================================================================== --- linux-2.6.16-rc1-mm1.orig/Documentation/sysctl/vm.txt 2006-01-18 17:41:29.000000000 -0800 +++ linux-2.6.16-rc1-mm1/Documentation/sysctl/vm.txt 2006-01-18 17:47:00.000000000 -0800 @@ -126,6 +126,15 @@ the high water marks for each per cpu pa zone_reclaim_mode: +This allows to set more or less agressive forms of reclaiming memory +when a zone runs out of memory. + +This is a ORed together value of + +1 = Zone reclaim without swapout +2 = Zone reclaim with swapping out pages +4 = Slab reclaim when zone is out of memory + This is set during bootup to 1 if it is determined that pages from remote zones will cause a significant performance reduction. The page allocator will then reclaim easily reusable pages (those page @@ -135,8 +144,10 @@ The user can override this setting. It m off zone reclaim if the system is used for a file server and all of memory should be used for caching files from disk. -It may be beneficial to switch this on if one wants to do zone -reclaim regardless of the numa distances in the system. +It may be advisable to set Slab reclaim if the system makes heavy +use of files and builds up large slab caches and no longer has +sufficient local memory available. Note that the slab shrink is global +and may free slab entries on other nodes. ================================================================