SLUB: Free slabs and sort partial slab lists in kmem_cache_shrink At kmem_cache_shrink check if we have any empty slabs on the partial if so then remove them. Also--as an anti-fragmentation measure--sort the partial slabs so that the most fully allocated ones come first and the least allocated last. The next allocations may fill up the nearly full slabs. Having the least allocated slabs last gives them the maximum chance that their remaining objects may be freed. Thus we can hopefully minimize the partial slabs. I think this is the best one can do in terms antifragmentation measures. Real defragmentation (meaning moving objects out of slabs with the least free objects to those that are almost full) can be implemted by reverse scanning through the list produced here but that would mean that we need to provide a callback at slab cache creation that allows the deletion or moving of an object. This will involve slab API changes so defer for now. Signed-off-by: Christoph Lameter Index: linux-2.6.21-rc6/mm/slub.c =================================================================== --- linux-2.6.21-rc6.orig/mm/slub.c 2007-04-16 21:35:32.000000000 -0700 +++ linux-2.6.21-rc6/mm/slub.c 2007-04-16 21:43:51.000000000 -0700 @@ -2178,6 +2178,77 @@ void kfree(const void *x) } EXPORT_SYMBOL(kfree); +/* + * kmem_cache_shrink removes empty slabs from the partial lists + * and then sorts the partially allocated slabs by the number + * of items in use. The slabs with the most items in use + * come first. New allocations will remove these from the + * partial list because they are full. The slabs with the + * least items are placed last. If it happens that the objects + * are freed then the page can be returned to the page allocator. + */ +int kmem_cache_shrink(struct kmem_cache *s) +{ + int node; + int i; + struct kmem_cache_node *n; + struct page *page; + struct page *t; + struct list_head *slabs_by_inuse = + kmalloc(sizeof(struct list_head) * s->objects, GFP_KERNEL); + unsigned long flags; + + if (!slabs_by_inuse) + return -ENOMEM; + + flush_all(s); + for_each_node(node) { + n = get_node(s, node); + + /* + * If there are just a minimum number of partial slabs + * then do not bother. + */ + if (n->nr_partial <= MIN_PARTIAL) + continue; + + for (i = 0; i < s->objects; i++) + INIT_LIST_HEAD(slabs_by_inuse + i); + + spin_lock_irqsave(&n->list_lock, flags); + + /* + * Build lists indexed by the items in use in + * each slab or free slabs if empty. + * + * Note that concurrent frees may occur while + * we hold the list_lock. page->inuse here is + * the upper limit. + */ + list_for_each_entry_safe(page, t, &n->partial, lru) { + if (!page->inuse) { + list_del(&page->lru); + discard_slab(s, page); + } else + list_move(&page->lru, + slabs_by_inuse + page->inuse); + } + + /* + * Rebuild the partial list with the slabs filled up + * most first and the least used slabs at the end. + */ + for (i = s->objects - 1; i > 0; i--) + list_splice(slabs_by_inuse + i, n->partial.prev); + + spin_unlock_irqrestore(&n->list_lock, flags); + } + + kfree(slabs_by_inuse); + return 0; +} +EXPORT_SYMBOL(kmem_cache_shrink); + /** * krealloc - reallocate memory. The contents will remain unchanged. * @@ -2422,17 +2493,6 @@ static struct notifier_block __cpuinitda #endif -/*************************************************************** - * Compatiblility definitions - **************************************************************/ - -int kmem_cache_shrink(struct kmem_cache *s) -{ - flush_all(s); - return 0; -} -EXPORT_SYMBOL(kmem_cache_shrink); - #ifdef CONFIG_NUMA /*****************************************************************