SLUB: Fix cpu slab flushing behavior

Currently we have a check for keventd in slab_alloc() and we only
schedule the event to check for inactive slabs if keventd is up.
This was done in the belief that later allocations would start
up the checking for all slabs. However, that is not true for
slabs that are only allocated from during boot. We will then have
per cpus slabs but s->cpu_slabs is zero. As a result flush_all() will
not flush the cpu_slabs from these slabcaches and slab validation wil
report counter inconsistencies.

Fix that by removing the check for keventd from slab_alloc.
Instead we set cpu_slabs to 1 during boot so that slab_alloc believes
that a check is already scheduled. Thus is will not try to schedule
via keventd during early boot.

Later when sysfs is brought up we have to scan through the list
of boot caches anyways. At that point we simply flush all active
slabs which will set cpu_slabs to zero. Any new cpu slab after
sysfs init will then cause the inactive cpu slab event to be triggered.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/slub.c |   26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

Index: linux-2.6.21-rc6/mm/slub.c
===================================================================
--- linux-2.6.21-rc6.orig/mm/slub.c	2007-04-11 12:27:13.000000000 -0700
+++ linux-2.6.21-rc6/mm/slub.c	2007-04-11 13:00:08.000000000 -0700
@@ -65,7 +65,7 @@
  * SLUB assigns one slab for allocation to each processor.
  * Allocations only occur from these slabs called cpu slabs.
  *
- * If a cpu slab exists then a workqueue thread checks every 10
+ * If a cpu slab exists then a workqueue thread checks every 30
  * seconds if the cpu slab is still in use. The cpu slab is pushed back
  * to the list if inactive [only needed for SMP].
  *
@@ -1201,7 +1201,7 @@ have_slab:
 		SetPageActive(page);
 
 #ifdef CONFIG_SMP
-		if (!atomic_read(&s->cpu_slabs) && keventd_up()) {
+		if (!atomic_read(&s->cpu_slabs)) {
 			atomic_inc(&s->cpu_slabs);
 			schedule_delayed_work(&s->flush, 30 * HZ);
 		}
@@ -1709,7 +1709,20 @@ static int kmem_cache_open(struct kmem_c
 
 #ifdef CONFIG_SMP
 	mutex_init(&s->flushing);
-	atomic_set(&s->cpu_slabs, 0);
+	if (slab_state >= SYSFS)
+		atomic_set(&s->cpu_slabs, 0);
+	else
+		/*
+		 * Keventd may not be up yet. Pretend that we have active
+		 * per_cpu slabs so that there will be no attempt to
+		 * schedule a flusher in slab_alloc.
+		 *
+		 * We fix the situation up later when sysfs is brought up
+		 * by flushing all slabs (which puts the slab caches that
+		 * are mostly/only used in a nice quiet state).
+		 */
+		atomic_set(&s->cpu_slabs, 1);
+
 	INIT_DELAYED_WORK(&s->flush, flusher);
 #endif
 	if (init_kmem_cache_nodes(s, gfpflags & ~SLUB_DMA))
@@ -3239,6 +3252,13 @@ int __init slab_sysfs_init(void)
 
 		err = sysfs_slab_add(s);
 		BUG_ON(err);
+		/*
+		 * Start the periodic checks for inactive cpu slabs.
+		 * flush_all() will zero s->cpu_slabs which will cause
+		 * any allocation of a new cpu slab to schedule an event
+		 * via keventd to watch for inactive cpu slabs.
+		 */
+		flush_all(s);
 	}
 
 	while (alias_list) {