To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: suresh.b.siddha@intel.com Cc: corey.d.gough@intel.com Cc: Pekka Enberg Cc: akpm@linux-foundation.org Subject: [RFC] SLUB patches for more functionality, performance and maintenance This series here contains a number of patches that need some discussion or evaluation. They apply on top of 2.6.22-rc6-mm1 + slub patches already in mm. 1. Page allocator pass through SLOB does pass through all larger kmalloc requests directly to the page allocator. The advantage is that the allocator overhead is eliminated and that we do not need to provide kmalloc slabs >= PAGE_SIZE. This patch does the same thing in SLUB. For allocation whose size is know the call to the slab may be converted to a call to the page allocator at compile time. The result of doing so is also that the behavior of SLUB for pagesize and higher kmalloc allocation will conform to SLAB. Meaning large kmalloc allocations are no longer debuggable and are guaranteed to be page aligned. Do we want this? 2. A series of performance enhancements patches The patches improve the producer / consumer scenario. If objects are always allocated on one processor and released on another then both will use distinct cachelines to store their information in order to avoid a bouncing cacheline. In order to do so we have to introduce a per cpu structure to keep per cpu allocation lists in distinct cachelines from the remote free information in the page struct. If we introduce a per cpu structure then we also need to allocate that in a NUMA aware fashion from the local node. Having a per cpu structure allows to avoid the use of certain fields in the page struct which in turn allows us to avoid using page->mapping and increasing the maximum number of objects per slab. More optimizations become possible by shifting information from the kmem_cache structure that is used in the hotpath to the per cpu structure thereby minimizing cacheline coverage. Finally there is an implementation of slab_alloc/slab_free using a cmpxchg instead of current interrupt enable disable approach. This was inspired by LTTng's approach. A cmpxchg is less costly than interrupt enabe/disable but means more complexity in managing the resulting race conditions. The disadvantage is that the allocation / free paths become very complex and fragile. All of these patches need to be evaluated as to what impact they have on a variety of loads. 3. Removal of SLOB and SLAB. We would like to consolidate and only have one slab allocator in the future. Two patches are included that remove SLOB and SLAB. There is only minimal justification for retaining SLOB. So I think we could remove SLOB for 2.6.23. SLAB is the reference that SLUB must be measured against to avoid regressions. On the other hand it will be a problem to support new functionality like Slab defragmentation since its design makes it difficult to implement comparable features. So I think that we need to keep SLAB around for one more cycle and then we may be able to get rid of it. Or we can keep it in the tree for awhile and produce more and more shims for new slab functionality.