Check for virtual memmap node/node overlaps There is the danger of a virtual memory map block reaching from the end of one node into the other. With that the page_struct for a page on one node would live on another. This could impact performance. Note that an overlap can only occur if: 1. The hole between nodes is smaller than the memory covered by one page of the vmemmap. This is x86_64: 4K page = 256k x86_64: 2M page = 128MB IA64: 16K page = 2MB IA64: 16M page= 4GB 2. The tail of the prior node does not properly align with the vmemmap page covering it. 3. The start of the node does not properly align with the vmemap page. The effect of an overlap may not be a problem at all because A. The early pages in a node are allocated for system use (node tables, slab pages, vmemmap blocks for the node). The page structs are never or extremely infrequently referenced. B. NUMA distances do not matter in case we run x86_64 with NUMA emulation. The check is run shortly before the system spawn init. It scans the early page_structs of all nodes and verifies that they are on the correct node. If it finds page structs that are on the earlier node then it will report and distinguish between system pages (which are not a problem) and user pages (which could impact performance). Signed-off-by: Christoph Lameter Index: linux-2.6.21-rc5-mm4/mm/sparse.c =================================================================== --- linux-2.6.21-rc5-mm4.orig/mm/sparse.c 2007-04-04 10:32:39.000000000 -0700 +++ linux-2.6.21-rc5-mm4/mm/sparse.c 2007-04-04 10:43:57.000000000 -0700 @@ -585,3 +585,74 @@ out: __kfree_section_memmap(memmap, nr_pages); return ret; } + +#if defined(CONFIG_SPARSE_VIRTUAL) && defined(CONFIG_NUMA) +/* + * This function checks the virtual memmap for eventual overlaps + * due to a block of page structs describing page structs of + * another node. This is run late in boot so that we are sure + * that the system had ample opportunity to consume eventually + * overlapping pages for other purposes. We distinguish between pages + * available for the user and those for the system. Of those only + * the pages for the user may posed a performance issue because + * the page_struct is on a different node from the page. + */ +static unsigned long report_mismatching_page_structs(unsigned long start, + unsigned long end, int free, int node) +{ + unsigned pfn = start; + const char *memtype; + + while (pfn < end && pfn_valid(pfn)) { + struct page *page = pfn_to_page(pfn); + + + if (PageBuddy(page) != free) + break; + + if (early_pfn_to_nid(pfn) != node) + break; + + if (PageBuddy(page)) + pfn += 1 << page->private; + else + pfn++; + } + + if (free) + memtype = "User"; + else + memtype = "System"; + + printk(KERN_INFO "%s pages (%p-%p) have page_struct on node %d", memtype, + pfn_to_kaddr(start), pfn_to_kaddr(pfn), early_pfn_to_nid(start)); + + return pfn; +} + +static int check_vmemmap(void) { + int node; + + for_each_online_node(node) { + unsigned long start = NODE_DATA(node)->node_start_pfn; + unsigned long end = start + + NODE_DATA(node)->node_spanned_pages; + + printk(KERN_INFO "VMEMMAP: Checking page_struct/page " + "mismatches on node %d\n", node); + + while (pfn_valid(start) && + early_pfn_to_nid(start) != node) { + + int free = PageBuddy(pfn_to_page(start)); + + start = report_mismatching_page_structs(start, + end, free, node); + } + } + return 0; +} + +late_initcall(check_vmemmap); +#endif +