From andrea@qumranet.com Tue Jan 29 06:25:37 2008 Date: Tue, 29 Jan 2008 15:24:52 +0100 From: Andrea Arcangeli To: Christoph Lameter Cc: Robin Holt , Avi Kivity , Izik Eidus , Nick Piggin , kvm-devel@lists.sourceforge.net, Benjamin Herrenschmidt , Peter Zijlstra , steiner@sgi.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, daniel.blueman@quadrics.com, Hugh Dickins Subject: Re: [patch 4/6] MMU notifier: invalidate_page callbacks using Linux rmaps This should fix the aging bugs you introduced through the faulty cpp expansion. This is hard to write for me, given any time somebody does a ptep_clear_flush_young w/o manually cpp-expandin "| mmu_notifier_age_page" after it, it's always a bug that needs fixing, similar bugs can emerge with time for ptep_clear_flush too. What will happen is that somebody will cleanup in 26+ and we'll remain with a #ifdef KERNEL_VERSION() < 2.6.26 in ksm.c to call mmu_notifier(invalidate_page) explicitly. Performance and optimizations or unnecessary invalidate_page are a red-herring, it can be fully optimized both ways. 99% of the time when somebody calls ptep_clear_flush and ptep_clear_flush_young, the respective mmu notifier can't be forgotten (and calling them once more even if a later invalidate_range is invoked, is always safer and preferable than not calling them at all) so I fail to see how this will not be cleaned up eventually, the same way the tlb flushes have been cleaned up already. Nevertheless I back your implementation and I'm not even trying at changing it with the risk to slowdown merging. Signed-off-by: Andrea Arcangeli --- mm/rmap.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) Index: linux-2.6/mm/rmap.c =================================================================== --- linux-2.6.orig/mm/rmap.c 2008-01-28 18:41:19.000000000 -0800 +++ linux-2.6/mm/rmap.c 2008-01-29 11:52:32.000000000 -0800 @@ -285,10 +285,8 @@ static int page_referenced_one(struct pa if (!pte) goto out; - if (ptep_clear_flush_young(vma, address, pte)) - referenced++; - - if (mmu_notifier_age_page(mm, address)) + if (ptep_clear_flush_young(vma, address, pte) | + mmu_notifier_age_page(mm, address)) referenced++; /* Pretend the page is referenced if the task has the @@ -684,7 +682,7 @@ static int try_to_unmap_one(struct page * skipped over this mm) then we should reactivate it. */ if (!migration && ((vma->vm_flags & VM_LOCKED) || - (ptep_clear_flush_young(vma, address, pte) || + (ptep_clear_flush_young(vma, address, pte) | mmu_notifier_age_page(mm, address)))) { ret = SWAP_FAIL; goto out_unmap; @@ -818,10 +816,8 @@ static void try_to_unmap_cluster(unsigne page = vm_normal_page(vma, address, *pte); BUG_ON(!page || PageAnon(page)); - if (ptep_clear_flush_young(vma, address, pte)) - continue; - - if (mmu_notifier_age_page(mm, address)) + if (ptep_clear_flush_young(vma, address, pte) | + mmu_notifier_age_page(mm, address)) continue; /* Nuke the page table entry. */