From: Nick Piggin lock_page needs the caller to have a reference on the page->mapping inode due to sync_page, ergo set_page_dirty_lock is obviously buggy according to its comments. Solve it by introducing a new lock_page_nosync which does not do a sync_page. akpm: unpleasant solution to an unpleasant problem. If it goes wrong it could cause great slowdowns while the lock_page() caller waits for kblockd to perform the unplug. And if a filesystem has special sync_page() requirements (none presently do), permanent hangs are possible. otoh, set_page_dirty_lock() is usually (always?) called against userspace pages. They are always up-to-date, so there shouldn't be any pending read I/O against these pages. Signed-off-by: Nick Piggin Signed-off-by: Andrew Morton --- include/linux/pagemap.h | 15 +++++++++++++++ mm/filemap.c | 17 +++++++++++++++++ mm/page-writeback.c | 2 +- 3 files changed, 33 insertions(+), 1 deletion(-) diff -puN include/linux/pagemap.h~mm-non-syncing-lock_page include/linux/pagemap.h --- a/include/linux/pagemap.h~mm-non-syncing-lock_page +++ a/include/linux/pagemap.h @@ -130,14 +130,29 @@ static inline pgoff_t linear_page_index( } extern void FASTCALL(__lock_page(struct page *page)); +extern void FASTCALL(__lock_page_nosync(struct page *page)); extern void FASTCALL(unlock_page(struct page *page)); +/* + * lock_page may only be called if we have the page's inode pinned. + */ static inline void lock_page(struct page *page) { might_sleep(); if (TestSetPageLocked(page)) __lock_page(page); } + +/* + * lock_page_nosync should only be used if we can't pin the page's inode. + * Doesn't play quite so well with block device plugging. + */ +static inline void lock_page_nosync(struct page *page) +{ + might_sleep(); + if (TestSetPageLocked(page)) + __lock_page_nosync(page); +} /* * This is exported only for wait_on_page_locked/wait_on_page_writeback. diff -puN mm/filemap.c~mm-non-syncing-lock_page mm/filemap.c --- a/mm/filemap.c~mm-non-syncing-lock_page +++ a/mm/filemap.c @@ -488,6 +488,12 @@ struct page *page_cache_alloc_cold(struc EXPORT_SYMBOL(page_cache_alloc_cold); #endif +static int __sleep_on_page_lock(void *word) +{ + io_schedule(); + return 0; +} + /* * In order to wait for pages to become available there must be * waitqueues associated with pages. By using a hash table of @@ -577,6 +583,17 @@ void fastcall __lock_page(struct page *p } EXPORT_SYMBOL(__lock_page); +/* + * Variant of lock_page that does not require the caller to hold a reference + * on the page's mapping. + */ +void fastcall __lock_page_nosync(struct page *page) +{ + DEFINE_WAIT_BIT(wait, &page->flags, PG_locked); + __wait_on_bit_lock(page_waitqueue(page), &wait, __sleep_on_page_lock, + TASK_UNINTERRUPTIBLE); +} + /** * find_get_page - find and get a page reference * @mapping: the address_space to search diff -puN mm/page-writeback.c~mm-non-syncing-lock_page mm/page-writeback.c --- a/mm/page-writeback.c~mm-non-syncing-lock_page +++ a/mm/page-writeback.c @@ -838,7 +838,7 @@ int set_page_dirty_lock(struct page *pag { int ret; - lock_page(page); + lock_page_nosync(page); ret = set_page_dirty(page); unlock_page(page); return ret; _