From: Balbir Singh We need to save a reference to the s_umount read write semaphore. The dentry can be freed by prune_one_dentry(). Dereferencing dentry->d_sb->s_umount is not safe after that point. I hit an Oops while running 2.6.17-rc1-mm1 DMA free:3584kB min:68kB low:84kB high:100kB active:10448kB inactive:0kB presentOops: 0002 [#1] PREEMPT SMP last sysfs file: /devices/pci0000:00/0000:00:0a.0/power/state Modules linked in: loop dm_mod ide_cd cdrom ohci_hcd usbcore serverworks generii CPU: 1 EIP: 0060:[] Not tainted VLI EFLAGS: 00010212 (2.6.17-rc1-mm1cpum #2) EIP is at prune_dcache+0x91/0x1d0 eax: 6b6b6ba7 ebx: e45918e0 ecx: 00000001 edx: ffffffff esi: e45918e8 edi: 00000058 ebp: e4cfcbe0 esp: e4cfcbbc ds: 007b es: 007b ss: 0068 Process hackbench (pid: 11183, threadinfo=e4cfc000 task=e4d076b0) Stack: <0>c12fb400 e4cfcbd0 c122f5ed c2288504 00000000 00000000 0000283c 000a0f c2259404 e4cfcbe8 c108266e e4cfcc28 c104fe9b 00000080 000000d0 0000000b 00000021 00000000 e4cfc000 00000000 0000008c e4cfc000 00000080 00004db7 Call Trace: show_stack_log_lvl+0xad/0xe0 show_registers+0x1c7/0x250 die+0x13a/0x330 do_page_fault+0x2d0/0x750 error_code+0x4f/0x54 shrink_dcache_memory+0x3e/0x50 shrink_slab+0x17b/0x240 try_to_free_pages+0x1bf/0x2b0 __alloc_pages+0x136/0x310 cache_alloc_refill+0x40c/0x70 __kmalloc_track_caller+0xc6/0xf0 __alloc_skb+0x5f/0x110 sock_alloc_send_skb+0x1a7/0x200 unix_stream_sendmsg+0x0 do_sock_write+0xb4/0xc0 sock_aio_write+0x67/0x70 do_sync_write+0xb9/0xf0 vfs_write+0x181/0x190 sys_write+0x47/0x70 sysenter_past_esp+0x54/0x75 Code: 0a 75 f3 85 c0 0f 88 fe 00 00 00 8b 4b 60 8b 41 38 85 c0 0f 84 de 00 00 0 Signed-off-by: Balbir Singh Cc: Jan Blunck Cc: Kirill Korotaev Cc: Olaf Hering Cc: Neil Brown Signed-off-by: Andrew Morton --- fs/dcache.c | 25 +++++++++++++++---------- 1 files changed, 15 insertions(+), 10 deletions(-) diff -puN fs/dcache.c~fix-dcache-race-during-umount-fix fs/dcache.c --- 25/fs/dcache.c~fix-dcache-race-during-umount-fix Thu Apr 6 15:35:09 2006 +++ 25-akpm/fs/dcache.c Thu Apr 6 15:40:07 2006 @@ -400,6 +400,7 @@ static void prune_dcache(int count, stru for (; count ; count--) { struct dentry *dentry; struct list_head *tmp; + struct rw_semaphore *s_umount; cond_resched_lock(&dcache_lock); @@ -449,26 +450,30 @@ static void prune_dcache(int count, stru * we want to shrink. */ /* - * If this dentry is for "my" filesystem, then I can - * prune it without taking the s_umount lock (I already hold it). + * If this dentry is for "my" filesystem, then I can prune it + * without taking the s_umount lock (I already hold it). */ if (sb && dentry->d_sb == sb) { prune_one_dentry(dentry); continue; } - /* ...otherwise we need to be sure this filesystem isn't being - * unmounted, otherwise we could race with generic_shutdown_super, - * and end up holding a reference to an inode while the - * filesystem is unmounted. - * So we try to get s_umount, and make sure s_root isn't NULL + /* + * ...otherwise we need to be sure this filesystem isn't being + * unmounted, otherwise we could race with + * generic_shutdown_super(), and end up holding a reference to + * an inode while the filesystem is unmounted. + * So we try to get s_umount, and make sure s_root isn't NULL. + * (Take a local copy of s_umount to avoid a use-after-free of + * `dentry'). */ - if (down_read_trylock(&dentry->d_sb->s_umount)) { + s_umount = &dentry->d_sb->s_umount; + if (down_read_trylock(s_umount)) { if (dentry->d_sb->s_root != NULL) { prune_one_dentry(dentry); - up_read(&dentry->d_sb->s_umount); + up_read(s_umount); continue; } - up_read(&dentry->d_sb->s_umount); + up_read(s_umount); } spin_unlock(&dentry->d_lock); /* Cannot remove the first dentry, and it isn't appropriate _