commit acb34e4d6b5d0601530e1dd10121a1ae209e3a44 Author: Greg Kroah-Hartman Date: Mon Dec 14 09:47:25 2009 -0800 Linux 2.6.32.1 commit abb247066f9769e90df87ab65e5c2bb4dbdb529c Author: Theodore Ts'o Date: Wed Dec 9 21:30:02 2009 -0500 ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem) (cherry picked from commit fab3a549e204172236779f502eccb4f9bf0dc87d) Fix the following potential circular locking dependency between mm->mmap_sem and ei->i_data_sem: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-04115-gec044c5 #37 ------------------------------------------------------- ureadahead/1855 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [] might_fault+0x5c/0xac but task is already holding lock: (&ei->i_data_sem){++++..}, at: [] ext4_fiemap+0x11b/0x159 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&ei->i_data_sem){++++..}: [] __lock_acquire+0xb67/0xd0f [] lock_acquire+0xdc/0x102 [] down_read+0x51/0x84 [] ext4_get_blocks+0x50/0x2a5 [] ext4_get_block+0xab/0xef [] do_mpage_readpage+0x198/0x48d [] mpage_readpages+0xd0/0x114 [] ext4_readpages+0x1d/0x1f [] __do_page_cache_readahead+0x12f/0x1bc [] ra_submit+0x21/0x25 [] filemap_fault+0x19f/0x32c [] __do_fault+0x55/0x3a2 [] handle_mm_fault+0x327/0x734 [] do_page_fault+0x292/0x2aa [] page_fault+0x25/0x30 [] clear_user+0x38/0x3c [] padzero+0x20/0x31 [] load_elf_binary+0x8bc/0x17ed [] search_binary_handler+0xc2/0x259 [] load_script+0x1b8/0x1cc [] search_binary_handler+0xc2/0x259 [] do_execve+0x1ce/0x2cf [] sys_execve+0x43/0x5a [] stub_execve+0x6a/0xc0 -> #0 (&mm->mmap_sem){++++++}: [] __lock_acquire+0xa11/0xd0f [] lock_acquire+0xdc/0x102 [] might_fault+0x89/0xac [] fiemap_fill_next_extent+0x95/0xda [] ext4_ext_fiemap_cb+0x138/0x157 [] ext4_ext_walk_space+0x178/0x1f1 [] ext4_fiemap+0x13c/0x159 [] do_vfs_ioctl+0x348/0x4d6 [] sys_ioctl+0x56/0x79 [] system_call_fastpath+0x16/0x1b other info that might help us debug this: 1 lock held by ureadahead/1855: #0: (&ei->i_data_sem){++++..}, at: [] ext4_fiemap+0x11b/0x159 stack backtrace: Pid: 1855, comm: ureadahead Not tainted 2.6.32-04115-gec044c5 #37 Call Trace: [] print_circular_bug+0xa8/0xb7 [] __lock_acquire+0xa11/0xd0f [] ? sched_clock+0x9/0xd [] lock_acquire+0xdc/0x102 [] ? might_fault+0x5c/0xac [] might_fault+0x89/0xac [] ? might_fault+0x5c/0xac [] ? __kmalloc+0x13b/0x18c [] fiemap_fill_next_extent+0x95/0xda [] ext4_ext_fiemap_cb+0x138/0x157 [] ? ext4_ext_fiemap_cb+0x0/0x157 [] ext4_ext_walk_space+0x178/0x1f1 [] ext4_fiemap+0x13c/0x159 [] ? might_fault+0x5c/0xac [] do_vfs_ioctl+0x348/0x4d6 [] ? __up_read+0x8d/0x95 [] ? retint_swapgs+0x13/0x1b [] sys_ioctl+0x56/0x79 [] system_call_fastpath+0x16/0x1b Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 0fd023ecf102ab0bd070d5affd73b18e6704ff0f Author: Akira Fujita Date: Sun Dec 6 23:38:31 2009 -0500 ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT (cherry picked from commit 4a58579b9e4e2a35d57e6c9c8483e52f6f1b7fd6) This patch fixes three problems in the handling of the EXT4_IOC_MOVE_EXT ioctl: 1. In current EXT4_IOC_MOVE_EXT, there are read access mode checks for original and donor files, but they allow the illegal write access to donor file, since donor file is overwritten by original file data. To fix this problem, change access mode checks of original (r->r/w) and donor (r->w) files. 2. Disallow the use of donor files that have a setuid or setgid bits. 3. Call mnt_want_write() and mnt_drop_write() before and after ext4_move_extents() calling to get write access to a mount. Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit eebb744d30006474a8f63af098bc71f0cb209677 Author: Jan Kara Date: Tue Dec 8 23:51:10 2009 -0500 ext4: Wait for proper transaction commit on fsync (cherry picked from commit b436b9bef84de6893e86346d8fbf7104bc520645) We cannot rely on buffer dirty bits during fsync because pdflush can come before fsync is called and clear dirty bits without forcing a transaction commit. What we do is that we track which transaction has last changed the inode and which transaction last changed allocation and force it to disk on fsync. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit caa305aa349212c285ad9564b9ff2ffa040b193c Author: Dmitry Monakhov Date: Tue Dec 8 22:42:28 2009 -0500 ext4: fix incorrect block reservation on quota transfer. (cherry picked from commit 194074acacebc169ded90a4657193f5180015051) Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid This means that we may end-up with transferring all quotas. Add we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in case of QUOTA_INIT_BLOCKS. Signed-off-by: Dmitry Monakhov Reviewed-by: Mingming Cao Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit da2068b384bbfaae98ff55f1424be88c65bf801b Author: Dmitry Monakhov Date: Tue Dec 8 22:42:15 2009 -0500 ext4: quota macros cleanup (cherry picked from commit 5aca07eb7d8f14d90c740834d15ca15277f4820c) Currently all quota block reservation macros contains hard-coded "2" aka MAXQUOTAS value. This is no good because in some places it is not obvious to understand what does this digit represent. Let's introduce new macro with self descriptive name. Signed-off-by: Dmitry Monakhov Acked-by: Mingming Cao Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 6798788a72ee430761aa41c02f770ee3afb9c212 Author: Dmitry Monakhov Date: Tue Dec 8 22:41:52 2009 -0500 ext4: ext4_get_reserved_space() must return bytes instead of blocks (cherry picked from commit 8aa6790f876e81f5a2211fe1711a5fe3fe2d7b20) Signed-off-by: Dmitry Monakhov Reviewed-by: Eric Sandeen Acked-by: Mingming Cao Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 637b13106b744398b530fe916eb1556aeb7f6bca Author: Curt Wohlgemuth Date: Tue Dec 8 22:18:25 2009 -0500 ext4: remove blocks from inode prealloc list on failure (cherry picked from commit b844167edc7fcafda9623955c05e4c1b3c32ebc7) This fixes a leak of blocks in an inode prealloc list if device failures cause ext4_mb_mark_diskspace_used() to fail. Signed-off-by: Curt Wohlgemuth Acked-by: Aneesh Kumar K.V Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 1cd3f1980ce02bd814879ce1ac9cde5eaceb5f13 Author: Josef Bacik Date: Tue Dec 8 21:48:58 2009 -0500 ext4: wait for log to commit when umounting (cherry picked from commit d4edac314e9ad0b21ba20ba8bc61b61f186f79e1) There is a potential race when a transaction is committing right when the file system is being umounting. This could reduce in a race because EXT4_SB(sb)->s_group_info could be freed in ext4_put_super before the commit code calls a callback so the mballoc code can release freed blocks in the transaction, resulting in a panic trying to access the freed s_group_info. The fix is to wait for the transaction to finish committing before we shutdown the multiblock allocator. Signed-off-by: Josef Bacik Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 35a6f7824919816fca466997531885044d290b59 Author: Jan Kara Date: Tue Dec 8 21:24:33 2009 -0500 ext4: Avoid data / filesystem corruption when write fails to copy data (cherry picked from commit b9a4207d5e911b938f73079a83cc2ae10524ec7f) When ext4_write_begin fails after allocating some blocks or generic_perform_write fails to copy data to write, we truncate blocks already instantiated beyond i_size. Although these blocks were never inside i_size, we have to truncate the pagecache of these blocks so that corresponding buffers get unmapped. Otherwise subsequent __block_prepare_write (called because we are retrying the write) will find the buffers mapped, not call ->get_block, and thus the page will be backed by already freed blocks leading to filesystem and data corruption. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 66c3a718335ccd37ac14c4f0f2d0a7551af562ed Author: Roel Kluin Date: Mon Dec 7 10:38:16 2009 -0500 ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks() (cherry picked from commit c09eef305dd43846360944ad072f051f964fa383) Signed-off-by: Roel Kluin Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 86d291a39650e05c9db7c0d849e48beaf39efcdd Author: Theodore Ts'o Date: Tue Dec 1 09:04:42 2009 -0500 jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer() (cherry picked from commit e6ec116b67f46e0e7808276476554727b2e6240b) OOM happens. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit ce5cf38ef17272ad2641c723c6d0d8eaf1c34eab Author: Akira Fujita Date: Tue Nov 24 10:31:56 2009 -0500 ext4: move_extent_per_page() cleanup (cherry picked from commit ac48b0a1d068887141581bea8285de5fcab182b0) Integrate duplicate lines (acquire/release semaphore and invalidate extent cache in move_extent_per_page()) into mext_replace_branches(), to reduce source and object code size. Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 3369cbb6bedae7e8c7600b33803a8e69b2a6e99f Author: Kazuya Mio Date: Tue Nov 24 10:28:48 2009 -0500 ext4: initialize moved_len before calling ext4_move_extents() (cherry picked from commit 446aaa6e7e993b38a6f21c6acfa68f3f1af3dbe3) The move_extent.moved_len is used to pass back the number of exchanged blocks count to user space. Currently the caller must clear this field; but we spend more code space checking for this requirement than simply zeroing the field ourselves, so let's just make life easier for everyone all around. Signed-off-by: Kazuya Mio Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 74920c74ad3802bc81f5ccbb737d351073dd3ff9 Author: Akira Fujita Date: Tue Nov 24 10:19:57 2009 -0500 ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT (cherry picked from commit 94d7c16cbbbd0e03841fcf272bcaf0620ad39618) At the beginning of ext4_move_extent(), we call ext4_discard_preallocations() to discard inode PAs of orig and donor inodes. But in the following case, blocks can be double freed, so move ext4_discard_preallocations() to the end of ext4_move_extents(). 1. Discard inode PAs of orig and donor inodes with ext4_discard_preallocations() in ext4_move_extents(). orig : [ DATA1 ] donor: [ DATA2 ] 2. While data blocks are exchanging between orig and donor inodes, new inode PAs is created to orig by other process's block allocation. (Since there are semaphore gaps in ext4_move_extents().) And new inode PAs is used partially (2-1). 2-1 Create new inode PAs to orig inode orig : [ DATA1 | used PA1 | free PA1 ] donor: [ DATA2 ] 3. Donor inode which has old orig inode's blocks is deleted after EXT4_IOC_MOVE_EXT finished (3-1, 3-2). So the block bitmap corresponds to old orig inode's blocks are freed. 3-1 After EXT4_IOC_MOVE_EXT finished orig : [ DATA2 | free PA1 ] donor: [ DATA1 | used PA1 ] 3-2 Delete donor inode orig : [ DATA2 | free PA1 ] donor: [ FREE SPACE(DATA1) | FREE SPACE(used PA1) ] 4. The double-free of blocks is occurred, when close() is called to orig inode. Because ext4_discard_preallocations() for orig inode frees used PA1 and free PA1, though used PA1 is already freed in 3. 4-1 Double-free of blocks is occurred orig : [ DATA2 | FREE SPACE(free PA1) ] donor: [ FREE SPACE(DATA1) | DOUBLE FREE(used PA1) ] Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 6011d0baad8d5229445b1e0f7b6bdfcaf9cbff26 Author: Eric Sandeen Date: Thu Nov 19 14:28:50 2009 -0500 ext4: make "norecovery" an alias for "noload" (cherry picked from commit e3bb52ae2bb9573e84c17b8e3560378d13a5c798) Users on the linux-ext4 list recently complained about differences across filesystems w.r.t. how to mount without a journal replay. In the discussion it was noted that xfs's "norecovery" option is perhaps more descriptively accurate than "noload," so let's make that an alias for ext4. Also show this status in /proc/mounts Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit feac39ba7f50e851f4e20466e5d390f046932287 Author: Eric Sandeen Date: Thu Nov 19 14:25:42 2009 -0500 ext4: make trim/discard optional (and off by default) (cherry picked from commit 5328e635315734d42080de9a5a1ee87bf4cae0a4) It is anticipated that when sb_issue_discard starts doing real work on trim-capable devices, we may see issues. Make this mount-time optional, and default it to off until we know that things are working out OK. Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit cd9c823a6830d6c84f50fe7dabaf281c542d13ef Author: Jan Kara Date: Mon Nov 23 07:24:48 2009 -0500 ext4: fix error handling in ext4_ind_get_blocks() (cherry picked from commit 2bba702d4f88d7b010ec37e2527b552588404ae7) When an error happened in ext4_splice_branch we failed to notice that in ext4_ind_get_blocks and mapped the buffer anyway. Fix the problem by checking for error properly. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 43e932d3116f3a74f1b5ff0a38cca5e0bac1099a Author: Theodore Ts'o Date: Mon Nov 23 07:24:57 2009 -0500 ext4: avoid issuing unnecessary barriers (cherry picked from commit 6b17d902fdd241adfa4ce780df20547b28bf5801) We don't to issue an I/O barrier on an error or if we force commit because we are doing data journaling. Signed-off-by: "Theodore Ts'o" Cc: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 31299e22be94e77b6df41d022e8f0bd075537d58 Author: Theodore Ts'o Date: Sun Nov 15 15:29:56 2009 -0500 ext4: fix block validity checks so they work correctly with meta_bg (cherry picked from commit 1032988c71f3f85483b2b4319684d1205a704c02) The block validity checks used by ext4_data_block_valid() wasn't correctly written to check file systems with the meta_bg feature. Fix this. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 58ffefbe7bf2bdfc6f91b79eef9d670df6eab4d3 Author: Theodore Ts'o Date: Mon Nov 23 07:24:38 2009 -0500 ext4: fix uninit block bitmap initialization when s_meta_first_bg is non-zero (cherry picked from commit 8dadb198cb70ef811916668fe67eeec82e8858dd) The number of old-style block group descriptor blocks is s_meta_first_bg when the meta_bg feature flag is set. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 9e9ddddfe74189e2f2514ed2d8eb9ef3f015118f Author: Theodore Ts'o Date: Mon Nov 23 07:24:52 2009 -0500 ext4: don't update the superblock in ext4_statfs() (cherry picked from commit 3f8fb9490efbd300887470a2a880a64e04dcc3f5) commit a71ce8c6c9bf269b192f352ea555217815cf027e updated ext4_statfs() to update the on-disk superblock counters, but modified this buffer directly without any journaling of the change. This is one of the accesses that was causing the crc errors in journal replay as seen in kernel.org bugzilla #14354. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 023c1d20ed1d29d5f5e40e79fde1f8f6b6a42e6f Author: Eric Sandeen Date: Sun Nov 15 15:30:52 2009 -0500 ext4: journal all modifications in ext4_xattr_set_handle (cherry picked from commit 86ebfd08a1930ccedb8eac0aeb1ed4b8b6a41dbc) ext4_xattr_set_handle() was zeroing out an inode outside of journaling constraints; this is one of the accesses that was causing the crc errors in journal replay as seen in kernel.org bugzilla #14354. Reviewed-by: Andreas Dilger Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 70095d96ced4dbe2ebc7cc0d248313e507dda31a Author: Julia Lawall Date: Sun Nov 15 15:30:58 2009 -0500 ext4: fix i_flags access in ext4_da_writepages_trans_blocks() (cherry picked from commit 30c6e07a92ea4cb87160d32ffa9bce172576ae4c) We need to be testing the i_flags field in the ext4 specific portion of the inode, instead of the (confusingly aliased) i_flags field in the generic struct inode. Signed-off-by: Julia Lawall Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 0f9036c7eed145cdd8c8ed9e899c61f499278259 Author: Theodore Ts'o Date: Mon Nov 23 07:17:34 2009 -0500 ext4: make sure directory and symlink blocks are revoked (cherry picked from commit 50689696867d95b38d9c7be640a311494a04fb86) When an inode gets unlinked, the functions ext4_clear_blocks() and ext4_remove_blocks() call ext4_forget() for all the buffer heads corresponding to the deleted inode's data blocks. If the inode is a directory or a symlink, the is_metadata parameter must be non-zero so ext4_forget() will revoke them via jbd2_journal_revoke(). Otherwise, if these blocks are reused for a data file, and the system crashes before a journal checkpoint, the journal replay could end up corrupting these data blocks. Thanks to Curt Wohlgemuth for pointing out potential problems in this area. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 21a4b3aaa2180ca6748446c4b06e91f3da244dca Author: Theodore Ts'o Date: Sat Nov 14 08:19:05 2009 -0500 ext4: plug a buffer_head leak in an error path of ext4_iget() (cherry picked from commit 567f3e9a70d71e5c9be03701b8578be77857293b) One of the invalid error paths in ext4_iget() forgot to brelse() the inode buffer head. Fix it by adding a brelse() in the common error return path, which also simplifies function. Thanks to Andi Kleen reporting the problem. Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit a752b960d4b57e0d307e1c40b1b47ab849832c0f Author: Akira Fujita Date: Mon Nov 23 07:24:41 2009 -0500 ext4: fix possible recursive locking warning in EXT4_IOC_MOVE_EXT (cherry picked from commit 49bd22bc4d603a2a4fc2a6a60e156cbea52eb494) If CONFIG_PROVE_LOCKING is enabled, the double_down_write_data_sem() will trigger a false-positive warning of a recursive lock. Since we take i_data_sem for the two inodes ordered by their inode numbers, this isn't a problem. Use of down_write_nested() will notify the lock dependency checker machinery that there is no problem here. This problem was reported by Brian Rogers: http://marc.info/?l=linux-ext4&m=125115356928011&w=1 Reported-by: Brian Rogers Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 52a4345d3d82b77bea320bf223716298468be3de Author: Akira Fujita Date: Mon Nov 23 07:24:43 2009 -0500 ext4: fix lock order problem in ext4_move_extents() (cherry picked from commit fc04cb49a898c372a22b21fffc47f299d8710801) ext4_move_extents() checks the logical block contiguousness of original file with ext4_find_extent() and mext_next_extent(). Therefore the extent which ext4_ext_path structure indicates must not be changed between above functions. But in current implementation, there is no i_data_sem protection between ext4_ext_find_extent() and mext_next_extent(). So the extent which ext4_ext_path structure indicates may be overwritten by delalloc. As a result, ext4_move_extents() will exchange wrong blocks between original and donor files. I change the place where acquire/release i_data_sem to solve this problem. Moreover, I changed move_extent_per_page() to start transaction first, and then acquire i_data_sem. Without this change, there is a possibility of the deadlock between mmap() and ext4_move_extents(): * NOTE: "A", "B" and "C" mean different processes A-1: ext4_ext_move_extents() acquires i_data_sem of two inodes. B: do_page_fault() starts the transaction (T), and then tries to acquire i_data_sem. But process "A" is already holding it, so it is kept waiting. C: While "A" and "B" running, kjournald2 tries to commit transaction (T) but it is under updating, so kjournald2 waits for it. A-2: Call ext4_journal_start with holding i_data_sem, but transaction (T) is locked. Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit a4a87a7f39020ea18f220c9e9c8ab112ebd6764a Author: Akira Fujita Date: Mon Nov 23 07:25:48 2009 -0500 ext4: fix the returned block count if EXT4_IOC_MOVE_EXT fails (cherry picked from commit f868a48d06f8886cb0367568a12367fa4f21ea0d) If the EXT4_IOC_MOVE_EXT ioctl fails, the number of blocks that were exchanged before the failure should be returned to the userspace caller. Unfortunately, currently if the block size is not the same as the page size, the returned block count that is returned is the page-aligned block count instead of the actual block count. This commit addresses this bug. Signed-off-by: Akira Fujita Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 8ed33ff5203300ac8878042bc4d4954e2f40c488 Author: Theodore Ts'o Date: Mon Nov 23 07:24:46 2009 -0500 ext4: avoid divide by zero when trying to mount a corrupted file system (cherry picked from commit 503358ae01b70ce6909d19dd01287093f6b6271c) If s_log_groups_per_flex is greater than 31, then groups_per_flex will will overflow and cause a divide by zero error. This can cause kernel BUG if such a file system is mounted. Thanks to Nageswara R Sastry for analyzing the failure and providing an initial patch. http://bugzilla.kernel.org/show_bug.cgi?id=14287 Signed-off-by: "Theodore Ts'o" Signed-off-by: Greg Kroah-Hartman commit 6662a8d03104516c53170257b837d8a03e86db39 Author: Theodore Ts'o Date: Mon Nov 23 07:25:49 2009 -0500 ext4: fix potential buffer head leak when add_dirent_to_buf() returns ENOSPC (cherry picked from commit 2de770a406b06dfc619faabbf5d85c835ed3f2e1) Previously add_dirent_to_buf() did not free its passed-in buffer head in the case of ENOSPC, since in some cases the caller still needed it. However, this led to potential buffer head leaks since not all callers dealt with this correctly. Fix this by making simplifying the freeing convention; now add_dirent_to_buf() *never* frees the passed-in buffer head, and leaves that to the responsibility of its caller. This makes things cleaner and easier to prove that the code is neither leaking buffer heads or calling brelse() one time too many. Signed-off-by: "Theodore Ts'o" Cc: Curt Wohlgemuth Signed-off-by: Greg Kroah-Hartman commit d10a8f05208b66c0b29562a9601601b9a7b5d9ac Author: Yang, Bo Date: Tue Oct 6 14:52:20 2009 -0600 SCSI: megaraid_sas: fix 64 bit sense pointer truncation commit 7b2519afa1abd1b9f63aa1e90879307842422dae upstream. The current sense pointer is cast to a u32 pointer, which can truncate on 64 bits. Fix by using unsigned long instead. Signed-off-by Bo Yang Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 79daedf8b665473f5adf6d0d80b4f2c6ff524bca Author: Martin Michlmayr Date: Mon Nov 16 20:49:25 2009 +0200 SCSI: osd_protocol.h: Add missing #include commit 0899638688f223fd9e9fee60d662665e11693d12 upstream. include/scsi/osd_protocol.h uses ALIGN() without an #include , leading to: | include/scsi/osd_protocol.h:362: error: implicit declaration of function 'ALIGN' Signed-off-by: Martin Michlmayr Signed-off-by: Boaz Harrosh Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit d888b1a2d5c7a9fbdc01e7395ea6a8d75cd729f5 Author: James Bottomley Date: Thu Nov 5 13:33:12 2009 -0600 SCSI: scsi_lib_dma: fix bug with dma maps on nested scsi objects commit d139b9bd0e52dda14fd13412e7096e68b56d0076 upstream. Some of our virtual SCSI hosts don't have a proper bus parent at the top, which can be a problem for doing DMA on them This patch makes the host device cache a pointer to the physical bus device and provides an extra API for setting it (the normal API picks it up from the parent). This patch also modifies the qla2xxx and lpfc vport logic to use the new DMA host setting API. Acked-By: James Smart Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 98d338a7028dbcfc98a7d64856798882b4fbcc21 Author: Sebastian Andrzej Siewior Date: Sun Oct 25 15:37:58 2009 +0100 signal: Fix alternate signal stack check commit 2a855dd01bc1539111adb7233f587c5c468732ac upstream. All architectures in the kernel increment/decrement the stack pointer before storing values on the stack. On architectures which have the stack grow down sas_ss_sp == sp is not on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is on the alternate signal stack. On architectures which have the stack grow up sas_ss_sp == sp is on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is not on the alternate signal stack. The current implementation fails for architectures which have the stack grow down on the corner case where sas_ss_sp == sp.This was reported as Debian bug #544905 on AMD64. Simplified test case: http://download.breakpoint.cc/tc-sig-stack.c The test case creates the following stack scenario: 0xn0300 stack top 0xn0200 alt stack pointer top (when switching to alt stack) 0xn01ff alt stack end 0xn0100 alt stack start == stack pointer If the signal is sent the stack pointer is pointing to the base address of the alt stack and the kernel erroneously decides that it has already switched to the alternate stack because of the current check for "sp - sas_ss_sp < sas_ss_size" On parisc (stack grows up) the scenario would be: 0xn0200 stack pointer 0xn01ff alt stack end 0xn0100 alt stack start = alt stack pointer base (when switching to alt stack) 0xn0000 stack base This is handled correctly by the current implementation. [ tglx: Modified for archs which have the stack grow up (parisc) which would fail with the correct implementation for stack grows down. Added a check for sp >= current->sas_ss_sp which is strictly not necessary but makes the code symetric for both variants ] Signed-off-by: Sebastian Andrzej Siewior Cc: Oleg Nesterov Cc: Roland McGrath Cc: Kyle McMartin LKML-Reference: <20091025143758.GA6653@Chamillionaire.breakpoint.cc> Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman