commit 2992c4bd5742b31a0ee00a76eee9c1c284507418 Merge: e08f6d4 1650add Author: Linus Torvalds Date: Tue Jun 21 18:20:55 2011 -0700 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix decode_secinfo_maxsz NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test NFSv4.1: Fix some issues with pnfs_generic_pg_test NFSv4.1: file layout must consider pg_bsize for coalescing pnfs-obj: No longer needed to take an extra ref at add_device SUNRPC: Ensure the RPC client only quits on fatal signals NFSv4: Fix a readdir regression nfs4.1: mark layout as bad on error path in _pnfs_return_layout nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path NFS: (d)printks should use %zd for ssize_t arguments NFSv4.1: fix break condition in pnfs_find_lseg nfs4.1: fix several problems with _pnfs_return_layout NFSv4.1: allow zero fh array in filelayout decode layout NFSv4.1: allow nfs_fhget to succeed with mounted on fileid NFSv4.1: Fix a refcounting issue in the pNFS device id cache NFSv4.1: deprecate headerpadsz in CREATE_SESSION NFS41: do not update isize if inode needs layoutcommit NLM: Don't hang forever on NLM unlock requests NFS: fix umount of pnfs filesystems commit e08f6d4131ab964420f0bcabecc68d75fb49df79 Merge: 890879c c7d74b0 Author: Linus Torvalds Date: Tue Jun 21 10:36:06 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Ensure that LOS and DFE are being turned off RDMA/cxgb4: Couple of abort fixes RDMA/cxgb4: Don't truncate MR lengths RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs commit 890879cfa08f5ceaa09810611f46e890f7d57ff6 Merge: 5629937 de1b794 Author: Linus Torvalds Date: Tue Jun 21 10:22:35 2011 -0700 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: Fix oops in jbd2_journal_remove_journal_head() jbd2: Remove obsolete parameters in the comments for some jbd2 functions ext4: fixed tracepoints cleanup ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap ext4: Fix max file size and logical block counting of extent format file ext4: correct comments for ext4_free_blocks() commit 1650add23578b5ca35c1f1e863987180a8c03779 Author: Bryan Schumaker Date: Thu Jun 2 15:07:35 2011 -0400 NFS: Fix decode_secinfo_maxsz I initially did the calculation in bytes, and not words Signed-off-by: Bryan Schumaker Signed-off-by: Trond Myklebust commit 19982ba8562e33083cb5bbb59a74855d8a9624ea Author: Trond Myklebust Date: Fri Jun 10 13:30:23 2011 -0400 NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test And document what is going on there... Signed-off-by: Trond Myklebust commit 8f7d5efbef8718a774ac5e347b4ec069f17fd9b4 Author: Trond Myklebust Date: Fri Jun 10 13:30:22 2011 -0400 NFSv4.1: Fix some issues with pnfs_generic_pg_test 1. If the intention is to coalesce requests 'prev' and 'req' then we have to ensure at least that we have a layout starting at req_offset(prev). 2. If we're only requesting a minimal layout of length desc->pg_count, we need to test the length actually returned by the server before we allow the coalescing to occur. 3. We need to deal correctly with (pgio->lseg == NULL) 4. Fixup the test guarding the pnfs_update_layout. Signed-off-by: Trond Myklebust commit 19345cb299e8234006c5125151ab723e851a1d24 Author: Benny Halevy Date: Sun Jun 19 18:33:46 2011 -0400 NFSv4.1: file layout must consider pg_bsize for coalescing Otherwise we end up overflowing the rpc buffer size on the receive end. Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit df18d127f4fed7a0284bcfa8d2843800cdb63b72 Author: Boaz Harrosh Date: Fri Jun 17 16:25:51 2011 -0400 pnfs-obj: No longer needed to take an extra ref at add_device Andy's last device_cache patches, already take an extra reference on the newly inserted device_id. So we can remove it from obj-io. Without this patch the device_ids are leaked. Andy's patches are not in Linus tree yet. So I'm not sure if they are scheduled for this Kernel or the next. This patch should be added as part of these. CC: Andy Adamson Signed-off-by: Boaz Harrosh Signed-off-by: Trond Myklebust commit c7d74b090913102e7917dd02bb574ef060e1e930 Merge: 8da7e7a 3126448 Author: Roland Dreier Date: Fri Jun 17 11:57:55 2011 -0700 Merge branches 'cxgb4' and 'qib' into for-next commit 3126448451105fae59de0058c68692aa09aa4c37 Author: Mitko Haralanov Date: Thu Jun 9 20:27:26 2011 +0000 IB/qib: Ensure that LOS and DFE are being turned off Due to timing, it is possible for the LOS and DFE to remain on. This is due to the link progressing to LinkUP prior to the driver getting the first Status Changed interrupt. By expanding the conditions under which LOS is turned off and DFE timeout is being set, timing is no longer an issue. Signed-off-by: Mitko Haralanov Signed-off-by: Mike Marciniszyn Signed-off-by: Roland Dreier commit 8da7e7a55231543b84ac84e93ad5ca9d340773d7 Author: Steve Wise Date: Tue Jun 14 20:59:27 2011 +0000 RDMA/cxgb4: Couple of abort fixes - fix a race where the driver could end up sending a close_con_req after an abort_rpl. In c4iw_ep_disconnect(), send abort or close request with the ep mutex held. - fix a hang where driver fails to wake up when a connection is reset during a normal close. Wake up any waiters in the interrupt path, and correctly cleanup after rdma_fini() failures. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 301c2c3f039a1f9478f6cbef60f2ccd4da9bd4a1 Author: Steve Wise Date: Tue Jun 14 20:59:21 2011 +0000 RDMA/cxgb4: Don't truncate MR lengths Remove left-over code from T3 that limited MR sizes to 32b. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 2ff7d09a1b0f20f2d9c1bde0e003d4e384de2313 Author: Steve Wise Date: Wed Jun 1 17:49:14 2011 +0000 RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs Memory allocated for user CQs gets rounded up to the next page boundary. And after rounding, we recalculate the resulting IQ depth and we need to make sure we don't exceed the HW limits. This bug can result a much smaller CQ allocated than was expected if the HW size field is exceeded, resulting in CQ overflow failures. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 5afa9133cfe67f1bfead6049a9640c9262a7101c Author: Trond Myklebust Date: Fri Jun 17 10:14:59 2011 -0400 SUNRPC: Ensure the RPC client only quits on fatal signals Fix a couple of instances where we were exiting the RPC client on arbitrary signals. We should only do so on fatal signals. Cc: stable@kernel.org Signed-off-by: Trond Myklebust commit ee7b75fc4f3ae49e1f25bf56219bb5de3c29afaf Author: Trond Myklebust Date: Thu Jun 16 13:15:41 2011 -0400 NFSv4: Fix a readdir regression Commit 7ebb9315 (NFS: use secinfo when crossing mountpoints) introduces a regression when decoding an NFSv4 readdir entry that sets the rdattr_error field. By treating the resulting value as if it is a decoding error, the current code may cause us to skip valid readdir entries. Reported-by: Andy Adamson Cc: stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 9e2dfdb3081edfae66a49013517e80dd8a0469fa Author: Fred Isaman Date: Wed Jun 15 14:32:02 2011 -0400 nfs4.1: mark layout as bad on error path in _pnfs_return_layout Signed-off-by: Fred Isaman Signed-off-by: Trond Myklebust commit ea0ded748bdea78f9e2fefb571f7d6ce9edb4f89 Author: Fred Isaman Date: Wed Jun 15 12:31:02 2011 -0400 nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout mark_matching_lsegs_invalid could put the last ref to the layout, so the get_layout_hdr needs to be called first. Signed-off-by: Fred Isaman Signed-off-by: Trond Myklebust commit 1ed3a8539af7b36aa5c977f304e80f7fc8d27bfc Author: Benny Halevy Date: Wed Jun 15 11:39:57 2011 -0400 NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path We always get a reference on the layout header and we rely on nfs4_layoutreturn_release to put it. If we hit an allocation error before starting the rpc proc we bail out early without dereferncing the layout header properly. Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit c7fd06228b994190d8369a2a0acf5224e4e13d1a Author: David Howells Date: Wed Jun 15 00:55:44 2011 +0100 NFS: (d)printks should use %zd for ssize_t arguments (d)printks should use %zd for ssize_t arguments not %ld, otherwise they might get a warning. I see the following with MN10300. fs/nfs/objlayout/objlayout.c: In function 'objlayout_read_done': fs/nfs/objlayout/objlayout.c:294: warning: format '%ld' expects type 'long int', but argument 3 has type 'ssize_t' Signed-off-by: David Howells cc: Trond Myklebust cc: linux-nfs@vger.kernel.org Signed-off-by: Trond Myklebust commit d771e3a43e23a37398b7e05a9d1b1036d698263c Author: Benny Halevy Date: Tue Jun 14 16:30:16 2011 -0400 NFSv4.1: fix break condition in pnfs_find_lseg The break condition to skip out of the loop got broken when cmp_layout was change. Essentially, we want to stop looking once we know no layout on the remainder of the list can match the first byte of the looked-up range. Reported-by: Peng Tao Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit a2e1d4f2e5ed83850de92a491ef225824cb457bd Author: Fred Isaman Date: Mon Jun 13 18:54:53 2011 -0400 nfs4.1: fix several problems with _pnfs_return_layout _pnfs_return_layout had the following problems: - it did not call pnfs_free_lseg_list on all paths - it unintentionally did a forgetful return when there was no outstanding io - it raced with concurrent LAYOUTGETS Signed-off-by: Fred Isaman Signed-off-by: Trond Myklebust commit cec765cf5891c7fc3d905832b481bfb6fd55825d Author: Andy Adamson Date: Mon Jun 13 18:36:17 2011 -0400 NFSv4.1: allow zero fh array in filelayout decode layout Signed-off-by: Andy Adamson cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 533eb4611c9eea53072eb6a61d5a6393b6a77ed7 Author: Andy Adamson Date: Mon Jun 13 18:25:56 2011 -0400 NFSv4.1: allow nfs_fhget to succeed with mounted on fileid Commit 28331a46d88459788c8fca72dbb0415cd7f514c9 "Ensure we request the ordinary fileid when doing readdirplus" changed the meaning of NFS_ATTR_FATTR_FILEID which used to be set when FATTR4_WORD1_MOUNTED_ON_FILED was requested. Allow nfs_fhget to succeed with only a mounted on fileid when crossing a mountpoint or a referral. Ask for the fileid of the absent file system if mounted_on_fileid is not supported. Signed-off-by: Andy Adamson cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 1d92a08da23848a38eece4df7eaa4e8ec0e6c699 Author: Trond Myklebust Date: Tue Jun 14 12:07:38 2011 -0400 NFSv4.1: Fix a refcounting issue in the pNFS device id cache When we add something to the global device id cache, we need to bump the reference count, so that the cache itself holds a reference. Signed-off-by: Trond Myklebust commit c9c30dd5f73dccaa326a54dfcf490316946aea87 Author: Benny Halevy Date: Sat Jun 11 17:08:39 2011 -0400 NFSv4.1: deprecate headerpadsz in CREATE_SESSION We don't support header padding yet so better off ditching it Reported-by: Sid Moore Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit 0f66b5984df2fe1617c05900a39a7ef493ca9de9 Author: Peng Tao Date: Sat Oct 16 22:07:46 2010 -0700 NFS41: do not update isize if inode needs layoutcommit nfs_update_inode will update isize if there is no queued pages. For pNFS, layoutcommit is supposed to change file size on server, the same effect as queued pages. nfs_update_inode may be called when dirty pages are written back (nfsi->npages==0) but layoutcommit is not sent, and it will change client file size according to server file size. Then client ends up losing what it just writes back in pNFS path. So we should skip updating client file size if file needs layoutcommit. Signed-off-by: Peng Tao Cc: stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 0b760113a3a155269a3fba93a409c640031dd68f Author: Trond Myklebust Date: Tue May 31 15:15:34 2011 -0400 NLM: Don't hang forever on NLM unlock requests If the NLM daemon is killed on the NFS server, we can currently end up hanging forever on an 'unlock' request, instead of aborting. Basically, if the rpcbind request fails, or the server keeps returning garbage, we really want to quit instead of retrying. Tested-by: Vasily Averin Signed-off-by: Trond Myklebust Cc: stable@kernel.org commit 9e3bd4e24e94d60d2e0762e919aab6c9a7fc0c5b Author: Weston Andros Adamson Date: Tue May 31 21:46:50 2011 -0400 NFS: fix umount of pnfs filesystems Unmounting a pnfs filesystem hangs using filelayout and possibly others. This fixes the use of the rcu protected node by making use of a new 'tmpnode' for the temporary purge list. Also, the spinlock shouldn't be held when calling synchronize_rcu(). Signed-off-by: Weston Andros Adamson Signed-off-by: Andy Adamson Signed-off-by: Trond Myklebust commit de1b794130b130e77ffa975bb58cb843744f9ae5 Author: Jan Kara Date: Mon Jun 13 15:38:22 2011 -0400 jbd2: Fix oops in jbd2_journal_remove_journal_head() jbd2_journal_remove_journal_head() can oops when trying to access journal_head returned by bh2jh(). This is caused for example by the following race: TASK1 TASK2 jbd2_journal_commit_transaction() ... processing t_forget list __jbd2_journal_refile_buffer(jh); if (!jh->b_transaction) { jbd_unlock_bh_state(bh); jbd2_journal_try_to_free_buffers() jbd2_journal_grab_journal_head(bh) jbd_lock_bh_state(bh) __journal_try_to_free_buffer() jbd2_journal_put_journal_head(jh) jbd2_journal_remove_journal_head(bh); jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not part of any transaction and thus frees journal_head before TASK1 gets to doing so. Note that even buffer_head can be released by try_to_free_buffers() after jbd2_journal_put_journal_head() which adds even larger opportunity for oops (but I didn't see this happen in reality). Fix the problem by making transactions hold their own journal_head reference (in b_jcount). That way we don't have to remove journal_head explicitely via jbd2_journal_remove_journal_head() and instead just remove journal_head when b_jcount drops to zero. The result of this is that [__]jbd2_journal_refile_buffer(), [__]jbd2_journal_unfile_buffer(), and __jdb2_journal_remove_checkpoint() can free journal_head which needs modification of a few callers. Also we have to be careful because once journal_head is removed, buffer_head might be freed as well. So we have to get our own buffer_head reference where it matters. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" commit 1fb74cda1b5e9c6207225fda5ef7504e815ce0e0 Author: Tao Ma Date: Sun Jun 12 22:44:10 2011 -0400 jbd2: Remove obsolete parameters in the comments for some jbd2 functions credits isn't a parameter for jbd2_journal_get_write_access and jbd2_journal_get_undo_access. So remove the corresponding comments. Acked-by: Jan Kara Cc: Randy Dunlap Signed-off-by: Tao Ma Signed-off-by: "Theodore Ts'o" commit a9c667f8f0656631ee5438baaf21bf30d5f67375 Author: Lukas Czerner Date: Mon Jun 6 09:51:52 2011 -0400 ext4: fixed tracepoints cleanup While creating fixed tracepoints for ext3, basically by porting them from ext4, I found a lot of useless retyping, wrong type usage, useless variable passing and other inconsistencies in the ext4 fixed tracepoint code. This patch cleans the fixed tracepoint code for ext4 and also simplify some of them. Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" commit c03f8aa9abdd517477c2021ea1251939b4da49e6 Author: Lukas Czerner Date: Mon Jun 6 00:06:52 2011 -0400 ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap Currently we are not marking the extent as the last one (FIEMAP_EXTENT_LAST) if there is a hole at the end of the file. This is because we just do not check for it right now and continue searching for next extent. But at the point we hit the hole at the end of the file, it is too late. This commit adds check for the allocated block in subsequent extent and if there is no more extents (block = EXT_MAX_BLOCKS) just flag the current one as the last one. This behaviour has been spotted unintentionally by 252 xfstest, when the test hangs out, because of wrong loop condition. However on other filesystems (like xfs) it will exit anyway, because we notice the last extent flag and exit. With this patch xfstest 252 does not hang anymore, ext4 fiemap implementation still reports bad extent type in some cases, however this seems to be different issue. Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" commit f17722f917b2f21497deb6edc62fb1683daa08e6 Author: Lukas Czerner Date: Mon Jun 6 00:05:17 2011 -0400 ext4: Fix max file size and logical block counting of extent format file Kazuya Mio reported that he was able to hit BUG_ON(next == lblock) in ext4_ext_put_gap_in_cache() while creating a sparse file in extent format and fill the tail of file up to its end. We will hit the BUG_ON when we write the last block (2^32-1) into the sparse file. The root cause of the problem lies in the fact that we specifically set s_maxbytes so that block at s_maxbytes fit into on-disk extent format, which is 32 bit long. However, we are not storing start and end block number, but rather start block number and length in blocks. It means that in order to cover extent from 0 to EXT_MAX_BLOCK we need EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) - and it does not. The only way to fix it without changing the meaning of the struct ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes by one fs block so we can cover the whole extent we can get by the on-disk extent format. Also in many places EXT_MAX_BLOCK is used as length instead of maximum logical block number as the name suggests, it is all a bit messy. So this commit renames it to EXT_MAX_BLOCKS and change its usage in some places to actually be maximum number of blocks in the extent. The bug which this commit fixes can be reproduced as follows: dd if=/dev/zero of=/mnt/mp1/file bs= count=1 seek=$((2**32-2)) sync dd if=/dev/zero of=/mnt/mp1/file bs= count=1 seek=$((2**32-1)) Reported-by: Kazuya Mio Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" commit 5def1360252b974faeb438775c19c14338bc1903 Author: Yongqiang Yang Date: Sun Jun 5 23:26:40 2011 -0400 ext4: correct comments for ext4_free_blocks() metadata is not parameter of ext4_free_blocks() any more. Signed-off-by: Yongqiang Yang Signed-off-by: "Theodore Ts'o"