commit bccaeafd7c117acee36e90d37c7e05c19be9e7bf Merge: 68d0080 ecc9046 Author: Linus Torvalds Date: Wed Jun 22 21:49:07 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: jfs: agstart field must be 64 bits JFS: Don't save agno in the inode jfs: Update agstart when resizing volume jfs: old_agsize should be 64 bits in jfs_extendfs commit 68d0080f1e222757c85606d3eaf81b5c4aa7719f Merge: f957db4 a5f76d5 Author: Linus Torvalds Date: Wed Jun 22 21:08:52 2011 -0700 Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PCI / PM: Block races between runtime PM and system sleep PM / Domains: Update documentation PM / Runtime: Handle clocks correctly if CONFIG_PM_RUNTIME is unset PM: Fix async resume following suspend failure PM: Free memory bitmaps if opening /dev/snapshot fails PM: Rename dev_pm_info.in_suspend to is_prepared PM: Update documentation regarding sysdevs PM / Runtime: Update doc: usage count no longer incremented across system PM commit f957db4fcdd8f03e186aa8f041f4049e76ab741c Author: David Rientjes Date: Wed Jun 22 18:13:04 2011 -0700 mm, hotplug: protect zonelist building with zonelists_mutex Commit 959ecc48fc75 ("mm/memory_hotplug.c: fix building of node hotplug zonelist") does not protect the build_all_zonelists() call with zonelists_mutex as needed. This can lead to races in constructing zonelist ordering if a concurrent build is underway. Protecting this with lock_memory_hotplug() is insufficient since zonelists can be rebuild though sysfs as well. Signed-off-by: David Rientjes Reviewed-by: KOSAKI Motohiro Signed-off-by: Linus Torvalds commit 7553e8f2d5161a2b7a9b7a9f37be1b77e735552f Author: David Rientjes Date: Wed Jun 22 18:13:01 2011 -0700 mm, hotplug: fix error handling in mem_online_node() The error handling in mem_online_node() is incorrect: hotadd_new_pgdat() returns NULL if the new pgdat could not have been allocated and a pointer to it otherwise. mem_online_node() should fail if hotadd_new_pgdat() fails, not the inverse. This fixes an issue when memoryless nodes are not onlined and their sysfs interface is not registered when their first cpu is brought up. The bug was introduced by commit cf23422b9d76 ("cpu/mem hotplug: enable CPUs online before local memory online") iow v2.6.35. Signed-off-by: David Rientjes Reviewed-by: KOSAKI Motohiro Cc: stable@kernel.org Signed-off-by: Linus Torvalds commit b1d7dd80aadb9042e83f9778b484a2f92e0b04d4 Author: David Howells Date: Tue Jun 21 14:32:05 2011 +0100 KEYS: Fix error handling in construct_key_and_link() Fix error handling in construct_key_and_link(). If construct_alloc_key() returns an error, it shouldn't pass out through the normal path as the key_serial() called by the kleave() statement will oops when it gets an error code in the pointer: BUG: unable to handle kernel paging request at ffffffffffffff84 IP: [] request_key_and_link+0x4d7/0x52f .. Call Trace: [] request_key+0x41/0x75 [] cifs_get_spnego_key+0x206/0x226 [cifs] [] CIFS_SessSetup+0x511/0x1234 [cifs] [] cifs_setup_session+0x90/0x1ae [cifs] [] cifs_get_smb_ses+0x34b/0x40f [cifs] [] cifs_mount+0x13f/0x504 [cifs] [] cifs_do_mount+0xc4/0x672 [cifs] [] mount_fs+0x69/0x155 [] vfs_kern_mount+0x63/0xa0 [] do_kern_mount+0x4d/0xdf [] do_mount+0x63c/0x69f [] sys_mount+0x88/0xc2 [] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells Acked-by: Jeff Layton Signed-off-by: Linus Torvalds commit 35052cffe0081904f3362c05818db900dd9dc7de Author: David Howells Date: Tue Jun 21 10:29:51 2011 +0100 MN10300: asm/uaccess.h needs to #include linux/kernel.h for might_sleep() MN10300's asm/uaccess.h needs to #include linux/kernel.h to get might_sleep() otherwise it fails to build on MN10300 allyesconfig. This fails in a few places with messages like the following: In file included from security/keys/trusted.c:14: include/linux/uaccess.h: In function '__copy_from_user_nocache': include/linux/uaccess.h:52: error: implicit declaration of function 'might_sleep' Signed-off-by: David Howells Signed-off-by: Linus Torvalds commit 2992c4bd5742b31a0ee00a76eee9c1c284507418 Merge: e08f6d4 1650add Author: Linus Torvalds Date: Tue Jun 21 18:20:55 2011 -0700 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix decode_secinfo_maxsz NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test NFSv4.1: Fix some issues with pnfs_generic_pg_test NFSv4.1: file layout must consider pg_bsize for coalescing pnfs-obj: No longer needed to take an extra ref at add_device SUNRPC: Ensure the RPC client only quits on fatal signals NFSv4: Fix a readdir regression nfs4.1: mark layout as bad on error path in _pnfs_return_layout nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path NFS: (d)printks should use %zd for ssize_t arguments NFSv4.1: fix break condition in pnfs_find_lseg nfs4.1: fix several problems with _pnfs_return_layout NFSv4.1: allow zero fh array in filelayout decode layout NFSv4.1: allow nfs_fhget to succeed with mounted on fileid NFSv4.1: Fix a refcounting issue in the pNFS device id cache NFSv4.1: deprecate headerpadsz in CREATE_SESSION NFS41: do not update isize if inode needs layoutcommit NLM: Don't hang forever on NLM unlock requests NFS: fix umount of pnfs filesystems commit a5f76d5eba157bf637beb2dd18026db2917c512e Author: Rafael J. Wysocki Date: Tue Jun 21 23:47:15 2011 +0200 PCI / PM: Block races between runtime PM and system sleep After commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to succeed during system suspend) it is possible that a device resumed by the pm_runtime_resume(dev) in pci_pm_prepare() will be suspended immediately from a work item, timer function or otherwise, defeating the very purpose of calling pm_runtime_resume(dev) from there. To prevent that from happening it is necessary to increment the runtime PM usage counter of the device by replacing pm_runtime_resume() with pm_runtime_get_sync(). Moreover, the incremented runtime PM usage counter has to be decremented by the corresponding pci_pm_complete(), via pm_runtime_put_sync(). Signed-off-by: Rafael J. Wysocki Cc: stable@kernel.org Acked-by: Jesse Barnes commit ca9c6890b598997165a7c85c001f382c910f12b0 Author: Rafael J. Wysocki Date: Tue Jun 21 23:25:32 2011 +0200 PM / Domains: Update documentation Commit 4d27e9dcff00a6425d779b065ec8892e4f391661 (PM: Make power domain callbacks take precedence over subsystem ones) forgot to update the device power management documentation to take changes made by it into account. Correct that mistake. Signed-off-by: Rafael J. Wysocki commit 4d1518f5668ef1b3dff6c3b30fa761fe5573cdaa Author: Rafael J. Wysocki Date: Tue Jun 21 23:24:33 2011 +0200 PM / Runtime: Handle clocks correctly if CONFIG_PM_RUNTIME is unset Commit 85eb8c8d0b0900c073b0e6f89979ac9c439ade1a (PM / Runtime: Generic clock manipulation rountines for runtime PM (v6)) converted the shmobile platform to using generic code for runtime PM clock management, but it changed the behavior for CONFIG_PM_RUNTIME unset incorrectly. Specifically, for CONFIG_PM_RUNTIME unset pm_runtime_clk_notify() should enable clocks for action equal to BUS_NOTIFY_BIND_DRIVER and it should disable them for action equal to BUS_NOTIFY_UNBOUND_DRIVER (instead of BUS_NOTIFY_ADD_DEVICE and BUS_NOTIFY_DEL_DEVICE, respectively). Make this function behave as appropriate. Signed-off-by: Rafael J. Wysocki Acked-by: Magnus Damm commit 6d0e0e84f66d32c33511984dd3badd32364b863c Author: Alan Stern Date: Sat Jun 18 22:42:09 2011 +0200 PM: Fix async resume following suspend failure The PM core doesn't handle suspend failures correctly when it comes to asynchronously suspended devices. These devices are moved onto the dpm_suspended_list as soon as the corresponding async thread is started up, and they remain on the list even if they fail to suspend or the sleep transition is cancelled before they get suspended. As a result, when the PM core unwinds the transition, it tries to resume the devices even though they were never suspended. This patch (as1474) fixes the problem by adding a new "is_suspended" flag to dev_pm_info. Devices are resumed only if the flag is set. [rjw: * Moved the dev->power.is_suspended check into device_resume(), because we need to complete dev->power.completion and clear dev->power.is_prepared too for devices whose dev->power.is_suspended flags are unset. * Fixed __device_suspend() to avoid setting dev->power.is_suspended if async_error is different from zero.] Signed-off-by: Alan Stern Signed-off-by: Rafael J. Wysocki Cc: stable@kernel.org commit 8440f4b19494467883f8541b7aa28c7bbf6ac92b Author: Michal Kubecek Date: Sat Jun 18 20:34:01 2011 +0200 PM: Free memory bitmaps if opening /dev/snapshot fails When opening /dev/snapshot device, snapshot_open() creates memory bitmaps which are freed in snapshot_release(). But if any of the callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open() fails, snapshot_release() is never called and bitmaps are not freed. Next attempt to open /dev/snapshot then triggers BUG_ON() check in create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module is active on s390x. Signed-off-by: Michal Kubecek Signed-off-by: Rafael J. Wysocki Cc: stable@kernel.org commit f76b168b6f117a49d36307053e1acbe30580ea5b Author: Alan Stern Date: Sat Jun 18 20:22:23 2011 +0200 PM: Rename dev_pm_info.in_suspend to is_prepared This patch (as1473) renames the "in_suspend" field in struct dev_pm_info to "is_prepared", in preparation for an upcoming change. The new name is more descriptive of what the field really means. Signed-off-by: Alan Stern Signed-off-by: Rafael J. Wysocki Cc: stable@kernel.org commit 78420884e680da8fbc3240de2d3106437042381e Author: Rafael J. Wysocki Date: Sat Jun 18 19:53:57 2011 +0200 PM: Update documentation regarding sysdevs The part of Documentation/power/devices.txt regarding sysdevs is not valid any more after commit 2e711c04dbbf7a7732a3f7073b1fc285d12b369d (PM: Remove sysdev suspend, resume and shutdown operations), so remove it. Signed-off-by: Rafael J. Wysocki commit 129b656a0de9a229a72fe4bb6bacd134a1477b44 Author: Kevin Hilman Date: Fri Jun 10 16:05:51 2011 -0700 PM / Runtime: Update doc: usage count no longer incremented across system PM Commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to succeed during system suspend) removed usage count increment across system PM. Update doc to reflect this. Signed-off-by: Kevin Hilman Signed-off-by: Rafael J. Wysocki commit e08f6d4131ab964420f0bcabecc68d75fb49df79 Merge: 890879c c7d74b0 Author: Linus Torvalds Date: Tue Jun 21 10:36:06 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Ensure that LOS and DFE are being turned off RDMA/cxgb4: Couple of abort fixes RDMA/cxgb4: Don't truncate MR lengths RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs commit 890879cfa08f5ceaa09810611f46e890f7d57ff6 Merge: 5629937 de1b794 Author: Linus Torvalds Date: Tue Jun 21 10:22:35 2011 -0700 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: Fix oops in jbd2_journal_remove_journal_head() jbd2: Remove obsolete parameters in the comments for some jbd2 functions ext4: fixed tracepoints cleanup ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap ext4: Fix max file size and logical block counting of extent format file ext4: correct comments for ext4_free_blocks() commit 1650add23578b5ca35c1f1e863987180a8c03779 Author: Bryan Schumaker Date: Thu Jun 2 15:07:35 2011 -0400 NFS: Fix decode_secinfo_maxsz I initially did the calculation in bytes, and not words Signed-off-by: Bryan Schumaker Signed-off-by: Trond Myklebust commit 19982ba8562e33083cb5bbb59a74855d8a9624ea Author: Trond Myklebust Date: Fri Jun 10 13:30:23 2011 -0400 NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test And document what is going on there... Signed-off-by: Trond Myklebust commit 8f7d5efbef8718a774ac5e347b4ec069f17fd9b4 Author: Trond Myklebust Date: Fri Jun 10 13:30:22 2011 -0400 NFSv4.1: Fix some issues with pnfs_generic_pg_test 1. If the intention is to coalesce requests 'prev' and 'req' then we have to ensure at least that we have a layout starting at req_offset(prev). 2. If we're only requesting a minimal layout of length desc->pg_count, we need to test the length actually returned by the server before we allow the coalescing to occur. 3. We need to deal correctly with (pgio->lseg == NULL) 4. Fixup the test guarding the pnfs_update_layout. Signed-off-by: Trond Myklebust commit ecc90462b428db2ad2ee5081c45496ed10f3a633 Author: Dave Kleikamp Date: Mon Jun 20 17:53:24 2011 -0500 jfs: agstart field must be 64 bits The previous patch added the agstart field to jfs_ip, but declared it a long. We need to make sure its 64 bits on every platform. Signed-off-by: Dave Kleikamp commit 19345cb299e8234006c5125151ab723e851a1d24 Author: Benny Halevy Date: Sun Jun 19 18:33:46 2011 -0400 NFSv4.1: file layout must consider pg_bsize for coalescing Otherwise we end up overflowing the rpc buffer size on the receive end. Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit d31b53e3cd069e02290ed8a648aa8c7618d6fe77 Author: Dave Kleikamp Date: Mon Jun 20 10:53:46 2011 -0500 JFS: Don't save agno in the inode Resizing the file system can result in an in-memory inode being remapped to a different aggregate group (AG). A cached AG number can cause problems when trying to free or allocate inodes. Instead, save the IAG's agstart address and calculate the agno when we need it. Signed-off-by: Dave Kleikamp commit 28e0fa894cd5996d3007ce82f07226f79beb7286 Author: Dave Kleikamp Date: Mon Jun 20 10:32:46 2011 -0500 jfs: Update agstart when resizing volume A comment indicates that the IAG's agstart does not need to be updated since it will always point to a block in the same aggregate group, but jfs_fsck isn't so forgiving and reports it as an error. I'm fixing this in jfsutils as well, so either a new kernel or new utilities will be sufficient to fix the problem. Signed-off-by: Dave Kleikamp commit 206b6310fd0268a6ca50cf36f03b0f4eee5602ec Author: Dave Kleikamp Date: Mon Jun 20 10:30:04 2011 -0500 jfs: old_agsize should be 64 bits in jfs_extendfs Signed-off-by: Dave Kleikamp commit df18d127f4fed7a0284bcfa8d2843800cdb63b72 Author: Boaz Harrosh Date: Fri Jun 17 16:25:51 2011 -0400 pnfs-obj: No longer needed to take an extra ref at add_device Andy's last device_cache patches, already take an extra reference on the newly inserted device_id. So we can remove it from obj-io. Without this patch the device_ids are leaked. Andy's patches are not in Linus tree yet. So I'm not sure if they are scheduled for this Kernel or the next. This patch should be added as part of these. CC: Andy Adamson Signed-off-by: Boaz Harrosh Signed-off-by: Trond Myklebust commit c7d74b090913102e7917dd02bb574ef060e1e930 Merge: 8da7e7a 3126448 Author: Roland Dreier Date: Fri Jun 17 11:57:55 2011 -0700 Merge branches 'cxgb4' and 'qib' into for-next commit 3126448451105fae59de0058c68692aa09aa4c37 Author: Mitko Haralanov Date: Thu Jun 9 20:27:26 2011 +0000 IB/qib: Ensure that LOS and DFE are being turned off Due to timing, it is possible for the LOS and DFE to remain on. This is due to the link progressing to LinkUP prior to the driver getting the first Status Changed interrupt. By expanding the conditions under which LOS is turned off and DFE timeout is being set, timing is no longer an issue. Signed-off-by: Mitko Haralanov Signed-off-by: Mike Marciniszyn Signed-off-by: Roland Dreier commit 8da7e7a55231543b84ac84e93ad5ca9d340773d7 Author: Steve Wise Date: Tue Jun 14 20:59:27 2011 +0000 RDMA/cxgb4: Couple of abort fixes - fix a race where the driver could end up sending a close_con_req after an abort_rpl. In c4iw_ep_disconnect(), send abort or close request with the ep mutex held. - fix a hang where driver fails to wake up when a connection is reset during a normal close. Wake up any waiters in the interrupt path, and correctly cleanup after rdma_fini() failures. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 301c2c3f039a1f9478f6cbef60f2ccd4da9bd4a1 Author: Steve Wise Date: Tue Jun 14 20:59:21 2011 +0000 RDMA/cxgb4: Don't truncate MR lengths Remove left-over code from T3 that limited MR sizes to 32b. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 2ff7d09a1b0f20f2d9c1bde0e003d4e384de2313 Author: Steve Wise Date: Wed Jun 1 17:49:14 2011 +0000 RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs Memory allocated for user CQs gets rounded up to the next page boundary. And after rounding, we recalculate the resulting IQ depth and we need to make sure we don't exceed the HW limits. This bug can result a much smaller CQ allocated than was expected if the HW size field is exceeded, resulting in CQ overflow failures. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 5afa9133cfe67f1bfead6049a9640c9262a7101c Author: Trond Myklebust Date: Fri Jun 17 10:14:59 2011 -0400 SUNRPC: Ensure the RPC client only quits on fatal signals Fix a couple of instances where we were exiting the RPC client on arbitrary signals. We should only do so on fatal signals. Cc: stable@kernel.org Signed-off-by: Trond Myklebust commit ee7b75fc4f3ae49e1f25bf56219bb5de3c29afaf Author: Trond Myklebust Date: Thu Jun 16 13:15:41 2011 -0400 NFSv4: Fix a readdir regression Commit 7ebb9315 (NFS: use secinfo when crossing mountpoints) introduces a regression when decoding an NFSv4 readdir entry that sets the rdattr_error field. By treating the resulting value as if it is a decoding error, the current code may cause us to skip valid readdir entries. Reported-by: Andy Adamson Cc: stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 9e2dfdb3081edfae66a49013517e80dd8a0469fa Author: Fred Isaman Date: Wed Jun 15 14:32:02 2011 -0400 nfs4.1: mark layout as bad on error path in _pnfs_return_layout Signed-off-by: Fred Isaman Signed-off-by: Trond Myklebust commit ea0ded748bdea78f9e2fefb571f7d6ce9edb4f89 Author: Fred Isaman Date: Wed Jun 15 12:31:02 2011 -0400 nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout mark_matching_lsegs_invalid could put the last ref to the layout, so the get_layout_hdr needs to be called first. Signed-off-by: Fred Isaman Signed-off-by: Trond Myklebust commit 1ed3a8539af7b36aa5c977f304e80f7fc8d27bfc Author: Benny Halevy Date: Wed Jun 15 11:39:57 2011 -0400 NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path We always get a reference on the layout header and we rely on nfs4_layoutreturn_release to put it. If we hit an allocation error before starting the rpc proc we bail out early without dereferncing the layout header properly. Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit c7fd06228b994190d8369a2a0acf5224e4e13d1a Author: David Howells Date: Wed Jun 15 00:55:44 2011 +0100 NFS: (d)printks should use %zd for ssize_t arguments (d)printks should use %zd for ssize_t arguments not %ld, otherwise they might get a warning. I see the following with MN10300. fs/nfs/objlayout/objlayout.c: In function 'objlayout_read_done': fs/nfs/objlayout/objlayout.c:294: warning: format '%ld' expects type 'long int', but argument 3 has type 'ssize_t' Signed-off-by: David Howells cc: Trond Myklebust cc: linux-nfs@vger.kernel.org Signed-off-by: Trond Myklebust commit d771e3a43e23a37398b7e05a9d1b1036d698263c Author: Benny Halevy Date: Tue Jun 14 16:30:16 2011 -0400 NFSv4.1: fix break condition in pnfs_find_lseg The break condition to skip out of the loop got broken when cmp_layout was change. Essentially, we want to stop looking once we know no layout on the remainder of the list can match the first byte of the looked-up range. Reported-by: Peng Tao Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit a2e1d4f2e5ed83850de92a491ef225824cb457bd Author: Fred Isaman Date: Mon Jun 13 18:54:53 2011 -0400 nfs4.1: fix several problems with _pnfs_return_layout _pnfs_return_layout had the following problems: - it did not call pnfs_free_lseg_list on all paths - it unintentionally did a forgetful return when there was no outstanding io - it raced with concurrent LAYOUTGETS Signed-off-by: Fred Isaman Signed-off-by: Trond Myklebust commit cec765cf5891c7fc3d905832b481bfb6fd55825d Author: Andy Adamson Date: Mon Jun 13 18:36:17 2011 -0400 NFSv4.1: allow zero fh array in filelayout decode layout Signed-off-by: Andy Adamson cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 533eb4611c9eea53072eb6a61d5a6393b6a77ed7 Author: Andy Adamson Date: Mon Jun 13 18:25:56 2011 -0400 NFSv4.1: allow nfs_fhget to succeed with mounted on fileid Commit 28331a46d88459788c8fca72dbb0415cd7f514c9 "Ensure we request the ordinary fileid when doing readdirplus" changed the meaning of NFS_ATTR_FATTR_FILEID which used to be set when FATTR4_WORD1_MOUNTED_ON_FILED was requested. Allow nfs_fhget to succeed with only a mounted on fileid when crossing a mountpoint or a referral. Ask for the fileid of the absent file system if mounted_on_fileid is not supported. Signed-off-by: Andy Adamson cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 1d92a08da23848a38eece4df7eaa4e8ec0e6c699 Author: Trond Myklebust Date: Tue Jun 14 12:07:38 2011 -0400 NFSv4.1: Fix a refcounting issue in the pNFS device id cache When we add something to the global device id cache, we need to bump the reference count, so that the cache itself holds a reference. Signed-off-by: Trond Myklebust commit c9c30dd5f73dccaa326a54dfcf490316946aea87 Author: Benny Halevy Date: Sat Jun 11 17:08:39 2011 -0400 NFSv4.1: deprecate headerpadsz in CREATE_SESSION We don't support header padding yet so better off ditching it Reported-by: Sid Moore Signed-off-by: Benny Halevy Signed-off-by: Trond Myklebust commit 0f66b5984df2fe1617c05900a39a7ef493ca9de9 Author: Peng Tao Date: Sat Oct 16 22:07:46 2010 -0700 NFS41: do not update isize if inode needs layoutcommit nfs_update_inode will update isize if there is no queued pages. For pNFS, layoutcommit is supposed to change file size on server, the same effect as queued pages. nfs_update_inode may be called when dirty pages are written back (nfsi->npages==0) but layoutcommit is not sent, and it will change client file size according to server file size. Then client ends up losing what it just writes back in pNFS path. So we should skip updating client file size if file needs layoutcommit. Signed-off-by: Peng Tao Cc: stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust commit 0b760113a3a155269a3fba93a409c640031dd68f Author: Trond Myklebust Date: Tue May 31 15:15:34 2011 -0400 NLM: Don't hang forever on NLM unlock requests If the NLM daemon is killed on the NFS server, we can currently end up hanging forever on an 'unlock' request, instead of aborting. Basically, if the rpcbind request fails, or the server keeps returning garbage, we really want to quit instead of retrying. Tested-by: Vasily Averin Signed-off-by: Trond Myklebust Cc: stable@kernel.org commit 9e3bd4e24e94d60d2e0762e919aab6c9a7fc0c5b Author: Weston Andros Adamson Date: Tue May 31 21:46:50 2011 -0400 NFS: fix umount of pnfs filesystems Unmounting a pnfs filesystem hangs using filelayout and possibly others. This fixes the use of the rcu protected node by making use of a new 'tmpnode' for the temporary purge list. Also, the spinlock shouldn't be held when calling synchronize_rcu(). Signed-off-by: Weston Andros Adamson Signed-off-by: Andy Adamson Signed-off-by: Trond Myklebust commit de1b794130b130e77ffa975bb58cb843744f9ae5 Author: Jan Kara Date: Mon Jun 13 15:38:22 2011 -0400 jbd2: Fix oops in jbd2_journal_remove_journal_head() jbd2_journal_remove_journal_head() can oops when trying to access journal_head returned by bh2jh(). This is caused for example by the following race: TASK1 TASK2 jbd2_journal_commit_transaction() ... processing t_forget list __jbd2_journal_refile_buffer(jh); if (!jh->b_transaction) { jbd_unlock_bh_state(bh); jbd2_journal_try_to_free_buffers() jbd2_journal_grab_journal_head(bh) jbd_lock_bh_state(bh) __journal_try_to_free_buffer() jbd2_journal_put_journal_head(jh) jbd2_journal_remove_journal_head(bh); jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not part of any transaction and thus frees journal_head before TASK1 gets to doing so. Note that even buffer_head can be released by try_to_free_buffers() after jbd2_journal_put_journal_head() which adds even larger opportunity for oops (but I didn't see this happen in reality). Fix the problem by making transactions hold their own journal_head reference (in b_jcount). That way we don't have to remove journal_head explicitely via jbd2_journal_remove_journal_head() and instead just remove journal_head when b_jcount drops to zero. The result of this is that [__]jbd2_journal_refile_buffer(), [__]jbd2_journal_unfile_buffer(), and __jdb2_journal_remove_checkpoint() can free journal_head which needs modification of a few callers. Also we have to be careful because once journal_head is removed, buffer_head might be freed as well. So we have to get our own buffer_head reference where it matters. Signed-off-by: Jan Kara Signed-off-by: "Theodore Ts'o" commit 1fb74cda1b5e9c6207225fda5ef7504e815ce0e0 Author: Tao Ma Date: Sun Jun 12 22:44:10 2011 -0400 jbd2: Remove obsolete parameters in the comments for some jbd2 functions credits isn't a parameter for jbd2_journal_get_write_access and jbd2_journal_get_undo_access. So remove the corresponding comments. Acked-by: Jan Kara Cc: Randy Dunlap Signed-off-by: Tao Ma Signed-off-by: "Theodore Ts'o" commit a9c667f8f0656631ee5438baaf21bf30d5f67375 Author: Lukas Czerner Date: Mon Jun 6 09:51:52 2011 -0400 ext4: fixed tracepoints cleanup While creating fixed tracepoints for ext3, basically by porting them from ext4, I found a lot of useless retyping, wrong type usage, useless variable passing and other inconsistencies in the ext4 fixed tracepoint code. This patch cleans the fixed tracepoint code for ext4 and also simplify some of them. Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" commit c03f8aa9abdd517477c2021ea1251939b4da49e6 Author: Lukas Czerner Date: Mon Jun 6 00:06:52 2011 -0400 ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap Currently we are not marking the extent as the last one (FIEMAP_EXTENT_LAST) if there is a hole at the end of the file. This is because we just do not check for it right now and continue searching for next extent. But at the point we hit the hole at the end of the file, it is too late. This commit adds check for the allocated block in subsequent extent and if there is no more extents (block = EXT_MAX_BLOCKS) just flag the current one as the last one. This behaviour has been spotted unintentionally by 252 xfstest, when the test hangs out, because of wrong loop condition. However on other filesystems (like xfs) it will exit anyway, because we notice the last extent flag and exit. With this patch xfstest 252 does not hang anymore, ext4 fiemap implementation still reports bad extent type in some cases, however this seems to be different issue. Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" commit f17722f917b2f21497deb6edc62fb1683daa08e6 Author: Lukas Czerner Date: Mon Jun 6 00:05:17 2011 -0400 ext4: Fix max file size and logical block counting of extent format file Kazuya Mio reported that he was able to hit BUG_ON(next == lblock) in ext4_ext_put_gap_in_cache() while creating a sparse file in extent format and fill the tail of file up to its end. We will hit the BUG_ON when we write the last block (2^32-1) into the sparse file. The root cause of the problem lies in the fact that we specifically set s_maxbytes so that block at s_maxbytes fit into on-disk extent format, which is 32 bit long. However, we are not storing start and end block number, but rather start block number and length in blocks. It means that in order to cover extent from 0 to EXT_MAX_BLOCK we need EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) - and it does not. The only way to fix it without changing the meaning of the struct ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes by one fs block so we can cover the whole extent we can get by the on-disk extent format. Also in many places EXT_MAX_BLOCK is used as length instead of maximum logical block number as the name suggests, it is all a bit messy. So this commit renames it to EXT_MAX_BLOCKS and change its usage in some places to actually be maximum number of blocks in the extent. The bug which this commit fixes can be reproduced as follows: dd if=/dev/zero of=/mnt/mp1/file bs= count=1 seek=$((2**32-2)) sync dd if=/dev/zero of=/mnt/mp1/file bs= count=1 seek=$((2**32-1)) Reported-by: Kazuya Mio Signed-off-by: Lukas Czerner Signed-off-by: "Theodore Ts'o" commit 5def1360252b974faeb438775c19c14338bc1903 Author: Yongqiang Yang Date: Sun Jun 5 23:26:40 2011 -0400 ext4: correct comments for ext4_free_blocks() metadata is not parameter of ext4_free_blocks() any more. Signed-off-by: Yongqiang Yang Signed-off-by: "Theodore Ts'o"