GIT 35cb2bfed00c012bb06e43bd5a20b02f6b3d845e git://git.linux-nfs.org/pub/linux/nfs-2.6.git commit 35cb2bfed00c012bb06e43bd5a20b02f6b3d845e Author: J. Bruce Fields Date: Tue Jan 3 09:56:01 2006 +0100 SUNRPC: Make krb5 report unsupported encryption types Print messages when an unsupported encrytion algorthm is requested or there is an error locating a supported algorthm. Signed-off-by: Kevin Coffman Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit 95898e55da60f0a95cc2e43b7362597aead794f4 Author: J. Bruce Fields Date: Tue Jan 3 09:56:01 2006 +0100 SUNRPC: Make spkm3 report unsupported encryption types Print messages when an unsupported encrytion algorthm is requested or there is an error locating a supported algorthm. Signed-off-by: Kevin Coffman Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit 84fb17174a5c462c37ffef76543b7cd643e5cca9 Author: J. Bruce Fields Date: Tue Jan 3 09:56:00 2006 +0100 SUNRPC: Update the spkm3 code to use the make_checksum interface Also update the tokenlen calculations to accomodate g_token_size(). Signed-off-by: Andy Adamson Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit 88df1fe8d2375235dc36919f5292d398ab36cec9 Author: Trond Myklebust Date: Tue Jan 3 09:55:58 2006 +0100 NFSv4: Fix an Oops in nfs_do_expire_all_delegations If the loop errors, we need to exit. Signed-off-by: Trond Myklebust commit 18bae8b68e433c0bb0b24d5ec4f6737c920427fa Author: Trond Myklebust Date: Tue Jan 3 09:55:57 2006 +0100 NFSv4: Allow entries in the idmap cache to expire If someone changes the uid/gid mapping in userland, then we do eventually want those changes to be propagated to the kernel. Currently the kernel assumes that it may cache entries forever. Add an expiration time + garbage collector for idmap entries. Signed-off-by: Trond Myklebust commit e853fbf5787deae0e549fcdfafc55d735b4c1f9a Author: Trond Myklebust Date: Tue Jan 3 09:55:56 2006 +0100 SUNRPC: Clean up xprt_destroy() We ought never to be calling xprt_destroy() if there are still active rpc_tasks. Optimise away the broken code that attempts to "fix" that case. Signed-off-by: Trond Myklebust commit 24f7b4aa7ae698888e6fa663e020b7fad25812be Author: Trond Myklebust Date: Tue Jan 3 09:55:55 2006 +0100 SUNRPC: Ensure client closes the socket when server initiates a close If the server decides to close the RPC socket, we currently don't actually respond until either another RPC call is scheduled, or until xprt_autoclose() gets called by the socket expiry timer (which may be up to 5 minutes later). This patch ensures that xprt_autoclose() is called much sooner if the server closes the socket. Signed-off-by: Trond Myklebust commit 4e7d099731898483ff408c5594d66e269cc38bcc Author: Trond Myklebust Date: Tue Jan 3 09:55:54 2006 +0100 NFS: get rid of some needless code obfuscation in xdr_encode_sattr(). Signed-off-by: Trond Myklebust commit 2419e859e39f9e66b065981173a27d7ebece1dc6 Author: Trond Myklebust Date: Tue Jan 3 09:55:53 2006 +0100 NFS: Send valid mode bits to the server inode->i_mode contains a lot more than just the mode bits. Make sure that we mask away this extra stuff in SETATTR calls to the server. Signed-off-by: Trond Myklebust commit 469c4ebc7df3416328de5c808b80bb5b20ff6835 Author: Chuck Lever Date: Tue Jan 3 09:55:52 2006 +0100 SUNRPC: get rid of cl_chatty Clean up: Every ULP that uses the in-kernel RPC client, except the NLM client, sets cl_chatty. There's no reason why NLM shouldn't set it, so just get rid of cl_chatty and always be verbose. Test-plan: Compile with CONFIG_NFS enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit b2fd211907c14cff95f99956dbb7c30302e613d4 Author: Chuck Lever Date: Tue Jan 3 09:55:51 2006 +0100 SUNRPC: transport switch API for setting port number At some point, transport endpoint addresses will no longer be IPv4. To hide the structure of the rpc_xprt's address field from ULPs and port mappers, add an API for setting the port number during an RPC bind operation. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 9c200eb5368adda5c94ef79f3861f9338fd4dfd8 Author: Chuck Lever Date: Tue Jan 3 09:55:50 2006 +0100 SUNRPC: new interface to force an RPC rebind We'd like to hide fields in rpc_xprt and rpc_clnt from upper layer protocols. Start by creating an API to force RPC rebind, replacing logic that simply sets cl_port to zero. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit a15b86fd4b7113ccbe27f90eb4fd38239df24094 Author: Chuck Lever Date: Tue Jan 3 09:55:49 2006 +0100 SUNRPC: switchable buffer allocation Add RPC client transport switch support for replacing buffer management on a per-transport basis. In the current IPv4 socket transport implementation, RPC buffers are allocated as needed for each RPC message that is sent. Some transport implementations may choose to use pre-allocated buffers for encoding, sending, receiving, and unmarshalling RPC messages, however. For transports capable of direct data placement, the buffers can be carved out of a pre-registered area of memory rather than from a slab cache. Test-plan: Millions of fsx operations. Performance characterization with "sio" and "iozone". Use oprofile and other tools to look for significant regression in CPU utilization. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 265e2ef45c1b16f683ca46c6dbe3a85fc2c7cfbd Author: J. Bruce Fields Date: Tue Jan 3 09:55:48 2006 +0100 NFSv3: try get_root user-supplied security_flavor Thanks to Ed Keizer for bug and root cause. He says: "... we could only mount the top-level Solaris share. We could not mount deeper into the tree. Investigation showed that Solaris allows UNIX authenticated FSINFO only on the top level of the share. This is a problem because we share/export our home directories one level higher than we mount them. I.e. we share the partition and not the individual home directories. This prevented access to home directories." We still may need to try auth_sys for the case where the client doesn't have appropriate credentials. Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit 83040edba85aa1b19bce64f3d61523255de4c2d7 Author: J. Bruce Fields Date: Tue Jan 3 09:55:46 2006 +0100 NLM: fix parsing of sm notify procedure The procedure that decodes statd sm_notify call seems to be skipping a few arguments. How did this ever work? >From folks at Polyserve. Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit d3f340e9574563570c26d3c7f1c40883092c17f9 Author: J. Bruce Fields Date: Tue Jan 3 09:55:46 2006 +0100 NLM: Further cancel fixes If the server receives an NLM cancel call and finds no waiting lock to cancel, then chances are the lock has already been applied, and the client just hadn't yet processed the NLM granted callback before it sent the cancel. The Open Group text, for example, perimts a server to return either success (LCK_GRANTED) or failure (LCK_DENIED) in this case. But returning an error seems more helpful; the client may be able to use it to recognize that a race has occurred and to recover from the race. So, modify the relevant functions to return an error in this case. Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit 0454b98b8dfe69b8c41aab4396f85fc2a35ff553 Author: J. Bruce Fields Date: Tue Jan 3 09:55:45 2006 +0100 NLM: clean up nlmsvc_delete_block The fl_next check here is superfluous (and possibly a layering violation). Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit 2f57144a476d1ed038e78ca34d72912c5dfdad58 Author: J. Bruce Fields Date: Tue Jan 3 09:55:44 2006 +0100 NLM: don't unlock on cancel requests Currently when lockd gets an NLM_CANCEL request, it also does an unlock for the same range. This is incorrect. The Open Group documentation says that "This procedure cancels an *outstanding* blocked lock request." (Emphasis mine.) Also, consider a client that holds a lock on the first byte of a file, and requests a lock on the entire file. If the client cancels that request (perhaps because the requesting process is signalled), the server shouldn't apply perform an unlock on the entire file, since that will also remove the previous lock that the client was already granted. Or consider a lock request that actually *downgraded* an exclusive lock to a shared lock. Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit fa364876bd8696f5271c30f12e3dba57d38ce13a Author: J. Bruce Fields Date: Tue Jan 3 09:55:42 2006 +0100 NLM: Clean up nlmsvc_grant_reply locking Slightly simpler logic here makes it more trivial to verify that the up's and down's are balanced here. Break out an assignment from a conditional while we're at it. Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust commit a855e95004c0f3aeb2afd495663389fee15b0dea Author: Adrian Bunk Date: Tue Jan 3 09:55:41 2006 +0100 SUNRPC: net/sunrpc/xdr.c: remove xdr_decode_string() This patch removes ths unused function xdr_decode_string(). Signed-off-by: Adrian Bunk Acked-by: Neil Brown Acked-by: Charles Lever Signed-off-by: Trond Myklebust commit a2a225d5a41161a65fdffce14ccdc1c45c1f9819 Author: Trond Myklebust Date: Tue Jan 3 09:55:41 2006 +0100 NFSv4: Allow user to set the port used by the NFSv4 callback channel Signed-off-by: Trond Myklebust commit a8786690111ec8ee04edd60be57d393864c2b749 Author: Trond Myklebust Date: Tue Jan 3 09:55:40 2006 +0100 NFS: Clean up weak cache consistency code ...and ensure that nfs_update_inode() respects wcc Signed-off-by: Trond Myklebust commit 791882385c1b5e3631e5ca899f04c52efa906f7d Author: Trond Myklebust Date: Tue Jan 3 09:55:38 2006 +0100 NFSv4: Ensure DELEGRETURN returns attributes Upon return of a write delegation, the server will almost always bump the change attribute. Ensure that we pick up that change so that we don't invalidate our data cache unnecessarily. Signed-off-by: Trond Myklebust commit 6f89bb4486319d314e9832f814e03425c92156f2 Author: Trond Myklebust Date: Tue Jan 3 09:55:37 2006 +0100 NFSv4: Ensure change attribute returned by GETATTR callback conforms to spec According to RFC3530 we're supposed to cache the change attribute at the time the client receives a write delegation. If the inode is clean, a CB_GETATTR callback by the server to the client is supposed to return the cached change attribute. If, OTOH, the inode is dirty, the client should bump the cached change attribute by 1. Signed-off-by: Trond Myklebust commit 212501c20a1b2d3ea0559f546a49d5e0b2284c01 Author: Trond Myklebust Date: Tue Jan 3 09:55:36 2006 +0100 SUNRPC: Fix a potential race in rpc_pipefs. Signed-off-by: Trond Myklebust commit 1aac4619e12a091d39d906f5b832462f11ea8d90 Author: Trond Myklebust Date: Tue Jan 3 09:55:35 2006 +0100 NFS: Make directIO aware of compound pages... ...and avoid calling set_page_dirty on them Signed-off-by: Trond Myklebust commit 321b058d9f9f6f31d3a5e6ed8b278fdab32606d8 Author: Trond Myklebust Date: Tue Jan 3 09:55:34 2006 +0100 NFS: Make stat() return updated mtimes after a write() The SuS states that a call to write() will cause mtime to be updated on the file. In order to satisfy that requirement, we need to flush out any cached writes in nfs_getattr(). Speed things up slightly by not committing the writes. Signed-off-by: Trond Myklebust commit 234a7f1300d6ed0f284d4ac1642bbab1a49e5ffd Author: Trond Myklebust Date: Tue Jan 3 09:55:33 2006 +0100 NFSv4: Ensure that we return the delegation on the target of a rename too. Signed-off-by: Trond Myklebust commit e5a17cdffefc458219524dc2638d922c0084dac9 Author: Chuck Lever Date: Wed Nov 30 18:09:02 2005 -0500 NFS: support large reads and writes on the wire Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 12dd5c6f49f06d9c8ad6a3b43e62f8eb0bc9209e Author: Chuck Lever Date: Wed Nov 30 18:08:55 2005 -0500 NFS: make "inode number mismatch" message more useful To help NFS users and server developers, make the "inode number mismatch" message display more useful information. Test-plan: None. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 8fc1059dbd1b5dac88c23cbc4bcd55d63002cf6d Author: Chuck Lever Date: Wed Nov 30 18:08:57 2005 -0500 NFS: get rid of useless kernel log message nfs_statfs() generates a log message when GETATTR returns an error. This is usually a useless message. Make it a dprintk. Test plan: None Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit add1db6c7ac6169e3ae126731f473799907b0c8a Author: Chuck Lever Date: Wed Nov 30 18:08:59 2005 -0500 NFS: simplify inlined bit ops in nfs_page.h Minor cleanup: inlined bit ops in nfs_page.h can be simpler. Test plan: Write-intensive workload against a server that requires COMMITs. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit bcc2509680dd604df7ef1041558f58baa6f8921a Author: Chuck Lever Date: Wed Nov 30 18:08:19 2005 -0500 NFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs() Red Hat found a problem in the error recovery logic in __init_nfs. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 37026e5ad053a590bf641c63d9baa3b62a5f06d2 Author: Chuck Lever Date: Wed Nov 30 18:08:17 2005 -0500 NFS: use generic_write_checks() to sanity check direct writes Replace ad hoc write parameter sanity checking in nfs_file_direct_write() with a call to generic_write_checks(). This should make the proper checks modulo the O_LARGEFILE flag, and should catch NFSv2-specific limitations by virtue of i_sb->s_maxbytes. Test plan: Posix compliance testing with both NFSv2 and NFSv3. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 7607788b83084aa1f2ff94e64159235e08a1ba07 Author: Trond Myklebust Date: Tue Jan 3 09:55:26 2006 +0100 NFSv4: Remove requirement for machine creds for the "setclientid" operation Use a cred from the nfs4_client->cl_state_owners list. Signed-off-by: Trond Myklebust commit dfbd9c86a288447d590359d74c9297091869c279 Author: Trond Myklebust Date: Tue Jan 3 09:55:25 2006 +0100 NFSv4: Remove requirement for machine creds for the "renew" operation In RFC3530, the RENEW operation is allowed to use either the same principal, RPC security flavour and (if RPCSEC_GSS), the same mechanism and service that was used for SETCLIENTID_CONFIRM OR Any principal, RPC security flavour and service combination that currently has an OPEN file on the server. Choose the latter since that doesn't require us to keep credentials for the same principal for the entire duration of the mount. Signed-off-by: Trond Myklebust commit d4e4f9d728cd18910a8a7a270803157538ccbc83 Author: Trond Myklebust Date: Tue Jan 3 09:55:24 2006 +0100 NFSv4: Send RENEW requests to the server only when we're holding state Signed-off-by: Trond Myklebust commit 93d2d07dfb39fe6fc9804cba00fc1ca98a8bd0e4 Author: Trond Myklebust Date: Tue Jan 3 09:55:23 2006 +0100 NFS: Convert instances of kernel_thread() to kthread() Convert private implementations in NFSv4 state recovery and delegation code to use kthreads. Signed-off-by: Trond Myklebust commit c99a396ecd396c32443d47461700f8847e0caf6c Author: Trond Myklebust Date: Tue Jan 3 09:55:22 2006 +0100 NFSv4: State recovery cleanup Use wait_on_bit() when waiting for state recovery to complete. Signed-off-by: Trond Myklebust commit a41744f26627dfd6bc0540d3cd065cf4412fdad0 Author: Trond Myklebust Date: Tue Jan 3 09:55:21 2006 +0100 NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 lease Cut down on the number of unnecessary RENEW requests on the wire. Signed-off-by: Trond Myklebust commit d1e2f7fca459308fb3cf93d57fb2c3a96fcfa749 Author: Trond Myklebust Date: Tue Jan 3 09:55:19 2006 +0100 SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call. ...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to interrupt RPC calls too (as per the Solaris implementation). Signed-off-by: Trond Myklebust commit 95e37fd1e50eee330e17116cea4858741ed37acd Author: Trond Myklebust Date: Tue Jan 3 09:55:18 2006 +0100 NFSv4: Make DELEGRETURN an interruptible operation. Signed-off-by: Trond Myklebust commit 123dd8981c4487e54f13037563723cc02bf3365b Author: Trond Myklebust Date: Tue Jan 3 09:55:17 2006 +0100 NFSv4: Convert LOCK rpc call into an asynchronous RPC call In order to allow users to interrupt/cancel it. Signed-off-by: Trond Myklebust commit 03945ec5a7c04281156a85ee346d46788df6a5af Author: Trond Myklebust Date: Tue Jan 3 09:55:16 2006 +0100 NFSv4: locking XDR cleanup Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust commit 9cdaa13bddb1ee97371c593dcc5883fab6a508d1 Author: Trond Myklebust Date: Tue Jan 3 09:55:15 2006 +0100 NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctly When recovering from a delegation recall or a network partition, we need to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately. Signed-off-by: Trond Myklebust commit 40e73791f6ffe56e1830f887624dee5e7348a243 Author: Trond Myklebust Date: Tue Jan 3 09:55:13 2006 +0100 NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separately A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always specify a access modes that have been the argument of a previous OPEN operation. IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden unless the user called OPEN(O_WRONLY) In order to fix that, we really need to track the three possible open states separately. Signed-off-by: Trond Myklebust commit 359afa7aee808774abcc090ac8660bc95dfd5f55 Author: Trond Myklebust Date: Tue Jan 3 09:55:12 2006 +0100 NFSv4: Make open_confirm() asynchronous too Signed-off-by: Trond Myklebust commit 47042c668bc3656163255530877cdb8086c90a76 Author: Trond Myklebust Date: Tue Jan 3 09:55:11 2006 +0100 NFSv4: Convert open() into an asynchronous RPC call OPEN is a stateful operation, so we must ensure that it always completes. In order to allow users to interrupt the operation, we need to make the RPC call asynchronous, and then wait on completion (or cancel). Signed-off-by: Trond Myklebust commit 691b7acfa816dc7939541df68a580649abbd1443 Author: Trond Myklebust Date: Tue Jan 3 09:55:10 2006 +0100 SUNRPC: rpc_execute should not return task->tk_status; Signed-off-by: Trond Myklebust commit b3e465233b8a538701eb0723321f3896148583e1 Author: Trond Myklebust Date: Tue Jan 3 09:55:09 2006 +0100 SUNRPC: Get rid of some unused exports Signed-off-by: Trond Myklebust commit d8a3e5e7460e775cf2bbcf93b35f0f750c22edfd Author: Trond Myklebust Date: Tue Jan 3 09:55:08 2006 +0100 NFSv4: Allocate OPEN call RPC arguments using kmalloc() Cleanup in preparation for making OPEN calls interruptible by the user. Signed-off-by: Trond Myklebust commit f0a3b02641ad2bc401309edfc9b006b511c1aacd Author: Trond Myklebust Date: Tue Jan 3 09:55:07 2006 +0100 NFSv4: Make locku use the new RPC "wait on completion" interface. Signed-off-by: Trond Myklebust commit eddd798c96c006167375a95565f031543557202d Author: Trond Myklebust Date: Tue Jan 3 09:55:06 2006 +0100 NFSv4: stateful NFSv4 RPC call interface The NFSv4 model requires us to complete all RPC calls that might establish state on the server whether or not the user wants to interrupt it. We may also need to schedule new work (including new RPC calls) in order to cancel the new state. The asynchronous RPC model will allow us to ensure that RPC calls always complete, but in order to allow for "synchronous" RPC, we want to add the ability to wait for completion. The waits are, of course, interruptible. Signed-off-by: Trond Myklebust commit 1a573e8d0e5b8f43d343c0815a1fd36738125ab5 Author: Trond Myklebust Date: Tue Jan 3 09:55:05 2006 +0100 SUNRPC: Further cleanups Signed-off-by: Trond Myklebust commit f37d53325f16026e9da3741902b286b713b9d07c Author: Trond Myklebust Date: Tue Jan 3 09:55:04 2006 +0100 RPC: Clean up RPC task structure Shrink the RPC task structure. Instead of storing separate pointers for task->tk_exit and task->tk_release, put them in a structure. Also pass the user data pointer as a parameter instead of passing it via task->tk_calldata. This enables us to nest callbacks. Signed-off-by: Trond Myklebust commit e6c90d17e16c878c2e2f4ac7ff3571dc277abf48 Author: Trond Myklebust Date: Tue Jan 3 09:55:03 2006 +0100 SUNRPC: Yet more RPC cleanups Signed-off-by: Trond Myklebust commit c9c23cf783a731b8b0eff8161fc43c268dc219ac Author: Trond Myklebust Date: Tue Jan 3 09:55:02 2006 +0100 NFS: Work correctly with single-page ->writepage() calls Ensure that we always initiate flushing of data before we exit a single-page ->writepage() call. Signed-off-by: Trond Myklebust commit 19ffd103e4e8a7bd4bb107950d13afe1682c1832 Author: Andrew Morton Date: Wed Nov 16 15:07:01 2005 -0800 identify multipage ->writepages() calls NFS needs to be able to distinguish between single-page ->writepage() calls and multipage ->writepages() calls. For the single-page writepage calls NFS can kick off the I/O within the context of ->writepage(). For multipage ->writepages calls, nfs_writepage() will leave the I/O pending and nfs_writepages() will kick off the I/O when it all has been queued up within NFS. Cc: Trond Myklebust Signed-off-by: Andrew Morton Signed-off-by: Trond Myklebust --- diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 5dffcfe..117523c 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -902,6 +902,14 @@ running once the system is up. nfsroot= [NFS] nfs root filesystem for disk-less boxes. See Documentation/nfsroot.txt. + nfs.callback_tcpport= + [NFS] set the TCP port on which the NFSv4 callback + channel should listen. + + nfs.idmap_cache_timeout= + [NFS] set the maximum lifetime for idmapper cache + entries. + nmi_watchdog= [KNL,BUGS=IA-32] Debugging features for SMP kernels no387 [BUGS=IA-32] Tells the kernel to use the 387 maths diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c index c5a3364..1455240 100644 --- a/fs/lockd/clntproc.c +++ b/fs/lockd/clntproc.c @@ -26,11 +26,12 @@ static int nlmclnt_test(struct nlm_rqst *, struct file_lock *); static int nlmclnt_lock(struct nlm_rqst *, struct file_lock *); static int nlmclnt_unlock(struct nlm_rqst *, struct file_lock *); -static void nlmclnt_unlock_callback(struct rpc_task *); -static void nlmclnt_cancel_callback(struct rpc_task *); static int nlm_stat_to_errno(u32 stat); static void nlmclnt_locks_init_private(struct file_lock *fl, struct nlm_host *host); +static const struct rpc_call_ops nlmclnt_unlock_ops; +static const struct rpc_call_ops nlmclnt_cancel_ops; + /* * Cookie counter for NLM requests */ @@ -221,8 +222,7 @@ nlmclnt_proc(struct inode *inode, int cm goto done; } clnt->cl_softrtry = nfssrv->client->cl_softrtry; - clnt->cl_intr = nfssrv->client->cl_intr; - clnt->cl_chatty = nfssrv->client->cl_chatty; + clnt->cl_intr = nfssrv->client->cl_intr; } /* Keep the old signal mask */ @@ -399,8 +399,7 @@ in_grace_period: /* * Generic NLM call, async version. */ -int -nlmsvc_async_call(struct nlm_rqst *req, u32 proc, rpc_action callback) +int nlmsvc_async_call(struct nlm_rqst *req, u32 proc, const struct rpc_call_ops *tk_ops) { struct nlm_host *host = req->a_host; struct rpc_clnt *clnt; @@ -419,13 +418,12 @@ nlmsvc_async_call(struct nlm_rqst *req, msg.rpc_proc = &clnt->cl_procinfo[proc]; /* bootstrap and kick off the async RPC call */ - status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, callback, req); + status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, tk_ops, req); return status; } -static int -nlmclnt_async_call(struct nlm_rqst *req, u32 proc, rpc_action callback) +static int nlmclnt_async_call(struct nlm_rqst *req, u32 proc, const struct rpc_call_ops *tk_ops) { struct nlm_host *host = req->a_host; struct rpc_clnt *clnt; @@ -448,7 +446,7 @@ nlmclnt_async_call(struct nlm_rqst *req, /* Increment host refcount */ nlm_get_host(host); /* bootstrap and kick off the async RPC call */ - status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, callback, req); + status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, tk_ops, req); if (status < 0) nlm_release_host(host); return status; @@ -664,7 +662,7 @@ nlmclnt_unlock(struct nlm_rqst *req, str if (req->a_flags & RPC_TASK_ASYNC) { status = nlmclnt_async_call(req, NLMPROC_UNLOCK, - nlmclnt_unlock_callback); + &nlmclnt_unlock_ops); /* Hrmf... Do the unlock early since locks_remove_posix() * really expects us to free the lock synchronously */ do_vfs_lock(fl); @@ -692,10 +690,9 @@ nlmclnt_unlock(struct nlm_rqst *req, str return -ENOLCK; } -static void -nlmclnt_unlock_callback(struct rpc_task *task) +static void nlmclnt_unlock_callback(struct rpc_task *task, void *data) { - struct nlm_rqst *req = (struct nlm_rqst *) task->tk_calldata; + struct nlm_rqst *req = data; int status = req->a_res.status; if (RPC_ASSASSINATED(task)) @@ -722,6 +719,10 @@ die: rpc_restart_call(task); } +static const struct rpc_call_ops nlmclnt_unlock_ops = { + .rpc_call_done = nlmclnt_unlock_callback, +}; + /* * Cancel a blocked lock request. * We always use an async RPC call for this in order not to hang a @@ -750,8 +751,7 @@ nlmclnt_cancel(struct nlm_host *host, st nlmclnt_setlockargs(req, fl); - status = nlmclnt_async_call(req, NLMPROC_CANCEL, - nlmclnt_cancel_callback); + status = nlmclnt_async_call(req, NLMPROC_CANCEL, &nlmclnt_cancel_ops); if (status < 0) { nlmclnt_release_lockargs(req); kfree(req); @@ -765,10 +765,9 @@ nlmclnt_cancel(struct nlm_host *host, st return status; } -static void -nlmclnt_cancel_callback(struct rpc_task *task) +static void nlmclnt_cancel_callback(struct rpc_task *task, void *data) { - struct nlm_rqst *req = (struct nlm_rqst *) task->tk_calldata; + struct nlm_rqst *req = data; if (RPC_ASSASSINATED(task)) goto die; @@ -807,6 +806,10 @@ retry_cancel: rpc_delay(task, 30 * HZ); } +static const struct rpc_call_ops nlmclnt_cancel_ops = { + .rpc_call_done = nlmclnt_cancel_callback, +}; + /* * Convert an NLM status code to a generic kernel errno */ diff --git a/fs/lockd/host.c b/fs/lockd/host.c index c4c8601..82f7a0b 100644 --- a/fs/lockd/host.c +++ b/fs/lockd/host.c @@ -177,7 +177,7 @@ nlm_bind_host(struct nlm_host *host) if ((clnt = host->h_rpcclnt) != NULL) { xprt = clnt->cl_xprt; if (time_after_eq(jiffies, host->h_nextrebind)) { - clnt->cl_port = 0; + rpc_force_rebind(clnt); host->h_nextrebind = jiffies + NLM_HOST_REBIND; dprintk("lockd: next rebind in %ld jiffies\n", host->h_nextrebind - jiffies); @@ -217,7 +217,7 @@ nlm_rebind_host(struct nlm_host *host) { dprintk("lockd: rebind host %s\n", host->h_name); if (host->h_rpcclnt && time_after_eq(jiffies, host->h_nextrebind)) { - host->h_rpcclnt->cl_port = 0; + rpc_force_rebind(host->h_rpcclnt); host->h_nextrebind = jiffies + NLM_HOST_REBIND; } } diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c index 2d144ab..0edc03e 100644 --- a/fs/lockd/mon.c +++ b/fs/lockd/mon.c @@ -123,7 +123,6 @@ nsm_create(void) if (IS_ERR(clnt)) goto out_err; clnt->cl_softrtry = 1; - clnt->cl_chatty = 1; clnt->cl_oneshot = 1; return clnt; diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c index 12a857c..71a30b4 100644 --- a/fs/lockd/svc.c +++ b/fs/lockd/svc.c @@ -178,6 +178,8 @@ lockd(struct svc_rqst *rqstp) } + flush_signals(current); + /* * Check whether there's a new lockd process before * shutting down the hosts and clearing the slot. @@ -192,8 +194,6 @@ lockd(struct svc_rqst *rqstp) "lockd: new process, skipping host shutdown\n"); wake_up(&lockd_exit); - flush_signals(current); - /* Exit the RPC thread */ svc_exit_thread(rqstp); diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c index 489670e..4063095 100644 --- a/fs/lockd/svc4proc.c +++ b/fs/lockd/svc4proc.c @@ -22,7 +22,8 @@ #define NLMDBG_FACILITY NLMDBG_CLIENT static u32 nlm4svc_callback(struct svc_rqst *, u32, struct nlm_res *); -static void nlm4svc_callback_exit(struct rpc_task *); + +static const struct rpc_call_ops nlm4svc_callback_ops; /* * Obtain client and file from arguments @@ -470,7 +471,6 @@ nlm4svc_proc_granted_res(struct svc_rqst } - /* * This is the generic lockd callback for async RPC calls */ @@ -494,7 +494,7 @@ nlm4svc_callback(struct svc_rqst *rqstp, call->a_host = host; memcpy(&call->a_args, resp, sizeof(*resp)); - if (nlmsvc_async_call(call, proc, nlm4svc_callback_exit) < 0) + if (nlmsvc_async_call(call, proc, &nlm4svc_callback_ops) < 0) goto error; return rpc_success; @@ -504,10 +504,9 @@ nlm4svc_callback(struct svc_rqst *rqstp, return rpc_system_err; } -static void -nlm4svc_callback_exit(struct rpc_task *task) +static void nlm4svc_callback_exit(struct rpc_task *task, void *data) { - struct nlm_rqst *call = (struct nlm_rqst *) task->tk_calldata; + struct nlm_rqst *call = data; if (task->tk_status < 0) { dprintk("lockd: %4d callback failed (errno = %d)\n", @@ -517,6 +516,10 @@ nlm4svc_callback_exit(struct rpc_task *t kfree(call); } +static const struct rpc_call_ops nlm4svc_callback_ops = { + .rpc_call_done = nlm4svc_callback_exit, +}; + /* * NLM Server procedures. */ diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c index 49f9597..9cfced6 100644 --- a/fs/lockd/svclock.c +++ b/fs/lockd/svclock.c @@ -41,7 +41,8 @@ static void nlmsvc_insert_block(struct nlm_block *block, unsigned long); static int nlmsvc_remove_block(struct nlm_block *block); -static void nlmsvc_grant_callback(struct rpc_task *task); + +static const struct rpc_call_ops nlmsvc_grant_ops; /* * The list of blocked locks to retry @@ -226,31 +227,27 @@ failed: * It is the caller's responsibility to check whether the file * can be closed hereafter. */ -static void +static int nlmsvc_delete_block(struct nlm_block *block, int unlock) { struct file_lock *fl = &block->b_call.a_args.lock.fl; struct nlm_file *file = block->b_file; struct nlm_block **bp; + int status = 0; dprintk("lockd: deleting block %p...\n", block); /* Remove block from list */ nlmsvc_remove_block(block); - if (fl->fl_next) - posix_unblock_lock(file->f_file, fl); - if (unlock) { - fl->fl_type = F_UNLCK; - posix_lock_file(file->f_file, fl); - block->b_granted = 0; - } + if (unlock) + status = posix_unblock_lock(file->f_file, fl); /* If the block is in the middle of a GRANT callback, * don't kill it yet. */ if (block->b_incall) { nlmsvc_insert_block(block, NLM_NEVER); block->b_done = 1; - return; + return status; } /* Remove block from file's list of blocks */ @@ -265,6 +262,7 @@ nlmsvc_delete_block(struct nlm_block *bl nlm_release_host(block->b_host); nlmclnt_freegrantargs(&block->b_call); kfree(block); + return status; } /* @@ -275,6 +273,7 @@ int nlmsvc_traverse_blocks(struct nlm_host *host, struct nlm_file *file, int action) { struct nlm_block *block, *next; + /* XXX: Will everything get cleaned up if we don't unlock here? */ down(&file->f_sema); for (block = file->f_blocks; block; block = next) { @@ -444,6 +443,7 @@ u32 nlmsvc_cancel_blocked(struct nlm_file *file, struct nlm_lock *lock) { struct nlm_block *block; + int status = 0; dprintk("lockd: nlmsvc_cancel(%s/%ld, pi=%d, %Ld-%Ld)\n", file->f_file->f_dentry->d_inode->i_sb->s_id, @@ -454,9 +454,9 @@ nlmsvc_cancel_blocked(struct nlm_file *f down(&file->f_sema); if ((block = nlmsvc_lookup_block(file, lock, 1)) != NULL) - nlmsvc_delete_block(block, 1); + status = nlmsvc_delete_block(block, 1); up(&file->f_sema); - return nlm_granted; + return status ? nlm_lck_denied : nlm_granted; } /* @@ -562,7 +562,7 @@ callback: /* Call the client */ nlm_get_host(block->b_call.a_host); if (nlmsvc_async_call(&block->b_call, NLMPROC_GRANTED_MSG, - nlmsvc_grant_callback) < 0) + &nlmsvc_grant_ops) < 0) nlm_release_host(block->b_call.a_host); up(&file->f_sema); } @@ -575,10 +575,9 @@ callback: * chain once more in order to have it removed by lockd itself (which can * then sleep on the file semaphore without disrupting e.g. the nfs client). */ -static void -nlmsvc_grant_callback(struct rpc_task *task) +static void nlmsvc_grant_callback(struct rpc_task *task, void *data) { - struct nlm_rqst *call = (struct nlm_rqst *) task->tk_calldata; + struct nlm_rqst *call = data; struct nlm_block *block; unsigned long timeout; struct sockaddr_in *peer_addr = RPC_PEERADDR(task->tk_client); @@ -614,6 +613,10 @@ nlmsvc_grant_callback(struct rpc_task *t nlm_release_host(call->a_host); } +static const struct rpc_call_ops nlmsvc_grant_ops = { + .rpc_call_done = nlmsvc_grant_callback, +}; + /* * We received a GRANT_RES callback. Try to find the corresponding * block. @@ -633,11 +636,12 @@ nlmsvc_grant_reply(struct svc_rqst *rqst file->f_count++; down(&file->f_sema); - if ((block = nlmsvc_find_block(cookie,&rqstp->rq_addr)) != NULL) { + block = nlmsvc_find_block(cookie, &rqstp->rq_addr); + if (block) { if (status == NLM_LCK_DENIED_GRACE_PERIOD) { /* Try again in a couple of seconds */ nlmsvc_insert_block(block, 10 * HZ); - block = NULL; + up(&file->f_sema); } else { /* Lock is now held by client, or has been rejected. * In both cases, the block should be removed. */ @@ -648,8 +652,6 @@ nlmsvc_grant_reply(struct svc_rqst *rqst nlmsvc_delete_block(block, 1); } } - if (!block) - up(&file->f_sema); nlm_release_file(file); } diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c index 757e344..3bc437e 100644 --- a/fs/lockd/svcproc.c +++ b/fs/lockd/svcproc.c @@ -23,7 +23,8 @@ #define NLMDBG_FACILITY NLMDBG_CLIENT static u32 nlmsvc_callback(struct svc_rqst *, u32, struct nlm_res *); -static void nlmsvc_callback_exit(struct rpc_task *); + +static const struct rpc_call_ops nlmsvc_callback_ops; #ifdef CONFIG_LOCKD_V4 static u32 @@ -518,7 +519,7 @@ nlmsvc_callback(struct svc_rqst *rqstp, call->a_host = host; memcpy(&call->a_args, resp, sizeof(*resp)); - if (nlmsvc_async_call(call, proc, nlmsvc_callback_exit) < 0) + if (nlmsvc_async_call(call, proc, &nlmsvc_callback_ops) < 0) goto error; return rpc_success; @@ -528,10 +529,9 @@ nlmsvc_callback(struct svc_rqst *rqstp, return rpc_system_err; } -static void -nlmsvc_callback_exit(struct rpc_task *task) +static void nlmsvc_callback_exit(struct rpc_task *task, void *data) { - struct nlm_rqst *call = (struct nlm_rqst *) task->tk_calldata; + struct nlm_rqst *call = data; if (task->tk_status < 0) { dprintk("lockd: %4d callback failed (errno = %d)\n", @@ -541,6 +541,10 @@ nlmsvc_callback_exit(struct rpc_task *ta kfree(call); } +static const struct rpc_call_ops nlmsvc_callback_ops = { + .rpc_call_done = nlmsvc_callback_exit, +}; + /* * NLM Server procedures. */ diff --git a/fs/lockd/xdr4.c b/fs/lockd/xdr4.c index ae4d6b4..fdcf105 100644 --- a/fs/lockd/xdr4.c +++ b/fs/lockd/xdr4.c @@ -354,7 +354,9 @@ nlm4svc_decode_reboot(struct svc_rqst *r return 0; argp->state = ntohl(*p++); /* Preserve the address in network byte order */ - argp->addr = *p++; + argp->addr = *p++; + argp->vers = *p++; + argp->proto = *p++; return xdr_argsize_check(rqstp, p); } diff --git a/fs/locks.c b/fs/locks.c index 250ef53..fb32d62 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1958,22 +1958,18 @@ EXPORT_SYMBOL(posix_block_lock); * * lockd needs to block waiting for locks. */ -void +int posix_unblock_lock(struct file *filp, struct file_lock *waiter) { - /* - * A remote machine may cancel the lock request after it's been - * granted locally. If that happens, we need to delete the lock. - */ + int status = 0; + lock_kernel(); - if (waiter->fl_next) { + if (waiter->fl_next) __locks_delete_block(waiter); - unlock_kernel(); - } else { - unlock_kernel(); - waiter->fl_type = F_UNLCK; - posix_lock_file(filp, waiter); - } + else + status = -ENOENT; + unlock_kernel(); + return status; } EXPORT_SYMBOL(posix_unblock_lock); diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index 8b3bb71..ec61fd5 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -13,4 +13,5 @@ nfs-$(CONFIG_NFS_V4) += nfs4proc.o nfs4x delegation.o idmap.o \ callback.o callback_xdr.o callback_proc.o nfs-$(CONFIG_NFS_DIRECTIO) += direct.o +nfs-$(CONFIG_SYSCTL) += sysctl.o nfs-objs := $(nfs-y) diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index f2ca782..10e9234 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -31,6 +31,7 @@ static struct nfs_callback_data nfs_call static DECLARE_MUTEX(nfs_callback_sema); static struct svc_program nfs4_callback_program; +unsigned int nfs_callback_set_tcpport; unsigned short nfs_callback_tcpport; /* @@ -95,7 +96,7 @@ int nfs_callback_up(void) if (!serv) goto out_err; /* FIXME: We don't want to register this socket with the portmapper */ - ret = svc_makesock(serv, IPPROTO_TCP, 0); + ret = svc_makesock(serv, IPPROTO_TCP, nfs_callback_set_tcpport); if (ret < 0) goto out_destroy; if (!list_empty(&serv->sv_permsocks)) { diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h index a0db2d4..b252e7f 100644 --- a/fs/nfs/callback.h +++ b/fs/nfs/callback.h @@ -65,6 +65,7 @@ extern unsigned nfs4_callback_recall(str extern int nfs_callback_up(void); extern int nfs_callback_down(void); +extern unsigned int nfs_callback_set_tcpport; extern unsigned short nfs_callback_tcpport; #endif /* __LINUX_FS_NFS_CALLBACK_H */ diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 65f1e19..462cfce 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -35,7 +35,9 @@ unsigned nfs4_callback_getattr(struct cb if (delegation == NULL || (delegation->type & FMODE_WRITE) == 0) goto out_iput; res->size = i_size_read(inode); - res->change_attr = NFS_CHANGE_ATTR(inode); + res->change_attr = delegation->change_attr; + if (nfsi->npages != 0) + res->change_attr++; res->ctime = inode->i_ctime; res->mtime = inode->i_mtime; res->bitmap[0] = (FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE) & diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 618a327..c6f07c1 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -8,6 +8,7 @@ */ #include #include +#include #include #include #include @@ -130,6 +131,7 @@ int nfs_inode_set_delegation(struct inod sizeof(delegation->stateid.data)); delegation->type = res->delegation_type; delegation->maxsize = res->maxsize; + delegation->change_attr = nfsi->change_attr; delegation->cred = get_rpccred(cred); delegation->inode = inode; @@ -157,8 +159,6 @@ static int nfs_do_return_delegation(stru { int res = 0; - __nfs_revalidate_inode(NFS_SERVER(inode), inode); - res = nfs4_proc_delegreturn(inode, delegation->cred, &delegation->stateid); nfs_free_delegation(delegation); return res; @@ -231,6 +231,49 @@ restart: spin_unlock(&clp->cl_lock); } +int nfs_do_expire_all_delegations(void *ptr) +{ + struct nfs4_client *clp = ptr; + struct nfs_delegation *delegation; + struct inode *inode; + + allow_signal(SIGKILL); +restart: + spin_lock(&clp->cl_lock); + if (test_bit(NFS4CLNT_STATE_RECOVER, &clp->cl_state) != 0) + goto out; + if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state) == 0) + goto out; + list_for_each_entry(delegation, &clp->cl_delegations, super_list) { + inode = igrab(delegation->inode); + if (inode == NULL) + continue; + spin_unlock(&clp->cl_lock); + nfs_inode_return_delegation(inode); + iput(inode); + goto restart; + } +out: + spin_unlock(&clp->cl_lock); + nfs4_put_client(clp); + module_put_and_exit(0); +} + +void nfs_expire_all_delegations(struct nfs4_client *clp) +{ + struct task_struct *task; + + __module_get(THIS_MODULE); + atomic_inc(&clp->cl_count); + task = kthread_run(nfs_do_expire_all_delegations, clp, + "%u.%u.%u.%u-delegreturn", + NIPQUAD(clp->cl_addr)); + if (!IS_ERR(task)) + return; + nfs4_put_client(clp); + module_put(THIS_MODULE); +} + /* * Return all delegations following an NFS4ERR_CB_PATH_DOWN error. */ diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h index 2fcc30d..7a0b2bf 100644 --- a/fs/nfs/delegation.h +++ b/fs/nfs/delegation.h @@ -21,6 +21,7 @@ struct nfs_delegation { #define NFS_DELEGATION_NEED_RECLAIM 1 long flags; loff_t maxsize; + __u64 change_attr; }; int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res); @@ -30,6 +31,7 @@ int nfs_async_inode_return_delegation(st struct inode *nfs_delegation_find_inode(struct nfs4_client *clp, const struct nfs_fh *fhandle); void nfs_return_all_delegations(struct super_block *sb); +void nfs_expire_all_delegations(struct nfs4_client *clp); void nfs_handle_cb_pathdown(struct nfs4_client *clp); void nfs_delegation_mark_reclaim(struct nfs4_client *clp); diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index c0d1a21..e925519 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1550,8 +1550,10 @@ go_ahead: } nfs_inode_return_delegation(old_inode); - if (new_inode) + if (new_inode != NULL) { + nfs_inode_return_delegation(new_inode); d_delete(new_dentry); + } nfs_begin_data_update(old_dir); nfs_begin_data_update(new_dir); diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 0792288..10ae377 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -122,9 +122,10 @@ nfs_free_user_pages(struct page **pages, { int i; for (i = 0; i < npages; i++) { - if (do_dirty) - set_page_dirty_lock(pages[i]); - page_cache_release(pages[i]); + struct page *page = pages[i]; + if (do_dirty && !PageCompound(page)) + set_page_dirty_lock(page); + page_cache_release(page); } kfree(pages); } @@ -154,6 +155,7 @@ static struct nfs_direct_req *nfs_direct struct list_head *list; struct nfs_direct_req *dreq; unsigned int reads = 0; + unsigned int rpages = (rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; dreq = kmem_cache_alloc(nfs_direct_cachep, SLAB_KERNEL); if (!dreq) @@ -167,7 +169,7 @@ static struct nfs_direct_req *nfs_direct list = &dreq->list; for(;;) { - struct nfs_read_data *data = nfs_readdata_alloc(); + struct nfs_read_data *data = nfs_readdata_alloc(rpages); if (unlikely(!data)) { while (!list_empty(list)) { @@ -268,8 +270,6 @@ static void nfs_direct_read_schedule(str NFS_PROTO(inode)->read_setup(data); data->task.tk_cookie = (unsigned long) inode; - data->task.tk_calldata = data; - data->task.tk_release = nfs_readdata_release; data->complete = nfs_direct_read_result; lock_kernel(); @@ -433,7 +433,7 @@ static ssize_t nfs_direct_write_seg(stru struct nfs_writeverf first_verf; struct nfs_write_data *wdata; - wdata = nfs_writedata_alloc(); + wdata = nfs_writedata_alloc(NFS_SERVER(inode)->wpages); if (!wdata) return -ENOMEM; @@ -662,10 +662,10 @@ nfs_file_direct_read(struct kiocb *iocb, .iov_len = count, }; - dprintk("nfs: direct read(%s/%s, %lu@%lu)\n", + dprintk("nfs: direct read(%s/%s, %lu@%Ld)\n", file->f_dentry->d_parent->d_name.name, file->f_dentry->d_name.name, - (unsigned long) count, (unsigned long) pos); + (unsigned long) count, (long long) pos); if (!is_sync_kiocb(iocb)) goto out; @@ -718,9 +718,7 @@ out: ssize_t nfs_file_direct_write(struct kiocb *iocb, const char __user *buf, size_t count, loff_t pos) { - ssize_t retval = -EINVAL; - loff_t *ppos = &iocb->ki_pos; - unsigned long limit = current->signal->rlim[RLIMIT_FSIZE].rlim_cur; + ssize_t retval; struct file *file = iocb->ki_filp; struct nfs_open_context *ctx = (struct nfs_open_context *) file->private_data; @@ -728,35 +726,32 @@ nfs_file_direct_write(struct kiocb *iocb struct inode *inode = mapping->host; struct iovec iov = { .iov_base = (char __user *)buf, - .iov_len = count, }; - dfprintk(VFS, "nfs: direct write(%s/%s(%ld), %lu@%lu)\n", + dfprintk(VFS, "nfs: direct write(%s/%s, %lu@%Ld)\n", file->f_dentry->d_parent->d_name.name, - file->f_dentry->d_name.name, inode->i_ino, - (unsigned long) count, (unsigned long) pos); + file->f_dentry->d_name.name, + (unsigned long) count, (long long) pos); + retval = -EINVAL; if (!is_sync_kiocb(iocb)) goto out; - if (count < 0) - goto out; - if (pos < 0) + + retval = generic_write_checks(file, &pos, &count, 0); + if (retval) goto out; - retval = -EFAULT; - if (!access_ok(VERIFY_READ, iov.iov_base, iov.iov_len)) + + retval = -EINVAL; + if ((ssize_t) count < 0) goto out; - retval = -EFBIG; - if (limit != RLIM_INFINITY) { - if (pos >= limit) { - send_sig(SIGXFSZ, current, 0); - goto out; - } - if (count > limit - (unsigned long) pos) - count = limit - (unsigned long) pos; - } retval = 0; if (!count) goto out; + iov.iov_len = count, + + retval = -EFAULT; + if (!access_ok(VERIFY_READ, iov.iov_base, iov.iov_len)) + goto out; retval = nfs_sync_mapping(mapping); if (retval) @@ -766,7 +761,7 @@ nfs_file_direct_write(struct kiocb *iocb if (mapping->nrpages) invalidate_inode_pages2(mapping); if (retval > 0) - *ppos = pos + retval; + iocb->ki_pos = pos + retval; out: return retval; diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c index ffb8df9..821edd3 100644 --- a/fs/nfs/idmap.c +++ b/fs/nfs/idmap.c @@ -54,7 +54,11 @@ #define IDMAP_HASH_SZ 128 +/* Default cache timeout is 10 minutes */ +unsigned int nfs_idmap_cache_timeout = 600 * HZ; + struct idmap_hashent { + unsigned long ih_expires; __u32 ih_id; int ih_namelen; char ih_name[IDMAP_NAMESZ]; @@ -149,6 +153,8 @@ idmap_lookup_name(struct idmap_hashtable if (he->ih_namelen != len || memcmp(he->ih_name, name, len) != 0) return NULL; + if (time_after(jiffies, he->ih_expires)) + return NULL; return he; } @@ -164,6 +170,8 @@ idmap_lookup_id(struct idmap_hashtable * struct idmap_hashent *he = idmap_id_hash(h, id); if (he->ih_id != id || he->ih_namelen == 0) return NULL; + if (time_after(jiffies, he->ih_expires)) + return NULL; return he; } @@ -192,6 +200,7 @@ idmap_update_entry(struct idmap_hashent memcpy(he->ih_name, name, namelen); he->ih_name[namelen] = '\0'; he->ih_namelen = namelen; + he->ih_expires = jiffies + nfs_idmap_cache_timeout; } /* diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 432f41c..e7bd0d9 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -40,6 +40,7 @@ #include #include "nfs4_fs.h" +#include "callback.h" #include "delegation.h" #define NFSDBG_FACILITY NFSDBG_VFS @@ -221,10 +222,10 @@ nfs_calc_block_size(u64 tsize) static inline unsigned long nfs_block_size(unsigned long bsize, unsigned char *nrbitsp) { - if (bsize < 1024) - bsize = NFS_DEF_FILE_IO_BUFFER_SIZE; - else if (bsize >= NFS_MAX_FILE_IO_BUFFER_SIZE) - bsize = NFS_MAX_FILE_IO_BUFFER_SIZE; + if (bsize < NFS_MIN_FILE_IO_SIZE) + bsize = NFS_DEF_FILE_IO_SIZE; + else if (bsize >= NFS_MAX_FILE_IO_SIZE) + bsize = NFS_MAX_FILE_IO_SIZE; return nfs_block_bits(bsize, nrbitsp); } @@ -307,20 +308,15 @@ nfs_sb_init(struct super_block *sb, rpc_ max_rpc_payload = nfs_block_size(rpc_max_payload(server->client), NULL); if (server->rsize > max_rpc_payload) server->rsize = max_rpc_payload; - if (server->wsize > max_rpc_payload) - server->wsize = max_rpc_payload; - + if (server->rsize > NFS_MAX_FILE_IO_SIZE) + server->rsize = NFS_MAX_FILE_IO_SIZE; server->rpages = (server->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - if (server->rpages > NFS_READ_MAXIOV) { - server->rpages = NFS_READ_MAXIOV; - server->rsize = server->rpages << PAGE_CACHE_SHIFT; - } + if (server->wsize > max_rpc_payload) + server->wsize = max_rpc_payload; + if (server->wsize > NFS_MAX_FILE_IO_SIZE) + server->wsize = NFS_MAX_FILE_IO_SIZE; server->wpages = (server->wsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - if (server->wpages > NFS_WRITE_MAXIOV) { - server->wpages = NFS_WRITE_MAXIOV; - server->wsize = server->wpages << PAGE_CACHE_SHIFT; - } if (sb->s_blocksize == 0) sb->s_blocksize = nfs_block_bits(server->wsize, @@ -417,7 +413,6 @@ nfs_create_client(struct nfs_server *ser clnt->cl_intr = 1; clnt->cl_softrtry = 1; - clnt->cl_chatty = 1; return clnt; @@ -575,11 +570,10 @@ nfs_statfs(struct super_block *sb, struc buf->f_namelen = server->namelen; out: unlock_kernel(); - return 0; out_err: - printk(KERN_WARNING "nfs_statfs: statfs error = %d\n", -error); + dprintk("%s: statfs error = %d\n", __FUNCTION__, -error); buf->f_bsize = buf->f_blocks = buf->f_bfree = buf->f_bavail = -1; goto out; @@ -958,6 +952,8 @@ int nfs_getattr(struct vfsmount *mnt, st int need_atime = NFS_I(inode)->cache_validity & NFS_INO_INVALID_ATIME; int err; + /* Flush out writes to the server in order to update c/mtime */ + nfs_sync_inode(inode, 0, 0, FLUSH_WAIT|FLUSH_NOCOMMIT); if (__IS_FLG(inode, MS_NOATIME)) need_atime = 0; else if (__IS_FLG(inode, MS_NODIRATIME) && S_ISDIR(inode->i_mode)) @@ -1252,6 +1248,33 @@ void nfs_end_data_update(struct inode *i atomic_dec(&nfsi->data_updates); } +static void nfs_wcc_update_inode(struct inode *inode, struct nfs_fattr *fattr) +{ + struct nfs_inode *nfsi = NFS_I(inode); + + if ((fattr->valid & NFS_ATTR_PRE_CHANGE) != 0 + && nfsi->change_attr == fattr->pre_change_attr) { + nfsi->change_attr = fattr->change_attr; + nfsi->cache_change_attribute = jiffies; + } + + /* If we have atomic WCC data, we may update some attributes */ + if ((fattr->valid & NFS_ATTR_WCC) != 0) { + if (timespec_equal(&inode->i_ctime, &fattr->pre_ctime)) { + memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime)); + nfsi->cache_change_attribute = jiffies; + } + if (timespec_equal(&inode->i_mtime, &fattr->pre_mtime)) { + memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime)); + nfsi->cache_change_attribute = jiffies; + } + if (inode->i_size == fattr->pre_size && nfsi->npages == 0) { + inode->i_size = fattr->size; + nfsi->cache_change_attribute = jiffies; + } + } +} + /** * nfs_check_inode_attributes - verify consistency of the inode attribute cache * @inode - pointer to inode @@ -1268,22 +1291,20 @@ static int nfs_check_inode_attributes(st int data_unstable; + if ((fattr->valid & NFS_ATTR_FATTR) == 0) + return 0; + /* Are we in the process of updating data on the server? */ data_unstable = nfs_caches_unstable(inode); - if (fattr->valid & NFS_ATTR_FATTR_V4) { - if ((fattr->valid & NFS_ATTR_PRE_CHANGE) != 0 - && nfsi->change_attr == fattr->pre_change_attr) - nfsi->change_attr = fattr->change_attr; - if (nfsi->change_attr != fattr->change_attr) { - nfsi->cache_validity |= NFS_INO_INVALID_ATTR; - if (!data_unstable) - nfsi->cache_validity |= NFS_INO_REVAL_PAGECACHE; - } - } + /* Do atomic weak cache consistency updates */ + nfs_wcc_update_inode(inode, fattr); - if ((fattr->valid & NFS_ATTR_FATTR) == 0) { - return 0; + if ((fattr->valid & NFS_ATTR_FATTR_V4) != 0 && + nfsi->change_attr != fattr->change_attr) { + nfsi->cache_validity |= NFS_INO_INVALID_ATTR; + if (!data_unstable) + nfsi->cache_validity |= NFS_INO_REVAL_PAGECACHE; } /* Has the inode gone and changed behind our back? */ @@ -1295,14 +1316,6 @@ static int nfs_check_inode_attributes(st cur_size = i_size_read(inode); new_isize = nfs_size_to_loff_t(fattr->size); - /* If we have atomic WCC data, we may update some attributes */ - if ((fattr->valid & NFS_ATTR_WCC) != 0) { - if (timespec_equal(&inode->i_ctime, &fattr->pre_ctime)) - memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime)); - if (timespec_equal(&inode->i_mtime, &fattr->pre_mtime)) - memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime)); - } - /* Verify a few of the more important attributes */ if (!timespec_equal(&inode->i_mtime, &fattr->mtime)) { nfsi->cache_validity |= NFS_INO_INVALID_ATTR; @@ -1410,14 +1423,8 @@ static int nfs_update_inode(struct inode if ((fattr->valid & NFS_ATTR_FATTR) == 0) return 0; - if (nfsi->fileid != fattr->fileid) { - printk(KERN_ERR "%s: inode number mismatch\n" - "expected (%s/0x%Lx), got (%s/0x%Lx)\n", - __FUNCTION__, - inode->i_sb->s_id, (long long)nfsi->fileid, - inode->i_sb->s_id, (long long)fattr->fileid); - goto out_err; - } + if (nfsi->fileid != fattr->fileid) + goto out_fileid; /* * Make sure the inode's type hasn't changed. @@ -1436,6 +1443,9 @@ static int nfs_update_inode(struct inode if (data_stable) nfsi->cache_validity &= ~(NFS_INO_INVALID_ATTR|NFS_INO_INVALID_ATIME); + /* Do atomic weak cache consistency updates */ + nfs_wcc_update_inode(inode, fattr); + /* Check if our cached file size is stale */ new_isize = nfs_size_to_loff_t(fattr->size); cur_isize = i_size_read(inode); @@ -1539,6 +1549,13 @@ static int nfs_update_inode(struct inode */ nfs_invalidate_inode(inode); return -ESTALE; + + out_fileid: + printk(KERN_ERR "NFS: server %s error: fileid changed\n" + "fsid %s: expected fileid 0x%Lx, got 0x%Lx\n", + NFS_SERVER(inode)->hostname, inode->i_sb->s_id, + (long long)nfsi->fileid, (long long)fattr->fileid); + goto out_err; } /* @@ -1820,25 +1837,10 @@ static int nfs4_fill_super(struct super_ } clnt->cl_intr = 1; clnt->cl_softrtry = 1; - clnt->cl_chatty = 1; clp->cl_rpcclient = clnt; - clp->cl_cred = rpcauth_lookupcred(clnt->cl_auth, 0); - if (IS_ERR(clp->cl_cred)) { - up_write(&clp->cl_sem); - err = PTR_ERR(clp->cl_cred); - clp->cl_cred = NULL; - goto out_fail; - } memcpy(clp->cl_ipaddr, server->ip_addr, sizeof(clp->cl_ipaddr)); nfs_idmap_new(clp); } - if (list_empty(&clp->cl_superblocks)) { - err = nfs4_init_client(clp); - if (err != 0) { - up_write(&clp->cl_sem); - goto out_fail; - } - } list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); clnt = rpc_clone_client(clp->cl_rpcclient); if (!IS_ERR(clnt)) @@ -2033,6 +2035,35 @@ static struct file_system_type nfs4_fs_t .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; +static const int nfs_set_port_min = 0; +static const int nfs_set_port_max = 65535; +static int param_set_port(const char *val, struct kernel_param *kp) +{ + char *endp; + int num = simple_strtol(val, &endp, 0); + if (endp == val || *endp || num < nfs_set_port_min || num > nfs_set_port_max) + return -EINVAL; + *((int *)kp->arg) = num; + return 0; +} + +module_param_call(callback_tcpport, param_set_port, param_get_int, + &nfs_callback_set_tcpport, 0644); + +static int param_set_idmap_timeout(const char *val, struct kernel_param *kp) +{ + char *endp; + int num = simple_strtol(val, &endp, 0); + int jif = num * HZ; + if (endp == val || *endp || num < 0 || jif < num) + return -EINVAL; + *((int *)kp->arg) = jif; + return 0; +} + +module_param_call(idmap_cache_timeout, param_set_idmap_timeout, param_get_int, + &nfs_idmap_cache_timeout, 0644); + #define nfs4_init_once(nfsi) \ do { \ INIT_LIST_HEAD(&(nfsi)->open_states); \ @@ -2040,8 +2071,25 @@ static struct file_system_type nfs4_fs_t nfsi->delegation_state = 0; \ init_rwsem(&nfsi->rwsem); \ } while(0) -#define register_nfs4fs() register_filesystem(&nfs4_fs_type) -#define unregister_nfs4fs() unregister_filesystem(&nfs4_fs_type) + +static inline int register_nfs4fs(void) +{ + int ret; + + ret = nfs_register_sysctl(); + if (ret != 0) + return ret; + ret = register_filesystem(&nfs4_fs_type); + if (ret != 0) + nfs_unregister_sysctl(); + return ret; +} + +static inline void unregister_nfs4fs(void) +{ + unregister_filesystem(&nfs4_fs_type); + nfs_unregister_sysctl(); +} #else #define nfs4_init_once(nfsi) \ do { } while (0) @@ -2166,11 +2214,11 @@ out: #ifdef CONFIG_PROC_FS rpc_proc_unregister("nfs"); #endif - nfs_destroy_writepagecache(); #ifdef CONFIG_NFS_DIRECTIO -out0: nfs_destroy_directcache(); +out0: #endif + nfs_destroy_writepagecache(); out1: nfs_destroy_readpagecache(); out2: diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c index 0e82617..db99b8f 100644 --- a/fs/nfs/mount_clnt.c +++ b/fs/nfs/mount_clnt.c @@ -82,7 +82,6 @@ mnt_create(char *hostname, struct sockad RPC_AUTH_UNIX); if (!IS_ERR(clnt)) { clnt->cl_softrtry = 1; - clnt->cl_chatty = 1; clnt->cl_oneshot = 1; clnt->cl_intr = 1; } diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c index 59049e8..7fc0560 100644 --- a/fs/nfs/nfs2xdr.c +++ b/fs/nfs/nfs2xdr.c @@ -146,23 +146,23 @@ xdr_decode_fattr(u32 *p, struct nfs_fatt return p; } -#define SATTR(p, attr, flag, field) \ - *p++ = (attr->ia_valid & flag) ? htonl(attr->field) : ~(u32) 0 static inline u32 * xdr_encode_sattr(u32 *p, struct iattr *attr) { - SATTR(p, attr, ATTR_MODE, ia_mode); - SATTR(p, attr, ATTR_UID, ia_uid); - SATTR(p, attr, ATTR_GID, ia_gid); - SATTR(p, attr, ATTR_SIZE, ia_size); + const u32 not_set = __constant_htonl(0xFFFFFFFF); + + *p++ = (attr->ia_valid & ATTR_MODE) ? htonl(attr->ia_mode) : not_set; + *p++ = (attr->ia_valid & ATTR_UID) ? htonl(attr->ia_uid) : not_set; + *p++ = (attr->ia_valid & ATTR_GID) ? htonl(attr->ia_gid) : not_set; + *p++ = (attr->ia_valid & ATTR_SIZE) ? htonl(attr->ia_size) : not_set; if (attr->ia_valid & ATTR_ATIME_SET) { p = xdr_encode_time(p, &attr->ia_atime); } else if (attr->ia_valid & ATTR_ATIME) { p = xdr_encode_current_server_time(p, &attr->ia_atime); } else { - *p++ = ~(u32) 0; - *p++ = ~(u32) 0; + *p++ = not_set; + *p++ = not_set; } if (attr->ia_valid & ATTR_MTIME_SET) { @@ -170,12 +170,11 @@ xdr_encode_sattr(u32 *p, struct iattr *a } else if (attr->ia_valid & ATTR_MTIME) { p = xdr_encode_current_server_time(p, &attr->ia_mtime); } else { - *p++ = ~(u32) 0; - *p++ = ~(u32) 0; + *p++ = not_set; + *p++ = not_set; } return p; } -#undef SATTR /* * NFS encode functions diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 92c870d..ed67567 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -68,27 +68,39 @@ nfs3_async_handle_jukebox(struct rpc_tas return 1; } -/* - * Bare-bones access to getattr: this is for nfs_read_super. - */ static int -nfs3_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle, - struct nfs_fsinfo *info) +do_proc_get_root(struct rpc_clnt *client, struct nfs_fh *fhandle, + struct nfs_fsinfo *info) { int status; dprintk("%s: call fsinfo\n", __FUNCTION__); nfs_fattr_init(info->fattr); - status = rpc_call(server->client_sys, NFS3PROC_FSINFO, fhandle, info, 0); + status = rpc_call(client, NFS3PROC_FSINFO, fhandle, info, 0); dprintk("%s: reply fsinfo: %d\n", __FUNCTION__, status); if (!(info->fattr->valid & NFS_ATTR_FATTR)) { - status = rpc_call(server->client_sys, NFS3PROC_GETATTR, fhandle, info->fattr, 0); + status = rpc_call(client, NFS3PROC_GETATTR, fhandle, info->fattr, 0); dprintk("%s: reply getattr: %d\n", __FUNCTION__, status); } return status; } /* + * Bare-bones access to getattr: this is for nfs_read_super. + */ +static int +nfs3_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle, + struct nfs_fsinfo *info) +{ + int status; + + status = do_proc_get_root(server->client, fhandle, info); + if (status && server->client_sys != server->client) + status = do_proc_get_root(server->client_sys, fhandle, info); + return status; +} + +/* * One function for each procedure in the NFS protocol. */ static int @@ -732,19 +744,23 @@ nfs3_proc_pathconf(struct nfs_server *se extern u32 *nfs3_decode_dirent(u32 *, struct nfs_entry *, int); -static void -nfs3_read_done(struct rpc_task *task) +static void nfs3_read_done(struct rpc_task *task, void *calldata) { - struct nfs_read_data *data = (struct nfs_read_data *) task->tk_calldata; + struct nfs_read_data *data = calldata; if (nfs3_async_handle_jukebox(task)) return; /* Call back common NFS readpage processing */ if (task->tk_status >= 0) nfs_refresh_inode(data->inode, &data->fattr); - nfs_readpage_result(task); + nfs_readpage_result(task, calldata); } +static const struct rpc_call_ops nfs3_read_ops = { + .rpc_call_done = nfs3_read_done, + .rpc_release = nfs_readdata_release, +}; + static void nfs3_proc_read_setup(struct nfs_read_data *data) { @@ -762,23 +778,26 @@ nfs3_proc_read_setup(struct nfs_read_dat flags = RPC_TASK_ASYNC | (IS_SWAPFILE(inode)? NFS_RPC_SWAPFLAGS : 0); /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs3_read_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs3_read_ops, data); rpc_call_setup(task, &msg, 0); } -static void -nfs3_write_done(struct rpc_task *task) +static void nfs3_write_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data; + struct nfs_write_data *data = calldata; if (nfs3_async_handle_jukebox(task)) return; - data = (struct nfs_write_data *)task->tk_calldata; if (task->tk_status >= 0) nfs_post_op_update_inode(data->inode, data->res.fattr); - nfs_writeback_done(task); + nfs_writeback_done(task, calldata); } +static const struct rpc_call_ops nfs3_write_ops = { + .rpc_call_done = nfs3_write_done, + .rpc_release = nfs_writedata_release, +}; + static void nfs3_proc_write_setup(struct nfs_write_data *data, int how) { @@ -806,23 +825,26 @@ nfs3_proc_write_setup(struct nfs_write_d flags = (how & FLUSH_SYNC) ? 0 : RPC_TASK_ASYNC; /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs3_write_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs3_write_ops, data); rpc_call_setup(task, &msg, 0); } -static void -nfs3_commit_done(struct rpc_task *task) +static void nfs3_commit_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data; + struct nfs_write_data *data = calldata; if (nfs3_async_handle_jukebox(task)) return; - data = (struct nfs_write_data *)task->tk_calldata; if (task->tk_status >= 0) nfs_post_op_update_inode(data->inode, data->res.fattr); - nfs_commit_done(task); + nfs_commit_done(task, calldata); } +static const struct rpc_call_ops nfs3_commit_ops = { + .rpc_call_done = nfs3_commit_done, + .rpc_release = nfs_commit_release, +}; + static void nfs3_proc_commit_setup(struct nfs_write_data *data, int how) { @@ -840,7 +862,7 @@ nfs3_proc_commit_setup(struct nfs_write_ flags = (how & FLUSH_SYNC) ? 0 : RPC_TASK_ASYNC; /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs3_commit_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs3_commit_ops, data); rpc_call_setup(task, &msg, 0); } diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c index 0498bd3..b6c0b50 100644 --- a/fs/nfs/nfs3xdr.c +++ b/fs/nfs/nfs3xdr.c @@ -182,7 +182,7 @@ xdr_encode_sattr(u32 *p, struct iattr *a { if (attr->ia_valid & ATTR_MODE) { *p++ = xdr_one; - *p++ = htonl(attr->ia_mode); + *p++ = htonl(attr->ia_mode & S_IALLUGO); } else { *p++ = xdr_zero; } diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index b7f262d..0f5e4e7 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -38,7 +38,8 @@ struct idmap; ((err) != NFSERR_NOFILEHANDLE)) enum nfs4_client_state { - NFS4CLNT_OK = 0, + NFS4CLNT_STATE_RECOVER = 0, + NFS4CLNT_LEASE_EXPIRED, }; /* @@ -67,7 +68,6 @@ struct nfs4_client { atomic_t cl_count; struct rpc_clnt * cl_rpcclient; - struct rpc_cred * cl_cred; struct list_head cl_superblocks; /* List of nfs_server structs */ @@ -76,7 +76,6 @@ struct nfs4_client { struct work_struct cl_renewd; struct work_struct cl_recoverd; - wait_queue_head_t cl_waitq; struct rpc_wait_queue cl_rpcwaitq; /* used for the setclientid verifier */ @@ -182,8 +181,9 @@ struct nfs4_state { nfs4_stateid stateid; - unsigned int nreaders; - unsigned int nwriters; + unsigned int n_rdonly; + unsigned int n_wronly; + unsigned int n_rdwr; int state; /* State on the server (R,W, or RW) */ atomic_t count; }; @@ -210,10 +210,10 @@ extern ssize_t nfs4_listxattr(struct den /* nfs4proc.c */ extern int nfs4_map_errors(int err); -extern int nfs4_proc_setclientid(struct nfs4_client *, u32, unsigned short); -extern int nfs4_proc_setclientid_confirm(struct nfs4_client *); -extern int nfs4_proc_async_renew(struct nfs4_client *); -extern int nfs4_proc_renew(struct nfs4_client *); +extern int nfs4_proc_setclientid(struct nfs4_client *, u32, unsigned short, struct rpc_cred *); +extern int nfs4_proc_setclientid_confirm(struct nfs4_client *, struct rpc_cred *); +extern int nfs4_proc_async_renew(struct nfs4_client *, struct rpc_cred *); +extern int nfs4_proc_renew(struct nfs4_client *, struct rpc_cred *); extern int nfs4_do_close(struct inode *inode, struct nfs4_state *state); extern struct dentry *nfs4_atomic_open(struct inode *, struct dentry *, struct nameidata *); extern int nfs4_open_revalidate(struct inode *, struct dentry *, int, struct nameidata *); @@ -237,8 +237,8 @@ extern void init_nfsv4_state(struct nfs_ extern void destroy_nfsv4_state(struct nfs_server *); extern struct nfs4_client *nfs4_get_client(struct in_addr *); extern void nfs4_put_client(struct nfs4_client *clp); -extern int nfs4_init_client(struct nfs4_client *clp); extern struct nfs4_client *nfs4_find_client(struct in_addr *); +struct rpc_cred *nfs4_get_renew_cred(struct nfs4_client *clp); extern u32 nfs4_alloc_lockowner_id(struct nfs4_client *); extern struct nfs4_state_owner * nfs4_get_state_owner(struct nfs_server *, struct rpc_cred *); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index f988a94..984ca34 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -57,11 +57,13 @@ #define NFS4_POLL_RETRY_MIN (1*HZ) #define NFS4_POLL_RETRY_MAX (15*HZ) -static int _nfs4_proc_open_confirm(struct rpc_clnt *clnt, const struct nfs_fh *fh, struct nfs4_state_owner *sp, nfs4_stateid *stateid, struct nfs_seqid *seqid); +struct nfs4_opendata; +static int _nfs4_proc_open(struct nfs4_opendata *data); static int nfs4_do_fsinfo(struct nfs_server *, struct nfs_fh *, struct nfs_fsinfo *); static int nfs4_async_handle_error(struct rpc_task *, const struct nfs_server *); static int _nfs4_proc_access(struct inode *inode, struct nfs_access_entry *entry); static int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct nfs4_exception *exception); +static int nfs4_wait_clnt_recover(struct rpc_clnt *clnt, struct nfs4_client *clp); extern u32 *nfs4_decode_dirent(u32 *p, struct nfs_entry *entry, int plus); extern struct rpc_procinfo nfs4_procedures[]; @@ -173,8 +175,7 @@ static void nfs4_setup_readdir(u64 cooki kunmap_atomic(start, KM_USER0); } -static void -renew_lease(struct nfs_server *server, unsigned long timestamp) +static void renew_lease(const struct nfs_server *server, unsigned long timestamp) { struct nfs4_client *clp = server->nfs4_state; spin_lock(&clp->cl_lock); @@ -194,21 +195,123 @@ static void update_changeattr(struct ino spin_unlock(&inode->i_lock); } +struct nfs4_opendata { + atomic_t count; + struct nfs_openargs o_arg; + struct nfs_openres o_res; + struct nfs_open_confirmargs c_arg; + struct nfs_open_confirmres c_res; + struct nfs_fattr f_attr; + struct nfs_fattr dir_attr; + struct dentry *dentry; + struct dentry *dir; + struct nfs4_state_owner *owner; + struct iattr attrs; + unsigned long timestamp; + int rpc_status; + int cancelled; +}; + +static struct nfs4_opendata *nfs4_opendata_alloc(struct dentry *dentry, + struct nfs4_state_owner *sp, int flags, + const struct iattr *attrs) +{ + struct dentry *parent = dget_parent(dentry); + struct inode *dir = parent->d_inode; + struct nfs_server *server = NFS_SERVER(dir); + struct nfs4_opendata *p; + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (p == NULL) + goto err; + p->o_arg.seqid = nfs_alloc_seqid(&sp->so_seqid); + if (p->o_arg.seqid == NULL) + goto err_free; + atomic_set(&p->count, 1); + p->dentry = dget(dentry); + p->dir = parent; + p->owner = sp; + atomic_inc(&sp->so_count); + p->o_arg.fh = NFS_FH(dir); + p->o_arg.open_flags = flags, + p->o_arg.clientid = server->nfs4_state->cl_clientid; + p->o_arg.id = sp->so_id; + p->o_arg.name = &dentry->d_name; + p->o_arg.server = server; + p->o_arg.bitmask = server->attr_bitmask; + p->o_arg.claim = NFS4_OPEN_CLAIM_NULL; + p->o_res.f_attr = &p->f_attr; + p->o_res.dir_attr = &p->dir_attr; + p->o_res.server = server; + nfs_fattr_init(&p->f_attr); + nfs_fattr_init(&p->dir_attr); + if (flags & O_EXCL) { + u32 *s = (u32 *) p->o_arg.u.verifier.data; + s[0] = jiffies; + s[1] = current->pid; + } else if (flags & O_CREAT) { + p->o_arg.u.attrs = &p->attrs; + memcpy(&p->attrs, attrs, sizeof(p->attrs)); + } + p->c_arg.fh = &p->o_res.fh; + p->c_arg.stateid = &p->o_res.stateid; + p->c_arg.seqid = p->o_arg.seqid; + return p; +err_free: + kfree(p); +err: + dput(parent); + return NULL; +} + +static void nfs4_opendata_free(struct nfs4_opendata *p) +{ + if (p != NULL && atomic_dec_and_test(&p->count)) { + nfs_free_seqid(p->o_arg.seqid); + nfs4_put_state_owner(p->owner); + dput(p->dir); + dput(p->dentry); + kfree(p); + } +} + /* Helper for asynchronous RPC calls */ -static int nfs4_call_async(struct rpc_clnt *clnt, rpc_action tk_begin, - rpc_action tk_exit, void *calldata) +static int nfs4_call_async(struct rpc_clnt *clnt, + const struct rpc_call_ops *tk_ops, void *calldata) { struct rpc_task *task; - if (!(task = rpc_new_task(clnt, tk_exit, RPC_TASK_ASYNC))) + if (!(task = rpc_new_task(clnt, RPC_TASK_ASYNC, tk_ops, calldata))) return -ENOMEM; - - task->tk_calldata = calldata; - task->tk_action = tk_begin; rpc_execute(task); return 0; } +static int nfs4_wait_for_completion_rpc_task(struct rpc_task *task) +{ + sigset_t oldset; + int ret; + + rpc_clnt_sigmask(task->tk_client, &oldset); + ret = rpc_wait_for_completion_task(task); + rpc_clnt_sigunmask(task->tk_client, &oldset); + return ret; +} + +static inline void update_open_stateflags(struct nfs4_state *state, mode_t open_flags) +{ + switch (open_flags) { + case FMODE_WRITE: + state->n_wronly++; + break; + case FMODE_READ: + state->n_rdonly++; + break; + case FMODE_READ|FMODE_WRITE: + state->n_rdwr++; + } +} + static void update_open_stateid(struct nfs4_state *state, nfs4_stateid *stateid, int open_flags) { struct inode *inode = state->inode; @@ -218,41 +321,134 @@ static void update_open_stateid(struct n spin_lock(&state->owner->so_lock); spin_lock(&inode->i_lock); memcpy(&state->stateid, stateid, sizeof(state->stateid)); - if ((open_flags & FMODE_WRITE)) - state->nwriters++; - if (open_flags & FMODE_READ) - state->nreaders++; + update_open_stateflags(state, open_flags); nfs4_state_set_mode_locked(state, state->state | open_flags); spin_unlock(&inode->i_lock); spin_unlock(&state->owner->so_lock); } +static struct nfs4_state *nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data) +{ + struct inode *inode; + struct nfs4_state *state = NULL; + + if (!(data->f_attr.valid & NFS_ATTR_FATTR)) + goto out; + inode = nfs_fhget(data->dir->d_sb, &data->o_res.fh, &data->f_attr); + if (inode == NULL) + goto out; + state = nfs4_get_open_state(inode, data->owner); + if (state == NULL) + goto put_inode; + update_open_stateid(state, &data->o_res.stateid, data->o_arg.open_flags); +put_inode: + iput(inode); +out: + return state; +} + +static struct nfs_open_context *nfs4_state_find_open_context(struct nfs4_state *state) +{ + struct nfs_inode *nfsi = NFS_I(state->inode); + struct nfs_open_context *ctx; + + spin_lock(&state->inode->i_lock); + list_for_each_entry(ctx, &nfsi->open_files, list) { + if (ctx->state != state) + continue; + get_nfs_open_context(ctx); + spin_unlock(&state->inode->i_lock); + return ctx; + } + spin_unlock(&state->inode->i_lock); + return ERR_PTR(-ENOENT); +} + +static int nfs4_open_recover_helper(struct nfs4_opendata *opendata, mode_t openflags, nfs4_stateid *stateid) +{ + int ret; + + opendata->o_arg.open_flags = openflags; + ret = _nfs4_proc_open(opendata); + if (ret != 0) + return ret; + memcpy(stateid->data, opendata->o_res.stateid.data, + sizeof(stateid->data)); + return 0; +} + +static int nfs4_open_recover(struct nfs4_opendata *opendata, struct nfs4_state *state) +{ + nfs4_stateid stateid; + struct nfs4_state *newstate; + int mode = 0; + int delegation = 0; + int ret; + + /* memory barrier prior to reading state->n_* */ + smp_rmb(); + if (state->n_rdwr != 0) { + ret = nfs4_open_recover_helper(opendata, FMODE_READ|FMODE_WRITE, &stateid); + if (ret != 0) + return ret; + mode |= FMODE_READ|FMODE_WRITE; + if (opendata->o_res.delegation_type != 0) + delegation = opendata->o_res.delegation_type; + smp_rmb(); + } + if (state->n_wronly != 0) { + ret = nfs4_open_recover_helper(opendata, FMODE_WRITE, &stateid); + if (ret != 0) + return ret; + mode |= FMODE_WRITE; + if (opendata->o_res.delegation_type != 0) + delegation = opendata->o_res.delegation_type; + smp_rmb(); + } + if (state->n_rdonly != 0) { + ret = nfs4_open_recover_helper(opendata, FMODE_READ, &stateid); + if (ret != 0) + return ret; + mode |= FMODE_READ; + } + clear_bit(NFS_DELEGATED_STATE, &state->flags); + if (mode == 0) + return 0; + if (opendata->o_res.delegation_type == 0) + opendata->o_res.delegation_type = delegation; + opendata->o_arg.open_flags |= mode; + newstate = nfs4_opendata_to_nfs4_state(opendata); + if (newstate != NULL) { + if (opendata->o_res.delegation_type != 0) { + struct nfs_inode *nfsi = NFS_I(newstate->inode); + int delegation_flags = 0; + if (nfsi->delegation) + delegation_flags = nfsi->delegation->flags; + if (!(delegation_flags & NFS_DELEGATION_NEED_RECLAIM)) + nfs_inode_set_delegation(newstate->inode, + opendata->owner->so_cred, + &opendata->o_res); + else + nfs_inode_reclaim_delegation(newstate->inode, + opendata->owner->so_cred, + &opendata->o_res); + } + nfs4_close_state(newstate, opendata->o_arg.open_flags); + } + if (newstate != state) + return -ESTALE; + return 0; +} + /* * OPEN_RECLAIM: * reclaim state on the server after a reboot. */ -static int _nfs4_open_reclaim(struct nfs4_state_owner *sp, struct nfs4_state *state) +static int _nfs4_do_open_reclaim(struct nfs4_state_owner *sp, struct nfs4_state *state, struct dentry *dentry) { - struct inode *inode = state->inode; - struct nfs_server *server = NFS_SERVER(inode); - struct nfs_delegation *delegation = NFS_I(inode)->delegation; - struct nfs_openargs o_arg = { - .fh = NFS_FH(inode), - .id = sp->so_id, - .open_flags = state->state, - .clientid = server->nfs4_state->cl_clientid, - .claim = NFS4_OPEN_CLAIM_PREVIOUS, - .bitmask = server->attr_bitmask, - }; - struct nfs_openres o_res = { - .server = server, /* Grrr */ - }; - struct rpc_message msg = { - .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_NOATTR], - .rpc_argp = &o_arg, - .rpc_resp = &o_res, - .rpc_cred = sp->so_cred, - }; + struct nfs_delegation *delegation = NFS_I(state->inode)->delegation; + struct nfs4_opendata *opendata; + int delegation_type = 0; int status; if (delegation != NULL) { @@ -262,38 +458,27 @@ static int _nfs4_open_reclaim(struct nfs set_bit(NFS_DELEGATED_STATE, &state->flags); return 0; } - o_arg.u.delegation_type = delegation->type; + delegation_type = delegation->type; } - o_arg.seqid = nfs_alloc_seqid(&sp->so_seqid); - if (o_arg.seqid == NULL) + opendata = nfs4_opendata_alloc(dentry, sp, 0, NULL); + if (opendata == NULL) return -ENOMEM; - status = rpc_call_sync(server->client, &msg, RPC_TASK_NOINTR); - /* Confirm the sequence as being established */ - nfs_confirm_seqid(&sp->so_seqid, status); - nfs_increment_open_seqid(status, o_arg.seqid); - if (status == 0) { - memcpy(&state->stateid, &o_res.stateid, sizeof(state->stateid)); - if (o_res.delegation_type != 0) { - nfs_inode_reclaim_delegation(inode, sp->so_cred, &o_res); - /* Did the server issue an immediate delegation recall? */ - if (o_res.do_recall) - nfs_async_inode_return_delegation(inode, &o_res.stateid); - } - } - nfs_free_seqid(o_arg.seqid); - clear_bit(NFS_DELEGATED_STATE, &state->flags); - /* Ensure we update the inode attributes */ - NFS_CACHEINV(inode); + opendata->o_arg.claim = NFS4_OPEN_CLAIM_PREVIOUS; + opendata->o_arg.fh = NFS_FH(state->inode); + nfs_copy_fh(&opendata->o_res.fh, opendata->o_arg.fh); + opendata->o_arg.u.delegation_type = delegation_type; + status = nfs4_open_recover(opendata, state); + nfs4_opendata_free(opendata); return status; } -static int nfs4_open_reclaim(struct nfs4_state_owner *sp, struct nfs4_state *state) +static int nfs4_do_open_reclaim(struct nfs4_state_owner *sp, struct nfs4_state *state, struct dentry *dentry) { struct nfs_server *server = NFS_SERVER(state->inode); struct nfs4_exception exception = { }; int err; do { - err = _nfs4_open_reclaim(sp, state); + err = _nfs4_do_open_reclaim(sp, state, dentry); if (err != -NFS4ERR_DELAY) break; nfs4_handle_exception(server, err, &exception); @@ -301,63 +486,36 @@ static int nfs4_open_reclaim(struct nfs4 return err; } +static int nfs4_open_reclaim(struct nfs4_state_owner *sp, struct nfs4_state *state) +{ + struct nfs_open_context *ctx; + int ret; + + ctx = nfs4_state_find_open_context(state); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); + ret = nfs4_do_open_reclaim(sp, state, ctx->dentry); + put_nfs_open_context(ctx); + return ret; +} + static int _nfs4_open_delegation_recall(struct dentry *dentry, struct nfs4_state *state) { struct nfs4_state_owner *sp = state->owner; - struct inode *inode = dentry->d_inode; - struct nfs_server *server = NFS_SERVER(inode); - struct dentry *parent = dget_parent(dentry); - struct nfs_openargs arg = { - .fh = NFS_FH(parent->d_inode), - .clientid = server->nfs4_state->cl_clientid, - .name = &dentry->d_name, - .id = sp->so_id, - .server = server, - .bitmask = server->attr_bitmask, - .claim = NFS4_OPEN_CLAIM_DELEGATE_CUR, - }; - struct nfs_openres res = { - .server = server, - }; - struct rpc_message msg = { - .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_NOATTR], - .rpc_argp = &arg, - .rpc_resp = &res, - .rpc_cred = sp->so_cred, - }; - int status = 0; + struct nfs4_opendata *opendata; + int ret; if (!test_bit(NFS_DELEGATED_STATE, &state->flags)) - goto out; - if (state->state == 0) - goto out; - arg.seqid = nfs_alloc_seqid(&sp->so_seqid); - status = -ENOMEM; - if (arg.seqid == NULL) - goto out; - arg.open_flags = state->state; - memcpy(arg.u.delegation.data, state->stateid.data, sizeof(arg.u.delegation.data)); - status = rpc_call_sync(server->client, &msg, RPC_TASK_NOINTR); - nfs_increment_open_seqid(status, arg.seqid); - if (status != 0) - goto out_free; - if(res.rflags & NFS4_OPEN_RESULT_CONFIRM) { - status = _nfs4_proc_open_confirm(server->client, NFS_FH(inode), - sp, &res.stateid, arg.seqid); - if (status != 0) - goto out_free; - } - nfs_confirm_seqid(&sp->so_seqid, 0); - if (status >= 0) { - memcpy(state->stateid.data, res.stateid.data, - sizeof(state->stateid.data)); - clear_bit(NFS_DELEGATED_STATE, &state->flags); - } -out_free: - nfs_free_seqid(arg.seqid); -out: - dput(parent); - return status; + return 0; + opendata = nfs4_opendata_alloc(dentry, sp, 0, NULL); + if (opendata == NULL) + return -ENOMEM; + opendata->o_arg.claim = NFS4_OPEN_CLAIM_DELEGATE_CUR; + memcpy(opendata->o_arg.u.delegation.data, state->stateid.data, + sizeof(opendata->o_arg.u.delegation.data)); + ret = nfs4_open_recover(opendata, state); + nfs4_opendata_free(opendata); + return ret; } int nfs4_open_delegation_recall(struct dentry *dentry, struct nfs4_state *state) @@ -382,82 +540,202 @@ int nfs4_open_delegation_recall(struct d return err; } -static int _nfs4_proc_open_confirm(struct rpc_clnt *clnt, const struct nfs_fh *fh, struct nfs4_state_owner *sp, nfs4_stateid *stateid, struct nfs_seqid *seqid) +static void nfs4_open_confirm_prepare(struct rpc_task *task, void *calldata) { - struct nfs_open_confirmargs arg = { - .fh = fh, - .seqid = seqid, - .stateid = *stateid, - }; - struct nfs_open_confirmres res; - struct rpc_message msg = { - .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_CONFIRM], - .rpc_argp = &arg, - .rpc_resp = &res, - .rpc_cred = sp->so_cred, + struct nfs4_opendata *data = calldata; + struct rpc_message msg = { + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_CONFIRM], + .rpc_argp = &data->c_arg, + .rpc_resp = &data->c_res, + .rpc_cred = data->owner->so_cred, }; + data->timestamp = jiffies; + rpc_call_setup(task, &msg, 0); +} + +static void nfs4_open_confirm_done(struct rpc_task *task, void *calldata) +{ + struct nfs4_opendata *data = calldata; + + data->rpc_status = task->tk_status; + if (RPC_ASSASSINATED(task)) + return; + if (data->rpc_status == 0) { + memcpy(data->o_res.stateid.data, data->c_res.stateid.data, + sizeof(data->o_res.stateid.data)); + renew_lease(data->o_res.server, data->timestamp); + } + nfs_increment_open_seqid(data->rpc_status, data->c_arg.seqid); + nfs_confirm_seqid(&data->owner->so_seqid, data->rpc_status); +} + +static void nfs4_open_confirm_release(void *calldata) +{ + struct nfs4_opendata *data = calldata; + struct nfs4_state *state = NULL; + + /* If this request hasn't been cancelled, do nothing */ + if (data->cancelled == 0) + goto out_free; + /* In case of error, no cleanup! */ + if (data->rpc_status != 0) + goto out_free; + nfs_confirm_seqid(&data->owner->so_seqid, 0); + state = nfs4_opendata_to_nfs4_state(data); + if (state != NULL) + nfs4_close_state(state, data->o_arg.open_flags); +out_free: + nfs4_opendata_free(data); +} + +static const struct rpc_call_ops nfs4_open_confirm_ops = { + .rpc_call_prepare = nfs4_open_confirm_prepare, + .rpc_call_done = nfs4_open_confirm_done, + .rpc_release = nfs4_open_confirm_release, +}; + +/* + * Note: On error, nfs4_proc_open_confirm will free the struct nfs4_opendata + */ +static int _nfs4_proc_open_confirm(struct nfs4_opendata *data) +{ + struct nfs_server *server = NFS_SERVER(data->dir->d_inode); + struct rpc_task *task; int status; - status = rpc_call_sync(clnt, &msg, RPC_TASK_NOINTR); - /* Confirm the sequence as being established */ - nfs_confirm_seqid(&sp->so_seqid, status); - nfs_increment_open_seqid(status, seqid); - if (status >= 0) - memcpy(stateid, &res.stateid, sizeof(*stateid)); + atomic_inc(&data->count); + task = rpc_run_task(server->client, RPC_TASK_ASYNC, &nfs4_open_confirm_ops, data); + if (IS_ERR(task)) { + nfs4_opendata_free(data); + return PTR_ERR(task); + } + status = nfs4_wait_for_completion_rpc_task(task); + if (status != 0) { + data->cancelled = 1; + smp_wmb(); + } else + status = data->rpc_status; + rpc_release_task(task); return status; } -static int _nfs4_proc_open(struct inode *dir, struct nfs4_state_owner *sp, struct nfs_openargs *o_arg, struct nfs_openres *o_res) +static void nfs4_open_prepare(struct rpc_task *task, void *calldata) { - struct nfs_server *server = NFS_SERVER(dir); + struct nfs4_opendata *data = calldata; + struct nfs4_state_owner *sp = data->owner; struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN], - .rpc_argp = o_arg, - .rpc_resp = o_res, + .rpc_argp = &data->o_arg, + .rpc_resp = &data->o_res, .rpc_cred = sp->so_cred, }; - int status; + + if (nfs_wait_on_sequence(data->o_arg.seqid, task) != 0) + return; + /* Update sequence id. */ + data->o_arg.id = sp->so_id; + data->o_arg.clientid = sp->so_client->cl_clientid; + if (data->o_arg.claim == NFS4_OPEN_CLAIM_PREVIOUS) + msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_NOATTR]; + data->timestamp = jiffies; + rpc_call_setup(task, &msg, 0); +} - /* Update sequence id. The caller must serialize! */ - o_arg->id = sp->so_id; - o_arg->clientid = sp->so_client->cl_clientid; +static void nfs4_open_done(struct rpc_task *task, void *calldata) +{ + struct nfs4_opendata *data = calldata; - status = rpc_call_sync(server->client, &msg, RPC_TASK_NOINTR); - if (status == 0) { - /* OPEN on anything except a regular file is disallowed in NFSv4 */ - switch (o_res->f_attr->mode & S_IFMT) { + data->rpc_status = task->tk_status; + if (RPC_ASSASSINATED(task)) + return; + if (task->tk_status == 0) { + switch (data->o_res.f_attr->mode & S_IFMT) { case S_IFREG: break; case S_IFLNK: - status = -ELOOP; + data->rpc_status = -ELOOP; break; case S_IFDIR: - status = -EISDIR; + data->rpc_status = -EISDIR; break; default: - status = -ENOTDIR; + data->rpc_status = -ENOTDIR; } + renew_lease(data->o_res.server, data->timestamp); } + nfs_increment_open_seqid(data->rpc_status, data->o_arg.seqid); +} + +static void nfs4_open_release(void *calldata) +{ + struct nfs4_opendata *data = calldata; + struct nfs4_state *state = NULL; - nfs_increment_open_seqid(status, o_arg->seqid); + /* If this request hasn't been cancelled, do nothing */ + if (data->cancelled == 0) + goto out_free; + /* In case of error, no cleanup! */ + if (data->rpc_status != 0) + goto out_free; + /* In case we need an open_confirm, no cleanup! */ + if (data->o_res.rflags & NFS4_OPEN_RESULT_CONFIRM) + goto out_free; + nfs_confirm_seqid(&data->owner->so_seqid, 0); + state = nfs4_opendata_to_nfs4_state(data); + if (state != NULL) + nfs4_close_state(state, data->o_arg.open_flags); +out_free: + nfs4_opendata_free(data); +} + +static const struct rpc_call_ops nfs4_open_ops = { + .rpc_call_prepare = nfs4_open_prepare, + .rpc_call_done = nfs4_open_done, + .rpc_release = nfs4_open_release, +}; + +/* + * Note: On error, nfs4_proc_open will free the struct nfs4_opendata + */ +static int _nfs4_proc_open(struct nfs4_opendata *data) +{ + struct inode *dir = data->dir->d_inode; + struct nfs_server *server = NFS_SERVER(dir); + struct nfs_openargs *o_arg = &data->o_arg; + struct nfs_openres *o_res = &data->o_res; + struct rpc_task *task; + int status; + + atomic_inc(&data->count); + task = rpc_run_task(server->client, RPC_TASK_ASYNC, &nfs4_open_ops, data); + if (IS_ERR(task)) { + nfs4_opendata_free(data); + return PTR_ERR(task); + } + status = nfs4_wait_for_completion_rpc_task(task); + if (status != 0) { + data->cancelled = 1; + smp_wmb(); + } else + status = data->rpc_status; + rpc_release_task(task); if (status != 0) - goto out; + return status; + if (o_arg->open_flags & O_CREAT) { update_changeattr(dir, &o_res->cinfo); nfs_post_op_update_inode(dir, o_res->dir_attr); } else nfs_refresh_inode(dir, o_res->dir_attr); if(o_res->rflags & NFS4_OPEN_RESULT_CONFIRM) { - status = _nfs4_proc_open_confirm(server->client, &o_res->fh, - sp, &o_res->stateid, o_arg->seqid); + status = _nfs4_proc_open_confirm(data); if (status != 0) - goto out; + return status; } - nfs_confirm_seqid(&sp->so_seqid, 0); + nfs_confirm_seqid(&data->owner->so_seqid, 0); if (!(o_res->f_attr->valid & NFS_ATTR_FATTR)) - status = server->rpc_ops->getattr(server, &o_res->fh, o_res->f_attr); -out: - return status; + return server->rpc_ops->getattr(server, &o_res->fh, o_res->f_attr); + return 0; } static int _nfs4_do_access(struct inode *inode, struct rpc_cred *cred, int openflags) @@ -488,6 +766,15 @@ out: return -EACCES; } +int nfs4_recover_expired_lease(struct nfs_server *server) +{ + struct nfs4_client *clp = server->nfs4_state; + + if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) + nfs4_schedule_state_recovery(clp); + return nfs4_wait_clnt_recover(server->client, clp); +} + /* * OPEN_EXPIRED: * reclaim state on the server after a network partition. @@ -495,77 +782,31 @@ out: */ static int _nfs4_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state, struct dentry *dentry) { - struct dentry *parent = dget_parent(dentry); - struct inode *dir = parent->d_inode; struct inode *inode = state->inode; - struct nfs_server *server = NFS_SERVER(dir); struct nfs_delegation *delegation = NFS_I(inode)->delegation; - struct nfs_fattr f_attr, dir_attr; - struct nfs_openargs o_arg = { - .fh = NFS_FH(dir), - .open_flags = state->state, - .name = &dentry->d_name, - .bitmask = server->attr_bitmask, - .claim = NFS4_OPEN_CLAIM_NULL, - }; - struct nfs_openres o_res = { - .f_attr = &f_attr, - .dir_attr = &dir_attr, - .server = server, - }; - int status = 0; + struct nfs4_opendata *opendata; + int openflags = state->state & (FMODE_READ|FMODE_WRITE); + int ret; if (delegation != NULL && !(delegation->flags & NFS_DELEGATION_NEED_RECLAIM)) { - status = _nfs4_do_access(inode, sp->so_cred, state->state); - if (status < 0) - goto out; + ret = _nfs4_do_access(inode, sp->so_cred, openflags); + if (ret < 0) + return ret; memcpy(&state->stateid, &delegation->stateid, sizeof(state->stateid)); set_bit(NFS_DELEGATED_STATE, &state->flags); - goto out; + return 0; } - o_arg.seqid = nfs_alloc_seqid(&sp->so_seqid); - status = -ENOMEM; - if (o_arg.seqid == NULL) - goto out; - nfs_fattr_init(&f_attr); - nfs_fattr_init(&dir_attr); - status = _nfs4_proc_open(dir, sp, &o_arg, &o_res); - if (status != 0) - goto out_nodeleg; - /* Check if files differ */ - if ((f_attr.mode & S_IFMT) != (inode->i_mode & S_IFMT)) - goto out_stale; - /* Has the file handle changed? */ - if (nfs_compare_fh(&o_res.fh, NFS_FH(inode)) != 0) { - /* Verify if the change attributes are the same */ - if (f_attr.change_attr != NFS_I(inode)->change_attr) - goto out_stale; - if (nfs_size_to_loff_t(f_attr.size) != inode->i_size) - goto out_stale; - /* Lets just pretend that this is the same file */ - nfs_copy_fh(NFS_FH(inode), &o_res.fh); - NFS_I(inode)->fileid = f_attr.fileid; - } - memcpy(&state->stateid, &o_res.stateid, sizeof(state->stateid)); - if (o_res.delegation_type != 0) { - if (!(delegation->flags & NFS_DELEGATION_NEED_RECLAIM)) - nfs_inode_set_delegation(inode, sp->so_cred, &o_res); - else - nfs_inode_reclaim_delegation(inode, sp->so_cred, &o_res); + opendata = nfs4_opendata_alloc(dentry, sp, openflags, NULL); + if (opendata == NULL) + return -ENOMEM; + ret = nfs4_open_recover(opendata, state); + if (ret == -ESTALE) { + /* Invalidate the state owner so we don't ever use it again */ + nfs4_drop_state_owner(sp); + d_drop(dentry); } -out_nodeleg: - nfs_free_seqid(o_arg.seqid); - clear_bit(NFS_DELEGATED_STATE, &state->flags); -out: - dput(parent); - return status; -out_stale: - status = -ESTALE; - /* Invalidate the state owner so we don't ever use it again */ - nfs4_drop_state_owner(sp); - d_drop(dentry); - /* Should we be trying to close that stateid? */ - goto out_nodeleg; + nfs4_opendata_free(opendata); + return ret; } static inline int nfs4_do_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state, struct dentry *dentry) @@ -584,26 +825,19 @@ static inline int nfs4_do_open_expired(s static int nfs4_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state) { - struct nfs_inode *nfsi = NFS_I(state->inode); struct nfs_open_context *ctx; - int status; + int ret; - spin_lock(&state->inode->i_lock); - list_for_each_entry(ctx, &nfsi->open_files, list) { - if (ctx->state != state) - continue; - get_nfs_open_context(ctx); - spin_unlock(&state->inode->i_lock); - status = nfs4_do_open_expired(sp, state, ctx->dentry); - put_nfs_open_context(ctx); - return status; - } - spin_unlock(&state->inode->i_lock); - return -ENOENT; + ctx = nfs4_state_find_open_context(state); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); + ret = nfs4_do_open_expired(sp, state, ctx->dentry); + put_nfs_open_context(ctx); + return ret; } /* - * Returns an nfs4_state + an extra reference to the inode + * Returns a referenced nfs4_state if there is an open delegation on the file */ static int _nfs4_open_delegated(struct inode *inode, int flags, struct rpc_cred *cred, struct nfs4_state **res) { @@ -616,6 +850,14 @@ static int _nfs4_open_delegated(struct i int open_flags = flags & (FMODE_READ|FMODE_WRITE); int err; + err = -ENOMEM; + if (!(sp = nfs4_get_state_owner(server, cred))) { + dprintk("%s: nfs4_get_state_owner failed!\n", __FUNCTION__); + return err; + } + err = nfs4_recover_expired_lease(server); + if (err != 0) + goto out_put_state_owner; /* Protect against reboot recovery - NOTE ORDER! */ down_read(&clp->cl_sem); /* Protect against delegation recall */ @@ -625,10 +867,6 @@ static int _nfs4_open_delegated(struct i if (delegation == NULL || (delegation->type & open_flags) != open_flags) goto out_err; err = -ENOMEM; - if (!(sp = nfs4_get_state_owner(server, cred))) { - dprintk("%s: nfs4_get_state_owner failed!\n", __FUNCTION__); - goto out_err; - } state = nfs4_get_open_state(inode, sp); if (state == NULL) goto out_err; @@ -636,39 +874,34 @@ static int _nfs4_open_delegated(struct i err = -ENOENT; if ((state->state & open_flags) == open_flags) { spin_lock(&inode->i_lock); - if (open_flags & FMODE_READ) - state->nreaders++; - if (open_flags & FMODE_WRITE) - state->nwriters++; + update_open_stateflags(state, open_flags); spin_unlock(&inode->i_lock); goto out_ok; } else if (state->state != 0) - goto out_err; + goto out_put_open_state; lock_kernel(); err = _nfs4_do_access(inode, cred, open_flags); unlock_kernel(); if (err != 0) - goto out_err; + goto out_put_open_state; set_bit(NFS_DELEGATED_STATE, &state->flags); update_open_stateid(state, &delegation->stateid, open_flags); out_ok: nfs4_put_state_owner(sp); up_read(&nfsi->rwsem); up_read(&clp->cl_sem); - igrab(inode); *res = state; - return 0; + return 0; +out_put_open_state: + nfs4_put_open_state(state); out_err: - if (sp != NULL) { - if (state != NULL) - nfs4_put_open_state(state); - nfs4_put_state_owner(sp); - } up_read(&nfsi->rwsem); up_read(&clp->cl_sem); if (err != -EACCES) nfs_inode_return_delegation(inode); +out_put_state_owner: + nfs4_put_state_owner(sp); return err; } @@ -689,7 +922,7 @@ static struct nfs4_state *nfs4_open_dele } /* - * Returns an nfs4_state + an referenced inode + * Returns a referenced nfs4_state */ static int _nfs4_do_open(struct inode *dir, struct dentry *dentry, int flags, struct iattr *sattr, struct rpc_cred *cred, struct nfs4_state **res) { @@ -697,73 +930,46 @@ static int _nfs4_do_open(struct inode *d struct nfs4_state *state = NULL; struct nfs_server *server = NFS_SERVER(dir); struct nfs4_client *clp = server->nfs4_state; - struct inode *inode = NULL; + struct nfs4_opendata *opendata; int status; - struct nfs_fattr f_attr, dir_attr; - struct nfs_openargs o_arg = { - .fh = NFS_FH(dir), - .open_flags = flags, - .name = &dentry->d_name, - .server = server, - .bitmask = server->attr_bitmask, - .claim = NFS4_OPEN_CLAIM_NULL, - }; - struct nfs_openres o_res = { - .f_attr = &f_attr, - .dir_attr = &dir_attr, - .server = server, - }; /* Protect against reboot recovery conflicts */ - down_read(&clp->cl_sem); status = -ENOMEM; if (!(sp = nfs4_get_state_owner(server, cred))) { dprintk("nfs4_do_open: nfs4_get_state_owner failed!\n"); goto out_err; } - if (flags & O_EXCL) { - u32 *p = (u32 *) o_arg.u.verifier.data; - p[0] = jiffies; - p[1] = current->pid; - } else - o_arg.u.attrs = sattr; - /* Serialization for the sequence id */ + status = nfs4_recover_expired_lease(server); + if (status != 0) + goto err_put_state_owner; + down_read(&clp->cl_sem); + status = -ENOMEM; + opendata = nfs4_opendata_alloc(dentry, sp, flags, sattr); + if (opendata == NULL) + goto err_put_state_owner; - o_arg.seqid = nfs_alloc_seqid(&sp->so_seqid); - if (o_arg.seqid == NULL) - return -ENOMEM; - nfs_fattr_init(&f_attr); - nfs_fattr_init(&dir_attr); - status = _nfs4_proc_open(dir, sp, &o_arg, &o_res); + status = _nfs4_proc_open(opendata); if (status != 0) - goto out_err; + goto err_opendata_free; status = -ENOMEM; - inode = nfs_fhget(dir->i_sb, &o_res.fh, &f_attr); - if (!inode) - goto out_err; - state = nfs4_get_open_state(inode, sp); - if (!state) - goto out_err; - update_open_stateid(state, &o_res.stateid, flags); - if (o_res.delegation_type != 0) - nfs_inode_set_delegation(inode, cred, &o_res); - nfs_free_seqid(o_arg.seqid); + state = nfs4_opendata_to_nfs4_state(opendata); + if (state == NULL) + goto err_opendata_free; + if (opendata->o_res.delegation_type != 0) + nfs_inode_set_delegation(state->inode, cred, &opendata->o_res); + nfs4_opendata_free(opendata); nfs4_put_state_owner(sp); up_read(&clp->cl_sem); *res = state; return 0; +err_opendata_free: + nfs4_opendata_free(opendata); +err_put_state_owner: + nfs4_put_state_owner(sp); out_err: - if (sp != NULL) { - if (state != NULL) - nfs4_put_open_state(state); - nfs_free_seqid(o_arg.seqid); - nfs4_put_state_owner(sp); - } /* Note: clp->cl_sem must be released before nfs4_put_open_state()! */ up_read(&clp->cl_sem); - if (inode != NULL) - iput(inode); *res = NULL; return status; } @@ -830,6 +1036,7 @@ static int _nfs4_do_setattr(struct nfs_s .rpc_argp = &arg, .rpc_resp = &res, }; + unsigned long timestamp = jiffies; int status; nfs_fattr_init(fattr); @@ -841,6 +1048,8 @@ static int _nfs4_do_setattr(struct nfs_s memcpy(&arg.stateid, &zero_stateid, sizeof(arg.stateid)); status = rpc_call_sync(server->client, &msg, 0); + if (status == 0 && state != NULL) + renew_lease(server, timestamp); return status; } @@ -865,12 +1074,13 @@ struct nfs4_closedata { struct nfs_closeargs arg; struct nfs_closeres res; struct nfs_fattr fattr; + unsigned long timestamp; }; -static void nfs4_free_closedata(struct nfs4_closedata *calldata) +static void nfs4_free_closedata(void *data) { - struct nfs4_state *state = calldata->state; - struct nfs4_state_owner *sp = state->owner; + struct nfs4_closedata *calldata = data; + struct nfs4_state_owner *sp = calldata->state->owner; nfs4_put_open_state(calldata->state); nfs_free_seqid(calldata->arg.seqid); @@ -878,12 +1088,14 @@ static void nfs4_free_closedata(struct n kfree(calldata); } -static void nfs4_close_done(struct rpc_task *task) +static void nfs4_close_done(struct rpc_task *task, void *data) { - struct nfs4_closedata *calldata = (struct nfs4_closedata *)task->tk_calldata; + struct nfs4_closedata *calldata = data; struct nfs4_state *state = calldata->state; struct nfs_server *server = NFS_SERVER(calldata->inode); + if (RPC_ASSASSINATED(task)) + return; /* hmm. we are done with the inode, and in the process of freeing * the state_owner. we keep this around to process errors */ @@ -892,6 +1104,7 @@ static void nfs4_close_done(struct rpc_t case 0: memcpy(&state->stateid, &calldata->res.stateid, sizeof(state->stateid)); + renew_lease(server, calldata->timestamp); break; case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: @@ -904,12 +1117,11 @@ static void nfs4_close_done(struct rpc_t } } nfs_refresh_inode(calldata->inode, calldata->res.fattr); - nfs4_free_closedata(calldata); } -static void nfs4_close_begin(struct rpc_task *task) +static void nfs4_close_prepare(struct rpc_task *task, void *data) { - struct nfs4_closedata *calldata = (struct nfs4_closedata *)task->tk_calldata; + struct nfs4_closedata *calldata = data; struct nfs4_state *state = calldata->state; struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_CLOSE], @@ -918,10 +1130,8 @@ static void nfs4_close_begin(struct rpc_ .rpc_cred = state->owner->so_cred, }; int mode = 0, old_mode; - int status; - status = nfs_wait_on_sequence(calldata->arg.seqid, task); - if (status != 0) + if (nfs_wait_on_sequence(calldata->arg.seqid, task) != 0) return; /* Recalculate the new open mode in case someone reopened the file * while we were waiting in line to be scheduled. @@ -929,26 +1139,34 @@ static void nfs4_close_begin(struct rpc_ spin_lock(&state->owner->so_lock); spin_lock(&calldata->inode->i_lock); mode = old_mode = state->state; - if (state->nreaders == 0) - mode &= ~FMODE_READ; - if (state->nwriters == 0) - mode &= ~FMODE_WRITE; + if (state->n_rdwr == 0) { + if (state->n_rdonly == 0) + mode &= ~FMODE_READ; + if (state->n_wronly == 0) + mode &= ~FMODE_WRITE; + } nfs4_state_set_mode_locked(state, mode); spin_unlock(&calldata->inode->i_lock); spin_unlock(&state->owner->so_lock); if (mode == old_mode || test_bit(NFS_DELEGATED_STATE, &state->flags)) { - nfs4_free_closedata(calldata); - task->tk_exit = NULL; - rpc_exit(task, 0); + /* Note: exit _without_ calling nfs4_close_done */ + task->tk_action = NULL; return; } nfs_fattr_init(calldata->res.fattr); if (mode != 0) msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_DOWNGRADE]; calldata->arg.open_flags = mode; + calldata->timestamp = jiffies; rpc_call_setup(task, &msg, 0); } +static const struct rpc_call_ops nfs4_close_ops = { + .rpc_call_prepare = nfs4_close_prepare, + .rpc_call_done = nfs4_close_done, + .rpc_release = nfs4_free_closedata, +}; + /* * It is possible for data to be read/written from a mem-mapped file * after the sys_close call (which hits the vfs layer as a flush). @@ -981,8 +1199,7 @@ int nfs4_do_close(struct inode *inode, s calldata->res.fattr = &calldata->fattr; calldata->res.server = server; - status = nfs4_call_async(server->client, nfs4_close_begin, - nfs4_close_done, calldata); + status = nfs4_call_async(server->client, &nfs4_close_ops, calldata); if (status == 0) goto out; @@ -1034,7 +1251,7 @@ nfs4_atomic_open(struct inode *dir, stru d_add(dentry, NULL); return (struct dentry *)state; } - res = d_add_unique(dentry, state->inode); + res = d_add_unique(dentry, igrab(state->inode)); if (res != NULL) dentry = res; nfs4_intent_set_file(nd, dentry, state); @@ -1046,7 +1263,6 @@ nfs4_open_revalidate(struct inode *dir, { struct rpc_cred *cred; struct nfs4_state *state; - struct inode *inode; cred = rpcauth_lookupcred(NFS_SERVER(dir)->client->cl_auth, 0); if (IS_ERR(cred)) @@ -1070,9 +1286,7 @@ nfs4_open_revalidate(struct inode *dir, } goto out_drop; } - inode = state->inode; - iput(inode); - if (inode == dentry->d_inode) { + if (state->inode == dentry->d_inode) { nfs4_intent_set_file(nd, dentry, state); return 1; } @@ -1508,11 +1722,13 @@ static int _nfs4_proc_write(struct nfs_w wdata->args.bitmask = server->attr_bitmask; wdata->res.server = server; + wdata->timestamp = jiffies; nfs_fattr_init(fattr); status = rpc_call_sync(server->client, &msg, rpcflags); dprintk("NFS reply write: %d\n", status); if (status < 0) return status; + renew_lease(server, wdata->timestamp); nfs_post_op_update_inode(inode, fattr); return wdata->res.count; } @@ -1547,8 +1763,11 @@ static int _nfs4_proc_commit(struct nfs_ cdata->args.bitmask = server->attr_bitmask; cdata->res.server = server; + cdata->timestamp = jiffies; nfs_fattr_init(fattr); status = rpc_call_sync(server->client, &msg, 0); + if (status >= 0) + renew_lease(server, cdata->timestamp); dprintk("NFS reply commit: %d\n", status); if (status >= 0) nfs_post_op_update_inode(inode, fattr); @@ -1601,7 +1820,7 @@ nfs4_proc_create(struct inode *dir, stru status = PTR_ERR(state); goto out; } - d_instantiate(dentry, state->inode); + d_instantiate(dentry, igrab(state->inode)); if (flags & O_EXCL) { struct nfs_fattr fattr; status = nfs4_do_setattr(NFS_SERVER(dir), &fattr, @@ -2125,10 +2344,9 @@ static int nfs4_proc_pathconf(struct nfs return err; } -static void -nfs4_read_done(struct rpc_task *task) +static void nfs4_read_done(struct rpc_task *task, void *calldata) { - struct nfs_read_data *data = (struct nfs_read_data *) task->tk_calldata; + struct nfs_read_data *data = calldata; struct inode *inode = data->inode; if (nfs4_async_handle_error(task, NFS_SERVER(inode)) == -EAGAIN) { @@ -2138,9 +2356,14 @@ nfs4_read_done(struct rpc_task *task) if (task->tk_status > 0) renew_lease(NFS_SERVER(inode), data->timestamp); /* Call back common NFS readpage processing */ - nfs_readpage_result(task); + nfs_readpage_result(task, calldata); } +static const struct rpc_call_ops nfs4_read_ops = { + .rpc_call_done = nfs4_read_done, + .rpc_release = nfs_readdata_release, +}; + static void nfs4_proc_read_setup(struct nfs_read_data *data) { @@ -2160,14 +2383,13 @@ nfs4_proc_read_setup(struct nfs_read_dat flags = RPC_TASK_ASYNC | (IS_SWAPFILE(inode)? NFS_RPC_SWAPFLAGS : 0); /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs4_read_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs4_read_ops, data); rpc_call_setup(task, &msg, 0); } -static void -nfs4_write_done(struct rpc_task *task) +static void nfs4_write_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data = (struct nfs_write_data *) task->tk_calldata; + struct nfs_write_data *data = calldata; struct inode *inode = data->inode; if (nfs4_async_handle_error(task, NFS_SERVER(inode)) == -EAGAIN) { @@ -2179,9 +2401,14 @@ nfs4_write_done(struct rpc_task *task) nfs_post_op_update_inode(inode, data->res.fattr); } /* Call back common NFS writeback processing */ - nfs_writeback_done(task); + nfs_writeback_done(task, calldata); } +static const struct rpc_call_ops nfs4_write_ops = { + .rpc_call_done = nfs4_write_done, + .rpc_release = nfs_writedata_release, +}; + static void nfs4_proc_write_setup(struct nfs_write_data *data, int how) { @@ -2214,14 +2441,13 @@ nfs4_proc_write_setup(struct nfs_write_d flags = (how & FLUSH_SYNC) ? 0 : RPC_TASK_ASYNC; /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs4_write_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs4_write_ops, data); rpc_call_setup(task, &msg, 0); } -static void -nfs4_commit_done(struct rpc_task *task) +static void nfs4_commit_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data = (struct nfs_write_data *) task->tk_calldata; + struct nfs_write_data *data = calldata; struct inode *inode = data->inode; if (nfs4_async_handle_error(task, NFS_SERVER(inode)) == -EAGAIN) { @@ -2231,9 +2457,14 @@ nfs4_commit_done(struct rpc_task *task) if (task->tk_status >= 0) nfs_post_op_update_inode(inode, data->res.fattr); /* Call back common NFS writeback processing */ - nfs_commit_done(task); + nfs_commit_done(task, calldata); } +static const struct rpc_call_ops nfs4_commit_ops = { + .rpc_call_done = nfs4_commit_done, + .rpc_release = nfs_commit_release, +}; + static void nfs4_proc_commit_setup(struct nfs_write_data *data, int how) { @@ -2255,7 +2486,7 @@ nfs4_proc_commit_setup(struct nfs_write_ flags = (how & FLUSH_SYNC) ? 0 : RPC_TASK_ASYNC; /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs4_commit_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs4_commit_ops, data); rpc_call_setup(task, &msg, 0); } @@ -2263,11 +2494,10 @@ nfs4_proc_commit_setup(struct nfs_write_ * nfs4_proc_async_renew(): This is not one of the nfs_rpc_ops; it is a special * standalone procedure for queueing an asynchronous RENEW. */ -static void -renew_done(struct rpc_task *task) +static void nfs4_renew_done(struct rpc_task *task, void *data) { struct nfs4_client *clp = (struct nfs4_client *)task->tk_msg.rpc_argp; - unsigned long timestamp = (unsigned long)task->tk_calldata; + unsigned long timestamp = (unsigned long)data; if (task->tk_status < 0) { switch (task->tk_status) { @@ -2284,26 +2514,28 @@ renew_done(struct rpc_task *task) spin_unlock(&clp->cl_lock); } -int -nfs4_proc_async_renew(struct nfs4_client *clp) +static const struct rpc_call_ops nfs4_renew_ops = { + .rpc_call_done = nfs4_renew_done, +}; + +int nfs4_proc_async_renew(struct nfs4_client *clp, struct rpc_cred *cred) { struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENEW], .rpc_argp = clp, - .rpc_cred = clp->cl_cred, + .rpc_cred = cred, }; return rpc_call_async(clp->cl_rpcclient, &msg, RPC_TASK_SOFT, - renew_done, (void *)jiffies); + &nfs4_renew_ops, (void *)jiffies); } -int -nfs4_proc_renew(struct nfs4_client *clp) +int nfs4_proc_renew(struct nfs4_client *clp, struct rpc_cred *cred) { struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENEW], .rpc_argp = clp, - .rpc_cred = clp->cl_cred, + .rpc_cred = cred, }; unsigned long now = jiffies; int status; @@ -2519,7 +2751,7 @@ nfs4_async_handle_error(struct rpc_task case -NFS4ERR_EXPIRED: rpc_sleep_on(&clp->cl_rpcwaitq, task, NULL, NULL); nfs4_schedule_state_recovery(clp); - if (test_bit(NFS4CLNT_OK, &clp->cl_state)) + if (test_bit(NFS4CLNT_STATE_RECOVER, &clp->cl_state) == 0) rpc_wake_up_task(task); task->tk_status = 0; return -EAGAIN; @@ -2536,25 +2768,25 @@ nfs4_async_handle_error(struct rpc_task return 0; } +static int nfs4_wait_bit_interruptible(void *word) +{ + if (signal_pending(current)) + return -ERESTARTSYS; + schedule(); + return 0; +} + static int nfs4_wait_clnt_recover(struct rpc_clnt *clnt, struct nfs4_client *clp) { - DEFINE_WAIT(wait); sigset_t oldset; - int interruptible, res = 0; + int res; might_sleep(); rpc_clnt_sigmask(clnt, &oldset); - interruptible = TASK_UNINTERRUPTIBLE; - if (clnt->cl_intr) - interruptible = TASK_INTERRUPTIBLE; - prepare_to_wait(&clp->cl_waitq, &wait, interruptible); - nfs4_schedule_state_recovery(clp); - if (clnt->cl_intr && signalled()) - res = -ERESTARTSYS; - else if (!test_bit(NFS4CLNT_OK, &clp->cl_state)) - schedule(); - finish_wait(&clp->cl_waitq, &wait); + res = wait_on_bit(&clp->cl_state, NFS4CLNT_STATE_RECOVER, + nfs4_wait_bit_interruptible, + TASK_INTERRUPTIBLE); rpc_clnt_sigunmask(clnt, &oldset); return res; } @@ -2597,6 +2829,7 @@ int nfs4_handle_exception(const struct n case -NFS4ERR_STALE_CLIENTID: case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: + nfs4_schedule_state_recovery(clp); ret = nfs4_wait_clnt_recover(server->client, clp); if (ret == 0) exception->retry = 1; @@ -2613,7 +2846,7 @@ int nfs4_handle_exception(const struct n return nfs4_map_errors(ret); } -int nfs4_proc_setclientid(struct nfs4_client *clp, u32 program, unsigned short port) +int nfs4_proc_setclientid(struct nfs4_client *clp, u32 program, unsigned short port, struct rpc_cred *cred) { nfs4_verifier sc_verifier; struct nfs4_setclientid setclientid = { @@ -2624,7 +2857,7 @@ int nfs4_proc_setclientid(struct nfs4_cl .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_SETCLIENTID], .rpc_argp = &setclientid, .rpc_resp = clp, - .rpc_cred = clp->cl_cred, + .rpc_cred = cred, }; u32 *p; int loop = 0; @@ -2638,7 +2871,7 @@ int nfs4_proc_setclientid(struct nfs4_cl setclientid.sc_name_len = scnprintf(setclientid.sc_name, sizeof(setclientid.sc_name), "%s/%u.%u.%u.%u %s %u", clp->cl_ipaddr, NIPQUAD(clp->cl_addr.s_addr), - clp->cl_cred->cr_ops->cr_name, + cred->cr_ops->cr_name, clp->cl_id_uniquifier); setclientid.sc_netid_len = scnprintf(setclientid.sc_netid, sizeof(setclientid.sc_netid), "tcp"); @@ -2661,14 +2894,14 @@ int nfs4_proc_setclientid(struct nfs4_cl } int -nfs4_proc_setclientid_confirm(struct nfs4_client *clp) +nfs4_proc_setclientid_confirm(struct nfs4_client *clp, struct rpc_cred *cred) { struct nfs_fsinfo fsinfo; struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_SETCLIENTID_CONFIRM], .rpc_argp = clp, .rpc_resp = &fsinfo, - .rpc_cred = clp->cl_cred, + .rpc_cred = cred, }; unsigned long now; int status; @@ -2679,24 +2912,92 @@ nfs4_proc_setclientid_confirm(struct nfs spin_lock(&clp->cl_lock); clp->cl_lease_time = fsinfo.lease_time * HZ; clp->cl_last_renewal = now; + clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state); spin_unlock(&clp->cl_lock); } return status; } -static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4_stateid *stateid) +struct nfs4_delegreturndata { + struct nfs4_delegreturnargs args; + struct nfs4_delegreturnres res; + struct nfs_fh fh; + nfs4_stateid stateid; + struct rpc_cred *cred; + unsigned long timestamp; + struct nfs_fattr fattr; + int rpc_status; +}; + +static void nfs4_delegreturn_prepare(struct rpc_task *task, void *calldata) { - struct nfs4_delegreturnargs args = { - .fhandle = NFS_FH(inode), - .stateid = stateid, - }; + struct nfs4_delegreturndata *data = calldata; struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_DELEGRETURN], - .rpc_argp = &args, - .rpc_cred = cred, + .rpc_argp = &data->args, + .rpc_resp = &data->res, + .rpc_cred = data->cred, }; + nfs_fattr_init(data->res.fattr); + rpc_call_setup(task, &msg, 0); +} - return rpc_call_sync(NFS_CLIENT(inode), &msg, 0); +static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata) +{ + struct nfs4_delegreturndata *data = calldata; + data->rpc_status = task->tk_status; + if (data->rpc_status == 0) + renew_lease(data->res.server, data->timestamp); +} + +static void nfs4_delegreturn_release(void *calldata) +{ + struct nfs4_delegreturndata *data = calldata; + + put_rpccred(data->cred); + kfree(calldata); +} + +const static struct rpc_call_ops nfs4_delegreturn_ops = { + .rpc_call_prepare = nfs4_delegreturn_prepare, + .rpc_call_done = nfs4_delegreturn_done, + .rpc_release = nfs4_delegreturn_release, +}; + +static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4_stateid *stateid) +{ + struct nfs4_delegreturndata *data; + struct nfs_server *server = NFS_SERVER(inode); + struct rpc_task *task; + int status; + + data = kmalloc(sizeof(*data), GFP_KERNEL); + if (data == NULL) + return -ENOMEM; + data->args.fhandle = &data->fh; + data->args.stateid = &data->stateid; + data->args.bitmask = server->attr_bitmask; + nfs_copy_fh(&data->fh, NFS_FH(inode)); + memcpy(&data->stateid, stateid, sizeof(data->stateid)); + data->res.fattr = &data->fattr; + data->res.server = server; + data->cred = get_rpccred(cred); + data->timestamp = jiffies; + data->rpc_status = 0; + + task = rpc_run_task(NFS_CLIENT(inode), RPC_TASK_ASYNC, &nfs4_delegreturn_ops, data); + if (IS_ERR(task)) { + nfs4_delegreturn_release(data); + return PTR_ERR(task); + } + status = nfs4_wait_for_completion_rpc_task(task); + if (status == 0) { + status = data->rpc_status; + if (status == 0) + nfs_post_op_update_inode(inode, &data->fattr); + } + rpc_release_task(task); + return status; } int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4_stateid *stateid) @@ -2734,43 +3035,17 @@ nfs4_set_lock_task_retry(unsigned long t return timeout; } -static inline int -nfs4_lck_type(int cmd, struct file_lock *request) -{ - /* set lock type */ - switch (request->fl_type) { - case F_RDLCK: - return IS_SETLKW(cmd) ? NFS4_READW_LT : NFS4_READ_LT; - case F_WRLCK: - return IS_SETLKW(cmd) ? NFS4_WRITEW_LT : NFS4_WRITE_LT; - case F_UNLCK: - return NFS4_WRITE_LT; - } - BUG(); - return 0; -} - -static inline uint64_t -nfs4_lck_length(struct file_lock *request) -{ - if (request->fl_end == OFFSET_MAX) - return ~(uint64_t)0; - return request->fl_end - request->fl_start + 1; -} - static int _nfs4_proc_getlk(struct nfs4_state *state, int cmd, struct file_lock *request) { struct inode *inode = state->inode; struct nfs_server *server = NFS_SERVER(inode); struct nfs4_client *clp = server->nfs4_state; - struct nfs_lockargs arg = { + struct nfs_lockt_args arg = { .fh = NFS_FH(inode), - .type = nfs4_lck_type(cmd, request), - .offset = request->fl_start, - .length = nfs4_lck_length(request), + .fl = request, }; - struct nfs_lockres res = { - .server = server, + struct nfs_lockt_res res = { + .denied = request, }; struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LOCKT], @@ -2778,36 +3053,23 @@ static int _nfs4_proc_getlk(struct nfs4_ .rpc_resp = &res, .rpc_cred = state->owner->so_cred, }; - struct nfs_lowner nlo; struct nfs4_lock_state *lsp; int status; down_read(&clp->cl_sem); - nlo.clientid = clp->cl_clientid; + arg.lock_owner.clientid = clp->cl_clientid; status = nfs4_set_lock_state(state, request); if (status != 0) goto out; lsp = request->fl_u.nfs4_fl.owner; - nlo.id = lsp->ls_id; - arg.u.lockt = &nlo; + arg.lock_owner.id = lsp->ls_id; status = rpc_call_sync(server->client, &msg, 0); - if (!status) { - request->fl_type = F_UNLCK; - } else if (status == -NFS4ERR_DENIED) { - int64_t len, start, end; - start = res.u.denied.offset; - len = res.u.denied.length; - end = start + len - 1; - if (end < 0 || len == 0) - request->fl_end = OFFSET_MAX; - else - request->fl_end = (loff_t)end; - request->fl_start = (loff_t)start; - request->fl_type = F_WRLCK; - if (res.u.denied.type & 1) - request->fl_type = F_RDLCK; - request->fl_pid = 0; - status = 0; + switch (status) { + case 0: + request->fl_type = F_UNLCK; + break; + case -NFS4ERR_DENIED: + status = 0; } out: up_read(&clp->cl_sem); @@ -2847,196 +3109,314 @@ static int do_vfs_lock(struct file *file } struct nfs4_unlockdata { - struct nfs_lockargs arg; - struct nfs_locku_opargs luargs; - struct nfs_lockres res; + struct nfs_locku_args arg; + struct nfs_locku_res res; struct nfs4_lock_state *lsp; struct nfs_open_context *ctx; - atomic_t refcount; - struct completion completion; + struct file_lock fl; + const struct nfs_server *server; + unsigned long timestamp; }; -static void nfs4_locku_release_calldata(struct nfs4_unlockdata *calldata) -{ - if (atomic_dec_and_test(&calldata->refcount)) { - nfs_free_seqid(calldata->luargs.seqid); - nfs4_put_lock_state(calldata->lsp); - put_nfs_open_context(calldata->ctx); - kfree(calldata); - } +static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl, + struct nfs_open_context *ctx, + struct nfs4_lock_state *lsp, + struct nfs_seqid *seqid) +{ + struct nfs4_unlockdata *p; + struct inode *inode = lsp->ls_state->inode; + + p = kmalloc(sizeof(*p), GFP_KERNEL); + if (p == NULL) + return NULL; + p->arg.fh = NFS_FH(inode); + p->arg.fl = &p->fl; + p->arg.seqid = seqid; + p->arg.stateid = &lsp->ls_stateid; + p->lsp = lsp; + atomic_inc(&lsp->ls_count); + /* Ensure we don't close file until we're done freeing locks! */ + p->ctx = get_nfs_open_context(ctx); + memcpy(&p->fl, fl, sizeof(p->fl)); + p->server = NFS_SERVER(inode); + return p; } -static void nfs4_locku_complete(struct nfs4_unlockdata *calldata) +static void nfs4_locku_release_calldata(void *data) { - complete(&calldata->completion); - nfs4_locku_release_calldata(calldata); + struct nfs4_unlockdata *calldata = data; + nfs_free_seqid(calldata->arg.seqid); + nfs4_put_lock_state(calldata->lsp); + put_nfs_open_context(calldata->ctx); + kfree(calldata); } -static void nfs4_locku_done(struct rpc_task *task) +static void nfs4_locku_done(struct rpc_task *task, void *data) { - struct nfs4_unlockdata *calldata = (struct nfs4_unlockdata *)task->tk_calldata; + struct nfs4_unlockdata *calldata = data; - nfs_increment_lock_seqid(task->tk_status, calldata->luargs.seqid); + if (RPC_ASSASSINATED(task)) + return; + nfs_increment_lock_seqid(task->tk_status, calldata->arg.seqid); switch (task->tk_status) { case 0: memcpy(calldata->lsp->ls_stateid.data, - calldata->res.u.stateid.data, + calldata->res.stateid.data, sizeof(calldata->lsp->ls_stateid.data)); + renew_lease(calldata->server, calldata->timestamp); break; case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: - nfs4_schedule_state_recovery(calldata->res.server->nfs4_state); + nfs4_schedule_state_recovery(calldata->server->nfs4_state); break; default: - if (nfs4_async_handle_error(task, calldata->res.server) == -EAGAIN) { + if (nfs4_async_handle_error(task, calldata->server) == -EAGAIN) { rpc_restart_call(task); - return; } } - nfs4_locku_complete(calldata); } -static void nfs4_locku_begin(struct rpc_task *task) +static void nfs4_locku_prepare(struct rpc_task *task, void *data) { - struct nfs4_unlockdata *calldata = (struct nfs4_unlockdata *)task->tk_calldata; + struct nfs4_unlockdata *calldata = data; struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LOCKU], .rpc_argp = &calldata->arg, .rpc_resp = &calldata->res, .rpc_cred = calldata->lsp->ls_state->owner->so_cred, }; - int status; - status = nfs_wait_on_sequence(calldata->luargs.seqid, task); - if (status != 0) + if (nfs_wait_on_sequence(calldata->arg.seqid, task) != 0) return; if ((calldata->lsp->ls_flags & NFS_LOCK_INITIALIZED) == 0) { - nfs4_locku_complete(calldata); - task->tk_exit = NULL; - rpc_exit(task, 0); + /* Note: exit _without_ running nfs4_locku_done */ + task->tk_action = NULL; return; } + calldata->timestamp = jiffies; rpc_call_setup(task, &msg, 0); } +static const struct rpc_call_ops nfs4_locku_ops = { + .rpc_call_prepare = nfs4_locku_prepare, + .rpc_call_done = nfs4_locku_done, + .rpc_release = nfs4_locku_release_calldata, +}; + +static struct rpc_task *nfs4_do_unlck(struct file_lock *fl, + struct nfs_open_context *ctx, + struct nfs4_lock_state *lsp, + struct nfs_seqid *seqid) +{ + struct nfs4_unlockdata *data; + struct rpc_task *task; + + data = nfs4_alloc_unlockdata(fl, ctx, lsp, seqid); + if (data == NULL) { + nfs_free_seqid(seqid); + return ERR_PTR(-ENOMEM); + } + + /* Unlock _before_ we do the RPC call */ + do_vfs_lock(fl->fl_file, fl); + task = rpc_run_task(NFS_CLIENT(lsp->ls_state->inode), RPC_TASK_ASYNC, &nfs4_locku_ops, data); + if (IS_ERR(task)) + nfs4_locku_release_calldata(data); + return task; +} + static int nfs4_proc_unlck(struct nfs4_state *state, int cmd, struct file_lock *request) { - struct nfs4_unlockdata *calldata; - struct inode *inode = state->inode; - struct nfs_server *server = NFS_SERVER(inode); + struct nfs_seqid *seqid; struct nfs4_lock_state *lsp; - int status; + struct rpc_task *task; + int status = 0; /* Is this a delegated lock? */ if (test_bit(NFS_DELEGATED_STATE, &state->flags)) - return do_vfs_lock(request->fl_file, request); + goto out_unlock; + /* Is this open_owner holding any locks on the server? */ + if (test_bit(LK_STATE_IN_USE, &state->flags) == 0) + goto out_unlock; status = nfs4_set_lock_state(state, request); if (status != 0) - return status; + goto out_unlock; lsp = request->fl_u.nfs4_fl.owner; - /* We might have lost the locks! */ - if ((lsp->ls_flags & NFS_LOCK_INITIALIZED) == 0) - return 0; - calldata = kmalloc(sizeof(*calldata), GFP_KERNEL); - if (calldata == NULL) - return -ENOMEM; - calldata->luargs.seqid = nfs_alloc_seqid(&lsp->ls_seqid); - if (calldata->luargs.seqid == NULL) { - kfree(calldata); - return -ENOMEM; - } - calldata->luargs.stateid = &lsp->ls_stateid; - calldata->arg.fh = NFS_FH(inode); - calldata->arg.type = nfs4_lck_type(cmd, request); - calldata->arg.offset = request->fl_start; - calldata->arg.length = nfs4_lck_length(request); - calldata->arg.u.locku = &calldata->luargs; - calldata->res.server = server; - calldata->lsp = lsp; - atomic_inc(&lsp->ls_count); - - /* Ensure we don't close file until we're done freeing locks! */ - calldata->ctx = get_nfs_open_context((struct nfs_open_context*)request->fl_file->private_data); - - atomic_set(&calldata->refcount, 2); - init_completion(&calldata->completion); - - status = nfs4_call_async(NFS_SERVER(inode)->client, nfs4_locku_begin, - nfs4_locku_done, calldata); - if (status == 0) - wait_for_completion_interruptible(&calldata->completion); + status = -ENOMEM; + seqid = nfs_alloc_seqid(&lsp->ls_seqid); + if (seqid == NULL) + goto out_unlock; + task = nfs4_do_unlck(request, request->fl_file->private_data, lsp, seqid); + status = PTR_ERR(task); + if (IS_ERR(task)) + goto out_unlock; + status = nfs4_wait_for_completion_rpc_task(task); + rpc_release_task(task); + return status; +out_unlock: do_vfs_lock(request->fl_file, request); - nfs4_locku_release_calldata(calldata); return status; } -static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *request, int reclaim) +struct nfs4_lockdata { + struct nfs_lock_args arg; + struct nfs_lock_res res; + struct nfs4_lock_state *lsp; + struct nfs_open_context *ctx; + struct file_lock fl; + unsigned long timestamp; + int rpc_status; + int cancelled; +}; + +static struct nfs4_lockdata *nfs4_alloc_lockdata(struct file_lock *fl, + struct nfs_open_context *ctx, struct nfs4_lock_state *lsp) { - struct inode *inode = state->inode; + struct nfs4_lockdata *p; + struct inode *inode = lsp->ls_state->inode; struct nfs_server *server = NFS_SERVER(inode); - struct nfs4_lock_state *lsp = request->fl_u.nfs4_fl.owner; - struct nfs_lock_opargs largs = { - .lock_stateid = &lsp->ls_stateid, - .open_stateid = &state->stateid, - .lock_owner = { - .clientid = server->nfs4_state->cl_clientid, - .id = lsp->ls_id, - }, - .reclaim = reclaim, - }; - struct nfs_lockargs arg = { - .fh = NFS_FH(inode), - .type = nfs4_lck_type(cmd, request), - .offset = request->fl_start, - .length = nfs4_lck_length(request), - .u = { - .lock = &largs, - }, - }; - struct nfs_lockres res = { - .server = server, - }; + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (p == NULL) + return NULL; + + p->arg.fh = NFS_FH(inode); + p->arg.fl = &p->fl; + p->arg.lock_seqid = nfs_alloc_seqid(&lsp->ls_seqid); + if (p->arg.lock_seqid == NULL) + goto out_free; + p->arg.lock_stateid = &lsp->ls_stateid; + p->arg.lock_owner.clientid = server->nfs4_state->cl_clientid; + p->arg.lock_owner.id = lsp->ls_id; + p->lsp = lsp; + atomic_inc(&lsp->ls_count); + p->ctx = get_nfs_open_context(ctx); + memcpy(&p->fl, fl, sizeof(p->fl)); + return p; +out_free: + kfree(p); + return NULL; +} + +static void nfs4_lock_prepare(struct rpc_task *task, void *calldata) +{ + struct nfs4_lockdata *data = calldata; + struct nfs4_state *state = data->lsp->ls_state; + struct nfs4_state_owner *sp = state->owner; struct rpc_message msg = { - .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LOCK], - .rpc_argp = &arg, - .rpc_resp = &res, - .rpc_cred = state->owner->so_cred, + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LOCK], + .rpc_argp = &data->arg, + .rpc_resp = &data->res, + .rpc_cred = sp->so_cred, }; - int status = -ENOMEM; - largs.lock_seqid = nfs_alloc_seqid(&lsp->ls_seqid); - if (largs.lock_seqid == NULL) - return -ENOMEM; - if (!(lsp->ls_seqid.flags & NFS_SEQID_CONFIRMED)) { - struct nfs4_state_owner *owner = state->owner; - - largs.open_seqid = nfs_alloc_seqid(&owner->so_seqid); - if (largs.open_seqid == NULL) + if (nfs_wait_on_sequence(data->arg.lock_seqid, task) != 0) + return; + dprintk("%s: begin!\n", __FUNCTION__); + /* Do we need to do an open_to_lock_owner? */ + if (!(data->arg.lock_seqid->sequence->flags & NFS_SEQID_CONFIRMED)) { + data->arg.open_seqid = nfs_alloc_seqid(&sp->so_seqid); + if (data->arg.open_seqid == NULL) { + data->rpc_status = -ENOMEM; + task->tk_action = NULL; goto out; - largs.new_lock_owner = 1; - status = rpc_call_sync(server->client, &msg, RPC_TASK_NOINTR); - /* increment open seqid on success, and seqid mutating errors */ - if (largs.new_lock_owner != 0) { - nfs_increment_open_seqid(status, largs.open_seqid); - if (status == 0) - nfs_confirm_seqid(&lsp->ls_seqid, 0); } - nfs_free_seqid(largs.open_seqid); - } else - status = rpc_call_sync(server->client, &msg, RPC_TASK_NOINTR); - /* increment lock seqid on success, and seqid mutating errors*/ - nfs_increment_lock_seqid(status, largs.lock_seqid); - /* save the returned stateid. */ - if (status == 0) { - memcpy(lsp->ls_stateid.data, res.u.stateid.data, - sizeof(lsp->ls_stateid.data)); - lsp->ls_flags |= NFS_LOCK_INITIALIZED; - } else if (status == -NFS4ERR_DENIED) - status = -EAGAIN; + data->arg.open_stateid = &state->stateid; + data->arg.new_lock_owner = 1; + } + data->timestamp = jiffies; + rpc_call_setup(task, &msg, 0); out: - nfs_free_seqid(largs.lock_seqid); - return status; + dprintk("%s: done!, ret = %d\n", __FUNCTION__, data->rpc_status); +} + +static void nfs4_lock_done(struct rpc_task *task, void *calldata) +{ + struct nfs4_lockdata *data = calldata; + + dprintk("%s: begin!\n", __FUNCTION__); + + data->rpc_status = task->tk_status; + if (RPC_ASSASSINATED(task)) + goto out; + if (data->arg.new_lock_owner != 0) { + nfs_increment_open_seqid(data->rpc_status, data->arg.open_seqid); + if (data->rpc_status == 0) + nfs_confirm_seqid(&data->lsp->ls_seqid, 0); + else + goto out; + } + if (data->rpc_status == 0) { + memcpy(data->lsp->ls_stateid.data, data->res.stateid.data, + sizeof(data->lsp->ls_stateid.data)); + data->lsp->ls_flags |= NFS_LOCK_INITIALIZED; + renew_lease(NFS_SERVER(data->ctx->dentry->d_inode), data->timestamp); + } + nfs_increment_lock_seqid(data->rpc_status, data->arg.lock_seqid); +out: + dprintk("%s: done, ret = %d!\n", __FUNCTION__, data->rpc_status); +} + +static void nfs4_lock_release(void *calldata) +{ + struct nfs4_lockdata *data = calldata; + + dprintk("%s: begin!\n", __FUNCTION__); + if (data->arg.open_seqid != NULL) + nfs_free_seqid(data->arg.open_seqid); + if (data->cancelled != 0) { + struct rpc_task *task; + task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp, + data->arg.lock_seqid); + if (!IS_ERR(task)) + rpc_release_task(task); + dprintk("%s: cancelling lock!\n", __FUNCTION__); + } else + nfs_free_seqid(data->arg.lock_seqid); + nfs4_put_lock_state(data->lsp); + put_nfs_open_context(data->ctx); + kfree(data); + dprintk("%s: done!\n", __FUNCTION__); +} + +static const struct rpc_call_ops nfs4_lock_ops = { + .rpc_call_prepare = nfs4_lock_prepare, + .rpc_call_done = nfs4_lock_done, + .rpc_release = nfs4_lock_release, +}; + +static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim) +{ + struct nfs4_lockdata *data; + struct rpc_task *task; + int ret; + + dprintk("%s: begin!\n", __FUNCTION__); + data = nfs4_alloc_lockdata(fl, fl->fl_file->private_data, + fl->fl_u.nfs4_fl.owner); + if (data == NULL) + return -ENOMEM; + if (IS_SETLKW(cmd)) + data->arg.block = 1; + if (reclaim != 0) + data->arg.reclaim = 1; + task = rpc_run_task(NFS_CLIENT(state->inode), RPC_TASK_ASYNC, + &nfs4_lock_ops, data); + if (IS_ERR(task)) { + nfs4_lock_release(data); + return PTR_ERR(task); + } + ret = nfs4_wait_for_completion_rpc_task(task); + if (ret == 0) { + ret = data->rpc_status; + if (ret == -NFS4ERR_DENIED) + ret = -EAGAIN; + } else + data->cancelled = 1; + rpc_release_task(task); + dprintk("%s: done, ret = %d!\n", __FUNCTION__, ret); + return ret; } static int nfs4_lock_reclaim(struct nfs4_state *state, struct file_lock *request) diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index a300162..5d764d8 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -54,6 +54,7 @@ #include #include #include "nfs4_fs.h" +#include "delegation.h" #define NFSDBG_FACILITY NFSDBG_PROC @@ -61,6 +62,7 @@ void nfs4_renew_state(void *data) { struct nfs4_client *clp = (struct nfs4_client *)data; + struct rpc_cred *cred; long lease, timeout; unsigned long last, now; @@ -68,7 +70,7 @@ nfs4_renew_state(void *data) dprintk("%s: start\n", __FUNCTION__); /* Are there any active superblocks? */ if (list_empty(&clp->cl_superblocks)) - goto out; + goto out; spin_lock(&clp->cl_lock); lease = clp->cl_lease_time; last = clp->cl_last_renewal; @@ -76,9 +78,17 @@ nfs4_renew_state(void *data) timeout = (2 * lease) / 3 + (long)last - (long)now; /* Are we close to a lease timeout? */ if (time_after(now, last + lease/3)) { + cred = nfs4_get_renew_cred(clp); + if (cred == NULL) { + set_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state); + spin_unlock(&clp->cl_lock); + nfs_expire_all_delegations(clp); + goto out; + } spin_unlock(&clp->cl_lock); /* Queue an asynchronous RENEW. */ - nfs4_proc_async_renew(clp); + nfs4_proc_async_renew(clp, cred); + put_rpccred(cred); timeout = (2 * lease) / 3; spin_lock(&clp->cl_lock); } else diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 5ef4c57..afad025 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -43,6 +43,8 @@ #include #include #include +#include +#include #include #include @@ -57,8 +59,6 @@ const nfs4_stateid zero_stateid; static DEFINE_SPINLOCK(state_spinlock); static LIST_HEAD(nfs4_clientid_list); -static void nfs4_recover_state(void *); - void init_nfsv4_state(struct nfs_server *server) { @@ -91,11 +91,10 @@ nfs4_alloc_client(struct in_addr *addr) if (nfs_callback_up() < 0) return NULL; - if ((clp = kmalloc(sizeof(*clp), GFP_KERNEL)) == NULL) { + if ((clp = kzalloc(sizeof(*clp), GFP_KERNEL)) == NULL) { nfs_callback_down(); return NULL; } - memset(clp, 0, sizeof(*clp)); memcpy(&clp->cl_addr, addr, sizeof(clp->cl_addr)); init_rwsem(&clp->cl_sem); INIT_LIST_HEAD(&clp->cl_delegations); @@ -103,14 +102,12 @@ nfs4_alloc_client(struct in_addr *addr) INIT_LIST_HEAD(&clp->cl_unused); spin_lock_init(&clp->cl_lock); atomic_set(&clp->cl_count, 1); - INIT_WORK(&clp->cl_recoverd, nfs4_recover_state, clp); INIT_WORK(&clp->cl_renewd, nfs4_renew_state, clp); INIT_LIST_HEAD(&clp->cl_superblocks); - init_waitqueue_head(&clp->cl_waitq); rpc_init_wait_queue(&clp->cl_rpcwaitq, "NFS4 client"); clp->cl_rpcclient = ERR_PTR(-EINVAL); clp->cl_boot_time = CURRENT_TIME; - clp->cl_state = 1 << NFS4CLNT_OK; + clp->cl_state = 1 << NFS4CLNT_LEASE_EXPIRED; return clp; } @@ -127,8 +124,6 @@ nfs4_free_client(struct nfs4_client *clp kfree(sp); } BUG_ON(!list_empty(&clp->cl_state_owners)); - if (clp->cl_cred) - put_rpccred(clp->cl_cred); nfs_idmap_delete(clp); if (!IS_ERR(clp->cl_rpcclient)) rpc_shutdown_client(clp->cl_rpcclient); @@ -193,27 +188,22 @@ nfs4_put_client(struct nfs4_client *clp) list_del(&clp->cl_servers); spin_unlock(&state_spinlock); BUG_ON(!list_empty(&clp->cl_superblocks)); - wake_up_all(&clp->cl_waitq); rpc_wake_up(&clp->cl_rpcwaitq); nfs4_kill_renewd(clp); nfs4_free_client(clp); } -static int __nfs4_init_client(struct nfs4_client *clp) +static int nfs4_init_client(struct nfs4_client *clp, struct rpc_cred *cred) { - int status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, nfs_callback_tcpport); + int status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, + nfs_callback_tcpport, cred); if (status == 0) - status = nfs4_proc_setclientid_confirm(clp); + status = nfs4_proc_setclientid_confirm(clp, cred); if (status == 0) nfs4_schedule_state_renewal(clp); return status; } -int nfs4_init_client(struct nfs4_client *clp) -{ - return nfs4_map_errors(__nfs4_init_client(clp)); -} - u32 nfs4_alloc_lockowner_id(struct nfs4_client *clp) { @@ -235,6 +225,32 @@ nfs4_client_grab_unused(struct nfs4_clie return sp; } +struct rpc_cred *nfs4_get_renew_cred(struct nfs4_client *clp) +{ + struct nfs4_state_owner *sp; + struct rpc_cred *cred = NULL; + + list_for_each_entry(sp, &clp->cl_state_owners, so_list) { + if (list_empty(&sp->so_states)) + continue; + cred = get_rpccred(sp->so_cred); + break; + } + return cred; +} + +struct rpc_cred *nfs4_get_setclientid_cred(struct nfs4_client *clp) +{ + struct nfs4_state_owner *sp; + + if (!list_empty(&clp->cl_state_owners)) { + sp = list_entry(clp->cl_state_owners.next, + struct nfs4_state_owner, so_list); + return get_rpccred(sp->so_cred); + } + return NULL; +} + static struct nfs4_state_owner * nfs4_find_state_owner(struct nfs4_client *clp, struct rpc_cred *cred) { @@ -349,14 +365,9 @@ nfs4_alloc_open_state(void) { struct nfs4_state *state; - state = kmalloc(sizeof(*state), GFP_KERNEL); + state = kzalloc(sizeof(*state), GFP_KERNEL); if (!state) return NULL; - state->state = 0; - state->nreaders = 0; - state->nwriters = 0; - state->flags = 0; - memset(state->stateid.data, 0, sizeof(state->stateid.data)); atomic_set(&state->count, 1); INIT_LIST_HEAD(&state->lock_states); spin_lock_init(&state->state_lock); @@ -475,15 +486,23 @@ void nfs4_close_state(struct nfs4_state /* Protect against nfs4_find_state() */ spin_lock(&owner->so_lock); spin_lock(&inode->i_lock); - if (mode & FMODE_READ) - state->nreaders--; - if (mode & FMODE_WRITE) - state->nwriters--; + switch (mode & (FMODE_READ | FMODE_WRITE)) { + case FMODE_READ: + state->n_rdonly--; + break; + case FMODE_WRITE: + state->n_wronly--; + break; + case FMODE_READ|FMODE_WRITE: + state->n_rdwr--; + } oldstate = newstate = state->state; - if (state->nreaders == 0) - newstate &= ~FMODE_READ; - if (state->nwriters == 0) - newstate &= ~FMODE_WRITE; + if (state->n_rdwr == 0) { + if (state->n_rdonly == 0) + newstate &= ~FMODE_READ; + if (state->n_wronly == 0) + newstate &= ~FMODE_WRITE; + } if (test_bit(NFS_DELEGATED_STATE, &state->flags)) { nfs4_state_set_mode_locked(state, newstate); oldstate = newstate; @@ -733,45 +752,43 @@ out: } static int reclaimer(void *); -struct reclaimer_args { - struct nfs4_client *clp; - struct completion complete; -}; + +static inline void nfs4_clear_recover_bit(struct nfs4_client *clp) +{ + smp_mb__before_clear_bit(); + clear_bit(NFS4CLNT_STATE_RECOVER, &clp->cl_state); + smp_mb__after_clear_bit(); + wake_up_bit(&clp->cl_state, NFS4CLNT_STATE_RECOVER); + rpc_wake_up(&clp->cl_rpcwaitq); +} /* * State recovery routine */ -void -nfs4_recover_state(void *data) +static void nfs4_recover_state(struct nfs4_client *clp) { - struct nfs4_client *clp = (struct nfs4_client *)data; - struct reclaimer_args args = { - .clp = clp, - }; - might_sleep(); - - init_completion(&args.complete); + struct task_struct *task; - if (kernel_thread(reclaimer, &args, CLONE_KERNEL) < 0) - goto out_failed_clear; - wait_for_completion(&args.complete); - return; -out_failed_clear: - set_bit(NFS4CLNT_OK, &clp->cl_state); - wake_up_all(&clp->cl_waitq); - rpc_wake_up(&clp->cl_rpcwaitq); + __module_get(THIS_MODULE); + atomic_inc(&clp->cl_count); + task = kthread_run(reclaimer, clp, "%u.%u.%u.%u-reclaim", + NIPQUAD(clp->cl_addr)); + if (!IS_ERR(task)) + return; + nfs4_clear_recover_bit(clp); + nfs4_put_client(clp); + module_put(THIS_MODULE); } /* * Schedule a state recovery attempt */ -void -nfs4_schedule_state_recovery(struct nfs4_client *clp) +void nfs4_schedule_state_recovery(struct nfs4_client *clp) { if (!clp) return; - if (test_and_clear_bit(NFS4CLNT_OK, &clp->cl_state)) - schedule_work(&clp->cl_recoverd); + if (test_and_set_bit(NFS4CLNT_STATE_RECOVER, &clp->cl_state) == 0) + nfs4_recover_state(clp); } static int nfs4_reclaim_locks(struct nfs4_state_recovery_ops *ops, struct nfs4_state *state) @@ -887,18 +904,14 @@ static void nfs4_state_mark_reclaim(stru static int reclaimer(void *ptr) { - struct reclaimer_args *args = (struct reclaimer_args *)ptr; - struct nfs4_client *clp = args->clp; + struct nfs4_client *clp = ptr; struct nfs4_state_owner *sp; struct nfs4_state_recovery_ops *ops; + struct rpc_cred *cred; int status = 0; - daemonize("%u.%u.%u.%u-reclaim", NIPQUAD(clp->cl_addr)); allow_signal(SIGKILL); - atomic_inc(&clp->cl_count); - complete(&args->complete); - /* Ensure exclusive access to NFSv4 state */ lock_kernel(); down_write(&clp->cl_sem); @@ -906,20 +919,33 @@ static int reclaimer(void *ptr) if (list_empty(&clp->cl_superblocks)) goto out; restart_loop: - status = nfs4_proc_renew(clp); - switch (status) { - case 0: - case -NFS4ERR_CB_PATH_DOWN: - goto out; - case -NFS4ERR_STALE_CLIENTID: - case -NFS4ERR_LEASE_MOVED: - ops = &nfs4_reboot_recovery_ops; - break; - default: - ops = &nfs4_network_partition_recovery_ops; - }; + ops = &nfs4_network_partition_recovery_ops; + /* Are there any open files on this volume? */ + cred = nfs4_get_renew_cred(clp); + if (cred != NULL) { + /* Yes there are: try to renew the old lease */ + status = nfs4_proc_renew(clp, cred); + switch (status) { + case 0: + case -NFS4ERR_CB_PATH_DOWN: + put_rpccred(cred); + goto out; + case -NFS4ERR_STALE_CLIENTID: + case -NFS4ERR_LEASE_MOVED: + ops = &nfs4_reboot_recovery_ops; + } + } else { + /* "reboot" to ensure we clear all state on the server */ + clp->cl_boot_time = CURRENT_TIME; + cred = nfs4_get_setclientid_cred(clp); + } + /* We're going to have to re-establish a clientid */ nfs4_state_mark_reclaim(clp); - status = __nfs4_init_client(clp); + status = -ENOENT; + if (cred != NULL) { + status = nfs4_init_client(clp, cred); + put_rpccred(cred); + } if (status) goto out_error; /* Mark all delegations for reclaim */ @@ -940,14 +966,13 @@ restart_loop: } nfs_delegation_reap_unclaimed(clp); out: - set_bit(NFS4CLNT_OK, &clp->cl_state); up_write(&clp->cl_sem); unlock_kernel(); - wake_up_all(&clp->cl_waitq); - rpc_wake_up(&clp->cl_rpcwaitq); if (status == -NFS4ERR_CB_PATH_DOWN) nfs_handle_cb_pathdown(clp); + nfs4_clear_recover_bit(clp); nfs4_put_client(clp); + module_put_and_exit(0); return 0; out_error: printk(KERN_WARNING "Error: state recovery failed on NFSv4 server %u.%u.%u.%u with error %d\n", diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index fbbace8..4bbf5ef 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -392,9 +392,11 @@ static int nfs_stat_to_errno(int); decode_getattr_maxsz) #define NFS4_enc_delegreturn_sz (compound_encode_hdr_maxsz + \ encode_putfh_maxsz + \ - encode_delegreturn_maxsz) + encode_delegreturn_maxsz + \ + encode_getattr_maxsz) #define NFS4_dec_delegreturn_sz (compound_decode_hdr_maxsz + \ - decode_delegreturn_maxsz) + decode_delegreturn_maxsz + \ + decode_getattr_maxsz) #define NFS4_enc_getacl_sz (compound_encode_hdr_maxsz + \ encode_putfh_maxsz + \ encode_getattr_maxsz) @@ -564,7 +566,7 @@ static int encode_attrs(struct xdr_strea } if (iap->ia_valid & ATTR_MODE) { bmval1 |= FATTR4_WORD1_MODE; - WRITE32(iap->ia_mode); + WRITE32(iap->ia_mode & S_IALLUGO); } if (iap->ia_valid & ATTR_UID) { bmval1 |= FATTR4_WORD1_OWNER; @@ -742,69 +744,80 @@ static int encode_link(struct xdr_stream return 0; } +static inline int nfs4_lock_type(struct file_lock *fl, int block) +{ + if ((fl->fl_type & (F_RDLCK|F_WRLCK|F_UNLCK)) == F_RDLCK) + return block ? NFS4_READW_LT : NFS4_READ_LT; + return block ? NFS4_WRITEW_LT : NFS4_WRITE_LT; +} + +static inline uint64_t nfs4_lock_length(struct file_lock *fl) +{ + if (fl->fl_end == OFFSET_MAX) + return ~(uint64_t)0; + return fl->fl_end - fl->fl_start + 1; +} + /* * opcode,type,reclaim,offset,length,new_lock_owner = 32 * open_seqid,open_stateid,lock_seqid,lock_owner.clientid, lock_owner.id = 40 */ -static int encode_lock(struct xdr_stream *xdr, const struct nfs_lockargs *arg) +static int encode_lock(struct xdr_stream *xdr, const struct nfs_lock_args *args) { uint32_t *p; - struct nfs_lock_opargs *opargs = arg->u.lock; RESERVE_SPACE(32); WRITE32(OP_LOCK); - WRITE32(arg->type); - WRITE32(opargs->reclaim); - WRITE64(arg->offset); - WRITE64(arg->length); - WRITE32(opargs->new_lock_owner); - if (opargs->new_lock_owner){ + WRITE32(nfs4_lock_type(args->fl, args->block)); + WRITE32(args->reclaim); + WRITE64(args->fl->fl_start); + WRITE64(nfs4_lock_length(args->fl)); + WRITE32(args->new_lock_owner); + if (args->new_lock_owner){ RESERVE_SPACE(40); - WRITE32(opargs->open_seqid->sequence->counter); - WRITEMEM(opargs->open_stateid->data, sizeof(opargs->open_stateid->data)); - WRITE32(opargs->lock_seqid->sequence->counter); - WRITE64(opargs->lock_owner.clientid); + WRITE32(args->open_seqid->sequence->counter); + WRITEMEM(args->open_stateid->data, sizeof(args->open_stateid->data)); + WRITE32(args->lock_seqid->sequence->counter); + WRITE64(args->lock_owner.clientid); WRITE32(4); - WRITE32(opargs->lock_owner.id); + WRITE32(args->lock_owner.id); } else { RESERVE_SPACE(20); - WRITEMEM(opargs->lock_stateid->data, sizeof(opargs->lock_stateid->data)); - WRITE32(opargs->lock_seqid->sequence->counter); + WRITEMEM(args->lock_stateid->data, sizeof(args->lock_stateid->data)); + WRITE32(args->lock_seqid->sequence->counter); } return 0; } -static int encode_lockt(struct xdr_stream *xdr, const struct nfs_lockargs *arg) +static int encode_lockt(struct xdr_stream *xdr, const struct nfs_lockt_args *args) { uint32_t *p; - struct nfs_lowner *opargs = arg->u.lockt; RESERVE_SPACE(40); WRITE32(OP_LOCKT); - WRITE32(arg->type); - WRITE64(arg->offset); - WRITE64(arg->length); - WRITE64(opargs->clientid); + WRITE32(nfs4_lock_type(args->fl, 0)); + WRITE64(args->fl->fl_start); + WRITE64(nfs4_lock_length(args->fl)); + WRITE64(args->lock_owner.clientid); WRITE32(4); - WRITE32(opargs->id); + WRITE32(args->lock_owner.id); return 0; } -static int encode_locku(struct xdr_stream *xdr, const struct nfs_lockargs *arg) +static int encode_locku(struct xdr_stream *xdr, const struct nfs_locku_args *args) { uint32_t *p; - struct nfs_locku_opargs *opargs = arg->u.locku; RESERVE_SPACE(44); WRITE32(OP_LOCKU); - WRITE32(arg->type); - WRITE32(opargs->seqid->sequence->counter); - WRITEMEM(opargs->stateid->data, sizeof(opargs->stateid->data)); - WRITE64(arg->offset); - WRITE64(arg->length); + WRITE32(nfs4_lock_type(args->fl, 0)); + WRITE32(args->seqid->sequence->counter); + WRITEMEM(args->stateid->data, sizeof(args->stateid->data)); + WRITE64(args->fl->fl_start); + WRITE64(nfs4_lock_length(args->fl)); return 0; } @@ -964,9 +977,9 @@ static int encode_open_confirm(struct xd { uint32_t *p; - RESERVE_SPACE(8+sizeof(arg->stateid.data)); + RESERVE_SPACE(8+sizeof(arg->stateid->data)); WRITE32(OP_OPEN_CONFIRM); - WRITEMEM(arg->stateid.data, sizeof(arg->stateid.data)); + WRITEMEM(arg->stateid->data, sizeof(arg->stateid->data)); WRITE32(arg->seqid->sequence->counter); return 0; @@ -1499,9 +1512,6 @@ static int nfs4_xdr_enc_open(struct rpc_ }; int status; - status = nfs_wait_on_sequence(args->seqid, req->rq_task); - if (status != 0) - goto out; xdr_init_encode(&xdr, &req->rq_snd_buf, p); encode_compound_hdr(&xdr, &hdr); status = encode_putfh(&xdr, args->fh); @@ -1538,9 +1548,6 @@ static int nfs4_xdr_enc_open_confirm(str }; int status; - status = nfs_wait_on_sequence(args->seqid, req->rq_task); - if (status != 0) - goto out; xdr_init_encode(&xdr, &req->rq_snd_buf, p); encode_compound_hdr(&xdr, &hdr); status = encode_putfh(&xdr, args->fh); @@ -1558,19 +1565,19 @@ static int nfs4_xdr_enc_open_noattr(stru { struct xdr_stream xdr; struct compound_hdr hdr = { - .nops = 2, + .nops = 3, }; int status; - status = nfs_wait_on_sequence(args->seqid, req->rq_task); - if (status != 0) - goto out; xdr_init_encode(&xdr, &req->rq_snd_buf, p); encode_compound_hdr(&xdr, &hdr); status = encode_putfh(&xdr, args->fh); if (status) goto out; status = encode_open(&xdr, args); + if (status) + goto out; + status = encode_getfattr(&xdr, args->bitmask); out: return status; } @@ -1602,21 +1609,14 @@ out: /* * Encode a LOCK request */ -static int nfs4_xdr_enc_lock(struct rpc_rqst *req, uint32_t *p, struct nfs_lockargs *args) +static int nfs4_xdr_enc_lock(struct rpc_rqst *req, uint32_t *p, struct nfs_lock_args *args) { struct xdr_stream xdr; struct compound_hdr hdr = { .nops = 2, }; - struct nfs_lock_opargs *opargs = args->u.lock; int status; - status = nfs_wait_on_sequence(opargs->lock_seqid, req->rq_task); - if (status != 0) - goto out; - /* Do we need to do an open_to_lock_owner? */ - if (opargs->lock_seqid->sequence->flags & NFS_SEQID_CONFIRMED) - opargs->new_lock_owner = 0; xdr_init_encode(&xdr, &req->rq_snd_buf, p); encode_compound_hdr(&xdr, &hdr); status = encode_putfh(&xdr, args->fh); @@ -1630,7 +1630,7 @@ out: /* * Encode a LOCKT request */ -static int nfs4_xdr_enc_lockt(struct rpc_rqst *req, uint32_t *p, struct nfs_lockargs *args) +static int nfs4_xdr_enc_lockt(struct rpc_rqst *req, uint32_t *p, struct nfs_lockt_args *args) { struct xdr_stream xdr; struct compound_hdr hdr = { @@ -1651,7 +1651,7 @@ out: /* * Encode a LOCKU request */ -static int nfs4_xdr_enc_locku(struct rpc_rqst *req, uint32_t *p, struct nfs_lockargs *args) +static int nfs4_xdr_enc_locku(struct rpc_rqst *req, uint32_t *p, struct nfs_locku_args *args) { struct xdr_stream xdr; struct compound_hdr hdr = { @@ -1985,14 +1985,20 @@ static int nfs4_xdr_enc_delegreturn(stru { struct xdr_stream xdr; struct compound_hdr hdr = { - .nops = 2, + .nops = 3, }; int status; xdr_init_encode(&xdr, &req->rq_snd_buf, p); encode_compound_hdr(&xdr, &hdr); - if ((status = encode_putfh(&xdr, args->fhandle)) == 0) - status = encode_delegreturn(&xdr, args->stateid); + status = encode_putfh(&xdr, args->fhandle); + if (status != 0) + goto out; + status = encode_delegreturn(&xdr, args->stateid); + if (status != 0) + goto out; + status = encode_getfattr(&xdr, args->bitmask); +out: return status; } @@ -2955,55 +2961,64 @@ static int decode_link(struct xdr_stream /* * We create the owner, so we know a proper owner.id length is 4. */ -static int decode_lock_denied (struct xdr_stream *xdr, struct nfs_lock_denied *denied) +static int decode_lock_denied (struct xdr_stream *xdr, struct file_lock *fl) { + uint64_t offset, length, clientid; uint32_t *p; - uint32_t namelen; + uint32_t namelen, type; READ_BUF(32); - READ64(denied->offset); - READ64(denied->length); - READ32(denied->type); - READ64(denied->owner.clientid); + READ64(offset); + READ64(length); + READ32(type); + if (fl != NULL) { + fl->fl_start = (loff_t)offset; + fl->fl_end = fl->fl_start + (loff_t)length - 1; + if (length == ~(uint64_t)0) + fl->fl_end = OFFSET_MAX; + fl->fl_type = F_WRLCK; + if (type & 1) + fl->fl_type = F_RDLCK; + fl->fl_pid = 0; + } + READ64(clientid); READ32(namelen); READ_BUF(namelen); - if (namelen == 4) - READ32(denied->owner.id); return -NFS4ERR_DENIED; } -static int decode_lock(struct xdr_stream *xdr, struct nfs_lockres *res) +static int decode_lock(struct xdr_stream *xdr, struct nfs_lock_res *res) { uint32_t *p; int status; status = decode_op_hdr(xdr, OP_LOCK); if (status == 0) { - READ_BUF(sizeof(res->u.stateid.data)); - COPYMEM(res->u.stateid.data, sizeof(res->u.stateid.data)); + READ_BUF(sizeof(res->stateid.data)); + COPYMEM(res->stateid.data, sizeof(res->stateid.data)); } else if (status == -NFS4ERR_DENIED) - return decode_lock_denied(xdr, &res->u.denied); + return decode_lock_denied(xdr, NULL); return status; } -static int decode_lockt(struct xdr_stream *xdr, struct nfs_lockres *res) +static int decode_lockt(struct xdr_stream *xdr, struct nfs_lockt_res *res) { int status; status = decode_op_hdr(xdr, OP_LOCKT); if (status == -NFS4ERR_DENIED) - return decode_lock_denied(xdr, &res->u.denied); + return decode_lock_denied(xdr, res->denied); return status; } -static int decode_locku(struct xdr_stream *xdr, struct nfs_lockres *res) +static int decode_locku(struct xdr_stream *xdr, struct nfs_locku_res *res) { uint32_t *p; int status; status = decode_op_hdr(xdr, OP_LOCKU); if (status == 0) { - READ_BUF(sizeof(res->u.stateid.data)); - COPYMEM(res->u.stateid.data, sizeof(res->u.stateid.data)); + READ_BUF(sizeof(res->stateid.data)); + COPYMEM(res->stateid.data, sizeof(res->stateid.data)); } return status; } @@ -3831,6 +3846,9 @@ static int nfs4_xdr_dec_open_noattr(stru if (status) goto out; status = decode_open(&xdr, res); + if (status) + goto out; + decode_getfattr(&xdr, res->f_attr, res->server); out: return status; } @@ -3864,7 +3882,7 @@ out: /* * Decode LOCK response */ -static int nfs4_xdr_dec_lock(struct rpc_rqst *rqstp, uint32_t *p, struct nfs_lockres *res) +static int nfs4_xdr_dec_lock(struct rpc_rqst *rqstp, uint32_t *p, struct nfs_lock_res *res) { struct xdr_stream xdr; struct compound_hdr hdr; @@ -3885,7 +3903,7 @@ out: /* * Decode LOCKT response */ -static int nfs4_xdr_dec_lockt(struct rpc_rqst *rqstp, uint32_t *p, struct nfs_lockres *res) +static int nfs4_xdr_dec_lockt(struct rpc_rqst *rqstp, uint32_t *p, struct nfs_lockt_res *res) { struct xdr_stream xdr; struct compound_hdr hdr; @@ -3906,7 +3924,7 @@ out: /* * Decode LOCKU response */ -static int nfs4_xdr_dec_locku(struct rpc_rqst *rqstp, uint32_t *p, struct nfs_lockres *res) +static int nfs4_xdr_dec_locku(struct rpc_rqst *rqstp, uint32_t *p, struct nfs_locku_res *res) { struct xdr_stream xdr; struct compound_hdr hdr; @@ -4174,7 +4192,7 @@ static int nfs4_xdr_dec_setclientid_conf /* * DELEGRETURN request */ -static int nfs4_xdr_dec_delegreturn(struct rpc_rqst *rqstp, uint32_t *p, void *dummy) +static int nfs4_xdr_dec_delegreturn(struct rpc_rqst *rqstp, uint32_t *p, struct nfs4_delegreturnres *res) { struct xdr_stream xdr; struct compound_hdr hdr; @@ -4182,11 +4200,14 @@ static int nfs4_xdr_dec_delegreturn(stru xdr_init_decode(&xdr, &rqstp->rq_rcv_buf, p); status = decode_compound_hdr(&xdr, &hdr); - if (status == 0) { - status = decode_putfh(&xdr); - if (status == 0) - status = decode_delegreturn(&xdr); - } + if (status != 0) + goto out; + status = decode_putfh(&xdr); + if (status != 0) + goto out; + status = decode_delegreturn(&xdr); + decode_getfattr(&xdr, res->fattr, res->server); +out: return status; } diff --git a/fs/nfs/nfsroot.c b/fs/nfs/nfsroot.c index 1b272a1..985cc53 100644 --- a/fs/nfs/nfsroot.c +++ b/fs/nfs/nfsroot.c @@ -296,8 +296,8 @@ static int __init root_nfs_name(char *na nfs_port = -1; nfs_data.version = NFS_MOUNT_VERSION; nfs_data.flags = NFS_MOUNT_NONLM; /* No lockd in nfs root yet */ - nfs_data.rsize = NFS_DEF_FILE_IO_BUFFER_SIZE; - nfs_data.wsize = NFS_DEF_FILE_IO_BUFFER_SIZE; + nfs_data.rsize = NFS_DEF_FILE_IO_SIZE; + nfs_data.wsize = NFS_DEF_FILE_IO_SIZE; nfs_data.acregmin = 3; nfs_data.acregmax = 60; nfs_data.acdirmin = 30; diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index e1e3ca5..f5150d7 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -111,6 +111,9 @@ nfs_proc_setattr(struct dentry *dentry, }; int status; + /* Mask out the non-modebit related stuff from attr->ia_mode */ + sattr->ia_mode &= S_IALLUGO; + dprintk("NFS call setattr\n"); nfs_fattr_init(fattr); status = rpc_call(NFS_CLIENT(inode), NFSPROC_SETATTR, &arg, fattr, 0); @@ -547,10 +550,9 @@ nfs_proc_pathconf(struct nfs_server *ser extern u32 * nfs_decode_dirent(u32 *, struct nfs_entry *, int); -static void -nfs_read_done(struct rpc_task *task) +static void nfs_read_done(struct rpc_task *task, void *calldata) { - struct nfs_read_data *data = (struct nfs_read_data *) task->tk_calldata; + struct nfs_read_data *data = calldata; if (task->tk_status >= 0) { nfs_refresh_inode(data->inode, data->res.fattr); @@ -560,9 +562,14 @@ nfs_read_done(struct rpc_task *task) if (data->args.offset + data->args.count >= data->res.fattr->size) data->res.eof = 1; } - nfs_readpage_result(task); + nfs_readpage_result(task, calldata); } +static const struct rpc_call_ops nfs_read_ops = { + .rpc_call_done = nfs_read_done, + .rpc_release = nfs_readdata_release, +}; + static void nfs_proc_read_setup(struct nfs_read_data *data) { @@ -580,20 +587,24 @@ nfs_proc_read_setup(struct nfs_read_data flags = RPC_TASK_ASYNC | (IS_SWAPFILE(inode)? NFS_RPC_SWAPFLAGS : 0); /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs_read_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs_read_ops, data); rpc_call_setup(task, &msg, 0); } -static void -nfs_write_done(struct rpc_task *task) +static void nfs_write_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data = (struct nfs_write_data *) task->tk_calldata; + struct nfs_write_data *data = calldata; if (task->tk_status >= 0) nfs_post_op_update_inode(data->inode, data->res.fattr); - nfs_writeback_done(task); + nfs_writeback_done(task, calldata); } +static const struct rpc_call_ops nfs_write_ops = { + .rpc_call_done = nfs_write_done, + .rpc_release = nfs_writedata_release, +}; + static void nfs_proc_write_setup(struct nfs_write_data *data, int how) { @@ -614,7 +625,7 @@ nfs_proc_write_setup(struct nfs_write_da flags = (how & FLUSH_SYNC) ? 0 : RPC_TASK_ASYNC; /* Finalize the task. */ - rpc_init_task(task, NFS_CLIENT(inode), nfs_write_done, flags); + rpc_init_task(task, NFS_CLIENT(inode), flags, &nfs_write_ops, data); rpc_call_setup(task, &msg, 0); } diff --git a/fs/nfs/read.c b/fs/nfs/read.c index 5f20eaf..05eb43f 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -42,9 +42,8 @@ mempool_t *nfs_rdata_mempool; #define MIN_POOL_READ (32) -void nfs_readdata_release(struct rpc_task *task) +void nfs_readdata_release(void *data) { - struct nfs_read_data *data = (struct nfs_read_data *)task->tk_calldata; nfs_readdata_free(data); } @@ -84,7 +83,7 @@ static int nfs_readpage_sync(struct nfs_ int result; struct nfs_read_data *rdata; - rdata = nfs_readdata_alloc(); + rdata = nfs_readdata_alloc(1); if (!rdata) return -ENOMEM; @@ -220,9 +219,6 @@ static void nfs_read_rpcsetup(struct nfs NFS_PROTO(inode)->read_setup(data); data->task.tk_cookie = (unsigned long)inode; - data->task.tk_calldata = data; - /* Release requests */ - data->task.tk_release = nfs_readdata_release; dprintk("NFS: %4d initiated read call (req %s/%Ld, %u bytes @ offset %Lu)\n", data->task.tk_pid, @@ -287,7 +283,7 @@ static int nfs_pagein_multi(struct list_ nbytes = req->wb_bytes; for(;;) { - data = nfs_readdata_alloc(); + data = nfs_readdata_alloc(1); if (!data) goto out_bad; INIT_LIST_HEAD(&data->pages); @@ -343,7 +339,7 @@ static int nfs_pagein_one(struct list_he if (NFS_SERVER(inode)->rsize < PAGE_CACHE_SIZE) return nfs_pagein_multi(head, inode); - data = nfs_readdata_alloc(); + data = nfs_readdata_alloc(NFS_SERVER(inode)->rpages); if (!data) goto out_bad; @@ -452,9 +448,9 @@ static void nfs_readpage_result_full(str * This is the callback from RPC telling us whether a reply was * received or some error occurred (timeout or socket shutdown). */ -void nfs_readpage_result(struct rpc_task *task) +void nfs_readpage_result(struct rpc_task *task, void *calldata) { - struct nfs_read_data *data = (struct nfs_read_data *)task->tk_calldata; + struct nfs_read_data *data = calldata; struct nfs_readargs *argp = &data->args; struct nfs_readres *resp = &data->res; int status = task->tk_status; diff --git a/fs/nfs/sysctl.c b/fs/nfs/sysctl.c new file mode 100644 index 0000000..4c486eb --- /dev/null +++ b/fs/nfs/sysctl.c @@ -0,0 +1,84 @@ +/* + * linux/fs/nfs/sysctl.c + * + * Sysctl interface to NFS parameters + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "callback.h" + +static const int nfs_set_port_min = 0; +static const int nfs_set_port_max = 65535; +static struct ctl_table_header *nfs_callback_sysctl_table; +/* + * Something that isn't CTL_ANY, CTL_NONE or a value that may clash. + * Use the same values as fs/lockd/svc.c + */ +#define CTL_UNNUMBERED -2 + +static ctl_table nfs_cb_sysctls[] = { +#ifdef CONFIG_NFS_V4 + { + .ctl_name = CTL_UNNUMBERED, + .procname = "nfs_callback_tcpport", + .data = &nfs_callback_set_tcpport, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec_minmax, + .extra1 = (int *)&nfs_set_port_min, + .extra2 = (int *)&nfs_set_port_max, + }, + { + .ctl_name = CTL_UNNUMBERED, + .procname = "idmap_cache_timeout", + .data = &nfs_idmap_cache_timeout, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec_jiffies, + .strategy = &sysctl_jiffies, + }, +#endif + { .ctl_name = 0 } +}; + +static ctl_table nfs_cb_sysctl_dir[] = { + { + .ctl_name = CTL_UNNUMBERED, + .procname = "nfs", + .mode = 0555, + .child = nfs_cb_sysctls, + }, + { .ctl_name = 0 } +}; + +static ctl_table nfs_cb_sysctl_root[] = { + { + .ctl_name = CTL_FS, + .procname = "fs", + .mode = 0555, + .child = nfs_cb_sysctl_dir, + }, + { .ctl_name = 0 } +}; + +int nfs_register_sysctl(void) +{ + nfs_callback_sysctl_table = register_sysctl_table(nfs_cb_sysctl_root, 0); + if (nfs_callback_sysctl_table == NULL) + return -ENOMEM; + return 0; +} + +void nfs_unregister_sysctl(void) +{ + unregister_sysctl_table(nfs_callback_sysctl_table); + nfs_callback_sysctl_table = NULL; +} diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c index d639d17..a65c7b5 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -87,10 +87,9 @@ nfs_copy_dname(struct dentry *dentry, st * We delay initializing RPC info until after the call to dentry_iput() * in order to minimize races against rename(). */ -static void -nfs_async_unlink_init(struct rpc_task *task) +static void nfs_async_unlink_init(struct rpc_task *task, void *calldata) { - struct nfs_unlinkdata *data = (struct nfs_unlinkdata *)task->tk_calldata; + struct nfs_unlinkdata *data = calldata; struct dentry *dir = data->dir; struct rpc_message msg = { .rpc_cred = data->cred, @@ -116,10 +115,9 @@ nfs_async_unlink_init(struct rpc_task *t * * Do the directory attribute update. */ -static void -nfs_async_unlink_done(struct rpc_task *task) +static void nfs_async_unlink_done(struct rpc_task *task, void *calldata) { - struct nfs_unlinkdata *data = (struct nfs_unlinkdata *)task->tk_calldata; + struct nfs_unlinkdata *data = calldata; struct dentry *dir = data->dir; struct inode *dir_i; @@ -141,13 +139,18 @@ nfs_async_unlink_done(struct rpc_task *t * We need to call nfs_put_unlinkdata as a 'tk_release' task since the * rpc_task would be freed too. */ -static void -nfs_async_unlink_release(struct rpc_task *task) +static void nfs_async_unlink_release(void *calldata) { - struct nfs_unlinkdata *data = (struct nfs_unlinkdata *)task->tk_calldata; + struct nfs_unlinkdata *data = calldata; nfs_put_unlinkdata(data); } +static const struct rpc_call_ops nfs_unlink_ops = { + .rpc_call_prepare = nfs_async_unlink_init, + .rpc_call_done = nfs_async_unlink_done, + .rpc_release = nfs_async_unlink_release, +}; + /** * nfs_async_unlink - asynchronous unlinking of a file * @dentry: dentry to unlink @@ -157,7 +160,6 @@ nfs_async_unlink(struct dentry *dentry) { struct dentry *dir = dentry->d_parent; struct nfs_unlinkdata *data; - struct rpc_task *task; struct rpc_clnt *clnt = NFS_CLIENT(dir->d_inode); int status = -ENOMEM; @@ -178,17 +180,13 @@ nfs_async_unlink(struct dentry *dentry) nfs_deletes = data; data->count = 1; - task = &data->task; - rpc_init_task(task, clnt, nfs_async_unlink_done , RPC_TASK_ASYNC); - task->tk_calldata = data; - task->tk_action = nfs_async_unlink_init; - task->tk_release = nfs_async_unlink_release; + rpc_init_task(&data->task, clnt, RPC_TASK_ASYNC, &nfs_unlink_ops, data); spin_lock(&dentry->d_lock); dentry->d_flags |= DCACHE_NFSFS_RENAMED; spin_unlock(&dentry->d_lock); - rpc_sleep_on(&nfs_delete_queue, task, NULL, NULL); + rpc_sleep_on(&nfs_delete_queue, &data->task, NULL, NULL); status = 0; out: return status; diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 3107908..9449b68 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -89,24 +89,38 @@ static mempool_t *nfs_commit_mempool; static DECLARE_WAIT_QUEUE_HEAD(nfs_write_congestion); -static inline struct nfs_write_data *nfs_commit_alloc(void) +static inline struct nfs_write_data *nfs_commit_alloc(unsigned int pagecount) { struct nfs_write_data *p = mempool_alloc(nfs_commit_mempool, SLAB_NOFS); + if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); + if (pagecount < NFS_PAGEVEC_SIZE) + p->pagevec = &p->page_array[0]; + else { + size_t size = ++pagecount * sizeof(struct page *); + p->pagevec = kmalloc(size, GFP_NOFS); + if (p->pagevec) { + memset(p->pagevec, 0, size); + } else { + mempool_free(p, nfs_commit_mempool); + p = NULL; + } + } } return p; } static inline void nfs_commit_free(struct nfs_write_data *p) { + if (p && (p->pagevec != &p->page_array[0])) + kfree(p->pagevec); mempool_free(p, nfs_commit_mempool); } -static void nfs_writedata_release(struct rpc_task *task) +void nfs_writedata_release(void *wdata) { - struct nfs_write_data *wdata = (struct nfs_write_data *)task->tk_calldata; nfs_writedata_free(wdata); } @@ -168,7 +182,7 @@ static int nfs_writepage_sync(struct nfs int result, written = 0; struct nfs_write_data *wdata; - wdata = nfs_writedata_alloc(); + wdata = nfs_writedata_alloc(1); if (!wdata) return -ENOMEM; @@ -232,19 +246,16 @@ static int nfs_writepage_async(struct nf unsigned int offset, unsigned int count) { struct nfs_page *req; - int status; req = nfs_update_request(ctx, inode, page, offset, count); - status = (IS_ERR(req)) ? PTR_ERR(req) : 0; - if (status < 0) - goto out; + if (IS_ERR(req)) + return PTR_ERR(req); /* Update file length */ nfs_grow_file(page, offset, count); /* Set the PG_uptodate flag? */ nfs_mark_uptodate(page, offset, count); nfs_unlock_request(req); - out: - return status; + return 0; } static int wb_priority(struct writeback_control *wbc) @@ -304,11 +315,8 @@ do_it: lock_kernel(); if (!IS_SYNC(inode) && inode_referenced) { err = nfs_writepage_async(ctx, inode, page, 0, offset); - if (err >= 0) { - err = 0; - if (wbc->for_reclaim) - nfs_flush_inode(inode, 0, 0, FLUSH_STABLE); - } + if (!wbc->for_writepages) + nfs_flush_inode(inode, 0, 0, wb_priority(wbc)); } else { err = nfs_writepage_sync(ctx, inode, page, 0, offset, priority); @@ -877,9 +885,6 @@ static void nfs_write_rpcsetup(struct nf data->task.tk_priority = flush_task_priority(how); data->task.tk_cookie = (unsigned long)inode; - data->task.tk_calldata = data; - /* Release requests */ - data->task.tk_release = nfs_writedata_release; dprintk("NFS: %4d initiated write call (req %s/%Ld, %u bytes @ offset %Lu)\n", data->task.tk_pid, @@ -919,7 +924,7 @@ static int nfs_flush_multi(struct list_h nbytes = req->wb_bytes; for (;;) { - data = nfs_writedata_alloc(); + data = nfs_writedata_alloc(1); if (!data) goto out_bad; list_add(&data->pages, &list); @@ -983,7 +988,7 @@ static int nfs_flush_one(struct list_hea if (NFS_SERVER(inode)->wsize < PAGE_CACHE_SIZE) return nfs_flush_multi(head, inode, how); - data = nfs_writedata_alloc(); + data = nfs_writedata_alloc(NFS_SERVER(inode)->wpages); if (!data) goto out_bad; @@ -1137,9 +1142,9 @@ static void nfs_writeback_done_full(stru /* * This function is called when the WRITE call is complete. */ -void nfs_writeback_done(struct rpc_task *task) +void nfs_writeback_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data = (struct nfs_write_data *) task->tk_calldata; + struct nfs_write_data *data = calldata; struct nfs_writeargs *argp = &data->args; struct nfs_writeres *resp = &data->res; @@ -1206,9 +1211,8 @@ void nfs_writeback_done(struct rpc_task #if defined(CONFIG_NFS_V3) || defined(CONFIG_NFS_V4) -static void nfs_commit_release(struct rpc_task *task) +void nfs_commit_release(void *wdata) { - struct nfs_write_data *wdata = (struct nfs_write_data *)task->tk_calldata; nfs_commit_free(wdata); } @@ -1244,9 +1248,6 @@ static void nfs_commit_rpcsetup(struct l data->task.tk_priority = flush_task_priority(how); data->task.tk_cookie = (unsigned long)inode; - data->task.tk_calldata = data; - /* Release requests */ - data->task.tk_release = nfs_commit_release; dprintk("NFS: %4d initiated commit call\n", data->task.tk_pid); } @@ -1255,12 +1256,12 @@ static void nfs_commit_rpcsetup(struct l * Commit dirty pages */ static int -nfs_commit_list(struct list_head *head, int how) +nfs_commit_list(struct inode *inode, struct list_head *head, int how) { struct nfs_write_data *data; struct nfs_page *req; - data = nfs_commit_alloc(); + data = nfs_commit_alloc(NFS_SERVER(inode)->wpages); if (!data) goto out_bad; @@ -1283,10 +1284,9 @@ nfs_commit_list(struct list_head *head, /* * COMMIT call returned */ -void -nfs_commit_done(struct rpc_task *task) +void nfs_commit_done(struct rpc_task *task, void *calldata) { - struct nfs_write_data *data = (struct nfs_write_data *)task->tk_calldata; + struct nfs_write_data *data = calldata; struct nfs_page *req; int res = 0; @@ -1366,7 +1366,7 @@ int nfs_commit_inode(struct inode *inode res = nfs_scan_commit(inode, &head, 0, 0); spin_unlock(&nfsi->req_lock); if (res) { - error = nfs_commit_list(&head, how); + error = nfs_commit_list(inode, &head, how); if (error < 0) return error; } @@ -1377,22 +1377,23 @@ int nfs_commit_inode(struct inode *inode int nfs_sync_inode(struct inode *inode, unsigned long idx_start, unsigned int npages, int how) { - int error, - wait; + int nocommit = how & FLUSH_NOCOMMIT; + int wait = how & FLUSH_WAIT; + int error; - wait = how & FLUSH_WAIT; - how &= ~FLUSH_WAIT; + how &= ~(FLUSH_WAIT|FLUSH_NOCOMMIT); do { - error = 0; - if (wait) + if (wait) { error = nfs_wait_on_requests(inode, idx_start, npages); - if (error == 0) - error = nfs_flush_inode(inode, idx_start, npages, how); -#if defined(CONFIG_NFS_V3) || defined(CONFIG_NFS_V4) - if (error == 0) + if (error != 0) + continue; + } + error = nfs_flush_inode(inode, idx_start, npages, how); + if (error != 0) + continue; + if (!nocommit) error = nfs_commit_inode(inode, how); -#endif } while (error > 0); return error; } diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c index 583c071..d828662 100644 --- a/fs/nfsd/nfs4callback.c +++ b/fs/nfsd/nfs4callback.c @@ -53,7 +53,7 @@ #define NFSPROC4_CB_COMPOUND 1 /* declarations */ -static void nfs4_cb_null(struct rpc_task *task); +static const struct rpc_call_ops nfs4_cb_null_ops; /* Index of predefined Linux callback client operations */ @@ -431,7 +431,6 @@ nfsd4_probe_callback(struct nfs4_client } clnt->cl_intr = 0; clnt->cl_softrtry = 1; - clnt->cl_chatty = 1; /* Kick rpciod, put the call on the wire. */ @@ -447,7 +446,7 @@ nfsd4_probe_callback(struct nfs4_client msg.rpc_cred = nfsd4_lookupcred(clp,0); if (IS_ERR(msg.rpc_cred)) goto out_rpciod; - status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, nfs4_cb_null, NULL); + status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, &nfs4_cb_null_ops, NULL); put_rpccred(msg.rpc_cred); if (status != 0) { @@ -469,7 +468,7 @@ out_err: } static void -nfs4_cb_null(struct rpc_task *task) +nfs4_cb_null(struct rpc_task *task, void *dummy) { struct nfs4_client *clp = (struct nfs4_client *)task->tk_msg.rpc_argp; struct nfs4_callback *cb = &clp->cl_callback; @@ -488,6 +487,10 @@ out: put_nfs4_client(clp); } +static const struct rpc_call_ops nfs4_cb_null_ops = { + .rpc_call_done = nfs4_cb_null, +}; + /* * called with dp->dl_count inc'ed. * nfs4_lock_state() may or may not have been called. diff --git a/include/linux/fs.h b/include/linux/fs.h index cc35b6a..a947734 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -729,7 +729,7 @@ extern struct file_lock *posix_test_lock extern int posix_lock_file(struct file *, struct file_lock *); extern int posix_lock_file_wait(struct file *, struct file_lock *); extern void posix_block_lock(struct file_lock *, struct file_lock *); -extern void posix_unblock_lock(struct file *, struct file_lock *); +extern int posix_unblock_lock(struct file *, struct file_lock *); extern int posix_locks_deadlock(struct file_lock *, struct file_lock *); extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl); extern int __break_lease(struct inode *inode, unsigned int flags); diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h index 16d4e5a..95c8fea 100644 --- a/include/linux/lockd/lockd.h +++ b/include/linux/lockd/lockd.h @@ -172,7 +172,7 @@ extern struct nlm_host *nlm_find_client( /* * Server-side lock handling */ -int nlmsvc_async_call(struct nlm_rqst *, u32, rpc_action); +int nlmsvc_async_call(struct nlm_rqst *, u32, const struct rpc_call_ops *); u32 nlmsvc_lock(struct svc_rqst *, struct nlm_file *, struct nlm_lock *, int, struct nlm_cookie *); u32 nlmsvc_unlock(struct nlm_file *, struct nlm_lock *); diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 2516ade..547d649 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -38,9 +38,6 @@ # define NFS_DEBUG #endif -#define NFS_MAX_FILE_IO_BUFFER_SIZE 32768 -#define NFS_DEF_FILE_IO_BUFFER_SIZE 4096 - /* Default timeout values */ #define NFS_MAX_UDP_TIMEOUT (60*HZ) #define NFS_MAX_TCP_TIMEOUT (600*HZ) @@ -65,6 +62,7 @@ #define FLUSH_STABLE 4 /* commit to stable storage */ #define FLUSH_LOWPRI 8 /* low priority background flush */ #define FLUSH_HIGHPRI 16 /* high priority memory reclaim flush */ +#define FLUSH_NOCOMMIT 32 /* Don't send the NFSv3/v4 COMMIT */ #ifdef __KERNEL__ @@ -394,6 +392,17 @@ extern int nfs_instantiate(struct dentry extern struct inode_operations nfs_symlink_inode_operations; /* + * linux/fs/nfs/sysctl.c + */ +#ifdef CONFIG_SYSCTL +extern int nfs_register_sysctl(void); +extern void nfs_unregister_sysctl(void); +#else +#define nfs_register_sysctl() do { } while(0) +#define nfs_unregister_sysctl() do { } while(0) +#endif + +/* * linux/fs/nfs/unlink.c */ extern int nfs_async_unlink(struct dentry *); @@ -406,10 +415,12 @@ extern int nfs_writepage(struct page *p extern int nfs_writepages(struct address_space *, struct writeback_control *); extern int nfs_flush_incompatible(struct file *file, struct page *page); extern int nfs_updatepage(struct file *, struct page *, unsigned int, unsigned int); -extern void nfs_writeback_done(struct rpc_task *task); +extern void nfs_writeback_done(struct rpc_task *task, void *data); +extern void nfs_writedata_release(void *data); #if defined(CONFIG_NFS_V3) || defined(CONFIG_NFS_V4) -extern void nfs_commit_done(struct rpc_task *); +extern void nfs_commit_done(struct rpc_task *, void *data); +extern void nfs_commit_release(void *data); #endif /* @@ -460,18 +471,33 @@ static inline int nfs_wb_page(struct ino */ extern mempool_t *nfs_wdata_mempool; -static inline struct nfs_write_data *nfs_writedata_alloc(void) +static inline struct nfs_write_data *nfs_writedata_alloc(unsigned int pagecount) { struct nfs_write_data *p = mempool_alloc(nfs_wdata_mempool, SLAB_NOFS); + if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); + if (pagecount < NFS_PAGEVEC_SIZE) + p->pagevec = &p->page_array[0]; + else { + size_t size = ++pagecount * sizeof(struct page *); + p->pagevec = kmalloc(size, GFP_NOFS); + if (p->pagevec) { + memset(p->pagevec, 0, size); + } else { + mempool_free(p, nfs_wdata_mempool); + p = NULL; + } + } } return p; } static inline void nfs_writedata_free(struct nfs_write_data *p) { + if (p && (p->pagevec != &p->page_array[0])) + kfree(p->pagevec); mempool_free(p, nfs_wdata_mempool); } @@ -481,28 +507,45 @@ static inline void nfs_writedata_free(st extern int nfs_readpage(struct file *, struct page *); extern int nfs_readpages(struct file *, struct address_space *, struct list_head *, unsigned); -extern void nfs_readpage_result(struct rpc_task *); +extern void nfs_readpage_result(struct rpc_task *, void *); +extern void nfs_readdata_release(void *data); + /* * Allocate and free nfs_read_data structures */ extern mempool_t *nfs_rdata_mempool; -static inline struct nfs_read_data *nfs_readdata_alloc(void) +static inline struct nfs_read_data *nfs_readdata_alloc(unsigned int pagecount) { struct nfs_read_data *p = mempool_alloc(nfs_rdata_mempool, SLAB_NOFS); - if (p) + + if (p) { memset(p, 0, sizeof(*p)); + INIT_LIST_HEAD(&p->pages); + if (pagecount < NFS_PAGEVEC_SIZE) + p->pagevec = &p->page_array[0]; + else { + size_t size = ++pagecount * sizeof(struct page *); + p->pagevec = kmalloc(size, GFP_NOFS); + if (p->pagevec) { + memset(p->pagevec, 0, size); + } else { + mempool_free(p, nfs_rdata_mempool); + p = NULL; + } + } + } return p; } static inline void nfs_readdata_free(struct nfs_read_data *p) { + if (p && (p->pagevec != &p->page_array[0])) + kfree(p->pagevec); mempool_free(p, nfs_rdata_mempool); } -extern void nfs_readdata_release(struct rpc_task *task); - /* * linux/fs/nfs3proc.c */ diff --git a/include/linux/nfs_idmap.h b/include/linux/nfs_idmap.h index a0f1f25..102e560 100644 --- a/include/linux/nfs_idmap.h +++ b/include/linux/nfs_idmap.h @@ -71,6 +71,8 @@ int nfs_map_name_to_uid(struct nfs4_clie int nfs_map_group_to_gid(struct nfs4_client *, const char *, size_t, __u32 *); int nfs_map_uid_to_name(struct nfs4_client *, __u32, char *); int nfs_map_gid_to_group(struct nfs4_client *, __u32, char *); + +extern unsigned int nfs_idmap_cache_timeout; #endif /* __KERNEL__ */ #endif /* NFS_IDMAP_H */ diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h index da2e077..66e2ed6 100644 --- a/include/linux/nfs_page.h +++ b/include/linux/nfs_page.h @@ -79,9 +79,7 @@ extern void nfs_clear_page_writeback(st static inline int nfs_lock_request_dontget(struct nfs_page *req) { - if (test_and_set_bit(PG_BUSY, &req->wb_flags)) - return 0; - return 1; + return !test_and_set_bit(PG_BUSY, &req->wb_flags); } /* @@ -125,9 +123,7 @@ nfs_list_remove_request(struct nfs_page static inline int nfs_defer_commit(struct nfs_page *req) { - if (test_and_set_bit(PG_NEED_COMMIT, &req->wb_flags)) - return 0; - return 1; + return !test_and_set_bit(PG_NEED_COMMIT, &req->wb_flags); } static inline void @@ -141,9 +137,7 @@ nfs_clear_commit(struct nfs_page *req) static inline int nfs_defer_reschedule(struct nfs_page *req) { - if (test_and_set_bit(PG_NEED_RESCHED, &req->wb_flags)) - return 0; - return 1; + return !test_and_set_bit(PG_NEED_RESCHED, &req->wb_flags); } static inline void diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 4071866..6d6f69e 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -4,6 +4,16 @@ #include #include +/* + * To change the maximum rsize and wsize supported by the NFS client, adjust + * NFS_MAX_FILE_IO_SIZE. 64KB is a typical maximum, but some servers can + * support a megabyte or more. The default is left at 4096 bytes, which is + * reasonable for NFS over UDP. + */ +#define NFS_MAX_FILE_IO_SIZE (1048576U) +#define NFS_DEF_FILE_IO_SIZE (4096U) +#define NFS_MIN_FILE_IO_SIZE (1024U) + struct nfs4_fsid { __u64 major; __u64 minor; @@ -137,7 +147,7 @@ struct nfs_openres { */ struct nfs_open_confirmargs { const struct nfs_fh * fh; - nfs4_stateid stateid; + nfs4_stateid * stateid; struct nfs_seqid * seqid; }; @@ -165,66 +175,62 @@ struct nfs_closeres { * * Arguments to the lock,lockt, and locku call. * */ struct nfs_lowner { - __u64 clientid; - u32 id; + __u64 clientid; + u32 id; }; -struct nfs_lock_opargs { +struct nfs_lock_args { + struct nfs_fh * fh; + struct file_lock * fl; struct nfs_seqid * lock_seqid; nfs4_stateid * lock_stateid; struct nfs_seqid * open_seqid; nfs4_stateid * open_stateid; - struct nfs_lowner lock_owner; - __u32 reclaim; - __u32 new_lock_owner; + struct nfs_lowner lock_owner; + unsigned char block : 1; + unsigned char reclaim : 1; + unsigned char new_lock_owner : 1; +}; + +struct nfs_lock_res { + nfs4_stateid stateid; }; -struct nfs_locku_opargs { +struct nfs_locku_args { + struct nfs_fh * fh; + struct file_lock * fl; struct nfs_seqid * seqid; nfs4_stateid * stateid; }; -struct nfs_lockargs { - struct nfs_fh * fh; - __u32 type; - __u64 offset; - __u64 length; - union { - struct nfs_lock_opargs *lock; /* LOCK */ - struct nfs_lowner *lockt; /* LOCKT */ - struct nfs_locku_opargs *locku; /* LOCKU */ - } u; +struct nfs_locku_res { + nfs4_stateid stateid; }; -struct nfs_lock_denied { - __u64 offset; - __u64 length; - __u32 type; - struct nfs_lowner owner; +struct nfs_lockt_args { + struct nfs_fh * fh; + struct file_lock * fl; + struct nfs_lowner lock_owner; }; -struct nfs_lockres { - union { - nfs4_stateid stateid;/* LOCK success, LOCKU */ - struct nfs_lock_denied denied; /* LOCK failed, LOCKT success */ - } u; - const struct nfs_server * server; +struct nfs_lockt_res { + struct file_lock * denied; /* LOCK, LOCKT failed */ }; struct nfs4_delegreturnargs { const struct nfs_fh *fhandle; const nfs4_stateid *stateid; + const u32 * bitmask; +}; + +struct nfs4_delegreturnres { + struct nfs_fattr * fattr; + const struct nfs_server *server; }; /* * Arguments to the read call. */ - -#define NFS_READ_MAXIOV (9U) -#if (NFS_READ_MAXIOV > (MAX_IOVEC -2)) -#error "NFS_READ_MAXIOV is too large" -#endif - struct nfs_readargs { struct nfs_fh * fh; struct nfs_open_context *context; @@ -243,11 +249,6 @@ struct nfs_readres { /* * Arguments to the write call. */ -#define NFS_WRITE_MAXIOV (9U) -#if (NFS_WRITE_MAXIOV > (MAX_IOVEC -2)) -#error "NFS_WRITE_MAXIOV is too large" -#endif - struct nfs_writeargs { struct nfs_fh * fh; struct nfs_open_context *context; @@ -678,6 +679,8 @@ struct nfs4_server_caps_res { struct nfs_page; +#define NFS_PAGEVEC_SIZE (8U) + struct nfs_read_data { int flags; struct rpc_task task; @@ -686,13 +689,14 @@ struct nfs_read_data { struct nfs_fattr fattr; /* fattr storage */ struct list_head pages; /* Coalesced read requests */ struct nfs_page *req; /* multi ops per nfs_page */ - struct page *pagevec[NFS_READ_MAXIOV]; + struct page **pagevec; struct nfs_readargs args; struct nfs_readres res; #ifdef CONFIG_NFS_V4 unsigned long timestamp; /* For lease renewal */ #endif void (*complete) (struct nfs_read_data *, int); + struct page *page_array[NFS_PAGEVEC_SIZE + 1]; }; struct nfs_write_data { @@ -704,13 +708,14 @@ struct nfs_write_data { struct nfs_writeverf verf; struct list_head pages; /* Coalesced requests we wish to flush */ struct nfs_page *req; /* multi ops per nfs_page */ - struct page *pagevec[NFS_WRITE_MAXIOV]; + struct page **pagevec; struct nfs_writeargs args; /* argument struct */ struct nfs_writeres res; /* result struct */ #ifdef CONFIG_NFS_V4 unsigned long timestamp; /* For lease renewal */ #endif void (*complete) (struct nfs_write_data *, int); + struct page *page_array[NFS_PAGEVEC_SIZE + 1]; }; struct nfs_access_entry; diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index ab151bb..f147e6b 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -49,7 +49,6 @@ struct rpc_clnt { unsigned int cl_softrtry : 1,/* soft timeouts */ cl_intr : 1,/* interruptible */ - cl_chatty : 1,/* be verbose */ cl_autobind : 1,/* use getport() */ cl_oneshot : 1,/* dispose after use */ cl_dead : 1;/* abandoned */ @@ -126,7 +125,8 @@ int rpc_register(u32, u32, int, unsigne void rpc_call_setup(struct rpc_task *, struct rpc_message *, int); int rpc_call_async(struct rpc_clnt *clnt, struct rpc_message *msg, - int flags, rpc_action callback, void *clntdata); + int flags, const struct rpc_call_ops *tk_ops, + void *calldata); int rpc_call_sync(struct rpc_clnt *clnt, struct rpc_message *msg, int flags); void rpc_restart_call(struct rpc_task *); @@ -134,6 +134,7 @@ void rpc_clnt_sigmask(struct rpc_clnt * void rpc_clnt_sigunmask(struct rpc_clnt *clnt, sigset_t *oldset); void rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int); size_t rpc_max_payload(struct rpc_clnt *); +void rpc_force_rebind(struct rpc_clnt *); int rpc_ping(struct rpc_clnt *clnt, int flags); static __inline__ diff --git a/include/linux/sunrpc/gss_spkm3.h b/include/linux/sunrpc/gss_spkm3.h index 0beb2cf..336e218 100644 --- a/include/linux/sunrpc/gss_spkm3.h +++ b/include/linux/sunrpc/gss_spkm3.h @@ -48,7 +48,7 @@ u32 spkm3_read_token(struct spkm3_ctx *c #define CKSUMTYPE_RSA_MD5 0x0007 s32 make_checksum(s32 cksumtype, char *header, int hdrlen, struct xdr_buf *body, - struct xdr_netobj *cksum); + int body_offset, struct xdr_netobj *cksum); void asn1_bitstring_len(struct xdr_netobj *in, int *enclen, int *zerobits); int decode_asn1_bitstring(struct xdr_netobj *out, char *in, int enclen, int explen); diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h index 4d77e90..8b25629 100644 --- a/include/linux/sunrpc/sched.h +++ b/include/linux/sunrpc/sched.h @@ -27,6 +27,7 @@ struct rpc_message { struct rpc_cred * rpc_cred; /* Credentials */ }; +struct rpc_call_ops; struct rpc_wait_queue; struct rpc_wait { struct list_head list; /* wait queue links */ @@ -41,6 +42,7 @@ struct rpc_task { #ifdef RPC_DEBUG unsigned long tk_magic; /* 0xf00baa */ #endif + atomic_t tk_count; /* Reference count */ struct list_head tk_task; /* global list of tasks */ struct rpc_clnt * tk_client; /* RPC client */ struct rpc_rqst * tk_rqstp; /* RPC request */ @@ -50,8 +52,6 @@ struct rpc_task { * RPC call state */ struct rpc_message tk_msg; /* RPC call info */ - __u32 * tk_buffer; /* XDR buffer */ - size_t tk_bufsize; __u8 tk_garb_retry; __u8 tk_cred_retry; @@ -61,13 +61,12 @@ struct rpc_task { * timeout_fn to be executed by timer bottom half * callback to be executed after waking up * action next procedure for async tasks - * exit exit async task and report to caller + * tk_ops caller callbacks */ void (*tk_timeout_fn)(struct rpc_task *); void (*tk_callback)(struct rpc_task *); void (*tk_action)(struct rpc_task *); - void (*tk_exit)(struct rpc_task *); - void (*tk_release)(struct rpc_task *); + const struct rpc_call_ops *tk_ops; void * tk_calldata; /* @@ -78,7 +77,6 @@ struct rpc_task { struct timer_list tk_timer; /* kernel timer */ unsigned long tk_timeout; /* timeout for rpc_sleep() */ unsigned short tk_flags; /* misc flags */ - unsigned char tk_active : 1;/* Task has been activated */ unsigned char tk_priority : 2;/* Task priority */ unsigned long tk_runstate; /* Task run status */ struct workqueue_struct *tk_workqueue; /* Normally rpciod, but could @@ -111,6 +109,13 @@ struct rpc_task { typedef void (*rpc_action)(struct rpc_task *); +struct rpc_call_ops { + void (*rpc_call_prepare)(struct rpc_task *, void *); + void (*rpc_call_done)(struct rpc_task *, void *); + void (*rpc_release)(void *); +}; + + /* * RPC task flags */ @@ -129,7 +134,6 @@ typedef void (*rpc_action)(struct rpc_ #define RPC_IS_SWAPPER(t) ((t)->tk_flags & RPC_TASK_SWAPPER) #define RPC_DO_ROOTOVERRIDE(t) ((t)->tk_flags & RPC_TASK_ROOTCREDS) #define RPC_ASSASSINATED(t) ((t)->tk_flags & RPC_TASK_KILLED) -#define RPC_IS_ACTIVATED(t) ((t)->tk_active) #define RPC_DO_CALLBACK(t) ((t)->tk_callback != NULL) #define RPC_IS_SOFT(t) ((t)->tk_flags & RPC_TASK_SOFT) #define RPC_TASK_UNINTERRUPTIBLE(t) ((t)->tk_flags & RPC_TASK_NOINTR) @@ -138,6 +142,7 @@ typedef void (*rpc_action)(struct rpc_ #define RPC_TASK_QUEUED 1 #define RPC_TASK_WAKEUP 2 #define RPC_TASK_HAS_TIMER 3 +#define RPC_TASK_ACTIVE 4 #define RPC_IS_RUNNING(t) (test_bit(RPC_TASK_RUNNING, &(t)->tk_runstate)) #define rpc_set_running(t) (set_bit(RPC_TASK_RUNNING, &(t)->tk_runstate)) @@ -168,6 +173,15 @@ typedef void (*rpc_action)(struct rpc_ smp_mb__after_clear_bit(); \ } while (0) +#define RPC_IS_ACTIVATED(t) (test_bit(RPC_TASK_ACTIVE, &(t)->tk_runstate)) +#define rpc_set_active(t) (set_bit(RPC_TASK_ACTIVE, &(t)->tk_runstate)) +#define rpc_clear_active(t) \ + do { \ + smp_mb__before_clear_bit(); \ + clear_bit(RPC_TASK_ACTIVE, &(t)->tk_runstate); \ + smp_mb__after_clear_bit(); \ + } while(0) + /* * Task priorities. * Note: if you change these, you must also change @@ -228,11 +242,16 @@ struct rpc_wait_queue { /* * Function prototypes */ -struct rpc_task *rpc_new_task(struct rpc_clnt *, rpc_action, int flags); +struct rpc_task *rpc_new_task(struct rpc_clnt *, int flags, + const struct rpc_call_ops *ops, void *data); +struct rpc_task *rpc_run_task(struct rpc_clnt *clnt, int flags, + const struct rpc_call_ops *ops, void *data); struct rpc_task *rpc_new_child(struct rpc_clnt *, struct rpc_task *parent); -void rpc_init_task(struct rpc_task *, struct rpc_clnt *, - rpc_action exitfunc, int flags); +void rpc_init_task(struct rpc_task *task, struct rpc_clnt *clnt, + int flags, const struct rpc_call_ops *ops, + void *data); void rpc_release_task(struct rpc_task *); +void rpc_exit_task(struct rpc_task *); void rpc_killall_tasks(struct rpc_clnt *); int rpc_execute(struct rpc_task *); void rpc_run_child(struct rpc_task *parent, struct rpc_task *child, @@ -247,9 +266,11 @@ struct rpc_task *rpc_wake_up_next(struct void rpc_wake_up_status(struct rpc_wait_queue *, int); void rpc_delay(struct rpc_task *, unsigned long); void * rpc_malloc(struct rpc_task *, size_t); +void rpc_free(struct rpc_task *); int rpciod_up(void); void rpciod_down(void); void rpciod_wake_up(void); +int __rpc_wait_for_completion_task(struct rpc_task *task, int (*)(void *)); #ifdef RPC_DEBUG void rpc_show_tasks(void); #endif @@ -259,7 +280,12 @@ void rpc_destroy_mempool(void); static inline void rpc_exit(struct rpc_task *task, int status) { task->tk_status = status; - task->tk_action = NULL; + task->tk_action = rpc_exit_task; +} + +static inline int rpc_wait_for_completion_task(struct rpc_task *task) +{ + return __rpc_wait_for_completion_task(task, NULL); } #ifdef RPC_DEBUG diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h index 5da9687..84c35d4 100644 --- a/include/linux/sunrpc/xdr.h +++ b/include/linux/sunrpc/xdr.h @@ -91,7 +91,6 @@ struct xdr_buf { u32 * xdr_encode_opaque_fixed(u32 *p, const void *ptr, unsigned int len); u32 * xdr_encode_opaque(u32 *p, const void *ptr, unsigned int len); u32 * xdr_encode_string(u32 *p, const char *s); -u32 * xdr_decode_string(u32 *p, char **sp, int *lenp, int maxlen); u32 * xdr_decode_string_inplace(u32 *p, char **sp, int *lenp, int maxlen); u32 * xdr_encode_netobj(u32 *p, const struct xdr_netobj *); u32 * xdr_decode_netobj(u32 *p, struct xdr_netobj *); @@ -135,11 +134,6 @@ xdr_adjust_iovec(struct kvec *iov, u32 * } /* - * Maximum number of iov's we use. - */ -#define MAX_IOVEC (12) - -/* * XDR buffer helper functions */ extern void xdr_shift_buf(struct xdr_buf *, size_t); diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 3b8b6e8..6ef99b1 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -79,21 +79,19 @@ struct rpc_rqst { void (*rq_release_snd_buf)(struct rpc_rqst *); /* release rq_enc_pages */ struct list_head rq_list; + __u32 * rq_buffer; /* XDR encode buffer */ + size_t rq_bufsize; + struct xdr_buf rq_private_buf; /* The receive buffer * used in the softirq. */ unsigned long rq_majortimeo; /* major timeout alarm */ unsigned long rq_timeout; /* Current timeout value */ unsigned int rq_retries; /* # of retries */ - /* - * For authentication (e.g. auth_des) - */ - u32 rq_creddata[2]; /* * Partial send handling */ - u32 rq_bytes_sent; /* Bytes we have sent */ unsigned long rq_xtime; /* when transmitted */ @@ -106,7 +104,10 @@ struct rpc_xprt_ops { void (*set_buffer_size)(struct rpc_xprt *xprt, size_t sndsize, size_t rcvsize); int (*reserve_xprt)(struct rpc_task *task); void (*release_xprt)(struct rpc_xprt *xprt, struct rpc_task *task); + void (*set_port)(struct rpc_xprt *xprt, unsigned short port); void (*connect)(struct rpc_task *task); + void * (*buf_alloc)(struct rpc_task *task, size_t size); + void (*buf_free)(struct rpc_task *task); int (*send_request)(struct rpc_task *task); void (*set_retrans_timeout)(struct rpc_task *task); void (*timer)(struct rpc_task *task); @@ -253,6 +254,7 @@ int xs_setup_tcp(struct rpc_xprt *xprt #define XPRT_LOCKED (0) #define XPRT_CONNECTED (1) #define XPRT_CONNECTING (2) +#define XPRT_CLOSE_WAIT (3) static inline void xprt_set_connected(struct rpc_xprt *xprt) { diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 343d883..2e89fea 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -53,10 +53,11 @@ struct writeback_control { loff_t start; loff_t end; - unsigned nonblocking:1; /* Don't get stuck on request queues */ - unsigned encountered_congestion:1; /* An output: a queue is full */ - unsigned for_kupdate:1; /* A kupdate writeback */ - unsigned for_reclaim:1; /* Invoked from the page allocator */ + unsigned nonblocking:1; /* Don't get stuck on request queues */ + unsigned encountered_congestion:1; /* An output: a queue is full */ + unsigned for_kupdate:1; /* A kupdate writeback */ + unsigned for_reclaim:1; /* Invoked from the page allocator */ + unsigned for_writepages:1; /* This is a writepages() call */ }; /* diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 0166ea1..5240e42 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -550,11 +550,17 @@ void __init page_writeback_init(void) int do_writepages(struct address_space *mapping, struct writeback_control *wbc) { + int ret; + if (wbc->nr_to_write <= 0) return 0; + wbc->for_writepages = 1; if (mapping->a_ops->writepages) - return mapping->a_ops->writepages(mapping, wbc); - return generic_writepages(mapping, wbc); + ret = mapping->a_ops->writepages(mapping, wbc); + else + ret = generic_writepages(mapping, wbc); + wbc->for_writepages = 0; + return ret; } /** diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c b/net/sunrpc/auth_gss/gss_krb5_mech.c index 5f1f806..129e2bd 100644 --- a/net/sunrpc/auth_gss/gss_krb5_mech.c +++ b/net/sunrpc/auth_gss/gss_krb5_mech.c @@ -97,13 +97,17 @@ get_key(const void *p, const void *end, alg_mode = CRYPTO_TFM_MODE_CBC; break; default: - dprintk("RPC: get_key: unsupported algorithm %d\n", alg); + printk("gss_kerberos_mech: unsupported algorithm %d\n", alg); goto out_err_free_key; } - if (!(*res = crypto_alloc_tfm(alg_name, alg_mode))) + if (!(*res = crypto_alloc_tfm(alg_name, alg_mode))) { + printk("gss_kerberos_mech: unable to initialize crypto algorithm %s\n", alg_name); goto out_err_free_key; - if (crypto_cipher_setkey(*res, key.data, key.len)) + } + if (crypto_cipher_setkey(*res, key.data, key.len)) { + printk("gss_kerberos_mech: error setting key for crypto algorithm %s\n", alg_name); goto out_err_free_tfm; + } kfree(key.data); return p; diff --git a/net/sunrpc/auth_gss/gss_spkm3_mech.c b/net/sunrpc/auth_gss/gss_spkm3_mech.c index 39b3edc..5840080 100644 --- a/net/sunrpc/auth_gss/gss_spkm3_mech.c +++ b/net/sunrpc/auth_gss/gss_spkm3_mech.c @@ -111,14 +111,18 @@ get_key(const void *p, const void *end, setkey = 0; break; default: - dprintk("RPC: SPKM3 get_key: unsupported algorithm %d", *resalg); + dprintk("gss_spkm3_mech: unsupported algorithm %d\n", *resalg); goto out_err_free_key; } - if (!(*res = crypto_alloc_tfm(alg_name, alg_mode))) + if (!(*res = crypto_alloc_tfm(alg_name, alg_mode))) { + printk("gss_spkm3_mech: unable to initialize crypto algorthm %s\n", alg_name); goto out_err_free_key; + } if (setkey) { - if (crypto_cipher_setkey(*res, key.data, key.len)) + if (crypto_cipher_setkey(*res, key.data, key.len)) { + printk("gss_spkm3_mech: error setting key for crypto algorthm %s\n", alg_name); goto out_err_free_tfm; + } } if(key.len > 0) diff --git a/net/sunrpc/auth_gss/gss_spkm3_seal.c b/net/sunrpc/auth_gss/gss_spkm3_seal.c index d1e12b2..86fbf7c 100644 --- a/net/sunrpc/auth_gss/gss_spkm3_seal.c +++ b/net/sunrpc/auth_gss/gss_spkm3_seal.c @@ -59,7 +59,7 @@ spkm3_make_token(struct spkm3_ctx *ctx, char tokhdrbuf[25]; struct xdr_netobj md5cksum = {.len = 0, .data = NULL}; struct xdr_netobj mic_hdr = {.len = 0, .data = tokhdrbuf}; - int tmsglen, tokenlen = 0; + int tokenlen = 0; unsigned char *ptr; s32 now; int ctxelen = 0, ctxzbit = 0; @@ -92,24 +92,23 @@ spkm3_make_token(struct spkm3_ctx *ctx, } if (toktype == SPKM_MIC_TOK) { - tmsglen = 0; /* Calculate checksum over the mic-header */ asn1_bitstring_len(&ctx->ctx_id, &ctxelen, &ctxzbit); spkm3_mic_header(&mic_hdr.data, &mic_hdr.len, ctx->ctx_id.data, ctxelen, ctxzbit); if (make_checksum(checksum_type, mic_hdr.data, mic_hdr.len, - text, &md5cksum)) + text, 0, &md5cksum)) goto out_err; asn1_bitstring_len(&md5cksum, &md5elen, &md5zbit); - tokenlen = 10 + ctxelen + 1 + 2 + md5elen + 1; + tokenlen = 10 + ctxelen + 1 + md5elen + 1; /* Create token header using generic routines */ - token->len = g_token_size(&ctx->mech_used, tokenlen + tmsglen); + token->len = g_token_size(&ctx->mech_used, tokenlen); ptr = token->data; - g_make_token_header(&ctx->mech_used, tokenlen + tmsglen, &ptr); + g_make_token_header(&ctx->mech_used, tokenlen, &ptr); spkm3_make_mic_token(&ptr, tokenlen, &mic_hdr, &md5cksum, md5elen, md5zbit); } else if (toktype == SPKM_WRAP_TOK) { /* Not Supported */ diff --git a/net/sunrpc/auth_gss/gss_spkm3_token.c b/net/sunrpc/auth_gss/gss_spkm3_token.c index 1f82457..af0d7ce 100644 --- a/net/sunrpc/auth_gss/gss_spkm3_token.c +++ b/net/sunrpc/auth_gss/gss_spkm3_token.c @@ -182,6 +182,7 @@ spkm3_mic_header(unsigned char **hdrbuf, * *tokp points to the beginning of the SPKM_MIC token described * in rfc 2025, section 3.2.1: * + * toklen is the inner token length */ void spkm3_make_mic_token(unsigned char **tokp, int toklen, struct xdr_netobj *mic_hdr, struct xdr_netobj *md5cksum, int md5elen, int md5zbit) @@ -189,7 +190,7 @@ spkm3_make_mic_token(unsigned char **tok unsigned char *ict = *tokp; *(u8 *)ict++ = 0xa4; - *(u8 *)ict++ = toklen - 2; + *(u8 *)ict++ = toklen; memcpy(ict, mic_hdr->data, mic_hdr->len); ict += mic_hdr->len; diff --git a/net/sunrpc/auth_gss/gss_spkm3_unseal.c b/net/sunrpc/auth_gss/gss_spkm3_unseal.c index 241d5b3..96851b0 100644 --- a/net/sunrpc/auth_gss/gss_spkm3_unseal.c +++ b/net/sunrpc/auth_gss/gss_spkm3_unseal.c @@ -95,7 +95,7 @@ spkm3_read_token(struct spkm3_ctx *ctx, ret = GSS_S_DEFECTIVE_TOKEN; code = make_checksum(CKSUMTYPE_RSA_MD5, ptr + 2, mic_hdrlen + 2, - message_buffer, &md5cksum); + message_buffer, 0, &md5cksum); if (code) goto out; diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 61c3abe..5530ac8 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -374,19 +374,23 @@ out: * Default callback for async RPC calls */ static void -rpc_default_callback(struct rpc_task *task) +rpc_default_callback(struct rpc_task *task, void *data) { } +static const struct rpc_call_ops rpc_default_ops = { + .rpc_call_done = rpc_default_callback, +}; + /* * Export the signal mask handling for synchronous code that * sleeps on RPC calls */ -#define RPC_INTR_SIGNALS (sigmask(SIGINT) | sigmask(SIGQUIT) | sigmask(SIGKILL)) +#define RPC_INTR_SIGNALS (sigmask(SIGHUP) | sigmask(SIGINT) | sigmask(SIGQUIT) | sigmask(SIGTERM)) static void rpc_save_sigmask(sigset_t *oldset, int intr) { - unsigned long sigallow = 0; + unsigned long sigallow = sigmask(SIGKILL); sigset_t sigmask; /* Block all signals except those listed in sigallow */ @@ -432,7 +436,7 @@ int rpc_call_sync(struct rpc_clnt *clnt, BUG_ON(flags & RPC_TASK_ASYNC); status = -ENOMEM; - task = rpc_new_task(clnt, NULL, flags); + task = rpc_new_task(clnt, flags, &rpc_default_ops, NULL); if (task == NULL) goto out; @@ -442,14 +446,15 @@ int rpc_call_sync(struct rpc_clnt *clnt, rpc_call_setup(task, msg, 0); /* Set up the call info struct and execute the task */ - if (task->tk_status == 0) { + status = task->tk_status; + if (status == 0) { + atomic_inc(&task->tk_count); status = rpc_execute(task); - } else { - status = task->tk_status; - rpc_release_task(task); + if (status == 0) + status = task->tk_status; } - rpc_restore_sigmask(&oldset); + rpc_release_task(task); out: return status; } @@ -459,7 +464,7 @@ out: */ int rpc_call_async(struct rpc_clnt *clnt, struct rpc_message *msg, int flags, - rpc_action callback, void *data) + const struct rpc_call_ops *tk_ops, void *data) { struct rpc_task *task; sigset_t oldset; @@ -472,12 +477,9 @@ rpc_call_async(struct rpc_clnt *clnt, st flags |= RPC_TASK_ASYNC; /* Create/initialize a new RPC task */ - if (!callback) - callback = rpc_default_callback; status = -ENOMEM; - if (!(task = rpc_new_task(clnt, callback, flags))) + if (!(task = rpc_new_task(clnt, flags, tk_ops, data))) goto out; - task->tk_calldata = data; /* Mask signals on GSS_AUTH upcalls */ rpc_task_sigmask(task, &oldset); @@ -511,7 +513,7 @@ rpc_call_setup(struct rpc_task *task, st if (task->tk_status == 0) task->tk_action = call_start; else - task->tk_action = NULL; + task->tk_action = rpc_exit_task; } void @@ -536,6 +538,18 @@ size_t rpc_max_payload(struct rpc_clnt * } EXPORT_SYMBOL(rpc_max_payload); +/** + * rpc_force_rebind - force transport to check that remote port is unchanged + * @clnt: client to rebind + * + */ +void rpc_force_rebind(struct rpc_clnt *clnt) +{ + if (clnt->cl_autobind) + clnt->cl_port = 0; +} +EXPORT_SYMBOL(rpc_force_rebind); + /* * Restart an (async) RPC call. Usually called from within the * exit handler. @@ -642,24 +656,26 @@ call_reserveresult(struct rpc_task *task /* * 2. Allocate the buffer. For details, see sched.c:rpc_malloc. - * (Note: buffer memory is freed in rpc_task_release). + * (Note: buffer memory is freed in xprt_release). */ static void call_allocate(struct rpc_task *task) { + struct rpc_rqst *req = task->tk_rqstp; + struct rpc_xprt *xprt = task->tk_xprt; unsigned int bufsiz; dprintk("RPC: %4d call_allocate (status %d)\n", task->tk_pid, task->tk_status); task->tk_action = call_bind; - if (task->tk_buffer) + if (req->rq_buffer) return; /* FIXME: compute buffer requirements more exactly using * auth->au_wslack */ bufsiz = task->tk_msg.rpc_proc->p_bufsiz + RPC_SLACK_SPACE; - if (rpc_malloc(task, bufsiz << 1) != NULL) + if (xprt->ops->buf_alloc(task, bufsiz << 1) != NULL) return; printk(KERN_INFO "RPC: buffer allocation failed for task %p\n", task); @@ -702,14 +718,14 @@ call_encode(struct rpc_task *task) task->tk_pid, task->tk_status); /* Default buffer setup */ - bufsiz = task->tk_bufsize >> 1; - sndbuf->head[0].iov_base = (void *)task->tk_buffer; + bufsiz = req->rq_bufsize >> 1; + sndbuf->head[0].iov_base = (void *)req->rq_buffer; sndbuf->head[0].iov_len = bufsiz; sndbuf->tail[0].iov_len = 0; sndbuf->page_len = 0; sndbuf->len = 0; sndbuf->buflen = bufsiz; - rcvbuf->head[0].iov_base = (void *)((char *)task->tk_buffer + bufsiz); + rcvbuf->head[0].iov_base = (void *)((char *)req->rq_buffer + bufsiz); rcvbuf->head[0].iov_len = bufsiz; rcvbuf->tail[0].iov_len = 0; rcvbuf->page_len = 0; @@ -849,8 +865,7 @@ call_connect_status(struct rpc_task *tas } /* Something failed: remote service port may have changed */ - if (clnt->cl_autobind) - clnt->cl_port = 0; + rpc_force_rebind(clnt); switch (status) { case -ENOTCONN: @@ -892,7 +907,7 @@ call_transmit(struct rpc_task *task) if (task->tk_status < 0) return; if (!task->tk_msg.rpc_proc->p_decode) { - task->tk_action = NULL; + task->tk_action = rpc_exit_task; rpc_wake_up_task(task); } return; @@ -931,8 +946,7 @@ call_status(struct rpc_task *task) break; case -ECONNREFUSED: case -ENOTCONN: - if (clnt->cl_autobind) - clnt->cl_port = 0; + rpc_force_rebind(clnt); task->tk_action = call_bind; break; case -EAGAIN: @@ -943,8 +957,7 @@ call_status(struct rpc_task *task) rpc_exit(task, status); break; default: - if (clnt->cl_chatty) - printk("%s: RPC call returned error %d\n", + printk("%s: RPC call returned error %d\n", clnt->cl_protname, -status); rpc_exit(task, status); break; @@ -979,20 +992,18 @@ call_timeout(struct rpc_task *task) dprintk("RPC: %4d call_timeout (major)\n", task->tk_pid); if (RPC_IS_SOFT(task)) { - if (clnt->cl_chatty) - printk(KERN_NOTICE "%s: server %s not responding, timed out\n", + printk(KERN_NOTICE "%s: server %s not responding, timed out\n", clnt->cl_protname, clnt->cl_server); rpc_exit(task, -EIO); return; } - if (clnt->cl_chatty && !(task->tk_flags & RPC_CALL_MAJORSEEN)) { + if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) { task->tk_flags |= RPC_CALL_MAJORSEEN; printk(KERN_NOTICE "%s: server %s not responding, still trying\n", clnt->cl_protname, clnt->cl_server); } - if (clnt->cl_autobind) - clnt->cl_port = 0; + rpc_force_rebind(clnt); retry: clnt->cl_stats->rpcretrans++; @@ -1014,7 +1025,7 @@ call_decode(struct rpc_task *task) dprintk("RPC: %4d call_decode (status %d)\n", task->tk_pid, task->tk_status); - if (clnt->cl_chatty && (task->tk_flags & RPC_CALL_MAJORSEEN)) { + if (task->tk_flags & RPC_CALL_MAJORSEEN) { printk(KERN_NOTICE "%s: server %s OK\n", clnt->cl_protname, clnt->cl_server); task->tk_flags &= ~RPC_CALL_MAJORSEEN; @@ -1039,13 +1050,14 @@ call_decode(struct rpc_task *task) sizeof(req->rq_rcv_buf)) != 0); /* Verify the RPC header */ - if (!(p = call_verify(task))) { - if (task->tk_action == NULL) - return; - goto out_retry; + p = call_verify(task); + if (IS_ERR(p)) { + if (p == ERR_PTR(-EAGAIN)) + goto out_retry; + return; } - task->tk_action = NULL; + task->tk_action = rpc_exit_task; if (decode) task->tk_status = rpcauth_unwrap_resp(task, decode, req, p, @@ -1138,7 +1150,7 @@ call_verify(struct rpc_task *task) if ((n = ntohl(*p++)) != RPC_REPLY) { printk(KERN_WARNING "call_verify: not an RPC reply: %x\n", n); - goto out_retry; + goto out_garbage; } if ((n = ntohl(*p++)) != RPC_MSG_ACCEPTED) { if (--len < 0) @@ -1168,7 +1180,7 @@ call_verify(struct rpc_task *task) task->tk_pid); rpcauth_invalcred(task); task->tk_action = call_refresh; - return NULL; + goto out_retry; case RPC_AUTH_BADCRED: case RPC_AUTH_BADVERF: /* possibly garbled cred/verf? */ @@ -1178,7 +1190,7 @@ call_verify(struct rpc_task *task) dprintk("RPC: %4d call_verify: retry garbled creds\n", task->tk_pid); task->tk_action = call_bind; - return NULL; + goto out_retry; case RPC_AUTH_TOOWEAK: printk(KERN_NOTICE "call_verify: server requires stronger " "authentication.\n"); @@ -1193,7 +1205,7 @@ call_verify(struct rpc_task *task) } if (!(p = rpcauth_checkverf(task, p))) { printk(KERN_WARNING "call_verify: auth check failed\n"); - goto out_retry; /* bad verifier, retry */ + goto out_garbage; /* bad verifier, retry */ } len = p - (u32 *)iov->iov_base - 1; if (len < 0) @@ -1230,23 +1242,24 @@ call_verify(struct rpc_task *task) /* Also retry */ } -out_retry: +out_garbage: task->tk_client->cl_stats->rpcgarbage++; if (task->tk_garb_retry) { task->tk_garb_retry--; dprintk("RPC %s: retrying %4d\n", __FUNCTION__, task->tk_pid); task->tk_action = call_bind; - return NULL; +out_retry: + return ERR_PTR(-EAGAIN); } printk(KERN_WARNING "RPC %s: retry failed, exit EIO\n", __FUNCTION__); out_eio: error = -EIO; out_err: rpc_exit(task, error); - return NULL; + return ERR_PTR(error); out_overflow: printk(KERN_WARNING "RPC %s: server reply was truncated.\n", __FUNCTION__); - goto out_retry; + goto out_garbage; } static int rpcproc_encode_null(void *rqstp, u32 *data, void *obj) diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index a398575..8139ce6 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -90,8 +90,7 @@ bailout: map->pm_binding = 0; rpc_wake_up(&map->pm_bindwait); spin_unlock(&pmap_lock); - task->tk_status = -EIO; - task->tk_action = NULL; + rpc_exit(task, -EIO); } #ifdef CONFIG_ROOT_NFS @@ -132,21 +131,22 @@ static void pmap_getport_done(struct rpc_task *task) { struct rpc_clnt *clnt = task->tk_client; + struct rpc_xprt *xprt = task->tk_xprt; struct rpc_portmap *map = clnt->cl_pmap; dprintk("RPC: %4d pmap_getport_done(status %d, port %d)\n", task->tk_pid, task->tk_status, clnt->cl_port); + + xprt->ops->set_port(xprt, 0); if (task->tk_status < 0) { /* Make the calling task exit with an error */ - task->tk_action = NULL; + task->tk_action = rpc_exit_task; } else if (clnt->cl_port == 0) { /* Program not registered */ - task->tk_status = -EACCES; - task->tk_action = NULL; + rpc_exit(task, -EACCES); } else { - /* byte-swap port number first */ + xprt->ops->set_port(xprt, clnt->cl_port); clnt->cl_port = htons(clnt->cl_port); - clnt->cl_xprt->addr.sin_port = clnt->cl_port; } spin_lock(&pmap_lock); map->pm_binding = 0; @@ -207,7 +207,7 @@ pmap_create(char *hostname, struct socka xprt = xprt_create_proto(proto, srvaddr, NULL); if (IS_ERR(xprt)) return (struct rpc_clnt *)xprt; - xprt->addr.sin_port = htons(RPC_PMAP_PORT); + xprt->ops->set_port(xprt, RPC_PMAP_PORT); if (!privileged) xprt->resvport = 0; @@ -217,7 +217,6 @@ pmap_create(char *hostname, struct socka RPC_AUTH_UNIX); if (!IS_ERR(clnt)) { clnt->cl_softrtry = 1; - clnt->cl_chatty = 1; clnt->cl_oneshot = 1; } return clnt; diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index 16a2458..24cc23a 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -70,8 +70,11 @@ rpc_timeout_upcall_queue(void *data) struct inode *inode = &rpci->vfs_inode; down(&inode->i_sem); + if (rpci->ops == NULL) + goto out; if (rpci->nreaders == 0 && !list_empty(&rpci->pipe)) __rpc_purge_upcall(inode, -ETIMEDOUT); +out: up(&inode->i_sem); } @@ -113,8 +116,6 @@ rpc_close_pipes(struct inode *inode) { struct rpc_inode *rpci = RPC_I(inode); - cancel_delayed_work(&rpci->queue_timeout); - flush_scheduled_work(); down(&inode->i_sem); if (rpci->ops != NULL) { rpci->nreaders = 0; @@ -127,6 +128,8 @@ rpc_close_pipes(struct inode *inode) } rpc_inode_setowner(inode, NULL); up(&inode->i_sem); + cancel_delayed_work(&rpci->queue_timeout); + flush_scheduled_work(); } static struct inode * @@ -166,7 +169,7 @@ rpc_pipe_open(struct inode *inode, struc static int rpc_pipe_release(struct inode *inode, struct file *filp) { - struct rpc_inode *rpci = RPC_I(filp->f_dentry->d_inode); + struct rpc_inode *rpci = RPC_I(inode); struct rpc_pipe_msg *msg; down(&inode->i_sem); diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index 54e60a6..7415406 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -41,8 +41,6 @@ static mempool_t *rpc_buffer_mempool __r static void __rpc_default_timer(struct rpc_task *task); static void rpciod_killall(void); -static void rpc_free(struct rpc_task *task); - static void rpc_async_schedule(void *); /* @@ -264,6 +262,35 @@ void rpc_init_wait_queue(struct rpc_wait } EXPORT_SYMBOL(rpc_init_wait_queue); +static int rpc_wait_bit_interruptible(void *word) +{ + if (signal_pending(current)) + return -ERESTARTSYS; + schedule(); + return 0; +} + +/* + * Mark an RPC call as having completed by clearing the 'active' bit + */ +static inline void rpc_mark_complete_task(struct rpc_task *task) +{ + rpc_clear_active(task); + wake_up_bit(&task->tk_runstate, RPC_TASK_ACTIVE); +} + +/* + * Allow callers to wait for completion of an RPC call + */ +int __rpc_wait_for_completion_task(struct rpc_task *task, int (*action)(void *)) +{ + if (action == NULL) + action = rpc_wait_bit_interruptible; + return wait_on_bit(&task->tk_runstate, RPC_TASK_ACTIVE, + action, TASK_INTERRUPTIBLE); +} +EXPORT_SYMBOL(__rpc_wait_for_completion_task); + /* * Make an RPC task runnable. * @@ -299,10 +326,7 @@ static void rpc_make_runnable(struct rpc static inline void rpc_schedule_run(struct rpc_task *task) { - /* Don't run a child twice! */ - if (RPC_IS_ACTIVATED(task)) - return; - task->tk_active = 1; + rpc_set_active(task); rpc_make_runnable(task); } @@ -324,8 +348,7 @@ static void __rpc_sleep_on(struct rpc_wa } /* Mark the task as being activated if so needed */ - if (!RPC_IS_ACTIVATED(task)) - task->tk_active = 1; + rpc_set_active(task); __rpc_add_wait_queue(q, task); @@ -555,36 +578,29 @@ __rpc_atrun(struct rpc_task *task) } /* - * Helper that calls task->tk_exit if it exists and then returns - * true if we should exit __rpc_execute. + * Helper to call task->tk_ops->rpc_call_prepare */ -static inline int __rpc_do_exit(struct rpc_task *task) +static void rpc_prepare_task(struct rpc_task *task) { - if (task->tk_exit != NULL) { - lock_kernel(); - task->tk_exit(task); - unlock_kernel(); - /* If tk_action is non-null, we should restart the call */ - if (task->tk_action != NULL) { - if (!RPC_ASSASSINATED(task)) { - /* Release RPC slot and buffer memory */ - xprt_release(task); - rpc_free(task); - return 0; - } - printk(KERN_ERR "RPC: dead task tried to walk away.\n"); - } - } - return 1; + task->tk_ops->rpc_call_prepare(task, task->tk_calldata); } -static int rpc_wait_bit_interruptible(void *word) +/* + * Helper that calls task->tk_ops->rpc_call_done if it exists + */ +void rpc_exit_task(struct rpc_task *task) { - if (signal_pending(current)) - return -ERESTARTSYS; - schedule(); - return 0; + task->tk_action = NULL; + if (task->tk_ops->rpc_call_done != NULL) { + task->tk_ops->rpc_call_done(task, task->tk_calldata); + if (task->tk_action != NULL) { + WARN_ON(RPC_ASSASSINATED(task)); + /* Always release the RPC slot and buffer memory */ + xprt_release(task); + } + } } +EXPORT_SYMBOL(rpc_exit_task); /* * This is the RPC `scheduler' (or rather, the finite state machine). @@ -631,12 +647,11 @@ static int __rpc_execute(struct rpc_task * by someone else. */ if (!RPC_IS_QUEUED(task)) { - if (task->tk_action != NULL) { - lock_kernel(); - task->tk_action(task); - unlock_kernel(); - } else if (__rpc_do_exit(task)) + if (task->tk_action == NULL) break; + lock_kernel(); + task->tk_action(task); + unlock_kernel(); } /* @@ -676,9 +691,9 @@ static int __rpc_execute(struct rpc_task dprintk("RPC: %4d sync task resuming\n", task->tk_pid); } - dprintk("RPC: %4d exit() = %d\n", task->tk_pid, task->tk_status); - status = task->tk_status; - + dprintk("RPC: %4d, return %d, status %d\n", task->tk_pid, status, task->tk_status); + /* Wake up anyone who is waiting for task completion */ + rpc_mark_complete_task(task); /* Release all resources associated with the task */ rpc_release_task(task); return status; @@ -696,9 +711,7 @@ static int __rpc_execute(struct rpc_task int rpc_execute(struct rpc_task *task) { - BUG_ON(task->tk_active); - - task->tk_active = 1; + rpc_set_active(task); rpc_set_running(task); return __rpc_execute(task); } @@ -708,17 +721,19 @@ static void rpc_async_schedule(void *arg __rpc_execute((struct rpc_task *)arg); } -/* - * Allocate memory for RPC purposes. +/** + * rpc_malloc - allocate an RPC buffer + * @task: RPC task that will use this buffer + * @size: requested byte size * * We try to ensure that some NFS reads and writes can always proceed * by using a mempool when allocating 'small' buffers. * In order to avoid memory starvation triggering more writebacks of * NFS requests, we use GFP_NOFS rather than GFP_KERNEL. */ -void * -rpc_malloc(struct rpc_task *task, size_t size) +void * rpc_malloc(struct rpc_task *task, size_t size) { + struct rpc_rqst *req = task->tk_rqstp; gfp_t gfp; if (task->tk_flags & RPC_TASK_SWAPPER) @@ -727,42 +742,52 @@ rpc_malloc(struct rpc_task *task, size_t gfp = GFP_NOFS; if (size > RPC_BUFFER_MAXSIZE) { - task->tk_buffer = kmalloc(size, gfp); - if (task->tk_buffer) - task->tk_bufsize = size; + req->rq_buffer = kmalloc(size, gfp); + if (req->rq_buffer) + req->rq_bufsize = size; } else { - task->tk_buffer = mempool_alloc(rpc_buffer_mempool, gfp); - if (task->tk_buffer) - task->tk_bufsize = RPC_BUFFER_MAXSIZE; + req->rq_buffer = mempool_alloc(rpc_buffer_mempool, gfp); + if (req->rq_buffer) + req->rq_bufsize = RPC_BUFFER_MAXSIZE; } - return task->tk_buffer; + return req->rq_buffer; } -static void -rpc_free(struct rpc_task *task) +/** + * rpc_free - free buffer allocated via rpc_malloc + * @task: RPC task with a buffer to be freed + * + */ +void rpc_free(struct rpc_task *task) { - if (task->tk_buffer) { - if (task->tk_bufsize == RPC_BUFFER_MAXSIZE) - mempool_free(task->tk_buffer, rpc_buffer_mempool); + struct rpc_rqst *req = task->tk_rqstp; + + if (req->rq_buffer) { + if (req->rq_bufsize == RPC_BUFFER_MAXSIZE) + mempool_free(req->rq_buffer, rpc_buffer_mempool); else - kfree(task->tk_buffer); - task->tk_buffer = NULL; - task->tk_bufsize = 0; + kfree(req->rq_buffer); + req->rq_buffer = NULL; + req->rq_bufsize = 0; } } /* * Creation and deletion of RPC task structures */ -void rpc_init_task(struct rpc_task *task, struct rpc_clnt *clnt, rpc_action callback, int flags) +void rpc_init_task(struct rpc_task *task, struct rpc_clnt *clnt, int flags, const struct rpc_call_ops *tk_ops, void *calldata) { memset(task, 0, sizeof(*task)); init_timer(&task->tk_timer); task->tk_timer.data = (unsigned long) task; task->tk_timer.function = (void (*)(unsigned long)) rpc_run_timer; + atomic_set(&task->tk_count, 1); task->tk_client = clnt; task->tk_flags = flags; - task->tk_exit = callback; + task->tk_ops = tk_ops; + if (tk_ops->rpc_call_prepare != NULL) + task->tk_action = rpc_prepare_task; + task->tk_calldata = calldata; /* Initialize retry counters */ task->tk_garb_retry = 2; @@ -791,6 +816,8 @@ void rpc_init_task(struct rpc_task *task list_add_tail(&task->tk_task, &all_tasks); spin_unlock(&rpc_sched_lock); + BUG_ON(task->tk_ops == NULL); + dprintk("RPC: %4d new task procpid %d\n", task->tk_pid, current->pid); } @@ -801,8 +828,7 @@ rpc_alloc_task(void) return (struct rpc_task *)mempool_alloc(rpc_task_mempool, GFP_NOFS); } -static void -rpc_default_free_task(struct rpc_task *task) +static void rpc_free_task(struct rpc_task *task) { dprintk("RPC: %4d freeing task\n", task->tk_pid); mempool_free(task, rpc_task_mempool); @@ -813,8 +839,7 @@ rpc_default_free_task(struct rpc_task *t * clean up after an allocation failure, as the client may * have specified "oneshot". */ -struct rpc_task * -rpc_new_task(struct rpc_clnt *clnt, rpc_action callback, int flags) +struct rpc_task *rpc_new_task(struct rpc_clnt *clnt, int flags, const struct rpc_call_ops *tk_ops, void *calldata) { struct rpc_task *task; @@ -822,10 +847,7 @@ rpc_new_task(struct rpc_clnt *clnt, rpc_ if (!task) goto cleanup; - rpc_init_task(task, clnt, callback, flags); - - /* Replace tk_release */ - task->tk_release = rpc_default_free_task; + rpc_init_task(task, clnt, flags, tk_ops, calldata); dprintk("RPC: %4d allocated task\n", task->tk_pid); task->tk_flags |= RPC_TASK_DYNAMIC; @@ -845,11 +867,15 @@ cleanup: void rpc_release_task(struct rpc_task *task) { - dprintk("RPC: %4d release task\n", task->tk_pid); + const struct rpc_call_ops *tk_ops = task->tk_ops; + void *calldata = task->tk_calldata; #ifdef RPC_DEBUG BUG_ON(task->tk_magic != RPC_TASK_MAGIC_ID); #endif + if (!atomic_dec_and_test(&task->tk_count)) + return; + dprintk("RPC: %4d release task\n", task->tk_pid); /* Remove from global task list */ spin_lock(&rpc_sched_lock); @@ -857,7 +883,6 @@ void rpc_release_task(struct rpc_task *t spin_unlock(&rpc_sched_lock); BUG_ON (RPC_IS_QUEUED(task)); - task->tk_active = 0; /* Synchronously delete any running timer */ rpc_delete_timer(task); @@ -867,7 +892,6 @@ void rpc_release_task(struct rpc_task *t xprt_release(task); if (task->tk_msg.rpc_cred) rpcauth_unbindcred(task); - rpc_free(task); if (task->tk_client) { rpc_release_client(task->tk_client); task->tk_client = NULL; @@ -876,11 +900,34 @@ void rpc_release_task(struct rpc_task *t #ifdef RPC_DEBUG task->tk_magic = 0; #endif - if (task->tk_release) - task->tk_release(task); + if (task->tk_flags & RPC_TASK_DYNAMIC) + rpc_free_task(task); + if (tk_ops->rpc_release) + tk_ops->rpc_release(calldata); } /** + * rpc_run_task - Allocate a new RPC task, then run rpc_execute against it + * @clnt - pointer to RPC client + * @flags - RPC flags + * @ops - RPC call ops + * @data - user call data + */ +struct rpc_task *rpc_run_task(struct rpc_clnt *clnt, int flags, + const struct rpc_call_ops *ops, + void *data) +{ + struct rpc_task *task; + task = rpc_new_task(clnt, flags, ops, data); + if (task == NULL) + return ERR_PTR(-ENOMEM); + atomic_inc(&task->tk_count); + rpc_execute(task); + return task; +} +EXPORT_SYMBOL(rpc_run_task); + +/** * rpc_find_parent - find the parent of a child task. * @child: child task * @@ -890,12 +937,11 @@ void rpc_release_task(struct rpc_task *t * * Caller must hold childq.lock */ -static inline struct rpc_task *rpc_find_parent(struct rpc_task *child) +static inline struct rpc_task *rpc_find_parent(struct rpc_task *child, struct rpc_task *parent) { - struct rpc_task *task, *parent; + struct rpc_task *task; struct list_head *le; - parent = (struct rpc_task *) child->tk_calldata; task_for_each(task, le, &childq.tasks[0]) if (task == parent) return parent; @@ -903,18 +949,22 @@ static inline struct rpc_task *rpc_find_ return NULL; } -static void rpc_child_exit(struct rpc_task *child) +static void rpc_child_exit(struct rpc_task *child, void *calldata) { struct rpc_task *parent; spin_lock_bh(&childq.lock); - if ((parent = rpc_find_parent(child)) != NULL) { + if ((parent = rpc_find_parent(child, calldata)) != NULL) { parent->tk_status = child->tk_status; __rpc_wake_up_task(parent); } spin_unlock_bh(&childq.lock); } +static const struct rpc_call_ops rpc_child_ops = { + .rpc_call_done = rpc_child_exit, +}; + /* * Note: rpc_new_task releases the client after a failure. */ @@ -923,11 +973,9 @@ rpc_new_child(struct rpc_clnt *clnt, str { struct rpc_task *task; - task = rpc_new_task(clnt, NULL, RPC_TASK_ASYNC | RPC_TASK_CHILD); + task = rpc_new_task(clnt, RPC_TASK_ASYNC | RPC_TASK_CHILD, &rpc_child_ops, parent); if (!task) goto fail; - task->tk_exit = rpc_child_exit; - task->tk_calldata = parent; return task; fail: @@ -1063,7 +1111,7 @@ void rpc_show_tasks(void) return; } printk("-pid- proc flgs status -client- -prog- --rqstp- -timeout " - "-rpcwait -action- --exit--\n"); + "-rpcwait -action- ---ops--\n"); alltask_for_each(t, le, &all_tasks) { const char *rpc_waitq = "none"; @@ -1078,7 +1126,7 @@ void rpc_show_tasks(void) (t->tk_client ? t->tk_client->cl_prog : 0), t->tk_rqstp, t->tk_timeout, rpc_waitq, - t->tk_action, t->tk_exit); + t->tk_action, t->tk_ops); } spin_unlock(&rpc_sched_lock); } diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c index a03d4b6..9f73732 100644 --- a/net/sunrpc/sunrpc_syms.c +++ b/net/sunrpc/sunrpc_syms.c @@ -30,8 +30,6 @@ EXPORT_SYMBOL(rpc_init_task); EXPORT_SYMBOL(rpc_sleep_on); EXPORT_SYMBOL(rpc_wake_up_next); EXPORT_SYMBOL(rpc_wake_up_task); -EXPORT_SYMBOL(rpc_new_child); -EXPORT_SYMBOL(rpc_run_child); EXPORT_SYMBOL(rpciod_down); EXPORT_SYMBOL(rpciod_up); EXPORT_SYMBOL(rpc_new_task); @@ -45,7 +43,6 @@ EXPORT_SYMBOL(rpc_clone_client); EXPORT_SYMBOL(rpc_bind_new_program); EXPORT_SYMBOL(rpc_destroy_client); EXPORT_SYMBOL(rpc_shutdown_client); -EXPORT_SYMBOL(rpc_release_client); EXPORT_SYMBOL(rpc_killall_tasks); EXPORT_SYMBOL(rpc_call_sync); EXPORT_SYMBOL(rpc_call_async); @@ -120,7 +117,6 @@ EXPORT_SYMBOL(unix_domain_find); /* Generic XDR */ EXPORT_SYMBOL(xdr_encode_string); -EXPORT_SYMBOL(xdr_decode_string); EXPORT_SYMBOL(xdr_decode_string_inplace); EXPORT_SYMBOL(xdr_decode_netobj); EXPORT_SYMBOL(xdr_encode_netobj); diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c index aaf08cd..ca4bfa5 100644 --- a/net/sunrpc/xdr.c +++ b/net/sunrpc/xdr.c @@ -93,27 +93,6 @@ xdr_encode_string(u32 *p, const char *st } u32 * -xdr_decode_string(u32 *p, char **sp, int *lenp, int maxlen) -{ - unsigned int len; - char *string; - - if ((len = ntohl(*p++)) > maxlen) - return NULL; - if (lenp) - *lenp = len; - if ((len % 4) != 0) { - string = (char *) p; - } else { - string = (char *) (p - 1); - memmove(string, p, len); - } - string[len] = '\0'; - *sp = string; - return p + XDR_QUADLEN(len); -} - -u32 * xdr_decode_string_inplace(u32 *p, char **sp, int *lenp, int maxlen) { unsigned int len; diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 6dda386..8ff2c8a 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -119,6 +119,17 @@ out_sleep: return 0; } +static void xprt_clear_locked(struct rpc_xprt *xprt) +{ + xprt->snd_task = NULL; + if (!test_bit(XPRT_CLOSE_WAIT, &xprt->state) || xprt->shutdown) { + smp_mb__before_clear_bit(); + clear_bit(XPRT_LOCKED, &xprt->state); + smp_mb__after_clear_bit(); + } else + schedule_work(&xprt->task_cleanup); +} + /* * xprt_reserve_xprt_cong - serialize write access to transports * @task: task that is requesting access to the transport @@ -145,9 +156,7 @@ int xprt_reserve_xprt_cong(struct rpc_ta } return 1; } - smp_mb__before_clear_bit(); - clear_bit(XPRT_LOCKED, &xprt->state); - smp_mb__after_clear_bit(); + xprt_clear_locked(xprt); out_sleep: dprintk("RPC: %4d failed to lock transport %p\n", task->tk_pid, xprt); task->tk_timeout = 0; @@ -193,9 +202,7 @@ static void __xprt_lock_write_next(struc return; out_unlock: - smp_mb__before_clear_bit(); - clear_bit(XPRT_LOCKED, &xprt->state); - smp_mb__after_clear_bit(); + xprt_clear_locked(xprt); } static void __xprt_lock_write_next_cong(struct rpc_xprt *xprt) @@ -222,9 +229,7 @@ static void __xprt_lock_write_next_cong( return; } out_unlock: - smp_mb__before_clear_bit(); - clear_bit(XPRT_LOCKED, &xprt->state); - smp_mb__after_clear_bit(); + xprt_clear_locked(xprt); } /** @@ -237,10 +242,7 @@ out_unlock: void xprt_release_xprt(struct rpc_xprt *xprt, struct rpc_task *task) { if (xprt->snd_task == task) { - xprt->snd_task = NULL; - smp_mb__before_clear_bit(); - clear_bit(XPRT_LOCKED, &xprt->state); - smp_mb__after_clear_bit(); + xprt_clear_locked(xprt); __xprt_lock_write_next(xprt); } } @@ -256,10 +258,7 @@ void xprt_release_xprt(struct rpc_xprt * void xprt_release_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task) { if (xprt->snd_task == task) { - xprt->snd_task = NULL; - smp_mb__before_clear_bit(); - clear_bit(XPRT_LOCKED, &xprt->state); - smp_mb__after_clear_bit(); + xprt_clear_locked(xprt); __xprt_lock_write_next_cong(xprt); } } @@ -535,10 +534,6 @@ void xprt_connect(struct rpc_task *task) dprintk("RPC: %4d xprt_connect xprt %p %s connected\n", task->tk_pid, xprt, (xprt_connected(xprt) ? "is" : "is not")); - if (xprt->shutdown) { - task->tk_status = -EIO; - return; - } if (!xprt->addr.sin_port) { task->tk_status = -EIO; return; @@ -687,9 +682,6 @@ int xprt_prepare_transmit(struct rpc_tas dprintk("RPC: %4d xprt_prepare_transmit\n", task->tk_pid); - if (xprt->shutdown) - return -EIO; - spin_lock_bh(&xprt->transport_lock); if (req->rq_received && !req->rq_bytes_sent) { err = req->rq_received; @@ -814,11 +806,9 @@ void xprt_reserve(struct rpc_task *task) struct rpc_xprt *xprt = task->tk_xprt; task->tk_status = -EIO; - if (!xprt->shutdown) { - spin_lock(&xprt->reserve_lock); - do_xprt_reserve(task); - spin_unlock(&xprt->reserve_lock); - } + spin_lock(&xprt->reserve_lock); + do_xprt_reserve(task); + spin_unlock(&xprt->reserve_lock); } static inline u32 xprt_alloc_xid(struct rpc_xprt *xprt) @@ -838,6 +828,8 @@ static void xprt_request_init(struct rpc req->rq_timeout = xprt->timeout.to_initval; req->rq_task = task; req->rq_xprt = xprt; + req->rq_buffer = NULL; + req->rq_bufsize = 0; req->rq_xid = xprt_alloc_xid(xprt); req->rq_release_snd_buf = NULL; dprintk("RPC: %4d reserved req %p xid %08x\n", task->tk_pid, @@ -863,10 +855,11 @@ void xprt_release(struct rpc_task *task) if (!list_empty(&req->rq_list)) list_del(&req->rq_list); xprt->last_used = jiffies; - if (list_empty(&xprt->recv) && !xprt->shutdown) + if (list_empty(&xprt->recv)) mod_timer(&xprt->timer, xprt->last_used + xprt->idle_timeout); spin_unlock_bh(&xprt->transport_lock); + xprt->ops->buf_free(task); task->tk_rqstp = NULL; if (req->rq_release_snd_buf) req->rq_release_snd_buf(req); @@ -974,16 +967,6 @@ struct rpc_xprt *xprt_create_proto(int p return xprt; } -static void xprt_shutdown(struct rpc_xprt *xprt) -{ - xprt->shutdown = 1; - rpc_wake_up(&xprt->sending); - rpc_wake_up(&xprt->resend); - xprt_wake_pending_tasks(xprt, -EIO); - rpc_wake_up(&xprt->backlog); - del_timer_sync(&xprt->timer); -} - /** * xprt_destroy - destroy an RPC transport, killing off all requests. * @xprt: transport to destroy @@ -992,7 +975,8 @@ static void xprt_shutdown(struct rpc_xpr int xprt_destroy(struct rpc_xprt *xprt) { dprintk("RPC: destroying transport %p\n", xprt); - xprt_shutdown(xprt); + xprt->shutdown = 1; + del_timer_sync(&xprt->timer); xprt->ops->destroy(xprt); kfree(xprt); diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 77e8800..c458f8d 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -424,7 +425,7 @@ static void xs_close(struct rpc_xprt *xp struct sock *sk = xprt->inet; if (!sk) - return; + goto clear_close_wait; dprintk("RPC: xs_close xprt %p\n", xprt); @@ -441,6 +442,10 @@ static void xs_close(struct rpc_xprt *xp sk->sk_no_check = 0; sock_release(sock); +clear_close_wait: + smp_mb__before_clear_bit(); + clear_bit(XPRT_CLOSE_WAIT, &xprt->state); + smp_mb__after_clear_bit(); } /** @@ -800,9 +805,13 @@ static void xs_tcp_state_change(struct s case TCP_SYN_SENT: case TCP_SYN_RECV: break; + case TCP_CLOSE_WAIT: + /* Try to schedule an autoclose RPC calls */ + set_bit(XPRT_CLOSE_WAIT, &xprt->state); + if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0) + schedule_work(&xprt->task_cleanup); default: xprt_disconnect(xprt); - break; } out: read_unlock(&sk->sk_callback_lock); @@ -920,6 +929,18 @@ static void xs_udp_timer(struct rpc_task xprt_adjust_cwnd(task, -ETIMEDOUT); } +/** + * xs_set_port - reset the port number in the remote endpoint address + * @xprt: generic transport + * @port: new port number + * + */ +static void xs_set_port(struct rpc_xprt *xprt, unsigned short port) +{ + dprintk("RPC: setting port for xprt %p to %u\n", xprt, port); + xprt->addr.sin_port = htons(port); +} + static int xs_bindresvport(struct rpc_xprt *xprt, struct socket *sock) { struct sockaddr_in myaddr = { @@ -1160,7 +1181,10 @@ static struct rpc_xprt_ops xs_udp_ops = .set_buffer_size = xs_udp_set_buffer_size, .reserve_xprt = xprt_reserve_xprt_cong, .release_xprt = xprt_release_xprt_cong, + .set_port = xs_set_port, .connect = xs_connect, + .buf_alloc = rpc_malloc, + .buf_free = rpc_free, .send_request = xs_udp_send_request, .set_retrans_timeout = xprt_set_retrans_timeout_rtt, .timer = xs_udp_timer, @@ -1172,7 +1196,10 @@ static struct rpc_xprt_ops xs_udp_ops = static struct rpc_xprt_ops xs_tcp_ops = { .reserve_xprt = xprt_reserve_xprt, .release_xprt = xprt_release_xprt, + .set_port = xs_set_port, .connect = xs_connect, + .buf_alloc = rpc_malloc, + .buf_free = rpc_free, .send_request = xs_tcp_send_request, .set_retrans_timeout = xprt_set_retrans_timeout_def, .close = xs_close,