GIT 7a27d53793bcabf576974eee449c5e7d0d82c34f git://git.linux-nfs.org/pub/linux/nfs-2.6.git commit 7a27d53793bcabf576974eee449c5e7d0d82c34f Author: Adrian Bunk Date: Thu Apr 27 23:57:53 2006 -0400 NFS: make nfs_follow_referral() static This patch makes the needlessly global nfs_follow_referral() static. Signed-off-by: Adrian Bunk Signed-off-by: Trond Myklebust commit 615c91c8d7fd3723c314cee7c439c32fb31dd8b7 Author: Trond Myklebust Date: Wed Apr 26 23:11:59 2006 -0400 NFS: Fix typo in nfs_do_clone_mount() Doh! Signed-off-by: Trond Myklebust commit 6640e276e0e003383739e6d86ef1a2bbfa877583 Author: Trond Myklebust Date: Sat Apr 22 17:06:31 2006 -0400 NFS: Fix compile errors introduced by referrals patches Signed-off-by: Trond Myklebust commit bde182754692ef20acb3c55354da92230155633c Author: Trond Myklebust Date: Fri Apr 21 11:40:23 2006 -0400 NFSv4: Ensure that referral mounts bind to a reserved port Signed-off-by: Trond Myklebust commit 35389290221999e4b7acb5790aa426426a1c505c Author: Andy Adamson Date: Fri Apr 21 11:40:18 2006 -0400 NFSv4: A root pathname is sent as a zero component4 Signed-off-by: Trond Myklebust commit f55625e95464102c1c4a5202cb7ea2c8b9c83427 Author: Manoj Naik Date: Fri Apr 21 11:40:15 2006 -0400 NFSv4: Follow a referral Respond to a moved error on NFS lookup by setting up the referral. Note: We don't actually follow the referral during lookup/getattr, but later when we detect fsid mismatch in inode revalidation (similar to the processing done for cloning submounts). Referrals will have fake attributes until they are actually followed or traversed. Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit 64f458a4bd205146bf423554499929fd55187880 Author: Manoj Naik Date: Fri Apr 21 11:37:59 2006 -0400 NFSv4: Ensure client submounts when following a referral Set up mountpoint when hitting a referral on moved error by getting fs_locations. Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit aa6ebcdbc46cc9f1e6942ba048fbec6f7cffd691 Author: Manoj Naik Date: Fri Apr 21 11:37:56 2006 -0400 NFS: Expand clone mounts to include other servers Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit 7977bbdfc9799528174d1e55e5e92019a73cfeb9 Author: Manoj Naik Date: Fri Apr 21 11:37:50 2006 -0400 NFSv4: Create NFSv4 transport and client Move existing code into a separate function so that it can be also used by referral code. Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit ab7e49d44f9cd2a85b8fcf99323f46c35792af72 Author: Manoj Naik Date: Fri Apr 21 11:37:46 2006 -0400 NFSv4: Define an fs_locations bitmap This is (similar to getattr bitmap) but includes fs_locations and mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations requests. Note: We can probably do better by requesting locations as part of fsinfo itself. Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit 02740485ba70b8ad463b7802eaa32f03cb01b9dd Author: Manoj Naik Date: Fri Apr 21 11:37:42 2006 -0400 NFSv4: GETATTR attributes on referral Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be requested in a GETATTR on referrals. Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit dc53090e9533216212a2631ea3507c06c3048d32 Author: Manoj Naik Date: Fri Apr 21 11:37:38 2006 -0400 NFSv4: Decode mounted_on_fileid attribute in getattr. It is ignored if fileid is also requested. This will be used on referrals (fs_locations). Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit d8e4895c516f5140710461e486084f6ba02b6888 Author: Manoj Naik Date: Fri Apr 21 11:37:35 2006 -0400 NFSv4: convert fs-locations-components to conform to RFC3530 Use component4-style formats for decoding list of servers and pathnames in fs_locations. Signed-off-by: Manoj Naik Signed-off-by: Trond Myklebust commit c1c08102d85fb18f9d1d44d5e6755d514348e930 Author: Trond Myklebust Date: Fri Apr 21 11:37:24 2006 -0400 NFSv4: Implement the fs_locations function call NFSv4 allows for the fact that filesystems may be replicated across several servers or that they may be migrated to a backup server in case of failure of the primary server. fs_locations is an NFSv4 operation for retrieving information about the location of migrated and/or replicated filesystems. Based on an initial implementation by Jiaying Zhang Signed-off-by: Trond Myklebust commit 974bdc1d08961e38ea09b7c2e3c428be2ae306c8 Author: Trond Myklebust Date: Fri Apr 21 11:37:21 2006 -0400 RPC: Allow struc xdr_stream to read the page section of an xdr_buf Signed-off-by: Trond Myklebust commit ca11972e59b9eb8ac8175a7fc410073d51027adf Author: Trond Myklebust Date: Fri Apr 21 11:37:17 2006 -0400 NFS: Add timeout to submounts Make automounted partitions expire using the mark_mounts_for_expiry() function. The timeout is controlled via a sysctl. Signed-off-by: Trond Myklebust commit cfedee1e7ff6158091f338de941d153066be594b Author: Trond Myklebust Date: Fri Apr 21 11:37:13 2006 -0400 NFS: Ensure the client submounts, when it crosses a server mountpoint. Signed-off-by: Trond Myklebust commit 0bc27dcec6c2d099cf3a44b1b5abeb128c3107bf Author: Trond Myklebust Date: Fri Apr 21 11:37:09 2006 -0400 NFS: Store the file system "fsid" value in the NFS super block. This should enable us to detect if we are crossing a mountpoint in the case where the server is exporting "nohide" mounts. Signed-off-by: Trond Myklebust commit 24c7b8f8e67a7483bb4a5709c2c1d28b61cea586 Author: Trond Myklebust Date: Fri Apr 21 11:37:05 2006 -0400 VFS: Remove dependency of ->umount_begin() call on MNT_FORCE Allow filesystems to decide to perform pre-umount processing whether or not MNT_FORCE is set. Signed-off-by: Trond Myklebust commit e237986258ac5765b1c044ba488f9abdb5e6916c Author: Trond Myklebust Date: Fri Apr 21 11:37:00 2006 -0400 VFS: Add shrink_submounts() Allow a submount to be marked as being 'shrinkable' by means of the vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which attempts to recursively unmount these submounts. Signed-off-by: Trond Myklebust commit 25e122d925ec203563923b1dba70be16418586fc Author: Trond Myklebust Date: Fri Apr 21 11:36:55 2006 -0400 VFS: Unexport do_kern_mount() and clean up simple_pin_fs() Replace all module uses with the new vfs_kern_mount() interface, and fix up simple_pin_fs(). Signed-off-by: Trond Myklebust commit 3968118865e78a9f4899dd9f98d1554e7f9e057b Author: Trond Myklebust Date: Fri Apr 21 11:36:49 2006 -0400 VFS: Add GPL_EXPORTED function vfs_kern_mount() do_kern_mount() does not allow the kernel to use private mount interfaces without exposing the same interfaces to userland. The problem is that the filesystem is referenced by name, thus meaning that it and its mount interface must be registered in the global filesystem list. vfs_kern_mount() passes the struct file_system_type as an explicit parameter in order to overcome this limitation. Signed-off-by: Trond Myklebust commit ae0e6c8ffc4a80af20dfa04eef9f8fe3a38a7b58 Author: Chuck Lever Date: Fri Apr 21 11:43:16 2006 -0400 SUNRPC: NFS_ROOT always uses the same XIDs The XID generator uses get_random_bytes to generate an initial XID. NFS_ROOT starts up before the random driver, though, so get_random_bytes doesn't set a random XID for NFS_ROOT. This causes NFS_ROOT mount points to reuse XIDs every time the client is booted. If the client boots often enough, the server will start serving old replies out of its DRC. Use net_random() instead. Test plan: I/O intensive workloads should perform well and generate no errors. Traces taken during client reboots should show that NFS_ROOT mounts use unique XIDs after every reboot. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 56b83cce99d496d54c69a4a42e7b2ccf45391aa8 Author: Chuck Lever Date: Fri Apr 21 11:43:12 2006 -0400 SUNRPC: select privileged port numbers at random Make the RPC client select privileged ephemeral source ports at random. This improves DRC behavior on the server by using the same port when reconnecting for the same mount point, but using a different port for fresh mounts. The Linux TCP implementation already does this for nonprivileged ports. Note that TCP sockets in TIME_WAIT will prevent quick reuse of a random ephemeral port number by leaving the port INUSE until the connection transitions out of TIME_WAIT. Test plan: Connectathon against every known server implementation using multiple mount points. Locking especially. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 6677466178e392c047cfc6dcb554c5e14e2552fe Author: Chuck Lever Date: Fri Apr 21 11:42:55 2006 -0400 NFS: Optimize allocation of nfs_read/write_data structures Clean up use of page_array, and fix an off-by-one error noticed by Tom Talpey which causes kmalloc calls in cases where using the page_array is sufficient. Test plan: Normal client functional testing with r/wsize=32768. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust commit 9026b4fff14d52badb08a16b5f0eeac28b66775a Author: Trond Myklebust Date: Fri Apr 21 11:39:40 2006 -0400 NFS: Clean up inode metadata updates Signed-off-by: Trond Myklebust commit 62b579fb12f45be4e28ab3b358959f0d10a01629 Author: Trond Myklebust Date: Fri Apr 21 11:39:33 2006 -0400 NFSv4: Some NFSv4 servers have broken behaviour for the change attribute The Linux NFSv4 server violates RFC3530 in that the change attribute is not guaranteed to be updated for every change to the inode. Our optimisation for checking whether or not the inode metadata has changed or not is broken too. Grr.... Signed-off-by: Trond Myklebust --- Signed-off-by: Andrew Morton --- Documentation/filesystems/automount-support.txt | 2 drivers/usb/core/inode.c | 2 fs/9p/vfs_super.c | 7 fs/afs/mntpt.c | 2 fs/afs/super.c | 2 fs/afs/super.h | 2 fs/binfmt_misc.c | 3 fs/cifs/cifsfs.c | 6 fs/configfs/mount.c | 2 fs/debugfs/inode.c | 2 fs/fuse/inode.c | 5 fs/libfs.c | 4 fs/namespace.c | 128 +- fs/nfs/Makefile | 3 fs/nfs/dir.c | 16 fs/nfs/idmap.c | 1 fs/nfs/inode.c | 795 ++++++++++++-- fs/nfs/namespace.c | 119 ++ fs/nfs/nfs2xdr.c | 3 fs/nfs/nfs3xdr.c | 3 fs/nfs/nfs4_fs.h | 4 fs/nfs/nfs4proc.c | 99 + fs/nfs/nfs4xdr.c | 215 +++ fs/nfs/read.c | 11 fs/nfs/sysctl.c | 10 fs/nfs/write.c | 18 fs/super.c | 24 include/linux/fs.h | 4 include/linux/mount.h | 8 include/linux/nfs4.h | 1 include/linux/nfs_fs.h | 20 include/linux/nfs_fs_sb.h | 1 include/linux/nfs_page.h | 1 include/linux/nfs_xdr.h | 61 - include/linux/sunrpc/xdr.h | 1 mm/shmem.c | 2 net/sunrpc/rpc_pipe.c | 2 net/sunrpc/xdr.c | 28 net/sunrpc/xprt.c | 4 net/sunrpc/xprtsock.c | 11 security/inode.c | 2 41 files changed, 1438 insertions(+), 196 deletions(-) diff -puN Documentation/filesystems/automount-support.txt~git-nfs Documentation/filesystems/automount-support.txt --- devel/Documentation/filesystems/automount-support.txt~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/Documentation/filesystems/automount-support.txt 2006-04-30 00:30:17.000000000 -0700 @@ -19,7 +19,7 @@ following procedure: (2) Have the follow_link() op do the following steps: - (a) Call do_kern_mount() to call the appropriate filesystem to set up a + (a) Call vfs_kern_mount() to call the appropriate filesystem to set up a superblock and gain a vfsmount structure representing it. (b) Copy the nameidata provided as an argument and substitute the dentry diff -puN drivers/usb/core/inode.c~git-nfs drivers/usb/core/inode.c --- devel/drivers/usb/core/inode.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/drivers/usb/core/inode.c 2006-04-30 00:30:17.000000000 -0700 @@ -569,7 +569,7 @@ static int create_special_files (void) ignore_mount = 1; /* create the devices special file */ - retval = simple_pin_fs("usbfs", &usbfs_mount, &usbfs_mount_count); + retval = simple_pin_fs(&usb_fs_type, &usbfs_mount, &usbfs_mount_count); if (retval) { err ("Unable to get usbfs mount"); goto exit; diff -puN fs/9p/vfs_super.c~git-nfs fs/9p/vfs_super.c --- devel/fs/9p/vfs_super.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/9p/vfs_super.c 2006-04-30 00:30:17.000000000 -0700 @@ -253,11 +253,12 @@ static int v9fs_show_options(struct seq_ } static void -v9fs_umount_begin(struct super_block *sb) +v9fs_umount_begin(struct vfsmount *vfsmnt, int flags) { - struct v9fs_session_info *v9ses = sb->s_fs_info; + struct v9fs_session_info *v9ses = vfsmnt->mnt_sb->s_fs_info; - v9fs_session_cancel(v9ses); + if (flags & MNT_FORCE) + v9fs_session_cancel(v9ses); } static struct super_operations v9fs_super_ops = { diff -puN fs/afs/mntpt.c~git-nfs fs/afs/mntpt.c --- devel/fs/afs/mntpt.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/afs/mntpt.c 2006-04-30 00:30:17.000000000 -0700 @@ -210,7 +210,7 @@ static struct vfsmount *afs_mntpt_do_aut /* try and do the mount */ kdebug("--- attempting mount %s -o %s ---", devname, options); - mnt = do_kern_mount("afs", 0, devname, options); + mnt = vfs_kern_mount(&afs_fs_type, 0, devname, options); kdebug("--- mount result %p ---", mnt); free_page((unsigned long) devname); diff -puN fs/afs/super.c~git-nfs fs/afs/super.c --- devel/fs/afs/super.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/afs/super.c 2006-04-30 00:30:17.000000000 -0700 @@ -48,7 +48,7 @@ static void afs_put_super(struct super_b static void afs_destroy_inode(struct inode *inode); -static struct file_system_type afs_fs_type = { +struct file_system_type afs_fs_type = { .owner = THIS_MODULE, .name = "afs", .get_sb = afs_get_sb, diff -puN fs/afs/super.h~git-nfs fs/afs/super.h --- devel/fs/afs/super.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/afs/super.h 2006-04-30 00:30:17.000000000 -0700 @@ -38,6 +38,8 @@ static inline struct afs_super_info *AFS return sb->s_fs_info; } +extern struct file_system_type afs_fs_type; + #endif /* __KERNEL__ */ #endif /* _LINUX_AFS_SUPER_H */ diff -puN fs/binfmt_misc.c~git-nfs fs/binfmt_misc.c --- devel/fs/binfmt_misc.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/binfmt_misc.c 2006-04-30 00:30:17.000000000 -0700 @@ -55,6 +55,7 @@ typedef struct { } Node; static DEFINE_RWLOCK(entries_lock); +static struct file_system_type bm_fs_type; static struct vfsmount *bm_mnt; static int entry_count; @@ -638,7 +639,7 @@ static ssize_t bm_register_write(struct if (!inode) goto out2; - err = simple_pin_fs("binfmt_misc", &bm_mnt, &entry_count); + err = simple_pin_fs(&bm_fs_type, &bm_mnt, &entry_count); if (err) { iput(inode); inode = NULL; diff -puN fs/cifs/cifsfs.c~git-nfs fs/cifs/cifsfs.c --- devel/fs/cifs/cifsfs.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/cifs/cifsfs.c 2006-04-30 00:30:17.000000000 -0700 @@ -402,12 +402,14 @@ static struct quotactl_ops cifs_quotactl #endif #ifdef CONFIG_CIFS_EXPERIMENTAL -static void cifs_umount_begin(struct super_block * sblock) +static void cifs_umount_begin(struct vfsmount * vfsmnt, int flags) { struct cifs_sb_info *cifs_sb; struct cifsTconInfo * tcon; - cifs_sb = CIFS_SB(sblock); + if (!(flags & MNT_FORCE)) + return; + cifs_sb = CIFS_SB(vfsmnt->mnt_sb); if(cifs_sb == NULL) return; diff -puN fs/configfs/mount.c~git-nfs fs/configfs/mount.c --- devel/fs/configfs/mount.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/configfs/mount.c 2006-04-30 00:30:17.000000000 -0700 @@ -118,7 +118,7 @@ static struct file_system_type configfs_ int configfs_pin_fs(void) { - return simple_pin_fs("configfs", &configfs_mount, + return simple_pin_fs(&configfs_fs_type, &configfs_mount, &configfs_mnt_count); } diff -puN fs/debugfs/inode.c~git-nfs fs/debugfs/inode.c --- devel/fs/debugfs/inode.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/debugfs/inode.c 2006-04-30 00:30:17.000000000 -0700 @@ -198,7 +198,7 @@ struct dentry *debugfs_create_file(const pr_debug("debugfs: creating file '%s'\n",name); - error = simple_pin_fs("debugfs", &debugfs_mount, &debugfs_mount_count); + error = simple_pin_fs(&debug_fs_type, &debugfs_mount, &debugfs_mount_count); if (error) goto exit; diff -puN fs/fuse/inode.c~git-nfs fs/fuse/inode.c --- devel/fs/fuse/inode.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/fuse/inode.c 2006-04-30 00:30:17.000000000 -0700 @@ -195,9 +195,10 @@ struct inode *fuse_iget(struct super_blo return inode; } -static void fuse_umount_begin(struct super_block *sb) +static void fuse_umount_begin(struct vfsmount *vfsmnt, int flags) { - fuse_abort_conn(get_fuse_conn_super(sb)); + if (flags & MNT_FORCE) + fuse_abort_conn(get_fuse_conn_super(vfsmnt->mnt_sb)); } static void fuse_put_super(struct super_block *sb) diff -puN fs/libfs.c~git-nfs fs/libfs.c --- devel/fs/libfs.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/libfs.c 2006-04-30 00:30:17.000000000 -0700 @@ -424,13 +424,13 @@ out: static DEFINE_SPINLOCK(pin_fs_lock); -int simple_pin_fs(char *name, struct vfsmount **mount, int *count) +int simple_pin_fs(struct file_system_type *type, struct vfsmount **mount, int *count) { struct vfsmount *mnt = NULL; spin_lock(&pin_fs_lock); if (unlikely(!*mount)) { spin_unlock(&pin_fs_lock); - mnt = do_kern_mount(name, 0, name, NULL); + mnt = vfs_kern_mount(type, 0, type->name, NULL); if (IS_ERR(mnt)) return PTR_ERR(mnt); spin_lock(&pin_fs_lock); diff -puN fs/namespace.c~git-nfs fs/namespace.c --- devel/fs/namespace.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/namespace.c 2006-04-30 00:30:17.000000000 -0700 @@ -576,8 +576,8 @@ static int do_umount(struct vfsmount *mn */ lock_kernel(); - if ((flags & MNT_FORCE) && sb->s_op->umount_begin) - sb->s_op->umount_begin(sb); + if (sb->s_op->umount_begin) + sb->s_op->umount_begin(mnt, flags); unlock_kernel(); /* @@ -1166,13 +1166,46 @@ static void expire_mount(struct vfsmount } /* + * go through the vfsmounts we've just consigned to the graveyard to + * - check that they're still dead + * - delete the vfsmount from the appropriate namespace under lock + * - dispose of the corpse + */ +static void expire_mount_list(struct list_head *graveyard, struct list_head *mounts) +{ + struct namespace *namespace; + struct vfsmount *mnt; + + while (!list_empty(graveyard)) { + LIST_HEAD(umounts); + mnt = list_entry(graveyard->next, struct vfsmount, mnt_expire); + list_del_init(&mnt->mnt_expire); + + /* don't do anything if the namespace is dead - all the + * vfsmounts from it are going away anyway */ + namespace = mnt->mnt_namespace; + if (!namespace || !namespace->root) + continue; + get_namespace(namespace); + + spin_unlock(&vfsmount_lock); + down_write(&namespace_sem); + expire_mount(mnt, mounts, &umounts); + up_write(&namespace_sem); + release_mounts(&umounts); + mntput(mnt); + put_namespace(namespace); + spin_lock(&vfsmount_lock); + } +} + +/* * process a list of expirable mountpoints with the intent of discarding any * mountpoints that aren't in use and haven't been touched since last we came * here */ void mark_mounts_for_expiry(struct list_head *mounts) { - struct namespace *namespace; struct vfsmount *mnt, *next; LIST_HEAD(graveyard); @@ -1196,38 +1229,79 @@ void mark_mounts_for_expiry(struct list_ list_move(&mnt->mnt_expire, &graveyard); } - /* - * go through the vfsmounts we've just consigned to the graveyard to - * - check that they're still dead - * - delete the vfsmount from the appropriate namespace under lock - * - dispose of the corpse - */ - while (!list_empty(&graveyard)) { - LIST_HEAD(umounts); - mnt = list_entry(graveyard.next, struct vfsmount, mnt_expire); - list_del_init(&mnt->mnt_expire); + expire_mount_list(&graveyard, mounts); - /* don't do anything if the namespace is dead - all the - * vfsmounts from it are going away anyway */ - namespace = mnt->mnt_namespace; - if (!namespace || !namespace->root) + spin_unlock(&vfsmount_lock); +} + +EXPORT_SYMBOL_GPL(mark_mounts_for_expiry); + +/* + * Ripoff of 'select_parent()' + * + * search the list of submounts for a given mountpoint, and move any + * shrinkable submounts to the 'graveyard' list. + */ +static int select_submounts(struct vfsmount *parent, struct list_head *graveyard) +{ + struct vfsmount *this_parent = parent; + struct list_head *next; + int found = 0; + +repeat: + next = this_parent->mnt_mounts.next; +resume: + while (next != &this_parent->mnt_mounts) { + struct list_head *tmp = next; + struct vfsmount *mnt = list_entry(tmp, struct vfsmount, mnt_child); + + next = tmp->next; + if (!(mnt->mnt_flags & MNT_SHRINKABLE)) continue; - get_namespace(namespace); + /* + * Descend a level if the d_mounts list is non-empty. + */ + if (!list_empty(&mnt->mnt_mounts)) { + this_parent = mnt; + goto repeat; + } - spin_unlock(&vfsmount_lock); - down_write(&namespace_sem); - expire_mount(mnt, mounts, &umounts); - up_write(&namespace_sem); - release_mounts(&umounts); - mntput(mnt); - put_namespace(namespace); - spin_lock(&vfsmount_lock); + if (!propagate_mount_busy(mnt, 1)) { + mntget(mnt); + list_move_tail(&mnt->mnt_expire, graveyard); + found++; + } } + /* + * All done at this level ... ascend and resume the search + */ + if (this_parent != parent) { + next = this_parent->mnt_child.next; + this_parent = this_parent->mnt_parent; + goto resume; + } + return found; +} + +/* + * process a list of expirable mountpoints with the intent of discarding any + * submounts of a specific parent mountpoint + */ +void shrink_submounts(struct vfsmount *mountpoint, struct list_head *mounts) +{ + LIST_HEAD(graveyard); + int found; + + spin_lock(&vfsmount_lock); + + /* extract submounts of 'mountpoint' from the expiration list */ + while ((found = select_submounts(mountpoint, &graveyard)) != 0) + expire_mount_list(&graveyard, mounts); spin_unlock(&vfsmount_lock); } -EXPORT_SYMBOL_GPL(mark_mounts_for_expiry); +EXPORT_SYMBOL_GPL(shrink_submounts); /* * Some copy_from_user() implementations do not return the exact number of diff -puN fs/nfs/dir.c~git-nfs fs/nfs/dir.c --- devel/fs/nfs/dir.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/dir.c 2006-04-30 00:30:17.000000000 -0700 @@ -868,6 +868,17 @@ int nfs_is_exclusive_create(struct inode return (nd->intent.open.flags & O_EXCL) != 0; } +static inline int nfs_reval_fsid(struct inode *dir, + struct nfs_fh *fh, struct nfs_fattr *fattr) +{ + struct nfs_server *server = NFS_SERVER(dir); + + if (!nfs_fsid_equal(&server->fsid, &fattr->fsid)) + /* Revalidate fsid on root dir */ + return __nfs_revalidate_inode(server, dir->i_sb->s_root->d_inode); + return 0; +} + static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, struct nameidata *nd) { struct dentry *res; @@ -900,6 +911,11 @@ static struct dentry *nfs_lookup(struct res = ERR_PTR(error); goto out_unlock; } + error = nfs_reval_fsid(dir, &fhandle, &fattr); + if (error < 0) { + res = ERR_PTR(error); + goto out_unlock; + } inode = nfs_fhget(dentry->d_sb, &fhandle, &fattr); res = (struct dentry *)inode; if (IS_ERR(res)) diff -puN fs/nfs/idmap.c~git-nfs fs/nfs/idmap.c --- devel/fs/nfs/idmap.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/idmap.c 2006-04-30 00:30:17.000000000 -0700 @@ -47,7 +47,6 @@ #include #include -#include #include #include diff -puN fs/nfs/inode.c~git-nfs fs/nfs/inode.c --- devel/fs/nfs/inode.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/inode.c 2006-04-30 00:30:17.000000000 -0700 @@ -36,6 +36,8 @@ #include #include #include +#include +#include #include #include @@ -64,7 +66,7 @@ static void nfs_destroy_inode(struct ino static int nfs_write_inode(struct inode *,int); static void nfs_delete_inode(struct inode *); static void nfs_clear_inode(struct inode *); -static void nfs_umount_begin(struct super_block *); +static void nfs_umount_begin(struct vfsmount *, int); static int nfs_statfs(struct super_block *, struct kstatfs *); static int nfs_show_options(struct seq_file *, struct vfsmount *); static int nfs_show_stats(struct seq_file *, struct vfsmount *); @@ -179,15 +181,20 @@ nfs_clear_inode(struct inode *inode) BUG_ON(atomic_read(&nfsi->data_updates) != 0); } -void -nfs_umount_begin(struct super_block *sb) +static void nfs_umount_begin(struct vfsmount *vfsmnt, int flags) { - struct rpc_clnt *rpc = NFS_SB(sb)->client; + struct nfs_server *server; + struct rpc_clnt *rpc; + shrink_submounts(vfsmnt, &nfs_automount_list); + if (!(flags & MNT_FORCE)) + return; /* -EIO all pending I/O */ + server = NFS_SB(vfsmnt->mnt_sb); + rpc = server->client; if (!IS_ERR(rpc)) rpc_killall_tasks(rpc); - rpc = NFS_SB(sb)->client_acl; + rpc = server->client_acl; if (!IS_ERR(rpc)) rpc_killall_tasks(rpc); } @@ -234,6 +241,14 @@ nfs_block_size(unsigned long bsize, unsi return nfs_block_bits(bsize, nrbitsp); } +static inline void +nfs_super_set_maxbytes(struct super_block *sb, __u64 maxfilesize) +{ + sb->s_maxbytes = (loff_t)maxfilesize; + if (sb->s_maxbytes > MAX_LFS_FILESIZE || sb->s_maxbytes <= 0) + sb->s_maxbytes = MAX_LFS_FILESIZE; +} + /* * Obtain the root inode of the file system. */ @@ -249,6 +264,7 @@ nfs_get_root(struct super_block *sb, str return ERR_PTR(error); } + server->fsid = fsinfo->fattr->fsid; return nfs_fhget(sb, rootfh, fsinfo->fattr); } @@ -343,9 +359,7 @@ nfs_sb_init(struct super_block *sb, rpc_ } server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD; - sb->s_maxbytes = fsinfo.maxfilesize; - if (sb->s_maxbytes > MAX_LFS_FILESIZE) - sb->s_maxbytes = MAX_LFS_FILESIZE; + nfs_super_set_maxbytes(sb, fsinfo.maxfilesize); server->client->cl_intr = (server->flags & NFS_MOUNT_INTR) ? 1 : 0; server->client->cl_softrtry = (server->flags & NFS_MOUNT_SOFT) ? 1 : 0; @@ -889,6 +903,14 @@ nfs_fhget(struct super_block *sb, struct if (nfs_server_capable(inode, NFS_CAP_READDIRPLUS) && fattr->size <= NFS_LIMIT_READDIRPLUS) set_bit(NFS_INO_ADVISE_RDPLUS, &NFS_FLAGS(inode)); + /* Deal with crossing mountpoints */ + if (!nfs_fsid_equal(&NFS_SB(sb)->fsid, &fattr->fsid)) { + if (fattr->valid & NFS_ATTR_FATTR_V4_REFERRAL) + inode->i_op = &nfs_referral_inode_operations; + else + inode->i_op = &nfs_mountpoint_inode_operations; + inode->i_fop = NULL; + } } else if (S_ISLNK(inode->i_mode)) inode->i_op = &nfs_symlink_inode_operations; else @@ -1360,12 +1382,6 @@ static void nfs_wcc_update_inode(struct { struct nfs_inode *nfsi = NFS_I(inode); - if ((fattr->valid & NFS_ATTR_PRE_CHANGE) != 0 - && nfsi->change_attr == fattr->pre_change_attr) { - nfsi->change_attr = fattr->change_attr; - nfsi->cache_change_attribute = jiffies; - } - /* If we have atomic WCC data, we may update some attributes */ if ((fattr->valid & NFS_ATTR_WCC) != 0) { if (timespec_equal(&inode->i_ctime, &fattr->pre_ctime)) { @@ -1399,9 +1415,6 @@ static int nfs_check_inode_attributes(st int data_unstable; - if ((fattr->valid & NFS_ATTR_FATTR) == 0) - return 0; - /* Has the inode gone and changed behind our back? */ if (nfsi->fileid != fattr->fileid || (inode->i_mode & S_IFMT) != (fattr->mode & S_IFMT)) { @@ -1414,9 +1427,8 @@ static int nfs_check_inode_attributes(st /* Do atomic weak cache consistency updates */ nfs_wcc_update_inode(inode, fattr); - if ((fattr->valid & NFS_ATTR_FATTR_V4) != 0) { - if (nfsi->change_attr == fattr->change_attr) - goto out; + if ((fattr->valid & NFS_ATTR_FATTR_V4) != 0 && + nfsi->change_attr != fattr->change_attr) { nfsi->cache_validity |= NFS_INO_INVALID_ATTR; if (!data_unstable) nfsi->cache_validity |= NFS_INO_REVAL_PAGECACHE; @@ -1444,7 +1456,6 @@ static int nfs_check_inode_attributes(st if (inode->i_nlink != fattr->nlink) nfsi->cache_validity |= NFS_INO_INVALID_ATTR; -out: if (!timespec_equal(&inode->i_atime, &fattr->atime)) nfsi->cache_validity |= NFS_INO_INVALID_ATIME; @@ -1518,6 +1529,7 @@ out: */ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) { + struct nfs_server *server; struct nfs_inode *nfsi = NFS_I(inode); loff_t cur_isize, new_isize; unsigned int invalid = 0; @@ -1527,9 +1539,6 @@ static int nfs_update_inode(struct inode __FUNCTION__, inode->i_sb->s_id, inode->i_ino, atomic_read(&inode->i_count), fattr->valid); - if ((fattr->valid & NFS_ATTR_FATTR) == 0) - return 0; - if (nfsi->fileid != fattr->fileid) goto out_fileid; @@ -1539,6 +1548,12 @@ static int nfs_update_inode(struct inode if ((inode->i_mode & S_IFMT) != (fattr->mode & S_IFMT)) goto out_changed; + server = NFS_SERVER(inode); + /* Update the fsid if and only if this is the root directory */ + if (inode == inode->i_sb->s_root->d_inode + && !nfs_fsid_equal(&server->fsid, &fattr->fsid)) + server->fsid = fattr->fsid; + /* * Update the read time so we don't revalidate too often. */ @@ -1612,15 +1627,13 @@ static int nfs_update_inode(struct inode inode->i_blksize = fattr->du.nfs2.blocksize; } - if ((fattr->valid & NFS_ATTR_FATTR_V4)) { - if (nfsi->change_attr != fattr->change_attr) { - dprintk("NFS: change_attr change on server for file %s/%ld\n", - inode->i_sb->s_id, inode->i_ino); - nfsi->change_attr = fattr->change_attr; - invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA|NFS_INO_INVALID_ACCESS|NFS_INO_INVALID_ACL; - nfsi->cache_change_attribute = jiffies; - } else - invalid &= ~(NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA); + if ((fattr->valid & NFS_ATTR_FATTR_V4) != 0 && + nfsi->change_attr != fattr->change_attr) { + dprintk("NFS: change_attr change on server for file %s/%ld\n", + inode->i_sb->s_id, inode->i_ino); + nfsi->change_attr = fattr->change_attr; + invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA|NFS_INO_INVALID_ACCESS|NFS_INO_INVALID_ACL; + nfsi->cache_change_attribute = jiffies; } /* Update attrtimeo value if we're out of the unstable period */ @@ -1672,6 +1685,110 @@ static int nfs_update_inode(struct inode * File system information */ +/* + * nfs_path - reconstruct the path given an arbitrary dentry + * @base - arbitrary string to prepend to the path + * @dentry - pointer to dentry + * @buffer - result buffer + * @buflen - length of buffer + * + * Helper function for constructing the path from the + * root dentry to an arbitrary hashed dentry. + * + * This is mainly for use in figuring out the path on the + * server side when automounting on top of an existing partition. + */ +static char *nfs_path(const char *base, const struct dentry *dentry, + char *buffer, ssize_t buflen) +{ + char *end = buffer+buflen; + int namelen; + + *--end = '\0'; + buflen--; + spin_lock(&dcache_lock); + while (!IS_ROOT(dentry)) { + namelen = dentry->d_name.len; + buflen -= namelen + 1; + if (buflen < 0) + goto Elong; + end -= namelen; + memcpy(end, dentry->d_name.name, namelen); + *--end = '/'; + dentry = dentry->d_parent; + } + spin_unlock(&dcache_lock); + namelen = strlen(base); + /* Strip off excess slashes in base string */ + while (namelen > 0 && base[namelen - 1] == '/') + namelen--; + buflen -= namelen; + if (buflen < 0) + goto Elong; + end -= namelen; + memcpy(end, base, namelen); + return end; +Elong: + return ERR_PTR(-ENAMETOOLONG); +} + +struct nfs_clone_mount { + const struct super_block *sb; + const struct dentry *dentry; + struct nfs_fh *fh; + struct nfs_fattr *fattr; + char *hostname; + char *mnt_path; + struct sockaddr_in *addr; + rpc_authflavor_t authflavor; +}; + +static struct super_block *nfs_clone_generic_sb(struct nfs_clone_mount *data, + struct super_block *(*fill_sb)(struct nfs_server *, struct nfs_clone_mount *), + struct nfs_server *(*fill_server)(struct super_block *, struct nfs_clone_mount *)) +{ + struct nfs_server *server; + struct nfs_server *parent = NFS_SB(data->sb); + struct super_block *sb = ERR_PTR(-EINVAL); + void *err = ERR_PTR(-ENOMEM); + char *hostname; + int len; + + server = kmalloc(sizeof(struct nfs_server), GFP_KERNEL); + if (server == NULL) + goto out_err; + memcpy(server, parent, sizeof(*server)); + hostname = (data->hostname != NULL) ? data->hostname : parent->hostname; + len = strlen(hostname) + 1; + server->hostname = kmalloc(len, GFP_KERNEL); + if (server->hostname == NULL) + goto free_server; + memcpy(server->hostname, hostname, len); + if (rpciod_up() != 0) + goto free_hostname; + + sb = fill_sb(server, data); + if (IS_ERR((err = sb)) || sb->s_root) + goto kill_rpciod; + + server = fill_server(sb, data); + if (IS_ERR((err = server))) + goto out_deactivate; + return sb; +out_deactivate: + up_write(&sb->s_umount); + deactivate_super(sb); + return (struct super_block *)err; +kill_rpciod: + rpciod_down(); +free_hostname: + kfree(server->hostname); +free_server: + kfree(server); +out_err: + return (struct super_block *)err; +} + static int nfs_set_super(struct super_block *s, void *data) { s->s_fs_info = data; @@ -1819,6 +1936,7 @@ static void nfs_kill_super(struct super_ nfs_free_iostats(server->io_stats); kfree(server->hostname); kfree(server); + nfs_release_automount_timer(); } static struct file_system_type nfs_fs_type = { @@ -1829,6 +1947,83 @@ static struct file_system_type nfs_fs_ty .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; +static struct super_block *nfs_clone_sb(struct nfs_server *server, struct nfs_clone_mount *data) +{ + struct super_block *sb; + + server->fsid = data->fattr->fsid; + nfs_copy_fh(&server->fh, data->fh); + sb = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server); + if (!IS_ERR(sb) && sb->s_root == NULL && !(server->flags & NFS_MOUNT_NONLM)) + lockd_up(); + return sb; +} + +static struct nfs_server *nfs_clone_server(struct super_block *sb, struct nfs_clone_mount *data) +{ + struct nfs_server *server = NFS_SB(sb); + struct nfs_server *parent = NFS_SB(data->sb); + struct inode *root_inode; + struct nfs_fsinfo fsinfo; + void *err = ERR_PTR(-ENOMEM); + + sb->s_op = data->sb->s_op; + sb->s_blocksize = data->sb->s_blocksize; + sb->s_blocksize_bits = data->sb->s_blocksize_bits; + sb->s_maxbytes = data->sb->s_maxbytes; + + server->client_sys = server->client_acl = ERR_PTR(-EINVAL); + server->io_stats = nfs_alloc_iostats(); + if (server->io_stats == NULL) + goto out; + + server->client = rpc_clone_client(parent->client); + if (IS_ERR((err = server->client))) + goto out; + + if (!IS_ERR(parent->client_sys)) { + server->client_sys = rpc_clone_client(parent->client_sys); + if (IS_ERR((err = server->client_sys))) + goto out; + } + if (!IS_ERR(parent->client_acl)) { + server->client_acl = rpc_clone_client(parent->client_acl); + if (IS_ERR((err = server->client_acl))) + goto out; + } + root_inode = nfs_fhget(sb, data->fh, data->fattr); + if (!root_inode) + goto out; + sb->s_root = d_alloc_root(root_inode); + if (!sb->s_root) + goto out_put_root; + fsinfo.fattr = data->fattr; + if (NFS_PROTO(root_inode)->fsinfo(server, data->fh, &fsinfo) == 0) + nfs_super_set_maxbytes(sb, fsinfo.maxfilesize); + sb->s_root->d_op = server->rpc_ops->dentry_ops; + sb->s_flags |= MS_ACTIVE; + return server; +out_put_root: + iput(root_inode); +out: + return err; +} + +static struct super_block *nfs_clone_nfs_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *raw_data) +{ + struct nfs_clone_mount *data = raw_data; + return nfs_clone_generic_sb(data, nfs_clone_sb, nfs_clone_server); +} + +static struct file_system_type clone_nfs_fs_type = { + .owner = THIS_MODULE, + .name = "nfs", + .get_sb = nfs_clone_nfs_sb, + .kill_sb = nfs_kill_super, + .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, +}; + #ifdef CONFIG_NFS_V4 static void nfs4_clear_inode(struct inode *); @@ -1877,62 +2072,24 @@ static void nfs4_clear_inode(struct inod } -static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, int silent) +static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, + struct rpc_timeout *timeparms, int proto, rpc_authflavor_t flavor) { - struct nfs_server *server; - struct nfs4_client *clp = NULL; + struct nfs4_client *clp; struct rpc_xprt *xprt = NULL; struct rpc_clnt *clnt = NULL; - struct rpc_timeout timeparms; - rpc_authflavor_t authflavour; int err = -EIO; - sb->s_blocksize_bits = 0; - sb->s_blocksize = 0; - server = NFS_SB(sb); - if (data->rsize != 0) - server->rsize = nfs_block_size(data->rsize, NULL); - if (data->wsize != 0) - server->wsize = nfs_block_size(data->wsize, NULL); - server->flags = data->flags & NFS_MOUNT_FLAGMASK; - server->caps = NFS_CAP_ATOMIC_OPEN; - - server->acregmin = data->acregmin*HZ; - server->acregmax = data->acregmax*HZ; - server->acdirmin = data->acdirmin*HZ; - server->acdirmax = data->acdirmax*HZ; - - server->rpc_ops = &nfs_v4_clientops; - - nfs_init_timeout_values(&timeparms, data->proto, data->timeo, data->retrans); - - server->retrans_timeo = timeparms.to_initval; - server->retrans_count = timeparms.to_retries; - clp = nfs4_get_client(&server->addr.sin_addr); if (!clp) { dprintk("%s: failed to create NFS4 client.\n", __FUNCTION__); - return -EIO; + return ERR_PTR(err); } /* Now create transport and client */ - authflavour = RPC_AUTH_UNIX; - if (data->auth_flavourlen != 0) { - if (data->auth_flavourlen != 1) { - dprintk("%s: Invalid number of RPC auth flavours %d.\n", - __FUNCTION__, data->auth_flavourlen); - err = -EINVAL; - goto out_fail; - } - if (copy_from_user(&authflavour, data->auth_flavours, sizeof(authflavour))) { - err = -EFAULT; - goto out_fail; - } - } - down_write(&clp->cl_sem); if (IS_ERR(clp->cl_rpcclient)) { - xprt = xprt_create_proto(data->proto, &server->addr, &timeparms); + xprt = xprt_create_proto(proto, &server->addr, timeparms); if (IS_ERR(xprt)) { up_write(&clp->cl_sem); err = PTR_ERR(xprt); @@ -1940,8 +2097,10 @@ static int nfs4_fill_super(struct super_ __FUNCTION__, err); goto out_fail; } + /* Bind to a reserved port! */ + xprt->resvport = 1; clnt = rpc_create_client(xprt, server->hostname, &nfs_program, - server->rpc_ops->version, authflavour); + server->rpc_ops->version, flavor); if (IS_ERR(clnt)) { up_write(&clp->cl_sem); err = PTR_ERR(clnt); @@ -1958,43 +2117,96 @@ static int nfs4_fill_super(struct super_ list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); clnt = rpc_clone_client(clp->cl_rpcclient); if (!IS_ERR(clnt)) - server->nfs4_state = clp; + server->nfs4_state = clp; up_write(&clp->cl_sem); clp = NULL; if (IS_ERR(clnt)) { - err = PTR_ERR(clnt); dprintk("%s: cannot create RPC client. Error = %d\n", __FUNCTION__, err); - return err; + return clnt; } - server->client = clnt; - if (server->nfs4_state->cl_idmap == NULL) { dprintk("%s: failed to create idmapper.\n", __FUNCTION__); - return -ENOMEM; + return ERR_PTR(-ENOMEM); } - if (clnt->cl_auth->au_flavor != authflavour) { + if (clnt->cl_auth->au_flavor != flavor) { struct rpc_auth *auth; - auth = rpcauth_create(authflavour, clnt); + auth = rpcauth_create(flavor, clnt); if (IS_ERR(auth)) { dprintk("%s: couldn't create credcache!\n", __FUNCTION__); - return PTR_ERR(auth); + return (struct rpc_clnt *)auth; + } + } + return clnt; + + out_fail: + if (clp) + nfs4_put_client(clp); + return ERR_PTR(err); +} + +static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, int silent) +{ + struct nfs_server *server; + struct rpc_timeout timeparms; + rpc_authflavor_t authflavour; + int err = -EIO; + + sb->s_blocksize_bits = 0; + sb->s_blocksize = 0; + server = NFS_SB(sb); + if (data->rsize != 0) + server->rsize = nfs_block_size(data->rsize, NULL); + if (data->wsize != 0) + server->wsize = nfs_block_size(data->wsize, NULL); + server->flags = data->flags & NFS_MOUNT_FLAGMASK; + server->caps = NFS_CAP_ATOMIC_OPEN; + + server->acregmin = data->acregmin*HZ; + server->acregmax = data->acregmax*HZ; + server->acdirmin = data->acdirmin*HZ; + server->acdirmax = data->acdirmax*HZ; + + server->rpc_ops = &nfs_v4_clientops; + + nfs_init_timeout_values(&timeparms, data->proto, data->timeo, data->retrans); + + server->retrans_timeo = timeparms.to_initval; + server->retrans_count = timeparms.to_retries; + + /* Now create transport and client */ + authflavour = RPC_AUTH_UNIX; + if (data->auth_flavourlen != 0) { + if (data->auth_flavourlen != 1) { + dprintk("%s: Invalid number of RPC auth flavours %d.\n", + __FUNCTION__, data->auth_flavourlen); + err = -EINVAL; + goto out_fail; + } + if (copy_from_user(&authflavour, data->auth_flavours, sizeof(authflavour))) { + err = -EFAULT; + goto out_fail; } } + server->client = nfs4_create_client(server, &timeparms, data->proto, authflavour); + if (IS_ERR(server->client)) { + err = PTR_ERR(server->client); + dprintk("%s: cannot create RPC client. Error = %d\n", + __FUNCTION__, err); + goto out_fail; + } + sb->s_time_gran = 1; sb->s_op = &nfs4_sops; err = nfs_sb_init(sb, authflavour); - if (err == 0) - return 0; -out_fail: - if (clp) - nfs4_put_client(clp); + + out_fail: return err; } @@ -2140,6 +2352,7 @@ static void nfs4_kill_super(struct super nfs_free_iostats(server->io_stats); kfree(server->hostname); kfree(server); + nfs_release_automount_timer(); } static struct file_system_type nfs4_fs_type = { @@ -2179,6 +2392,91 @@ static int param_set_idmap_timeout(const module_param_call(idmap_cache_timeout, param_set_idmap_timeout, param_get_int, &nfs_idmap_cache_timeout, 0644); +/* Constructs the SERVER-side path */ +static inline char *nfs4_path(const struct dentry *dentry, char *buffer, ssize_t buflen) +{ + return nfs_path(NFS_SB(dentry->d_sb)->mnt_path, dentry, buffer, buflen); +} + +static inline char *nfs4_dup_path(const struct dentry *dentry) +{ + char *page = (char *) __get_free_page(GFP_USER); + char *path; + + path = nfs4_path(dentry, page, PAGE_SIZE); + if (!IS_ERR(path)) { + int len = PAGE_SIZE + page - path; + char *tmp = path; + + path = kmalloc(len, GFP_KERNEL); + if (path) + memcpy(path, tmp, len); + else + path = ERR_PTR(-ENOMEM); + } + free_page((unsigned long)page); + return path; +} + +static struct super_block *nfs4_clone_sb(struct nfs_server *server, struct nfs_clone_mount *data) +{ + const struct dentry *dentry = data->dentry; + struct nfs4_client *clp = server->nfs4_state; + struct super_block *sb; + + server->fsid = data->fattr->fsid; + nfs_copy_fh(&server->fh, data->fh); + server->mnt_path = nfs4_dup_path(dentry); + if (IS_ERR(server->mnt_path)) { + sb = (struct super_block *)server->mnt_path; + goto err; + } + sb = sget(&nfs4_fs_type, nfs4_compare_super, nfs_set_super, server); + if (IS_ERR(sb) || sb->s_root) + goto free_path; + nfs4_server_capabilities(server, &server->fh); + + down_write(&clp->cl_sem); + atomic_inc(&clp->cl_count); + list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); + up_write(&clp->cl_sem); + return sb; +free_path: + kfree(server->mnt_path); +err: + server->mnt_path = NULL; + return sb; +} + +static struct super_block *nfs_clone_nfs4_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *raw_data) +{ + struct nfs_clone_mount *data = raw_data; + return nfs_clone_generic_sb(data, nfs4_clone_sb, nfs_clone_server); +} + +static struct file_system_type clone_nfs4_fs_type = { + .owner = THIS_MODULE, + .name = "nfs4", + .get_sb = nfs_clone_nfs4_sb, + .kill_sb = nfs4_kill_super, + .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, +}; + +static struct vfsmount *nfs_do_clone_mount(struct nfs_server *server, char *devname, struct nfs_clone_mount *mountdata) +{ + struct vfsmount *mnt = NULL; + switch (server->rpc_ops->version) { + case 2: + case 3: + mnt = vfs_kern_mount(&clone_nfs_fs_type, 0, devname, mountdata); + break; + case 4: + mnt = vfs_kern_mount(&clone_nfs4_fs_type, 0, devname, mountdata); + } + return mnt; +} + #define nfs4_init_once(nfsi) \ do { \ INIT_LIST_HEAD(&(nfsi)->open_states); \ @@ -2206,10 +2504,325 @@ static inline void unregister_nfs4fs(voi nfs_unregister_sysctl(); } #else +#define nfs4_fill_sb(a,b) ERR_PTR(-EINVAL) +#define nfs4_fill_super(a,b) ERR_PTR(-EINVAL) #define nfs4_init_once(nfsi) \ do { } while (0) #define register_nfs4fs() (0) #define unregister_nfs4fs() +static struct vfsmount *nfs_do_clone_mount(struct nfs_server *server, char *devname, struct nfs_clone_mount *mountdata) +{ + return vfs_kern_mount(&clone_nfs_fs_type, 0, devname, mountdata); +} +#endif + +static inline char *nfs_devname(const struct vfsmount *mnt_parent, + const struct dentry *dentry, + char *buffer, ssize_t buflen) +{ + return nfs_path(mnt_parent->mnt_devname, dentry, buffer, buflen); +} + +/** + * nfs_do_submount - set up mountpoint when crossing a filesystem boundary + * @mnt_parent - mountpoint of parent directory + * @dentry - parent directory + * @fh - filehandle for new root dentry + * @fattr - attributes for new root inode + * + */ +struct vfsmount *nfs_do_submount(const struct vfsmount *mnt_parent, + const struct dentry *dentry, struct nfs_fh *fh, + struct nfs_fattr *fattr) +{ + struct nfs_clone_mount mountdata = { + .sb = mnt_parent->mnt_sb, + .dentry = dentry, + .fh = fh, + .fattr = fattr, + }; + struct vfsmount *mnt = ERR_PTR(-ENOMEM); + char *page = (char *) __get_free_page(GFP_USER); + char *devname; + + dprintk("%s: submounting on %s/%s\n", __FUNCTION__, + dentry->d_parent->d_name.name, + dentry->d_name.name); + if (page == NULL) + goto out; + devname = nfs_devname(mnt_parent, dentry, page, PAGE_SIZE); + mnt = (struct vfsmount *)devname; + if (IS_ERR(devname)) + goto free_page; + mnt = nfs_do_clone_mount(NFS_SB(mnt_parent->mnt_sb), devname, &mountdata); +free_page: + free_page((unsigned long)page); +out: + dprintk("%s: done\n", __FUNCTION__); + return mnt; +} + +#ifdef CONFIG_NFS_V4 +/* Check if fs_root is valid */ +static inline char *nfs4_pathname_string(struct nfs4_pathname *pathname, char *buffer, ssize_t buflen) +{ + char *end = buffer + buflen; + int n; + + *--end = '\0'; + buflen--; + + n = pathname->ncomponents; + while (--n >= 0) { + struct nfs4_string *component = &pathname->components[n]; + buflen -= component->len + 1; + if (buflen < 0) + goto Elong; + end -= component->len; + memcpy(end, component->data, component->len); + *--end = '/'; + } + return end; +Elong: + return ERR_PTR(-ENAMETOOLONG); +} + +/* Check if the string represents a "valid" IPv4 address */ +static inline int valid_ipaddr4(const char *buf) +{ + int rc, count, in[4]; + + rc = sscanf(buf, "%d.%d.%d.%d", &in[0], &in[1], &in[2], &in[3]); + if (rc != 4) + return -EINVAL; + for (count = 0; count < 4; count++) { + if (in[count] > 255) + return -EINVAL; + } + return 0; +} + +static struct super_block *nfs4_referral_sb(struct nfs_server *server, struct nfs_clone_mount *data) +{ + struct super_block *sb = ERR_PTR(-ENOMEM); + int len; + + len = strlen(data->mnt_path) + 1; + server->mnt_path = kmalloc(len, GFP_KERNEL); + if (server->mnt_path == NULL) + goto err; + memcpy(server->mnt_path, data->mnt_path, len); + memcpy(&server->addr, data->addr, sizeof(struct sockaddr_in)); + + sb = sget(&nfs4_fs_type, nfs4_compare_super, nfs_set_super, server); + if (IS_ERR(sb) || sb->s_root) + goto free_path; + return sb; +free_path: + kfree(server->mnt_path); +err: + server->mnt_path = NULL; + return sb; +} + +static struct nfs_server *nfs4_referral_server(struct super_block *sb, struct nfs_clone_mount *data) +{ + struct nfs_server *server = NFS_SB(sb); + struct rpc_timeout timeparms; + int proto, timeo, retrans; + void *err; + + proto = IPPROTO_TCP; + /* Since we are following a referral and there may be alternatives, + set the timeouts and retries to low values */ + timeo = 2; + retrans = 1; + nfs_init_timeout_values(&timeparms, proto, timeo, retrans); + + server->client = nfs4_create_client(server, &timeparms, proto, data->authflavor); + if (IS_ERR((err = server->client))) + goto out_err; + + sb->s_time_gran = 1; + sb->s_op = &nfs4_sops; + err = ERR_PTR(nfs_sb_init(sb, data->authflavor)); + if (!IS_ERR(err)) + return server; +out_err: + return (struct nfs_server *)err; +} + +static struct super_block *nfs_referral_nfs4_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *raw_data) +{ + struct nfs_clone_mount *data = raw_data; + return nfs_clone_generic_sb(data, nfs4_referral_sb, nfs4_referral_server); +} + +static struct file_system_type nfs_referral_nfs4_fs_type = { + .owner = THIS_MODULE, + .name = "nfs4", + .get_sb = nfs_referral_nfs4_sb, + .kill_sb = nfs4_kill_super, + .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, +}; + +/** + * nfs_follow_referral - set up mountpoint when hitting a referral on moved error + * @mnt_parent - mountpoint of parent directory + * @dentry - parent directory + * @fspath - fs path returned in fs_locations + * @mntpath - mount path to new server + * @hostname - hostname of new server + * @addr - host addr of new server + * + */ +static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, + const struct dentry *dentry, + struct nfs4_fs_locations *locations) +{ + struct vfsmount *mnt = ERR_PTR(-ENOENT); + struct nfs_clone_mount mountdata = { + .sb = mnt_parent->mnt_sb, + .dentry = dentry, + .authflavor = NFS_SB(mnt_parent->mnt_sb)->client->cl_auth->au_flavor, + }; + char *page, *page2; + char *path, *fs_path; + char *devname; + int loc, s; + + if (locations == NULL || locations->nlocations <= 0) + goto out; + + dprintk("%s: referral at %s/%s\n", __FUNCTION__, + dentry->d_parent->d_name.name, dentry->d_name.name); + + /* Ensure fs path is a prefix of current dentry path */ + page = (char *) __get_free_page(GFP_USER); + if (page == NULL) + goto out; + page2 = (char *) __get_free_page(GFP_USER); + if (page2 == NULL) + goto out; + + path = nfs4_path(dentry, page, PAGE_SIZE); + if (IS_ERR(path)) + goto out_free; + + fs_path = nfs4_pathname_string(&locations->fs_path, page2, PAGE_SIZE); + if (IS_ERR(fs_path)) + goto out_free; + + if (strncmp(path, fs_path, strlen(fs_path)) != 0) { + dprintk("%s: path %s does not begin with fsroot %s\n", __FUNCTION__, path, fs_path); + goto out_free; + } + + devname = nfs_devname(mnt_parent, dentry, page, PAGE_SIZE); + if (IS_ERR(devname)) { + mnt = (struct vfsmount *)devname; + goto out_free; + } + + loc = 0; + while (loc < locations->nlocations && IS_ERR(mnt)) { + struct nfs4_fs_location *location = &locations->locations[loc]; + char *mnt_path; + + if (location == NULL || location->nservers <= 0 || + location->rootpath.ncomponents == 0) { + loc++; + continue; + } + + mnt_path = nfs4_pathname_string(&location->rootpath, page2, PAGE_SIZE); + if (IS_ERR(mnt_path)) { + loc++; + continue; + } + mountdata.mnt_path = mnt_path; + + s = 0; + while (s < location->nservers) { + struct sockaddr_in addr = {}; + + if (location->servers[s].len <= 0 || + valid_ipaddr4(location->servers[s].data) < 0) { + s++; + continue; + } + + mountdata.hostname = location->servers[s].data; + addr.sin_addr.s_addr = in_aton(mountdata.hostname); + addr.sin_family = AF_INET; + addr.sin_port = htons(NFS_PORT); + mountdata.addr = &addr; + + mnt = vfs_kern_mount(&nfs_referral_nfs4_fs_type, 0, devname, &mountdata); + if (!IS_ERR(mnt)) { + break; + } + s++; + } + loc++; + } + +out_free: + free_page((unsigned long)page); + free_page((unsigned long)page2); +out: + dprintk("%s: done\n", __FUNCTION__); + return mnt; +} + +/* + * nfs_do_refmount - handle crossing a referral on server + * @dentry - dentry of referral + * @nd - nameidata info + * + */ +struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, struct dentry *dentry) +{ + struct vfsmount *mnt = ERR_PTR(-ENOENT); + struct dentry *parent; + struct nfs4_fs_locations *fs_locations = NULL; + struct page *page; + int err; + + /* BUG_ON(IS_ROOT(dentry)); */ + dprintk("%s: enter\n", __FUNCTION__); + + page = alloc_page(GFP_KERNEL); + if (page == NULL) + goto out; + + fs_locations = kmalloc(sizeof(struct nfs4_fs_locations), GFP_KERNEL); + if (fs_locations == NULL) + goto out_free; + + /* Get locations */ + parent = dget_parent(dentry); + dprintk("%s: getting locations for %s/%s\n", __FUNCTION__, parent->d_name.name, dentry->d_name.name); + err = nfs4_proc_fs_locations(parent->d_inode, dentry, fs_locations, page); + dput(parent); + if (err != 0 || fs_locations->nlocations <= 0 || + fs_locations->fs_path.ncomponents <= 0) + goto out_free; + + mnt = nfs_follow_referral(mnt_parent, dentry, fs_locations); +out_free: + __free_page(page); + kfree(fs_locations); +out: + dprintk("%s: done\n", __FUNCTION__); + return mnt; +} +#else +struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, struct dentry *dentry) +{ + return ERR_PTR(-ENOENT); +} #endif extern int nfs_init_nfspagecache(void); diff -puN fs/nfs/Makefile~git-nfs fs/nfs/Makefile --- devel/fs/nfs/Makefile~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/Makefile 2006-04-30 00:30:17.000000000 -0700 @@ -5,7 +5,8 @@ obj-$(CONFIG_NFS_FS) += nfs.o nfs-y := dir.o file.o inode.o nfs2xdr.o pagelist.o \ - proc.o read.o symlink.o unlink.o write.o + proc.o read.o symlink.o unlink.o write.o \ + namespace.o nfs-$(CONFIG_ROOT_NFS) += nfsroot.o mount_clnt.o nfs-$(CONFIG_NFS_V3) += nfs3proc.o nfs3xdr.o nfs-$(CONFIG_NFS_V3_ACL) += nfs3acl.o diff -puN /dev/null fs/nfs/namespace.c --- /dev/null 2003-09-15 06:40:47.000000000 -0700 +++ devel-akpm/fs/nfs/namespace.c 2006-04-30 00:30:17.000000000 -0700 @@ -0,0 +1,119 @@ +/* + * linux/fs/nfs/namespace.c + * + * Copyright (C) 2005 Trond Myklebust + * + * NFS namespace + */ + +#include + +#include +#include +#include +#include +#include +#include +#include + +#define NFSDBG_FACILITY NFSDBG_VFS + +LIST_HEAD(nfs_automount_list); +static void nfs_expire_automounts(void *list); +static DECLARE_WORK(nfs_automount_task, nfs_expire_automounts, &nfs_automount_list); +int nfs_mountpoint_expiry_timeout = 500 * HZ; + +/* + * nfs_follow_mountpoint - handle crossing a mountpoint on the server + * @dentry - dentry of mountpoint + * @nd - nameidata info + * + * When we encounter a mountpoint on the server, we want to set up + * a mountpoint on the client too, to prevent inode numbers from + * colliding, and to allow "df" to work properly. + * On NFSv4, we also want to allow for the fact that different + * filesystems may be migrated to different servers in a failover + * situation, and that different filesystems may want to use + * different security flavours. + */ +static void * nfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) +{ + struct vfsmount *mnt; + struct nfs_server *server = NFS_SERVER(dentry->d_inode); + struct dentry *parent; + struct nfs_fh fh; + struct nfs_fattr fattr; + int err; + + BUG_ON(IS_ROOT(dentry)); + dprintk("%s: enter\n", __FUNCTION__); + dput(nd->dentry); + nd->dentry = dget(dentry); + if (d_mountpoint(nd->dentry)) + goto out_follow; + /* Look it up again */ + parent = dget_parent(nd->dentry); + err = server->rpc_ops->lookup(parent->d_inode, &nd->dentry->d_name, &fh, &fattr); + dput(parent); + if (err != 0) + goto out_err; + + if (fattr.valid & NFS_ATTR_FATTR_V4_REFERRAL) + mnt = nfs_do_refmount(nd->mnt, nd->dentry); + else + mnt = nfs_do_submount(nd->mnt, nd->dentry, &fh, &fattr); + err = PTR_ERR(mnt); + if (IS_ERR(mnt)) + goto out_err; + + mntget(mnt); + err = do_add_mount(mnt, nd, nd->mnt->mnt_flags|MNT_SHRINKABLE, &nfs_automount_list); + if (err < 0) { + mntput(mnt); + if (err == -EBUSY) + goto out_follow; + goto out_err; + } + mntput(nd->mnt); + dput(nd->dentry); + nd->mnt = mnt; + nd->dentry = dget(mnt->mnt_root); + schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); +out: + dprintk("%s: done, returned %d\n", __FUNCTION__, err); + return ERR_PTR(err); +out_err: + path_release(nd); + goto out; +out_follow: + while(d_mountpoint(nd->dentry) && follow_down(&nd->mnt, &nd->dentry)) + ; + err = 0; + goto out; +} + +struct inode_operations nfs_mountpoint_inode_operations = { + .follow_link = nfs_follow_mountpoint, + .getattr = nfs_getattr, +}; + +struct inode_operations nfs_referral_inode_operations = { + .follow_link = nfs_follow_mountpoint, +}; + +static void nfs_expire_automounts(void *data) +{ + struct list_head *list = (struct list_head *)data; + + mark_mounts_for_expiry(list); + if (!list_empty(list)) + schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); +} + +void nfs_release_automount_timer(void) +{ + if (list_empty(&nfs_automount_list)) { + cancel_delayed_work(&nfs_automount_task); + flush_scheduled_work(); + } +} diff -puN fs/nfs/nfs2xdr.c~git-nfs fs/nfs/nfs2xdr.c --- devel/fs/nfs/nfs2xdr.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/nfs2xdr.c 2006-04-30 00:30:17.000000000 -0700 @@ -131,7 +131,8 @@ xdr_decode_fattr(u32 *p, struct nfs_fatt fattr->du.nfs2.blocksize = ntohl(*p++); rdev = ntohl(*p++); fattr->du.nfs2.blocks = ntohl(*p++); - fattr->fsid_u.nfs3 = ntohl(*p++); + fattr->fsid.major = ntohl(*p++); + fattr->fsid.minor = 0; fattr->fileid = ntohl(*p++); p = xdr_decode_time(p, &fattr->atime); p = xdr_decode_time(p, &fattr->mtime); diff -puN fs/nfs/nfs3xdr.c~git-nfs fs/nfs/nfs3xdr.c --- devel/fs/nfs/nfs3xdr.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/nfs3xdr.c 2006-04-30 00:30:17.000000000 -0700 @@ -166,7 +166,8 @@ xdr_decode_fattr(u32 *p, struct nfs_fatt if (MAJOR(fattr->rdev) != major || MINOR(fattr->rdev) != minor) fattr->rdev = 0; - p = xdr_decode_hyper(p, &fattr->fsid_u.nfs3); + p = xdr_decode_hyper(p, &fattr->fsid.major); + fattr->fsid.minor = 0; p = xdr_decode_hyper(p, &fattr->fileid); p = xdr_decode_time3(p, &fattr->atime); p = xdr_decode_time3(p, &fattr->mtime); diff -puN fs/nfs/nfs4_fs.h~git-nfs fs/nfs/nfs4_fs.h --- devel/fs/nfs/nfs4_fs.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/nfs4_fs.h 2006-04-30 00:30:17.000000000 -0700 @@ -217,6 +217,9 @@ extern int nfs4_proc_renew(struct nfs4_c extern int nfs4_do_close(struct inode *inode, struct nfs4_state *state); extern struct dentry *nfs4_atomic_open(struct inode *, struct dentry *, struct nameidata *); extern int nfs4_open_revalidate(struct inode *, struct dentry *, int, struct nameidata *); +extern int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle); +extern int nfs4_proc_fs_locations(struct inode *dir, struct dentry *dentry, + struct nfs4_fs_locations *fs_locations, struct page *page); extern struct nfs4_state_recovery_ops nfs4_reboot_recovery_ops; extern struct nfs4_state_recovery_ops nfs4_network_partition_recovery_ops; @@ -225,6 +228,7 @@ extern const u32 nfs4_fattr_bitmap[2]; extern const u32 nfs4_statfs_bitmap[2]; extern const u32 nfs4_pathconf_bitmap[2]; extern const u32 nfs4_fsinfo_bitmap[2]; +extern const u32 nfs4_fs_locations_bitmap[2]; /* nfs4renewd.c */ extern void nfs4_schedule_state_renewal(struct nfs4_client *); diff -puN fs/nfs/nfs4proc.c~git-nfs fs/nfs/nfs4proc.c --- devel/fs/nfs/nfs4proc.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/nfs4proc.c 2006-04-30 00:30:17.000000000 -0700 @@ -121,6 +121,25 @@ const u32 nfs4_fsinfo_bitmap[2] = { FATT 0 }; +const u32 nfs4_fs_locations_bitmap[2] = { + FATTR4_WORD0_TYPE + | FATTR4_WORD0_CHANGE + | FATTR4_WORD0_SIZE + | FATTR4_WORD0_FSID + | FATTR4_WORD0_FILEID + | FATTR4_WORD0_FS_LOCATIONS, + FATTR4_WORD1_MODE + | FATTR4_WORD1_NUMLINKS + | FATTR4_WORD1_OWNER + | FATTR4_WORD1_OWNER_GROUP + | FATTR4_WORD1_RAWDEV + | FATTR4_WORD1_SPACE_USED + | FATTR4_WORD1_TIME_ACCESS + | FATTR4_WORD1_TIME_METADATA + | FATTR4_WORD1_TIME_MODIFY + | FATTR4_WORD1_MOUNTED_ON_FILEID +}; + static void nfs4_setup_readdir(u64 cookie, u32 *verifier, struct dentry *dentry, struct nfs4_readdir_arg *readdir) { @@ -1331,7 +1350,7 @@ static int _nfs4_server_capabilities(str return status; } -static int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle) +int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle) { struct nfs4_exception exception = { }; int err; @@ -1443,6 +1462,50 @@ out: return nfs4_map_errors(status); } +/* + * Get locations and (maybe) other attributes of a referral. + * Note that we'll actually follow the referral later when + * we detect fsid mismatch in inode revalidation + */ +static int nfs4_get_referral(struct inode *dir, struct qstr *name, struct nfs_fattr *fattr, struct nfs_fh *fhandle) +{ + int status = -ENOMEM; + struct page *page = NULL; + struct nfs4_fs_locations *locations = NULL; + struct dentry dentry = {}; + + page = alloc_page(GFP_KERNEL); + if (page == NULL) + goto out; + locations = kmalloc(sizeof(struct nfs4_fs_locations), GFP_KERNEL); + if (locations == NULL) + goto out; + + dentry.d_name.name = name->name; + dentry.d_name.len = name->len; + status = nfs4_proc_fs_locations(dir, &dentry, locations, page); + if (status != 0) + goto out; + /* Make sure server returned a different fsid for the referral */ + if (nfs_fsid_equal(&NFS_SERVER(dir)->fsid, &locations->fattr.fsid)) { + dprintk("%s: server did not return a different fsid for a referral at %s\n", __FUNCTION__, name->name); + status = -EIO; + goto out; + } + + memcpy(fattr, &locations->fattr, sizeof(struct nfs_fattr)); + fattr->valid |= NFS_ATTR_FATTR_V4_REFERRAL; + if (!fattr->mode) + fattr->mode = S_IFDIR; + memset(fhandle, 0, sizeof(struct nfs_fh)); +out: + if (page) + __free_page(page); + if (locations) + kfree(locations); + return status; +} + static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, struct nfs_fattr *fattr) { struct nfs4_getattr_arg args = { @@ -1547,6 +1610,8 @@ static int _nfs4_proc_lookup(struct inod dprintk("NFS call lookup %s\n", name->name); status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); + if (status == -NFS4ERR_MOVED) + status = nfs4_get_referral(dir, name, fattr, fhandle); dprintk("NFS reply lookup: %d\n", status); return status; } @@ -2008,7 +2073,7 @@ static int _nfs4_proc_link(struct inode if (!status) { update_changeattr(dir, &res.cinfo); nfs_post_op_update_inode(dir, res.dir_attr); - nfs_refresh_inode(inode, res.fattr); + nfs_post_op_update_inode(inode, res.fattr); } return status; @@ -3570,6 +3635,36 @@ ssize_t nfs4_listxattr(struct dentry *de return len; } +int nfs4_proc_fs_locations(struct inode *dir, struct dentry *dentry, + struct nfs4_fs_locations *fs_locations, struct page *page) +{ + struct nfs_server *server = NFS_SERVER(dir); + u32 bitmask[2] = { + [0] = FATTR4_WORD0_FSID | FATTR4_WORD0_FS_LOCATIONS, + [1] = FATTR4_WORD1_MOUNTED_ON_FILEID, + }; + struct nfs4_fs_locations_arg args = { + .dir_fh = NFS_FH(dir), + .name = &dentry->d_name, + .page = page, + .bitmask = bitmask, + }; + struct rpc_message msg = { + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_FS_LOCATIONS], + .rpc_argp = &args, + .rpc_resp = fs_locations, + }; + int status; + + dprintk("%s: start\n", __FUNCTION__); + fs_locations->fattr.valid = 0; + fs_locations->server = server; + fs_locations->nlocations = 0; + status = rpc_call_sync(server->client, &msg, 0); + dprintk("%s: returned status = %d\n", __FUNCTION__, status); + return status; +} + struct nfs4_state_recovery_ops nfs4_reboot_recovery_ops = { .recover_open = nfs4_open_reclaim, .recover_lock = nfs4_lock_reclaim, diff -puN fs/nfs/nfs4xdr.c~git-nfs fs/nfs/nfs4xdr.c --- devel/fs/nfs/nfs4xdr.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/nfs4xdr.c 2006-04-30 00:30:17.000000000 -0700 @@ -411,6 +411,15 @@ static int nfs_stat_to_errno(int); #define NFS4_dec_setacl_sz (compound_decode_hdr_maxsz + \ decode_putfh_maxsz + \ op_decode_hdr_maxsz + nfs4_fattr_bitmap_maxsz) +#define NFS4_enc_fs_locations_sz \ + (compound_encode_hdr_maxsz + \ + encode_putfh_maxsz + \ + encode_getattr_maxsz) +#define NFS4_dec_fs_locations_sz \ + (compound_decode_hdr_maxsz + \ + decode_putfh_maxsz + \ + op_decode_hdr_maxsz + \ + nfs4_fattr_bitmap_maxsz) static struct { unsigned int mode; @@ -722,6 +731,13 @@ static int encode_fsinfo(struct xdr_stre bitmask[1] & nfs4_fsinfo_bitmap[1]); } +static int encode_fs_locations(struct xdr_stream *xdr, const u32* bitmask) +{ + return encode_getattr_two(xdr, + bitmask[0] & nfs4_fs_locations_bitmap[0], + bitmask[1] & nfs4_fs_locations_bitmap[1]); +} + static int encode_getfh(struct xdr_stream *xdr) { uint32_t *p; @@ -2003,6 +2019,38 @@ out: } /* + * Encode FS_LOCATIONS request + */ +static int nfs4_xdr_enc_fs_locations(struct rpc_rqst *req, uint32_t *p, struct nfs4_fs_locations_arg *args) +{ + struct xdr_stream xdr; + struct compound_hdr hdr = { + .nops = 3, + }; + struct rpc_auth *auth = req->rq_task->tk_auth; + int replen; + int status; + + xdr_init_encode(&xdr, &req->rq_snd_buf, p); + encode_compound_hdr(&xdr, &hdr); + if ((status = encode_putfh(&xdr, args->dir_fh)) != 0) + goto out; + if ((status = encode_lookup(&xdr, args->name)) != 0) + goto out; + if ((status = encode_fs_locations(&xdr, args->bitmask)) != 0) + goto out; + /* set up reply + * toplevel_status + OP_PUTFH + status + * + OP_LOOKUP + status + OP_GETATTR + status = 7 + */ + replen = (RPC_REPHDRSIZE + auth->au_rslack + 7) << 2; + xdr_inline_pages(&req->rq_rcv_buf, replen, &args->page, + 0, PAGE_SIZE); +out: + return status; +} + +/* * START OF "GENERIC" DECODE ROUTINES. * These may look a little ugly since they are imported from a "generic" * set of XDR encode/decode routines which are intended to be shared by @@ -2036,7 +2084,7 @@ out: } \ } while (0) -static int decode_opaque_inline(struct xdr_stream *xdr, uint32_t *len, char **string) +static int decode_opaque_inline(struct xdr_stream *xdr, unsigned int *len, char **string) { uint32_t *p; @@ -2087,7 +2135,7 @@ static int decode_op_hdr(struct xdr_stre static int decode_ace(struct xdr_stream *xdr, void *ace, struct nfs4_client *clp) { uint32_t *p; - uint32_t strlen; + unsigned int strlen; char *str; READ_BUF(12); @@ -2217,7 +2265,7 @@ static int decode_attr_symlink_support(s return 0; } -static int decode_attr_fsid(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs4_fsid *fsid) +static int decode_attr_fsid(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs_fsid *fsid) { uint32_t *p; @@ -2285,6 +2333,22 @@ static int decode_attr_fileid(struct xdr return 0; } +static int decode_attr_mounted_on_fileid(struct xdr_stream *xdr, uint32_t *bitmap, uint64_t *fileid) +{ + uint32_t *p; + + *fileid = 0; + if (unlikely(bitmap[1] & (FATTR4_WORD1_MOUNTED_ON_FILEID - 1U))) + return -EIO; + if (likely(bitmap[1] & FATTR4_WORD1_MOUNTED_ON_FILEID)) { + READ_BUF(8); + READ64(*fileid); + bitmap[1] &= ~FATTR4_WORD1_MOUNTED_ON_FILEID; + } + dprintk("%s: fileid=%Lu\n", __FUNCTION__, (unsigned long long)*fileid); + return 0; +} + static int decode_attr_files_avail(struct xdr_stream *xdr, uint32_t *bitmap, uint64_t *res) { uint32_t *p; @@ -2336,6 +2400,116 @@ static int decode_attr_files_total(struc return status; } +static int decode_pathname(struct xdr_stream *xdr, struct nfs4_pathname *path) +{ + int n; + uint32_t *p; + int status = 0; + + READ_BUF(4); + READ32(n); + if (n < 0) + goto out_eio; + if (n == 0) + goto root_path; + dprintk("path "); + path->ncomponents = 0; + while (path->ncomponents < n) { + struct nfs4_string *component = &path->components[path->ncomponents]; + status = decode_opaque_inline(xdr, &component->len, &component->data); + if (unlikely(status != 0)) + goto out_eio; + if (path->ncomponents != n) + dprintk("/"); + dprintk("%s", component->data); + if (path->ncomponents < NFS4_PATHNAME_MAXCOMPONENTS) + path->ncomponents++; + else { + dprintk("cannot parse %d components in path\n", n); + goto out_eio; + } + } +out: + dprintk("\n"); + return status; +root_path: +/* a root pathname is sent as a zero component4 */ + path->ncomponents = 1; + path->components[0].len=0; + path->components[0].data=NULL; + dprintk("path /\n"); + goto out; +out_eio: + dprintk(" status %d", status); + status = -EIO; + goto out; +} + +static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs4_fs_locations *res) +{ + int n; + uint32_t *p; + int status = -EIO; + + if (unlikely(bitmap[0] & (FATTR4_WORD0_FS_LOCATIONS -1U))) + goto out; + status = 0; + if (unlikely(!(bitmap[0] & FATTR4_WORD0_FS_LOCATIONS))) + goto out; + dprintk("%s: fsroot ", __FUNCTION__); + status = decode_pathname(xdr, &res->fs_path); + if (unlikely(status != 0)) + goto out; + READ_BUF(4); + READ32(n); + if (n <= 0) + goto out_eio; + res->nlocations = 0; + while (res->nlocations < n) { + int m; + struct nfs4_fs_location *loc = &res->locations[res->nlocations]; + + READ_BUF(4); + READ32(m); + if (m <= 0) + goto out_eio; + + loc->nservers = 0; + dprintk("%s: servers ", __FUNCTION__); + while (loc->nservers < m) { + struct nfs4_string *server = &loc->servers[loc->nservers]; + status = decode_opaque_inline(xdr, &server->len, &server->data); + if (unlikely(status != 0)) + goto out_eio; + dprintk("%s ", server->data); + if (loc->nservers < NFS4_FS_LOCATION_MAXSERVERS) + loc->nservers++; + else { + int i; + dprintk("%s: using first %d of %d servers returned for location %d\n", __FUNCTION__, NFS4_FS_LOCATION_MAXSERVERS, m, res->nlocations); + for (i = loc->nservers; i < m; i++) { + int len; + char *data; + status = decode_opaque_inline(xdr, &len, &data); + if (unlikely(status != 0)) + goto out_eio; + } + } + } + status = decode_pathname(xdr, &loc->rootpath); + if (unlikely(status != 0)) + goto out_eio; + if (res->nlocations < NFS4_FS_LOCATIONS_MAXENTRIES) + res->nlocations++; + } +out: + dprintk("%s: fs_locations done, error = %d\n", __FUNCTION__, status); + return status; +out_eio: + status = -EIO; + goto out; +} + static int decode_attr_maxfilesize(struct xdr_stream *xdr, uint32_t *bitmap, uint64_t *res) { uint32_t *p; @@ -2841,6 +3015,7 @@ static int decode_getfattr(struct xdr_st bitmap[2] = {0}, type; int status, fmode = 0; + uint64_t fileid; if ((status = decode_op_hdr(xdr, OP_GETATTR)) != 0) goto xdr_error; @@ -2863,10 +3038,14 @@ static int decode_getfattr(struct xdr_st goto xdr_error; if ((status = decode_attr_size(xdr, bitmap, &fattr->size)) != 0) goto xdr_error; - if ((status = decode_attr_fsid(xdr, bitmap, &fattr->fsid_u.nfs4)) != 0) + if ((status = decode_attr_fsid(xdr, bitmap, &fattr->fsid)) != 0) goto xdr_error; if ((status = decode_attr_fileid(xdr, bitmap, &fattr->fileid)) != 0) goto xdr_error; + if ((status = decode_attr_fs_locations(xdr, bitmap, container_of(fattr, + struct nfs4_fs_locations, + fattr))) != 0) + goto xdr_error; if ((status = decode_attr_mode(xdr, bitmap, &fattr->mode)) != 0) goto xdr_error; fattr->mode |= fmode; @@ -2886,6 +3065,10 @@ static int decode_getfattr(struct xdr_st goto xdr_error; if ((status = decode_attr_time_modify(xdr, bitmap, &fattr->mtime)) != 0) goto xdr_error; + if ((status = decode_attr_mounted_on_fileid(xdr, bitmap, &fileid)) != 0) + goto xdr_error; + if (fattr->fileid == 0 && fileid != 0) + fattr->fileid = fileid; if ((status = verify_attr_len(xdr, savep, attrlen)) == 0) fattr->valid = NFS_ATTR_FATTR | NFS_ATTR_FATTR_V3 | NFS_ATTR_FATTR_V4; xdr_error: @@ -4211,6 +4394,29 @@ out: return status; } +/* + * FS_LOCATIONS request + */ +static int nfs4_xdr_dec_fs_locations(struct rpc_rqst *req, uint32_t *p, struct nfs4_fs_locations *res) +{ + struct xdr_stream xdr; + struct compound_hdr hdr; + int status; + + xdr_init_decode(&xdr, &req->rq_rcv_buf, p); + status = decode_compound_hdr(&xdr, &hdr); + if (status != 0) + goto out; + if ((status = decode_putfh(&xdr)) != 0) + goto out; + if ((status = decode_lookup(&xdr)) != 0) + goto out; + xdr_enter_page(&xdr, PAGE_SIZE); + status = decode_getfattr(&xdr, &res->fattr, res->server); +out: + return status; +} + uint32_t *nfs4_decode_dirent(uint32_t *p, struct nfs_entry *entry, int plus) { uint32_t bitmap[2] = {0}; @@ -4382,6 +4588,7 @@ struct rpc_procinfo nfs4_procedures[] = PROC(DELEGRETURN, enc_delegreturn, dec_delegreturn), PROC(GETACL, enc_getacl, dec_getacl), PROC(SETACL, enc_setacl, dec_setacl), + PROC(FS_LOCATIONS, enc_fs_locations, dec_fs_locations), }; struct rpc_version nfs_version4 = { diff -puN fs/nfs/read.c~git-nfs fs/nfs/read.c --- devel/fs/nfs/read.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/read.c 2006-04-30 00:30:17.000000000 -0700 @@ -51,14 +51,11 @@ struct nfs_read_data *nfs_readdata_alloc if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); - if (pagecount < NFS_PAGEVEC_SIZE) - p->pagevec = &p->page_array[0]; + if (pagecount <= ARRAY_SIZE(p->page_array)) + p->pagevec = p->page_array; else { - size_t size = ++pagecount * sizeof(struct page *); - p->pagevec = kmalloc(size, GFP_NOFS); - if (p->pagevec) { - memset(p->pagevec, 0, size); - } else { + p->pagevec = kcalloc(pagecount, sizeof(struct page *), GFP_NOFS); + if (!p->pagevec) { mempool_free(p, nfs_rdata_mempool); p = NULL; } diff -puN fs/nfs/sysctl.c~git-nfs fs/nfs/sysctl.c --- devel/fs/nfs/sysctl.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/sysctl.c 2006-04-30 00:30:17.000000000 -0700 @@ -12,6 +12,7 @@ #include #include #include +#include #include "callback.h" @@ -46,6 +47,15 @@ static ctl_table nfs_cb_sysctls[] = { .strategy = &sysctl_jiffies, }, #endif + { + .ctl_name = CTL_UNNUMBERED, + .procname = "nfs_mountpoint_timeout", + .data = &nfs_mountpoint_expiry_timeout, + .maxlen = sizeof(nfs_mountpoint_expiry_timeout), + .mode = 0644, + .proc_handler = &proc_dointvec_jiffies, + .strategy = &sysctl_jiffies, + }, { .ctl_name = 0 } }; diff -puN fs/nfs/write.c~git-nfs fs/nfs/write.c --- devel/fs/nfs/write.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/nfs/write.c 2006-04-30 00:30:17.000000000 -0700 @@ -98,11 +98,10 @@ struct nfs_write_data *nfs_commit_alloc( if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); - if (pagecount < NFS_PAGEVEC_SIZE) - p->pagevec = &p->page_array[0]; + if (pagecount <= ARRAY_SIZE(p->page_array)) + p->pagevec = p->page_array; else { - size_t size = ++pagecount * sizeof(struct page *); - p->pagevec = kzalloc(size, GFP_NOFS); + p->pagevec = kcalloc(pagecount, sizeof(struct page *), GFP_NOFS); if (!p->pagevec) { mempool_free(p, nfs_commit_mempool); p = NULL; @@ -126,14 +125,11 @@ struct nfs_write_data *nfs_writedata_all if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); - if (pagecount < NFS_PAGEVEC_SIZE) - p->pagevec = &p->page_array[0]; + if (pagecount <= ARRAY_SIZE(p->page_array)) + p->pagevec = p->page_array; else { - size_t size = ++pagecount * sizeof(struct page *); - p->pagevec = kmalloc(size, GFP_NOFS); - if (p->pagevec) { - memset(p->pagevec, 0, size); - } else { + p->pagevec = kcalloc(pagecount, sizeof(struct page *), GFP_NOFS); + if (!p->pagevec) { mempool_free(p, nfs_wdata_mempool); p = NULL; } diff -puN fs/super.c~git-nfs fs/super.c --- devel/fs/super.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/fs/super.c 2006-04-30 00:30:17.000000000 -0700 @@ -800,17 +800,13 @@ struct super_block *get_sb_single(struct EXPORT_SYMBOL(get_sb_single); struct vfsmount * -do_kern_mount(const char *fstype, int flags, const char *name, void *data) +vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data) { - struct file_system_type *type = get_fs_type(fstype); struct super_block *sb = ERR_PTR(-ENOMEM); struct vfsmount *mnt; int error; char *secdata = NULL; - if (!type) - return ERR_PTR(-ENODEV); - mnt = alloc_vfsmnt(name); if (!mnt) goto out; @@ -841,7 +837,6 @@ do_kern_mount(const char *fstype, int fl mnt->mnt_parent = mnt; up_write(&sb->s_umount); free_secdata(secdata); - put_filesystem(type); return mnt; out_sb: up_write(&sb->s_umount); @@ -852,15 +847,26 @@ out_free_secdata: out_mnt: free_vfsmnt(mnt); out: - put_filesystem(type); return (struct vfsmount *)sb; } -EXPORT_SYMBOL_GPL(do_kern_mount); +EXPORT_SYMBOL_GPL(vfs_kern_mount); + +struct vfsmount * +do_kern_mount(const char *fstype, int flags, const char *name, void *data) +{ + struct file_system_type *type = get_fs_type(fstype); + struct vfsmount *mnt; + if (!type) + return ERR_PTR(-ENODEV); + mnt = vfs_kern_mount(type, flags, name, data); + put_filesystem(type); + return mnt; +} struct vfsmount *kern_mount(struct file_system_type *type) { - return do_kern_mount(type->name, 0, type->name, NULL); + return vfs_kern_mount(type, 0, type->name, NULL); } EXPORT_SYMBOL(kern_mount); diff -puN include/linux/fs.h~git-nfs include/linux/fs.h --- devel/include/linux/fs.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/fs.h 2006-04-30 00:30:17.000000000 -0700 @@ -1100,7 +1100,7 @@ struct super_operations { int (*statfs) (struct super_block *, struct kstatfs *); int (*remount_fs) (struct super_block *, int *, char *); void (*clear_inode) (struct inode *); - void (*umount_begin) (struct super_block *); + void (*umount_begin) (struct vfsmount *, int); int (*show_options)(struct seq_file *, struct vfsmount *); int (*show_stats)(struct seq_file *, struct vfsmount *); @@ -1765,7 +1765,7 @@ extern struct inode_operations simple_di struct tree_descr { char *name; const struct file_operations *ops; int mode; }; struct dentry *d_alloc_name(struct dentry *, const char *); extern int simple_fill_super(struct super_block *, int, struct tree_descr *); -extern int simple_pin_fs(char *name, struct vfsmount **mount, int *count); +extern int simple_pin_fs(struct file_system_type *, struct vfsmount **mount, int *count); extern void simple_release_fs(struct vfsmount **mount, int *count); extern ssize_t simple_read_from_buffer(void __user *, size_t, loff_t *, const void *, size_t); diff -puN include/linux/mount.h~git-nfs include/linux/mount.h --- devel/include/linux/mount.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/mount.h 2006-04-30 00:30:17.000000000 -0700 @@ -23,6 +23,8 @@ #define MNT_NOATIME 0x08 #define MNT_NODIRATIME 0x10 +#define MNT_SHRINKABLE 0x100 + #define MNT_SHARED 0x1000 /* if the vfsmount is a shared mount */ #define MNT_UNBINDABLE 0x2000 /* if the vfsmount is a unbindable mount */ #define MNT_PNODE_MASK 0x3000 /* propogation flag mask */ @@ -73,12 +75,18 @@ extern struct vfsmount *alloc_vfsmnt(con extern struct vfsmount *do_kern_mount(const char *fstype, int flags, const char *name, void *data); +struct file_system_type; +extern struct vfsmount *vfs_kern_mount(struct file_system_type *type, + int flags, const char *name, + void *data); + struct nameidata; extern int do_add_mount(struct vfsmount *newmnt, struct nameidata *nd, int mnt_flags, struct list_head *fslist); extern void mark_mounts_for_expiry(struct list_head *mounts); +extern void shrink_submounts(struct vfsmount *mountpoint, struct list_head *mounts); extern spinlock_t vfsmount_lock; extern dev_t name_to_dev_t(char *name); diff -puN include/linux/nfs4.h~git-nfs include/linux/nfs4.h --- devel/include/linux/nfs4.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/nfs4.h 2006-04-30 00:30:17.000000000 -0700 @@ -384,6 +384,7 @@ enum { NFSPROC4_CLNT_DELEGRETURN, NFSPROC4_CLNT_GETACL, NFSPROC4_CLNT_SETACL, + NFSPROC4_CLNT_FS_LOCATIONS, }; #endif diff -puN include/linux/nfs_fs.h~git-nfs include/linux/nfs_fs.h --- devel/include/linux/nfs_fs.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/nfs_fs.h 2006-04-30 00:30:17.000000000 -0700 @@ -16,8 +16,6 @@ #include #include -#include - #include #include #include @@ -27,6 +25,9 @@ #include #include #include + +#include + #include #include @@ -307,6 +308,12 @@ extern void nfs_end_data_update(struct i extern struct nfs_open_context *get_nfs_open_context(struct nfs_open_context *ctx); extern void put_nfs_open_context(struct nfs_open_context *ctx); extern struct nfs_open_context *nfs_find_open_context(struct inode *inode, struct rpc_cred *cred, int mode); +extern struct vfsmount *nfs_do_submount(const struct vfsmount *mnt_parent, + const struct dentry *dentry, + struct nfs_fh *fh, + struct nfs_fattr *fattr); +extern struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, + struct dentry *dentry); /* linux/net/ipv4/ipconfig.c: trims ip addr off front of name, too. */ extern u32 root_nfs_parse_addr(char *name); /*__init*/ @@ -393,6 +400,15 @@ extern void nfs_unregister_sysctl(void); #endif /* + * linux/fs/nfs/namespace.c + */ +extern struct list_head nfs_automount_list; +extern struct inode_operations nfs_mountpoint_inode_operations; +extern struct inode_operations nfs_referral_inode_operations; +extern int nfs_mountpoint_expiry_timeout; +extern void nfs_release_automount_timer(void); + +/* * linux/fs/nfs/unlink.c */ extern int nfs_async_unlink(struct dentry *); diff -puN include/linux/nfs_fs_sb.h~git-nfs include/linux/nfs_fs_sb.h --- devel/include/linux/nfs_fs_sb.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/nfs_fs_sb.h 2006-04-30 00:30:17.000000000 -0700 @@ -35,6 +35,7 @@ struct nfs_server { char * hostname; /* remote hostname */ struct nfs_fh fh; struct sockaddr_in addr; + struct nfs_fsid fsid; unsigned long mount_time; /* when this fs was mounted */ #ifdef CONFIG_NFS_V4 /* Our own IP address, as a null-terminated string. diff -puN include/linux/nfs_page.h~git-nfs include/linux/nfs_page.h --- devel/include/linux/nfs_page.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/nfs_page.h 2006-04-30 00:30:17.000000000 -0700 @@ -13,7 +13,6 @@ #include #include #include -#include #include #include diff -puN include/linux/nfs_xdr.h~git-nfs include/linux/nfs_xdr.h --- devel/include/linux/nfs_xdr.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/nfs_xdr.h 2006-04-30 00:30:17.000000000 -0700 @@ -14,11 +14,19 @@ #define NFS_DEF_FILE_IO_SIZE (4096U) #define NFS_MIN_FILE_IO_SIZE (1024U) -struct nfs4_fsid { - __u64 major; - __u64 minor; +struct nfs_fsid { + uint64_t major; + uint64_t minor; }; +/* + * Helper for checking equality between 2 fsids. + */ +static inline int nfs_fsid_equal(const struct nfs_fsid *a, const struct nfs_fsid *b) +{ + return a->major == b->major && a->minor == b->minor; +} + struct nfs_fattr { unsigned short valid; /* which fields are valid */ __u64 pre_size; /* pre_op_attr.size */ @@ -40,10 +48,7 @@ struct nfs_fattr { } nfs3; } du; dev_t rdev; - union { - __u64 nfs3; /* also nfs2 */ - struct nfs4_fsid nfs4; - } fsid_u; + struct nfs_fsid fsid; __u64 fileid; struct timespec atime; struct timespec mtime; @@ -57,8 +62,8 @@ struct nfs_fattr { #define NFS_ATTR_WCC 0x0001 /* pre-op WCC data */ #define NFS_ATTR_FATTR 0x0002 /* post-op attributes */ #define NFS_ATTR_FATTR_V3 0x0004 /* NFSv3 attributes */ -#define NFS_ATTR_FATTR_V4 0x0008 -#define NFS_ATTR_PRE_CHANGE 0x0010 +#define NFS_ATTR_FATTR_V4 0x0008 /* NFSv4 change attribute */ +#define NFS_ATTR_FATTR_V4_REFERRAL 0x0010 /* NFSv4 referral */ /* * Info on the file system @@ -675,6 +680,40 @@ struct nfs4_server_caps_res { u32 has_symlinks; }; +struct nfs4_string { + unsigned int len; + char *data; +}; + +#define NFS4_PATHNAME_MAXCOMPONENTS 512 +struct nfs4_pathname { + unsigned int ncomponents; + struct nfs4_string components[NFS4_PATHNAME_MAXCOMPONENTS]; +}; + +#define NFS4_FS_LOCATION_MAXSERVERS 10 +struct nfs4_fs_location { + unsigned int nservers; + struct nfs4_string servers[NFS4_FS_LOCATION_MAXSERVERS]; + struct nfs4_pathname rootpath; +}; + +#define NFS4_FS_LOCATIONS_MAXENTRIES 10 +struct nfs4_fs_locations { + struct nfs_fattr fattr; + const struct nfs_server *server; + struct nfs4_pathname fs_path; + int nlocations; + struct nfs4_fs_location locations[NFS4_FS_LOCATIONS_MAXENTRIES]; +}; + +struct nfs4_fs_locations_arg { + const struct nfs_fh *dir_fh; + const struct qstr *name; + struct page *page; + const u32 *bitmask; +}; + #endif /* CONFIG_NFS_V4 */ struct nfs_page; @@ -695,7 +734,7 @@ struct nfs_read_data { #ifdef CONFIG_NFS_V4 unsigned long timestamp; /* For lease renewal */ #endif - struct page *page_array[NFS_PAGEVEC_SIZE + 1]; + struct page *page_array[NFS_PAGEVEC_SIZE]; }; struct nfs_write_data { @@ -713,7 +752,7 @@ struct nfs_write_data { #ifdef CONFIG_NFS_V4 unsigned long timestamp; /* For lease renewal */ #endif - struct page *page_array[NFS_PAGEVEC_SIZE + 1]; + struct page *page_array[NFS_PAGEVEC_SIZE]; }; struct nfs_access_entry; diff -puN include/linux/sunrpc/xdr.h~git-nfs include/linux/sunrpc/xdr.h --- devel/include/linux/sunrpc/xdr.h~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/include/linux/sunrpc/xdr.h 2006-04-30 00:30:17.000000000 -0700 @@ -194,6 +194,7 @@ extern void xdr_write_pages(struct xdr_s extern void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf, uint32_t *p); extern uint32_t *xdr_inline_decode(struct xdr_stream *xdr, size_t nbytes); extern void xdr_read_pages(struct xdr_stream *xdr, unsigned int len); +extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len); #endif /* __KERNEL__ */ diff -puN mm/shmem.c~git-nfs mm/shmem.c --- devel/mm/shmem.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/mm/shmem.c 2006-04-30 00:30:18.000000000 -0700 @@ -2261,7 +2261,7 @@ static int __init init_tmpfs(void) #ifdef CONFIG_TMPFS devfs_mk_dir("shm"); #endif - shm_mnt = do_kern_mount(tmpfs_fs_type.name, MS_NOUSER, + shm_mnt = vfs_kern_mount(&tmpfs_fs_type, MS_NOUSER, tmpfs_fs_type.name, NULL); if (IS_ERR(shm_mnt)) { error = PTR_ERR(shm_mnt); diff -puN net/sunrpc/rpc_pipe.c~git-nfs net/sunrpc/rpc_pipe.c --- devel/net/sunrpc/rpc_pipe.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/net/sunrpc/rpc_pipe.c 2006-04-30 00:30:18.000000000 -0700 @@ -439,7 +439,7 @@ struct vfsmount *rpc_get_mount(void) { int err; - err = simple_pin_fs("rpc_pipefs", &rpc_mount, &rpc_mount_count); + err = simple_pin_fs(&rpc_pipe_fs_type, &rpc_mount, &rpc_mount_count); if (err != 0) return ERR_PTR(err); return rpc_mount; diff -puN net/sunrpc/xdr.c~git-nfs net/sunrpc/xdr.c --- devel/net/sunrpc/xdr.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/net/sunrpc/xdr.c 2006-04-30 00:30:18.000000000 -0700 @@ -568,8 +568,7 @@ EXPORT_SYMBOL(xdr_inline_decode); * * Moves data beyond the current pointer position from the XDR head[] buffer * into the page list. Any data that lies beyond current position + "len" - * bytes is moved into the XDR tail[]. The current pointer is then - * repositioned at the beginning of the XDR tail. + * bytes is moved into the XDR tail[]. */ void xdr_read_pages(struct xdr_stream *xdr, unsigned int len) { @@ -606,6 +605,31 @@ void xdr_read_pages(struct xdr_stream *x } EXPORT_SYMBOL(xdr_read_pages); +/** + * xdr_enter_page - decode data from the XDR page + * @xdr: pointer to xdr_stream struct + * @len: number of bytes of page data + * + * Moves data beyond the current pointer position from the XDR head[] buffer + * into the page list. Any data that lies beyond current position + "len" + * bytes is moved into the XDR tail[]. The current pointer is then + * repositioned at the beginning of the first XDR page. + */ +void xdr_enter_page(struct xdr_stream *xdr, unsigned int len) +{ + char * kaddr = page_address(xdr->buf->pages[0]); + xdr_read_pages(xdr, len); + /* + * Position current pointer at beginning of tail, and + * set remaining message length. + */ + if (len > PAGE_CACHE_SIZE - xdr->buf->page_base) + len = PAGE_CACHE_SIZE - xdr->buf->page_base; + xdr->p = (uint32_t *)(kaddr + xdr->buf->page_base); + xdr->end = (uint32_t *)((char *)xdr->p + len); +} +EXPORT_SYMBOL(xdr_enter_page); + static struct kvec empty_iov = {.iov_base = NULL, .iov_len = 0}; void diff -puN net/sunrpc/xprt.c~git-nfs net/sunrpc/xprt.c --- devel/net/sunrpc/xprt.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/net/sunrpc/xprt.c 2006-04-30 00:30:18.000000000 -0700 @@ -41,7 +41,7 @@ #include #include #include -#include +#include #include #include @@ -830,7 +830,7 @@ static inline u32 xprt_alloc_xid(struct static inline void xprt_init_xid(struct rpc_xprt *xprt) { - get_random_bytes(&xprt->xid, sizeof(xprt->xid)); + xprt->xid = net_random(); } static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt) diff -puN net/sunrpc/xprtsock.c~git-nfs net/sunrpc/xprtsock.c --- devel/net/sunrpc/xprtsock.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/net/sunrpc/xprtsock.c 2006-04-30 00:30:18.000000000 -0700 @@ -930,6 +930,13 @@ static void xs_udp_timer(struct rpc_task xprt_adjust_cwnd(task, -ETIMEDOUT); } +static unsigned short xs_get_random_port(void) +{ + unsigned short range = xprt_max_resvport - xprt_min_resvport; + unsigned short rand = (unsigned short) net_random() % range; + return rand + xprt_min_resvport; +} + /** * xs_set_port - reset the port number in the remote endpoint address * @xprt: generic transport @@ -1275,7 +1282,7 @@ int xs_setup_udp(struct rpc_xprt *xprt, memset(xprt->slot, 0, slot_table_size); xprt->prot = IPPROTO_UDP; - xprt->port = xprt_max_resvport; + xprt->port = xs_get_random_port(); xprt->tsh_size = 0; xprt->resvport = capable(CAP_NET_BIND_SERVICE) ? 1 : 0; /* XXX: header size can vary due to auth type, IPv6, etc. */ @@ -1317,7 +1324,7 @@ int xs_setup_tcp(struct rpc_xprt *xprt, memset(xprt->slot, 0, slot_table_size); xprt->prot = IPPROTO_TCP; - xprt->port = xprt_max_resvport; + xprt->port = xs_get_random_port(); xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32); xprt->resvport = capable(CAP_NET_BIND_SERVICE) ? 1 : 0; xprt->max_payload = RPC_MAX_FRAGMENT_SIZE; diff -puN security/inode.c~git-nfs security/inode.c --- devel/security/inode.c~git-nfs 2006-04-30 00:30:17.000000000 -0700 +++ devel-akpm/security/inode.c 2006-04-30 00:30:18.000000000 -0700 @@ -224,7 +224,7 @@ struct dentry *securityfs_create_file(co pr_debug("securityfs: creating file '%s'\n",name); - error = simple_pin_fs("securityfs", &mount, &mount_count); + error = simple_pin_fs(&fs_type, &mount, &mount_count); if (error) { dentry = ERR_PTR(error); goto exit; _