commit 7aee47b0bb9f93baecdbea205e878fe0f155f7da Author: Sunil Mushran Date: Fri Nov 6 14:50:22 2009 -0800 ocfs2: Trivial cleanup of jbd compatibility layer removal Mainline commit 53ef99cad9878f02f27bb30bc304fc42af8bdd6e removed the JBD compatibility layer from OCFS2. This patch removes the last remaining remnants of that. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker commit 837711f862bb71ac263837a0f0714dd8cc4ef7ea Author: Coly Li Date: Fri Jan 16 16:33:05 2009 +0800 ocfs2: return f_fsid info in ocfs2_statfs() Currently the f_fsid of struct kstatfs returned from ocfs2_statfs() is undefined (vfs layer fills in 0 as default). Since in some conditions, f_fsid value might be used in a (f_fsid, ino) pair to uniquely identify a file, ocfs2 should return a unique defined f_fsid value from ocfs2_statfs(). Because uuid_str is the same on big or litlle endian machine, it's endian consistent to use osb->uuid_str to generate f_fsid value. Signed-off-by: Coly Li Cc: Sunil Mushran Cc: Mark Fasheh Signed-off-by: Joel Becker commit 2f48d593b6ceb7bb63d34124ceba77d33be298cf Author: Tao Ma Date: Thu Oct 15 11:10:49 2009 +0800 ocfs2: duplicate inline data properly during reflink. The old reflink fails to handle inodes with inline data and will oops if it encounters them. This patch copies inline data to the new inode. Extended attributes may still be refcounted. Signed-off-by: Tao Ma Signed-off-by: Joel Becker Tested-by: Tristan Ye commit 87f4b1bb98696e6cf84f57df7de41f28c2a7dbeb Author: Tao Ma Date: Thu Oct 15 11:10:48 2009 +0800 ocfs2: Move ocfs2_complete_reflink to the right place. As its name ocfs2_complete_reflink indicates, it should be called after all the work for reflink is done, so it really should be called after we reflink xattr successfully. Signed-off-by: Tao Ma Signed-off-by: Joel Becker Tested-by: Tristan Ye commit fb5cbe9efd741b16e72133613747f76490bbecd3 Author: Joel Becker Date: Wed Oct 28 22:28:24 2009 -0700 ocfs2: Return -EINVAL when a device is not ocfs2. In case of non-modular kernels the root filesystem is mounted by trying several filesystems. If ocfs2 was tried before the actual filesystem type, the mount would fail because ocfs2_sb_probe() returns -EAGAIN instead of -EINVAL. ocfs2 will now return -EINVAL properly. Signed-off-by: Joel Becker Reported-by: Laszlo Attila Toth commit 828c09509b9695271bcbdc53e9fc9a6a737148d2 Author: Alexey Dobriyan Date: Thu Oct 1 15:43:56 2009 -0700 const: constify remaining file_operations [akpm@linux-foundation.org: fix KVM] Signed-off-by: Alexey Dobriyan Acked-by: Mike Frysinger Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit f0f37e2f77731b3473fa6bd5ee53255d9a9cdb40 Author: Alexey Dobriyan Date: Sun Sep 27 22:29:37 2009 +0400 const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan Signed-off-by: Linus Torvalds commit db16826367fefcb0ddb93d76b66adc52eb4e6339 Merge: cd60451 465fdd9 Author: Linus Torvalds Date: Thu Sep 24 07:53:22 2009 -0700 Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits) HWPOISON: Enable error_remove_page on btrfs HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs HWPOISON: Add madvise() based injector for hardware poisoned pages v4 HWPOISON: Enable error_remove_page for NFS HWPOISON: Enable .remove_error_page for migration aware file systems HWPOISON: The high level memory error handler in the VM v7 HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process HWPOISON: shmem: call set_page_dirty() with locked page HWPOISON: Define a new error_remove_page address space op for async truncation HWPOISON: Add invalidate_inode_page HWPOISON: Refactor truncate to allow direct truncating of page v2 HWPOISON: check and isolate corrupted free pages v2 HWPOISON: Handle hardware poisoned pages in try_to_unmap HWPOISON: Use bitmask/action code for try_to_unmap behaviour HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 HWPOISON: Add poison check to page fault handling HWPOISON: Add basic support for poisoned pages in fault handler v3 HWPOISON: Add new SIGBUS error codes for hardware poison signals HWPOISON: Add support for poison swap entries v2 HWPOISON: Export some rmap vma locking to outside world ... commit 2bcd57ab61e7cabed626226a3771617981c11ce1 Author: Alexey Dobriyan Date: Thu Sep 24 04:22:25 2009 +0400 headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan Signed-off-by: Linus Torvalds commit b64ada6b23d4a305fb3ca59b79dd38707fc53b69 Merge: be90a49 b80474b Author: Linus Torvalds Date: Wed Sep 23 09:29:20 2009 -0700 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (85 commits) ocfs2: Use buffer IO if we are appending a file. ocfs2: add spinlock protection when dealing with lockres->purge. dlmglue.c: add missed mlog lines ocfs2: __ocfs2_abort() should not enable panic for local mounts ocfs2: Add ioctl for reflink. ocfs2: Enable refcount tree support. ocfs2: Implement ocfs2_reflink. ocfs2: Add preserve to reflink. ocfs2: Create reflinked file in orphan dir. ocfs2: Use proper parameter for some inode operation. ocfs2: Make transaction extend more efficient. ocfs2: Don't merge in 1st refcount ops of reflink. ocfs2: Modify removing xattr process for refcount. ocfs2: Add reflink support for xattr. ocfs2: Create an xattr indexed block if needed. ocfs2: Call refcount tree remove process properly. ocfs2: Attach xattr clusters to refcount tree. ocfs2: Abstract ocfs2 xattr tree extend rec iteration process. ocfs2: Abstract the creation of xattr block. ocfs2: Remove inode from ocfs2_xattr_bucket_get_name_value. ... commit 88e9d34c727883d7d6f02cf1475b3ec98b8480c7 Author: James Morris Date: Tue Sep 22 16:43:43 2009 -0700 seq_file: constify seq_operations Make all seq_operations structs const, to help mitigate against revectoring user-triggerable function pointers. This is derived from the grsecurity patch, although generated from scratch because it's simpler than extracting the changes from there. Signed-off-by: James Morris Acked-by: Serge Hallyn Acked-by: Casey Schaufler Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit b80474b432913f73cce8db001e9fa3104f9b79ee Author: Tao Ma Date: Thu Sep 10 15:28:47 2009 +0800 ocfs2: Use buffer IO if we are appending a file. In ocfs2_file_aio_write, we will prevent direct io if we find that we are appending(changing i_size) and call generic_file_aio_write_nolock. But actually O_DIRECT flag is there and this function will call generic_file_direct_write eventually which will update i_size and leave di->i_size alone. The bug is http://oss.oracle.com/bugzilla/show_bug.cgi?id=1173. So this patch let ocfs2_direct_IO returns 0 directly if we are appending so that buffered write will be called and di->i_size get updated successfully. And this is also what we want in ocfs2_file_aio_write. Signed-off-by: Tao Ma Signed-off-by: Joel Becker commit 83e32d9044a4510fffdf65c2691a25c0ba84e259 Author: Wengang Wang Date: Thu Sep 3 15:56:33 2009 +0800 ocfs2: add spinlock protection when dealing with lockres->purge. when we check/modify lockres->purge, we should with the protection of lockres->spinlock. in dlm_purge_lockres(), the checking/modifying is not with the protectin. this patch fixes it. Signed-off-by: Wengang Wang Signed-off-by: Joel Becker commit d92bc5127b27f315ef0ef2c1e1829fd6a5cba54a Author: Coly Li Date: Fri Aug 28 19:03:18 2009 +0800 dlmglue.c: add missed mlog lines This patch adds the missed mlog_exit() and mlog_exit_void() lines when routines return. Signed-off-by: Coly Li Acked-by: Mark Fasheh Signed-off-by: Joel Becker commit a2f2ddbf2bafdbc7e4f3bbf09439b42c8fee2747 Author: Sunil Mushran Date: Wed Aug 19 15:16:01 2009 -0700 ocfs2: __ocfs2_abort() should not enable panic for local mounts In a clustered setup, we have to panic the box on journal abort. This is because we don't have the facility to go hard readonly. With hard ro, another node would detect node failure and initiate recovery. Having said that, we shouldn't force panic if the volume is mounted locally. This patch defers the handling to the mount option, errors. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker commit bd50873dc725a9fa72592ecc986c58805e823051 Author: Tao Ma Date: Mon Sep 21 11:25:14 2009 +0800 ocfs2: Add ioctl for reflink. The ioctl will take 3 parameters: old_path, new_path and preserve and call vfs_reflink. It is useful when we backport reflink features to old kernels. Signed-off-by: Tao Ma commit 64871b8d62570fabec3b0959d494f8e0b87f5c4b Author: Tao Ma Date: Tue Aug 18 11:48:02 2009 +0800 ocfs2: Enable refcount tree support. Signed-off-by: Tao Ma commit 09bf27a000209e9e8c9c048b4c50f6bb0dd857bb Author: Tao Ma Date: Mon Sep 21 10:38:17 2009 +0800 ocfs2: Implement ocfs2_reflink. Implement ocfs2_reflink. Signed-off-by: Tao Ma commit 0fe9b66c65f3ff227da45381afe7612f91e32740 Author: Tao Ma Date: Tue Aug 18 11:47:56 2009 +0800 ocfs2: Add preserve to reflink. reflink has 2 options for the destination file: 1. snapshot: reflink will attempt to preserve ownership, permissions, and all other security state in order to create a full snapshot. 2. new file: it will acquire the data extent sharing but will see the file's security state and attributes initialized as a new file. So add the option to ocfs2. Signed-off-by: Tao Ma commit bc13d347574fc0a8a666bc0f4cc2b635d202e372 Author: Tao Ma Date: Tue Aug 18 11:44:14 2009 +0800 ocfs2: Create reflinked file in orphan dir. reflink is a very complicated process, so it can't be integrated into one transaction. So if the system panic in the operation, we may leave a unfinished inode in the destication directory. So we will try to create an inode in orphan_dir first, reflink it to the src file and then move it to the destication file in the end. In that way we won't be afraid of any corruption during the reflink. This patch adds 2 functions for orphan_dir operation: 1. Create a new inode in orphand dir. 2. Move an inode to a target dir. Note: fsck.ocfs2 should work for us to remove the unfinished file in the orphan_dir. Signed-off-by: Tao Ma commit 19bd341f6a6c6b314bcac55bbd906bfd3603fe9e Author: Tao Ma Date: Tue Aug 18 11:44:10 2009 +0800 ocfs2: Use proper parameter for some inode operation. In order to make the original function more suitable for reflink, we modify the following inode operations. Both are tiny. 1. ocfs2_mknod_locked only use dentry for mlog, so move it to the caller so that reflink can use it without dentry. 2. ocfs2_prepare_orphan_dir only want inode to get its ip_blkno. So use ip_blkno instead. Signed-off-by: Tao Ma commit c18b812d127a971901180188b918a7cd98ccd4d6 Author: Tao Ma Date: Tue Aug 18 11:44:07 2009 +0800 ocfs2: Make transaction extend more efficient. In ocfs2_extend_rotate_transaction, op_credits is the orignal credits in the handle and we only want to extend the credits for the rotation, but the old solution always double it. It is harmless for some minor operations, but for actions like reflink we may rotate tree many times and cause the credits increase dramatically. So this patch try to only increase the desired credits. Signed-off-by: Tao Ma commit 7540c1a77b26bc2f9d86a0bfbe6597b05ec5f93d Author: Tao Ma Date: Tue Aug 18 11:44:03 2009 +0800 ocfs2: Don't merge in 1st refcount ops of reflink. Actually the whole reflink will touch refcount tree 2 times: 1. It will add the clusters in the extent record to the tree if it isn't refcounted before. 2. It will add 1 refcount to these clusters when it add these extent records to the tree. So actually we shouldn't do merge in the 1st operation since the 2nd one will soon be called and we may have to split it again. Do a merge first and split soon is a waste of time. So we only merge in the 2nd round. This is done by adding a new internal __ocfs2_increase_refcount and call it with "not-merge" for 1st refcount operation in reflink. This also has a side-effect that we don't need to worry too much about the metadata allocation in the 2nd round since it will only merge and no split will happen for those records. Signed-off-by: Tao Ma commit ce9c5a54c0f06b0efb4db8720a0616cc6aa0e5b2 Author: Tao Ma Date: Tue Aug 18 11:43:59 2009 +0800 ocfs2: Modify removing xattr process for refcount. The old xattr value remove is quite simple, it just erase the tree and free the clusters. But as we have added refcount support, The process is a little complicated. We have to lock the refcount tree at the beginning, what's more, we may split the refcount tree in some cases, so meta/credits are needed. Signed-off-by: Tao Ma commit 2999d12f4d5529b282ce201b21444590c3f9f723 Author: Tao Ma Date: Tue Aug 18 11:43:55 2009 +0800 ocfs2: Add reflink support for xattr. Signed-off-by: Tao Ma commit a7fe7a3a1ab5dac8d81e531c060f51e12010133b Author: Tao Ma Date: Tue Aug 18 11:43:52 2009 +0800 ocfs2: Create an xattr indexed block if needed. With reflink, there is a need that we create a new xattr indexed block from the very beginning. So add a new parameter for ocfs2_create_xattr_block. Signed-off-by: Tao Ma commit 8b2c0dba5159570af5721d40490f6c529d721500 Author: Tao Ma Date: Tue Aug 18 11:43:49 2009 +0800 ocfs2: Call refcount tree remove process properly. Now with xattr refcount support, we need to check whether we have xattr refcounted before we remove the refcount tree. Now the mechanism is: 1) Check whether i_clusters == 0, if no, exit. 2) check whether we have i_xattr_loc in dinode. if yes, exit. 2) Check whether we have inline xattr stored outside, if yes, exit. 4) Remove the tree. Signed-off-by: Tao Ma commit 0129241e2b3b90ff83a8c774353e5612d84bd493 Author: Tao Ma Date: Mon Sep 21 13:04:19 2009 +0800 ocfs2: Attach xattr clusters to refcount tree. In ocfs2, when xattr's value is larger than OCFS2_XATTR_INLINE_SIZE, it will be kept outside of the blocks we store xattr entry. And they are stored in a b-tree also. So this patch try to attach all these clusters to refcount tree also. Signed-off-by: Tao Ma commit 47bca4950bc40fb54e9d41cbbc8b06cd653d2ae2 Author: Tao Ma Date: Tue Aug 18 11:43:42 2009 +0800 ocfs2: Abstract ocfs2 xattr tree extend rec iteration process. Currently we have ocfs2_iterate_xattr_buckets which can receive a para and a callback to iterate a series of bucket. It is good. But actually the 2 callers ocfs2_xattr_tree_list_index_block and ocfs2_delete_xattr_index_block are almost the same. The only difference is that the latter need to handle the extent record also. So add a new function named ocfs2_iterate_xattr_index_block. It can be given func callback which are used for exten record. So now we only have one iteration function for the xattr index block. Ane what's more, it is useful for our future reflink operations. Signed-off-by: Tao Ma commit 5aea1f0ef4024ba28213c10181e1b16ec678c82d Author: Tao Ma Date: Tue Aug 18 11:43:24 2009 +0800 ocfs2: Abstract the creation of xattr block. In xattr reflink, we also need to create xattr block, so abstract the process out. Signed-off-by: Tao Ma commit fd68a894fc9641f816d9cffa58e853ba91cbc1a1 Author: Tao Ma Date: Tue Aug 18 11:43:21 2009 +0800 ocfs2: Remove inode from ocfs2_xattr_bucket_get_name_value. In ocfs2_xattr_bucket_get_name_value, actually we only use super_block. So use it. Signed-off-by: Tao Ma commit 492a8a33e1cb966fa0b5756c5fc11d30c8f8848e Author: Tao Ma Date: Tue Aug 18 11:43:17 2009 +0800 ocfs2: Add CoW support for xattr. In order to make 2 transcation(xattr and cow) independent with each other, we CoW the whole xattr out in case we are setting them. Signed-off-by: Tao Ma commit 913580b4cd445c4fb25d7cf167911a8cf6bdb1eb Author: Tao Ma Date: Mon Aug 24 14:31:03 2009 +0800 ocfs2: Abstract duplicate clusters process in CoW. We currently use pagecache to duplicate clusters in CoW, but it isn't suitable for xattr case. So abstract it out so that the caller can decide which method it use. Signed-off-by: Tao Ma commit 1061f9c1c9f81ed88b5d268a95d8e3ace80da63a Author: Tao Ma Date: Tue Aug 18 11:41:57 2009 +0800 ocfs2: Return extent flags for xattr value tree. With the new refcount tree, xattr value can also be refcounted among multiple files. So return the appropriate extent flags so that CoW can used it later. Signed-off-by: Tao Ma commit a9063ab9a3827483007124bdb6f9877f0ab4c3f5 Author: Tao Ma Date: Tue Aug 18 11:40:59 2009 +0800 ocfs2: handle file attributes issue for reflink. A reflink creates a snapshot of a file, that means the attributes must be identical except for three exceptions - nlink, ino, and ctime. As for time changes, Here is a brief description: 1. Source file: 1) atime: Ignore. Let the lazy atime code handle that. 2) mtime: don't touch. 3) ctime: If we change the tree (adding REFCOUNTED to at least one extent), update it. 2. Destination file: 1) atime: ignore. 2) mtime: we want it to appear identical to the source. 3) ctime: update. The idea here is that an ls -l will show the same time for the src and target - it shows mtime. Backup software like rsync and tar will treat the new file correctly too. Signed-off-by: Tao Ma commit 110a045aca62f6f564e3b68f89af2a3a5a6ecff2 Author: Tao Ma Date: Sat Aug 22 23:54:27 2009 +0800 ocfs2: Add normal functions for reflink a normal file's extents. 2 major functions are added in this patch. ocfs2_attach_refcount_tree will create a new refcount tree to the old file if it doesn't have one and insert all the extent records to the tree if they are not refcounted. ocfs2_create_reflink_node will: 1. set the refcount tree to the new file. 2. call ocfs2_duplicate_extent_list which will iterate all the extents for the old file, insert it to the new file and increase the corresponding referennce count. Signed-off-by: Tao Ma commit 37f8a2bfaa8364dd3644cccee8824bb8f5e409a5 Author: Tao Ma Date: Wed Aug 26 09:47:28 2009 +0800 ocfs2: CoW a reflinked cluster when it is truncated. When we truncate a file to a specific size which resides in a reflinked cluster, we need to CoW it since ocfs2_zero_range_for_truncate will zero the space after the size(just another type of write). So we add a "max_cpos" in ocfs2_refcount_cow so that it will stop when it hit the max cluster offset. Signed-off-by: Tao Ma commit 293b2f70b4a16a1ca91efd28ef3d6634262c6887 Author: Tao Ma Date: Tue Aug 25 08:02:48 2009 +0800 ocfs2: Integrate CoW in file write. When we use mmap, we CoW the refcountd clusters in ocfs2_write_begin_nolock. While for normal file io(including directio), we do CoW in ocfs2_prepare_inode_for_write. Signed-off-by: Tao Ma commit 6ae23c5555176c5b23480c9c578ff27437085ba5 Author: Tao Ma Date: Tue Aug 18 11:30:55 2009 +0800 ocfs2: CoW refcount tree improvement. During CoW, if the old extent record is refcounted, we allocate som new clusters and do CoW. Actually we can have some improvement here. If the old extent has refcount=1, that means now it is only used by this file. So we don't need to allocate new clusters, just remove the refcounted flag and it is OK. We also have to remove it from the refcount tree while not deleting it. Signed-off-by: Tao Ma commit 6f70fa519976a379d72781d927cf8e5f5b05ec86 Author: Tao Ma Date: Tue Aug 25 08:05:12 2009 +0800 ocfs2: Add CoW support. This patch try CoW support for a refcounted record. the whole process will be: 1. Calculate how many clusters we need to CoW and where we start. Extents that are not completely encompassed by the write will be broken on 1MB boundaries. 2. Do CoW for the clusters with the help of page cache. 3. Change the b-tree structure with the new allocated clusters. Signed-off-by: Tao Ma commit bcbbb24a6a5c5b3e7b8e5284e0bfa23f45c32377 Author: Tao Ma Date: Tue Aug 18 11:29:12 2009 +0800 ocfs2: Decrement refcount when truncating refcounted extents. Add 'Decrement refcount for delete' in to the normal truncate process. So for a refcounted extent record, call refcount rec decrementation instead of cluster free. Signed-off-by: Tao Ma commit 1aa75fea64bc26bda9be9b1b20ae253d7a481877 Author: Tao Ma Date: Tue Aug 18 11:28:39 2009 +0800 ocfs2: Add functions for extents refcounted. Add function ocfs2_mark_extent_refcounted which can mark an extent refcounted. Signed-off-by: Tao Ma commit 1823cb0b9fe5e6d48017ee3f92428f69c0235d87 Author: Tao Ma Date: Tue Aug 18 11:24:49 2009 +0800 ocfs2: Add support of decrementing refcount for delete. Given a physical cpos and length, decrement the refcount in the tree. If the refcount for any portion of the extent goes to zero, that portion is queued for freeing. Signed-off-by: Tao Ma commit e73a819db9c2d6c4065b7cab7374709b6939e8f1 Author: Tao Ma Date: Tue Aug 11 14:33:14 2009 +0800 ocfs2: Add support for incrementing refcount in the tree. Given a physical cpos and length, increment the refcount in the tree. If the extent has not been seen before, a refcount record is created for it. Refcount records may be merged or split by this operation. Signed-off-by: Tao Ma commit e2e9f6082b5ff099978774d5c0148e062344c2f9 Author: Tao Ma Date: Tue Aug 18 11:22:34 2009 +0800 ocfs2: move tree path functions to alloc.h. Now fs/ocfs2/alloc.c has more than 7000 lines. It contains our basic b-tree operation. Although we have already make our b-tree operation generic, the basic structrue ocfs2_path which is used to iterate one b-tree branch is still static and limited to only used in alloc.c. As refcount tree need them and I don't want to add any more b-tree unrelated code to alloc.c, export them out. Signed-off-by: Tao Ma commit fe924415957e60471536762172d127e85519ef78 Author: Tao Ma Date: Tue Aug 18 11:22:25 2009 +0800 ocfs2: Add refcount b-tree as a new extent tree. Add refcount b-tree as a new extent tree so that it can use the b-tree to store and maniuplate ocfs2_refcount_rec. Signed-off-by: Tao Ma commit 555936bfcb1af26c6919d6cedb83710bb03d4322 Author: Tao Ma Date: Tue Aug 18 11:22:21 2009 +0800 ocfs2: Abstract extent split process. ocfs2_mark_extent_written actually does the following things: 1. check the parameters. 2. initialize the left_path and split_rec. 3. call __ocfs2_mark_extent_written. it will do: 1) check the flags of unwritten 2) do the real split work. The whole process is packed tightly somehow. So this patch will abstract 2 different functions so that future b-tree operation can work with it. 1. __ocfs2_split_extent will accept path and split_rec and do the real split work. 2. ocfs2_change_extent_flag will accept a new flag and initialize path and split_rec. So now ocfs2_mark_extent_written will do: 1. check the parameters. 2. call ocfs2_change_extent_flag. 1) initalize the left_path and split_rec. 2) check whether the new flags conflict with the old one. 3) call __ocfs2_split_extent to do the split. Signed-off-by: Tao Ma commit 853a3a1439b18d5a70ada2cb3fcd468e70b7d095 Author: Tao Ma Date: Tue Aug 18 11:22:18 2009 +0800 ocfs2: Wrap ocfs2_extent_contig in ocfs2_extent_tree. Add a new operation eo_ocfs2_extent_contig int the extent tree's operations vector. So that with the new refcount tree, We want this so that refcount trees can always return CONTIG_NONE and prevent extent merging. Signed-off-by: Tao Ma commit 8bf396de984e68491569b49770e4fd7aca40ba65 Author: Tao Ma Date: Mon Aug 24 11:12:02 2009 +0800 ocfs2: Basic tree root operation. Add basic refcount tree root operation. Signed-off-by: Tao Ma commit 374a263e790c4de85844283c098810a92985f623 Author: Tao Ma Date: Mon Aug 24 11:13:37 2009 +0800 ocfs2: Add refcount tree lock mechanism. Implement locking around struct ocfs2_refcount_tree. This protects all read/write operations on refcount trees. ocfs2_refcount_tree has its own lock and its own caching_info, protecting buffers among multiple nodes. User must call ocfs2_lock_refcount_tree before his operation on the tree and unlock it after that. ocfs2_refcount_trees are referenced by the block number of the refcount tree root block, So we create an rb-tree on the ocfs2_super to look them up. Signed-off-by: Tao Ma commit c732eb16bf07f9bfb7fa72b6868462471273bdbd Author: Tao Ma Date: Tue Aug 18 11:21:00 2009 +0800 ocfs2: Add caching info for refcount tree. refcount tree should use its own caching info so that when we downconvert the refcount tree lock, we can drop all the cached buffer head. Signed-off-by: Tao Ma commit 8dec98edfe9684ce00b580a09dde3dcd21ee785b Author: Tao Ma Date: Tue Aug 18 11:19:58 2009 +0800 ocfs2: Add new refcount tree lock resource in dlmglue. refcount tree lock resource is used to protect refcount tree read/write among multiple nodes. Signed-off-by: Tao Ma commit a433848132d8cdfb8173745b922ddb919de11527 Author: Tao Ma Date: Tue Aug 18 11:19:29 2009 +0800 ocfs2: Abstract caching info checkpoint. In meta downconvert, we need to checkpoint the metadata in an inode. For refcount tree, we also need it. So abstract the process out. Signed-off-by: Tao Ma commit f2c870e3b12e38da6d9b5b17c4c8ae56a0ed68e4 Author: Tao Ma Date: Tue Aug 18 11:19:26 2009 +0800 ocfs2: Add ocfs2_read_refcount_block. Signed-off-by: Tao Ma commit 93c97087a646429f4dc0d73298d64674ddd5cde8 Author: Tao Ma Date: Tue Aug 18 11:19:20 2009 +0800 ocfs2: Add metaecc for ocfs2_refcount_block. Add metaecc and journal trigger for ocfs2_refcount_block. Signed-off-by: Tao Ma commit 721f69c404c51a5d1dc93fddb48ee936e8e23770 Author: Tao Ma Date: Tue Aug 18 11:17:49 2009 +0800 ocfs2: Define refcount tree structure. Signed-off-by: Tao Ma commit 342ff1a1b558ebbdb8cbd55ab6a63eca8b2473ca Merge: 50223e4 24ed7a9 Author: Linus Torvalds Date: Tue Sep 22 07:51:45 2009 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) trivial: fix typo in aic7xxx comment trivial: fix comment typo in drivers/ata/pata_hpt37x.c trivial: typo in kernel-parameters.txt trivial: fix typo in tracing documentation trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c trivial: remove unnecessary semicolons trivial: Fix duplicated word "options" in comment trivial: kbuild: remove extraneous blank line after declaration of usage() trivial: improve help text for mm debug config options trivial: doc: hpfall: accept disk device to unload as argument trivial: doc: hpfall: reduce risk that hpfall can do harm trivial: SubmittingPatches: Fix reference to renumbered step trivial: fix typos "man[ae]g?ment" -> "management" trivial: media/video/cx88: add __init/__exit macros to cx88 drivers trivial: fix typo in CONFIG_DEBUG_FS in gcov doc trivial: fix missing printk space in amd_k7_smp_check trivial: fix typo s/ketymap/keymap/ in comment trivial: fix typo "to to" in multiple files trivial: fix typos in comments s/DGBU/DBGU/ ... commit 0d54b217a247f39605361f867fefbb9e099a5432 Author: Alexey Dobriyan Date: Mon Sep 21 17:01:09 2009 -0700 const: make struct super_block::s_qcop const Signed-off-by: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 61e225dc341107be304fd1088146c2a5e88ff9e0 Author: Alexey Dobriyan Date: Mon Sep 21 17:01:08 2009 -0700 const: make struct super_block::dq_op const Signed-off-by: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit a419aef8b858a2bdb98df60336063d28df4b272f Author: Joe Perches Date: Tue Aug 18 11:18:35 2009 -0700 trivial: remove unnecessary semicolons Signed-off-by: Joe Perches Signed-off-by: Jiri Kosina commit aa261f549d7652258331ebb12795f3bc4395d213 Author: Andi Kleen Date: Wed Sep 16 11:50:16 2009 +0200 HWPOISON: Enable .remove_error_page for migration aware file systems Enable removing of corrupted pages through truncation for a bunch of file systems: ext*, xfs, gfs2, ocfs2, ntfs These should cover most server needs. I chose the set of migration aware file systems for this for now, assuming they have been especially audited. But in general it should be safe for all file systems on the data area that support read/write and truncate. Caveat: the hardware error handler does not take i_mutex for now before calling the truncate function. Is that ok? Cc: tytso@mit.edu Cc: hch@infradead.org Cc: mfasheh@suse.com Cc: aia21@cantab.net Cc: hugh.dickins@tiscali.co.uk Cc: swhiteho@redhat.com Signed-off-by: Andi Kleen commit d23c937b0f740888765676f6f82f509dbbb2bbad Author: Jan Kara Date: Tue Aug 18 18:24:31 2009 +0200 ocfs2: Update syncing after splicing to match generic version Update ocfs2 specific splicing code to use generic syncing helper. The sync now does not happen under rw_lock because generic_write_sync() acquires i_mutex which ranks above rw_lock. That should not matter because standard fsync path does not hold it either. Acked-by: Joel Becker Acked-by: Mark Fasheh CC: ocfs2-devel@oss.oracle.com Signed-off-by: Jan Kara commit 918941a3f3d46c2a69971b4718aaf13b1be2f1a7 Author: Jan Kara Date: Mon Aug 17 18:50:08 2009 +0200 ocfs2: Use __generic_file_aio_write instead of generic_file_aio_write_nolock Use the new helper. We have to submit data pages ourselves in case of O_SYNC write because __generic_file_aio_write does not do it for us. OCFS2 developpers might think about moving the sync out of i_mutex which seems to be easily possible but that's out of scope of this patch. CC: ocfs2-devel@oss.oracle.com Acked-by: Joel Becker Signed-off-by: Jan Kara commit d993831fa7ffeb89e994f046f93eeb09ec91df08 Author: Jens Axboe Date: Fri Jun 12 14:45:52 2009 +0200 writeback: add name to backing_dev_info This enables us to track who does what and print info. Its main use is catching dirty inodes on the default_backing_dev_info, so we can fix that up. Signed-off-by: Jens Axboe commit 5e404e9ed1b05cafb044bd46792e50197df805ed Author: Joel Becker Date: Fri Feb 13 03:54:22 2009 -0800 ocfs2: Pass ocfs2_caching_info into ocfs_init_*_extent_tree(). With this commit, extent tree operations are divorced from inodes and rely on ocfs2_caching_info. Phew! Signed-off-by: Joel Becker commit a1cf076ba93f9fdf3eb4195f9f43d1e7cb7550f2 Author: Joel Becker Date: Fri Feb 13 03:45:49 2009 -0800 ocfs2: __ocfs2_mark_extent_written() doesn't need struct inode. We only allow unwritten extents on data, so the toplevel ocfs2_mark_extent_written() can use an inode all it wants. But the subfunction isn't even using the inode argument. Signed-off-by: Joel Becker commit f3868d0fa2e20d923087a8296fda47b0afe7f9ba Author: Joel Becker Date: Tue Feb 17 19:46:04 2009 -0800 ocfs2: Teach ocfs2_replace_extent_rec() to use an extent_tree. Don't use a struct inode anymore. Signed-off-by: Joel Becker commit d231129f44e7ead14f5f496e664ff1e3883a7b25 Author: Joel Becker Date: Fri Feb 13 03:43:22 2009 -0800 ocfs2: ocfs2_split_and_insert() no longer needs struct inode. It already has an extent_tree. Signed-off-by: Joel Becker commit dbdcf6a48a40e6c9d7081393d793c4f1c5bb4fcf Author: Joel Becker Date: Fri Feb 13 03:41:26 2009 -0800 ocfs2: ocfs2_remove_extent() no longer needs struct inode. One more generic btree function that is isolated from struct inode. Signed-off-by: Joel Becker commit cbee7e1a6a1a2a3d6eda1f76ffc38a3ed3eeb6cc Author: Joel Becker Date: Fri Feb 13 03:34:15 2009 -0800 ocfs2: ocfs2_add_clusters_in_btree() no longer needs struct inode. One more function that doesn't need a struct inode to pass to its children. Signed-off-by: Joel Becker commit cc79d8c19e9d39446525a1026f1a21761f5d3cd2 Author: Joel Becker Date: Fri Feb 13 03:24:43 2009 -0800 ocfs2: ocfs2_insert_extent() no longer needs struct inode. One more function down, no inode in the entire insert-extent chain. Signed-off-by: Joel Becker commit 92ba470c44c1404ff18ca0f4ecce1e5b116bb933 Author: Joel Becker Date: Fri Feb 13 03:18:34 2009 -0800 ocfs2: Make extent map insertion an extent_tree_operation. ocfs2_insert_extent() wants to insert a record into the extent map if it's an inode data extent. But since many btrees can call that function, let's make it an op on ocfs2_extent_tree. Other tree types can leave it empty. Signed-off-by: Joel Becker commit 627961b77e68b725851cb227db10084bf15f6920 Author: Joel Becker Date: Fri Feb 13 03:14:38 2009 -0800 ocfs2: ocfs2_figure_insert_type() no longer needs struct inode. It's not using it, so remove it from the parameter list. Signed-off-by: Joel Becker commit 1ef61b33148a6b32b6d28383cd72ceeddfc7054d Author: Joel Becker Date: Fri Feb 13 03:12:33 2009 -0800 ocfs2: Remove inode from ocfs2_figure_extent_contig(). It already has an ocfs2_extent_tree and doesn't need the inode. Signed-off-by: Joel Becker commit a29702914ad36443d83b5250b3bfa1bf91e6b239 Author: Joel Becker Date: Fri Feb 13 03:09:54 2009 -0800 ocfs2: Swap inode for extent_tree in ocfs2_figure_merge_contig_type(). We don't want struct inode in generic btree operations. Signed-off-by: Joel Becker commit b4a176515c715f0c6db1759a39cd9c4175e5a23a Author: Joel Becker Date: Fri Feb 13 03:07:09 2009 -0800 ocfs2: ocfs2_extent_contig() only requires the superblock. Don't pass the inode in. We don't want it around for generic btree operations. Signed-off-by: Joel Becker commit 3505bec01829a8f690259517add55c7941a4d3d5 Author: Joel Becker Date: Fri Feb 13 02:57:58 2009 -0800 ocfs2: ocfs2_do_insert_extent() and ocfs2_insert_path() no longer need an inode. They aren't using it, so remove it from their parameter lists. Signed-off-by: Joel Becker commit c38e52bb1c0187186bd3c4a2b318ffe69cd2fdf8 Author: Joel Becker Date: Fri Feb 13 02:56:23 2009 -0800 ocfs2: Give ocfs2_split_record() an extent_tree instead of an inode. Another on the way to generic btree functions. Signed-off-by: Joel Becker commit d562862314a7b131a630f7b912490312387542fb Author: Joel Becker Date: Fri Feb 13 02:54:36 2009 -0800 ocfs2: ocfs2_insert_at_leaf() doesn't need struct inode. Give it an ocfs2_extent_tree and it is happy. Signed-off-by: Joel Becker commit 4c911eefca316f580f174940cd67d561b4b7e6e8 Author: Joel Becker Date: Fri Feb 13 02:50:12 2009 -0800 ocfs2: Make truncating the extent map an extent_tree_operation. ocfs2_remove_extent() wants to truncate the extent map if it's truncating an inode data extent. But since many btrees can call that function, let's make it an op on ocfs2_extent_tree. Other tree types can leave it empty. Signed-off-by: Joel Becker commit 043beebb6c467a07ccd7aa666095f87fade1c28e Author: Joel Becker Date: Fri Feb 13 02:42:30 2009 -0800 ocfs2: ocfs2_truncate_rec() doesn't need struct inode. It's not using it anymore. Remove it from the parameter list. Signed-off-by: Joel Becker commit d401dc12fcced123909eba10334fb5d78866d1a9 Author: Joel Becker Date: Fri Feb 13 02:24:10 2009 -0800 ocfs2: ocfs2_grow_branch() and ocfs2_append_rec_to_path() lose struct inode. ocfs2_grow_branch() not really using it other than to pass it to the subfunctions ocfs2_shift_tree_depth(), ocfs2_find_branch_target(), and ocfs2_add_branch(). The first two weren't it either, so they drop the argument. ocfs2_add_branch() only passed it to ocfs2_adjust_rightmost_branch(), which drops the inode argument and uses the ocfs2_extent_tree as well. ocfs2_append_rec_to_path() can be take an ocfs2_extent_tree instead of the inode. The function ocfs2_adjust_rightmost_records() goes along for the ride. Signed-off-by: Joel Becker commit c495dd24ac00654f99540f533185e1fcc9534009 Author: Joel Becker Date: Fri Feb 13 02:19:11 2009 -0800 ocfs2: ocfs2_try_to_merge_extent() doesn't need struct inode. It's not using it, so remove it from the parameter list. Signed-off-by: Joel Becker commit 4fe82c312a7d975a9d0f591dc9180c1197ee4270 Author: Joel Becker Date: Fri Feb 13 02:16:08 2009 -0800 ocfs2: ocfs2_merge_rec_left/right() no longer need struct inode. Drop it from the parameters - they already have ocfs2_extent_list. Signed-off-by: Joel Becker commit 70f18c08b476e315c8ee17ea34b55ea1957e7e7d Author: Joel Becker Date: Fri Feb 13 02:09:31 2009 -0800 ocfs2: ocfs2_rotate_tree_left() no longer needs struct inode. It already gets ocfs2_extent_tree, so we can just use that. This chains to the same modification for ocfs2_remove_rightmost_path() and ocfs2_rotate_rightmost_leaf_left(). Signed-off-by: Joel Becker commit e46f74dc357947e2aed9bdd63cf335c5fd23810b Author: Joel Becker Date: Thu Feb 12 19:47:43 2009 -0800 ocfs2: __ocfs2_rotate_tree_left() doesn't need struct inode. It already has struct ocfs2_extent_tree, which has the caching info. So we don't need to pass it struct inode. Signed-off-by: Joel Becker commit 1e2dd63fe0b6e99b81904a61090db801978b9520 Author: Joel Becker Date: Thu Feb 12 19:45:28 2009 -0800 ocfs2: ocfs2_rotate_subtree_left() doesn't need struct inode. It already has struct ocfs2_extent_tree, which has the caching info. So we don't need to pass it struct inode. Signed-off-by: Joel Becker commit 09106bae05c3350e8d0ef0ede90b1c3da4bda2f8 Author: Joel Becker Date: Thu Feb 12 19:43:57 2009 -0800 ocfs2: ocfs2_update_edge_lengths() doesn't need struct inode. Pass in the extent tree, which is all we need. Signed-off-by: Joel Becker commit 1bbf0b8d606645c7596ee641acfbf042765c9719 Author: Joel Becker Date: Thu Feb 12 19:42:08 2009 -0800 ocfs2: ocfs2_rotate_tree_right() doesn't need struct inode. We don't need struct inode in ocfs2_rotate_tree_right() anymore. Signed-off-by: Joel Becker commit 6136ca5f5f9fd38da399e9ff9380f537c1b3b901 Author: Joel Becker Date: Thu Feb 12 19:32:43 2009 -0800 ocfs2: Drop struct inode from ocfs2_extent_tree_operations. We can get to the inode from the caching information. Other parent types don't need it. Signed-off-by: Joel Becker commit 7dc028056750328e74ca807041c822068384fe16 Author: Joel Becker Date: Thu Feb 12 19:20:13 2009 -0800 ocfs2: Pass ocfs2_extent_tree to ocfs2_get_subtree_root() Get rid of the inode argument. Use extent_tree instead. This means a few more functions have to pass an extent_tree around. Signed-off-by: Joel Becker commit 5c601aba8c5d9d5f944cf02b59e3288dd72ae6cf Author: Joel Becker Date: Thu Feb 12 19:10:13 2009 -0800 ocfs2: Get inode out of ocfs2_rotate_subtree_root_right(). Pass the ocfs2_extent_list down through ocfs2_rotate_tree_right() and get rid of struct inode in ocfs2_rotate_subtree_root_right(). Signed-off-by: Joel Becker commit 4619c73e7c9bd10bac6b60925fa28d5a2eeaf6ed Author: Joel Becker Date: Thu Feb 12 19:02:36 2009 -0800 ocfs2: ocfs2_complete_edge_insert() doesn't need struct inode at all. Completely unused argument. Get rid of it. Signed-off-by: Joel Becker commit 6641b0ce3274d979338cb67b2f562189dcbc1c28 Author: Joel Becker Date: Thu Feb 12 18:57:52 2009 -0800 ocfs2: Pass ocfs2_extent_tree to ocfs2_unlink_path() ocfs2_unlink_path() doesn't need struct inode, so let's pass it struct ocfs2_extent_tree. Signed-off-by: Joel Becker commit 42a5a7a9a5abf9a566b91c51137921957b9a14e4 Author: Joel Becker Date: Thu Feb 12 18:49:19 2009 -0800 ocfs2: ocfs2_create_new_meta_bhs() doesn't need struct inode. Pass struct ocfs2_extent_tree into ocfs2_create_new_meta_bhs(). It no longer needs struct inode or ocfs2_super. Signed-off-by: Joel Becker commit facdb77f54f09a33baf6b649496f5dd1d7922a7e Author: Joel Becker Date: Thu Feb 12 18:08:48 2009 -0800 ocfs2: ocfs2_find_path() only needs the caching info ocfs2_find_path and ocfs2_find_leaf() walk our btrees, reading extent blocks. They need struct ocfs2_caching_info for that, but not struct inode. Signed-off-by: Joel Becker commit 3d03a305ded8057155bd3c801e64ffef9f534827 Author: Joel Becker Date: Thu Feb 12 17:49:26 2009 -0800 ocfs2: Pass ocfs2_caching_info to ocfs2_read_extent_block(). extent blocks belong to btrees on more than just inodes, so we want to pass the ocfs2_caching_info structure directly to ocfs2_read_extent_block(). A number of places in alloc.c can now drop struct inode from their argument list. Signed-off-by: Joel Becker commit d9a0a1f83bf083b55b3c1f16efddecc31abace61 Author: Joel Becker Date: Thu Feb 12 17:32:34 2009 -0800 ocfs2: Store the ocfs2_caching_info on ocfs2_extent_tree. What do we cache? Metadata blocks. What are most of our non-inode metadata blocks? Extent blocks for our btrees. struct ocfs2_extent_tree is the main structure for managing those. So let's store the associated ocfs2_caching_info there. This means that ocfs2_et_root_journal_access() doesn't need struct inode anymore, and any place that has an et can refer to et->et_ci instead of INODE_CACHE(inode). Signed-off-by: Joel Becker commit 0cf2f7632b1789b811ab20b611c4156e6de2b055 Author: Joel Becker Date: Thu Feb 12 16:41:25 2009 -0800 ocfs2: Pass struct ocfs2_caching_info to the journal functions. The next step in divorcing metadata I/O management from struct inode is to pass struct ocfs2_caching_info to the journal functions. Thus the journal locks a metadata cache with the cache io_lock function. It also can compare ci_last_trans and ci_created_trans directly. This is a large patch because of all the places we change ocfs2_journal_access..(handle, inode, ...) to ocfs2_journal_access..(handle, INODE_CACHE(inode), ...). Signed-off-by: Joel Becker commit 292dd27ec76b96cebcef576f330ab121f59ccf05 Author: Joel Becker Date: Thu Feb 12 15:41:59 2009 -0800 ocfs2: move ip_created_trans to struct ocfs2_caching_info Similar ip_last_trans, ip_created_trans tracks the creation of a journal managed inode. This specifically tracks what transaction created the inode. This is so the code can know if the inode has ever been written to disk. This behavior is desirable for any journal managed object. We move it to struct ocfs2_caching_info as ci_created_trans so that any object using ocfs2_caching_info can rely on this behavior. Signed-off-by: Joel Becker commit 66fb345ddd2d343e36692da0ff66126d7a99dc1b Author: Joel Becker Date: Thu Feb 12 15:24:40 2009 -0800 ocfs2: move ip_last_trans to struct ocfs2_caching_info We have the read side of metadata caching isolated to struct ocfs2_caching_info, now we need the write side. This means the journal functions. The journal only does a couple of things with struct inode. This change moves the ip_last_trans field onto struct ocfs2_caching_info as ci_last_trans. This field tells the journal whether a pending journal flush is required. Signed-off-by: Joel Becker commit 8cb471e8f82506937fe5e2e9fb0bf90f6b1f1170 Author: Joel Becker Date: Tue Feb 10 20:00:41 2009 -0800 ocfs2: Take the inode out of the metadata read/write paths. We are really passing the inode into the ocfs2_read/write_blocks() functions to get at the metadata cache. This commit passes the cache directly into the metadata block functions, divorcing them from the inode. Signed-off-by: Joel Becker commit 6e5a3d7538ad4e46a976862f593faf65750e37cc Author: Joel Becker Date: Tue Feb 10 19:00:37 2009 -0800 ocfs2: Change metadata caching locks to an operations structure. We don't really want to cart around too many new fields on the ocfs2_caching_info structure. So let's wrap all our access of the parent object in a set of operations. One pointer on caching_info, and more flexibility to boot. Signed-off-by: Joel Becker commit 47460d65a483529b3bc2bf6ccf461ad45f94df83 Author: Joel Becker Date: Tue Feb 10 16:05:07 2009 -0800 ocfs2: Make the ocfs2_caching_info structure self-contained. We want to use the ocfs2_caching_info structure in places that are not inodes. To do that, it can no longer rely on referencing the inode directly. This patch moves the flags to ocfs2_caching_info->ci_flags, stores pointers to the parent's locks on the ocfs2_caching_info, and renames the constants and flags to reflect its independant state. Signed-off-by: Joel Becker