commit e988cf1cfed4ed80bf40528e655fe18bed6a38b6 Author: Mark Fasheh Date: Thu Jul 10 09:25:39 2008 -0700 ocfs2: Fix flags in ocfs2_file_lock The stack-glue merge changed the way we use flags in dlmglue in that we now use the fs/dlm equivalents. Unfortunately, a merge error left the new flock code only partially updated. This took a while to show up though, because the lock level constants are actually identical between o2dlm and fs/dlm. The *_CONVERT and *_NOQUEUE flags have different values though, which is eventually causing a crash in flags_to_o2dlm(). Signed-off-by: Mark Fasheh commit 18c6ac383f3e46cfce08d0bf972705852a4e1268 Author: Sunil Mushran Date: Mon Jul 7 10:06:29 2008 -0700 [PATCH] ocfs2/dlm: Fixes oops in dlm_new_lockres() Patch fixes a race that can result in an oops while adding a lockres to the dlm lockres tracking list. Bug introduced by mainline commit 29576f8bb54045be944ba809d4fca1ad77c94165. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 2c39450b39880e162b3eb339672314101f58ee1a Author: Joel Becker Date: Fri May 30 15:58:26 2008 -0700 ocfs2: Remove ->hangup() from stack glue operations. The ->hangup() call was only used to execute ocfs2_hb_ctl. Now that the generic stack glue code handles this, the underlying stack drivers don't need to know about it. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 9f9a99f4eccc64650e932090cff0ebd07b81e334 Author: Joel Becker Date: Fri May 30 15:43:58 2008 -0700 ocfs2: Move the call of ocfs2_hb_ctl into the stack glue. Take o2hb_stop() out of the o2cb code and make it part of the generic stack glue as ocfs2_leave_group(). This also allows us to remove the ocfs2_get_hb_ctl_path() function - everything to do with hb_ctl is now part of stackglue.c. o2cb no longer needs a ->hangup() function. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 3878f110f71a0971ff7acc15dd6db711b6ef37c6 Author: Joel Becker Date: Fri May 30 15:30:49 2008 -0700 ocfs2: Move the hb_ctl_path sysctl into the stack glue. ocfs2 needs to call out to the hb_ctl program at unmount for all cluster stacks. The first step is to move the hb_ctl_path sysctl out of the o2cb code and into the generic stack glue. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 0f475b2abed6cbccee1da20a0bef2895eb2a0edd Author: Sunil Mushran Date: Mon May 12 18:31:37 2008 -0700 [PATCH 3/3] ocfs2/net: Silence build warnings This patch silences the build warnings concerning o2net_init_nst() and friends when building without CONFIG_DEBUG_FS enabled. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 959040c37a8cae8117907d4aed87f1b01ff1ea19 Author: Sunil Mushran Date: Mon May 12 18:31:36 2008 -0700 [PATCH 2/3] ocfs2/dlm: Silence build warnings This patch silences the build warnings concerning dlm_debug_init() and friends when building without CONFIG_DEBUG_FS enabled. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 271d772d02507c7541d5e6b4938ed2380e59a39a Author: Sunil Mushran Date: Mon May 12 18:31:35 2008 -0700 [PATCH 1/3] ocfs2/net: Silence build warnings This patch silences the build warnings concerning o2net_debugfs_init() and friends when building without CONFIG_DEBUG_FS enabled. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit a12630b186d56a77d17c9b34c82b88dda4337ed7 Author: Joel Becker Date: Fri May 9 18:49:29 2008 -0700 ocfs2: Rename 'user_stack' plugin structure to 'ocfs2_user_plugin' The static structure describing the userspace cluster plugin for ocfs2 was named 'user_stack', which is a real pain when people are grep(1)ing the tree for the program stack object 'user_stack'. Change the name to something distinct and namespaced. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 9d8df6aa9b1ca74127b11537d91de492dbea666a Author: Al Viro Date: Wed May 21 06:32:11 2008 +0100 ocfs2 endianness fixes Signed-off-by: Al Viro Signed-off-by: Linus Torvalds commit 4ba1c5bfd2e5a6c9528eb7777b66c297e70f61ca Author: Sunil Mushran Date: Fri Apr 18 15:03:59 2008 -0700 ocfs2: Use GFP_NOFS in kmalloc during localalloc window move kmalloc() during a localalloc window move can trigger the mm to prune the dcache which inturn can trigger the fs to delete an inode causing it start a recursive transaction. The fix also makes the change in kmalloc during localalloc shutdown just to be safe. Fixes oss bugzilla#901 http://oss.oracle.com/bugzilla/show_bug.cgi?id=901 Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit bc535809c06ada210d89f5a43b335c68ecbb8e1b Author: Sunil Mushran Date: Fri Apr 18 10:23:53 2008 -0700 ocfs2: Allow uid/gid/perm changes of symlinks This patch adds the ability to change attributes of a symlink. Fixes oss bugzilla#963 http://oss.oracle.com/bugzilla/show_bug.cgi?id=963 Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 95642e56647d84963428a1168baa8a73cb782ac3 Author: Adrian Bunk Date: Mon Apr 21 11:49:37 2008 +0300 ocfs2/dlm: dlmdebug.c: make 2 functions static This patch makes the following needlessly global functions static: - stringify_lockname() - dlm_debug_put() Signed-off-by: Adrian Bunk Acked-by: Sunil Mushran Signed-off-by: Mark Fasheh commit 4af694e672aaa85940d6e29d27b7eeea5f6eb258 Author: Adrian Bunk Date: Mon Apr 21 11:49:31 2008 +0300 ocfs2: make struct o2cb_stack_ops static This patch makes the needlessly global struct o2cb_stack_ops static. Signed-off-by: Adrian Bunk Acked-by: Joel Becker Signed-off-by: Mark Fasheh commit 4d8755b5e667df8f01647773ba744a5ac97e68e6 Author: Adrian Bunk Date: Mon Apr 21 11:49:26 2008 +0300 ocfs2: make struct ocfs2_control_device static This patch makes the needlessly global struct ocfs2_control_device static. Signed-off-by: Adrian Bunk Acked-by: Joel Becker Signed-off-by: Mark Fasheh commit 9d80f7539a91c0154e40fc9e4ae5e818dd8f102e Author: Joel Becker Date: Tue Apr 22 11:46:44 2008 -0700 ocfs2: Correct merge of 52f7c21 (Move /sys/o2cb to /sys/fs/o2cb) Commit 52f7c21b613f80cb425d115c9e5b4ed958a133c0 was intended to move /sys/o2cb to /sys/fs/o2cb, providing /sys/o2cb as a symlink for backwards compatibility. However, the merge apparently added the symlink but failed to move the directory, resulting in a duplicate filename error. It's a one-line change that was missing. Signed-off-by: Joel Becker Acked-by: Randy Dunlap Signed-off-by: Mark Fasheh commit e4ad08fe64afca4ef79ecc4c624e6e871688da0d Author: Miklos Szeredi Date: Wed Apr 30 00:54:37 2008 -0700 mm: bdi: add separate writeback accounting capability Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is set, then don't update the per-bdi writeback stats from test_set_page_writeback() and test_clear_page_writeback(). Misc cleanups: - convert bdi_cap_writeback_dirty() and friends to static inline functions - create a flag that includes all three dirty/writeback related flags, since almst all users will want to have them toghether Signed-off-by: Miklos Szeredi Cc: Peter Zijlstra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds commit 42a74f206b914db13ee1f5ae932dcd91a77c8579 Author: Dave Hansen Date: Fri Feb 15 14:37:46 2008 -0800 [PATCH] r/o bind mounts: elevate write count for ioctls() Some ioctl()s can cause writes to the filesystem. Take these, and make them use mnt_want/drop_write() instead. [AV: updated] Acked-by: Al Viro Signed-off-by: Christoph Hellwig Signed-off-by: Dave Hansen Signed-off-by: Andrew Morton Signed-off-by: Al Viro commit 2309e9e040fe29469fb85a384636c455b62fe525 Author: Sunil Mushran Date: Mon Apr 14 10:46:19 2008 -0700 ocfs2/net: Add debug interface to o2net This patch exposes o2net information via debugfs. The information includes the list of sockets (sock_containers) as well as the list of outstanding messages (send_tracking). Useful for o2dlm debugging. (This patch is derived from an earlier one written by Zach Brown that exposed the same information via /proc.) [Mark: checkpatch fixes] Signed-off-by: Sunil Mushran Reviewed-by: Joel Becker Signed-off-by: Mark Fasheh commit 93b06edb5127315473d87e075b2b1d1acf74659c Author: Mark Fasheh Date: Fri Apr 4 12:45:55 2008 -0700 ocfs2: Only build ocfs2/dlm with the o2cb stack module fs/ocfs2/dlm/ocfs2_dlm.ko and fs/ocfs2/dlm/ocfs2_dlmfs.ko get built if CONFIG_FS_OCFS2 is specified. This isn't quite how it should happen any more - the "o2cb" dlm modules should only be built if CONFIG_FS_OCFS2_O2CB is set, so update the dlm Makefile accordingly. Signed-off-by: Mark Fasheh Acked-by: Randy Dunlap Acked-by: Joel Becker commit 409753bf6da4a2db038027471abaf324e063db2f Author: Jeff Mahoney Date: Fri Mar 28 16:44:13 2008 -0700 ocfs2/cluster: Get rid of arguments to the timeout routines We keep seeing bug reports related to NULL pointer derefs in o2net_set_nn_state(). When I originally wrote up the configurable timeout patch, I had tried to plan for multiple clusters. This was silly. The timeout routines all use o2nm_single_cluster so there's no point in passing an argument at all. This patch removes the arguments and kills those bugs dead. Signed-off-by: Jeff Mahoney Signed-off-by: Mark Fasheh commit b1f3550fa1471b691ad6c2f35b5b22e93eaa5855 Author: Julia Lawall Date: Tue Mar 4 15:21:05 2008 -0800 ocfs2: Use BUG_ON if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (E) { BUG(); } + BUG_ON(E); ) // Signed-off-by: Julia Lawall Signed-off-by: Andrew Morton Signed-off-by: Mark Fasheh commit c9ec14884d69a303eef4faae42bd3c4e25b19941 Author: Andi Kleen Date: Sun Jan 27 03:17:17 2008 +0100 ocfs2: Convert ocfs2 over to unlocked_ioctl As far as I can see there is nothing in ocfs2_ioctl that requires the BKL, so use unlocked_ioctl Signed-off-by: Andi Kleen Signed-off-by: Mark Fasheh commit 5dabd69515765156605b09261abf969236a77803 Author: Jan Kara Date: Thu Feb 21 18:00:00 2008 +0100 ocfs2: Improve rename locking ocfs2_rename() was being too aggressive with the rename lock - we only need it for certain forms of directory rename. Signed-off-by: Jan Kara Signed-off-by: Mark Fasheh commit 58dadcdbc2584db050969f9781727fc5a3f618db Author: Julia Lawall Date: Fri Mar 28 14:43:10 2008 -0700 fs/ocfs2/aops.c: test for IS_ERR rather than 0 The function ocfs2_start_trans always returns either a valid pointer or a value made with ERR_PTR, so its result should be tested with IS_ERR, not with a test for 0. Signed-off-by: Julia Lawall Signed-off-by: Andrew Morton Signed-off-by: Mark Fasheh commit 4d0ddb2ce25db2254d468233d942276ecf40bff8 Author: Tao Ma Date: Wed Mar 5 16:11:46 2008 +0800 ocfs2: Add inode stealing for ocfs2_reserve_new_inode Inode allocation is modified to look in other nodes allocators during extreme out of space situations. We retry our own slot when space is freed back to the global bitmap, or whenever we've allocated more than 1024 inodes from another slot. Signed-off-by: Tao Ma Signed-off-by: Mark Fasheh commit a4a4891164d4f6f383cc17e7c90828a7ca6a1146 Author: Tao Ma Date: Mon Mar 3 17:12:30 2008 +0800 ocfs2: Add ac_alloc_slot in ocfs2_alloc_context In inode stealing, we no longer restrict the allocation to happen in the local node. So it is neccessary for us to add a new member in ocfs2_alloc_context to indicate which slot we are using for allocation. We also modify the process of local alloc so that this member can be used there also. Signed-off-by: Tao Ma Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit ffda89a3bf3b968bdc268584c6bc1da5c173cf12 Author: Tao Ma Date: Mon Mar 3 17:12:09 2008 +0800 ocfs2: Add a new parameter for ocfs2_reserve_suballoc_bits In some cases(Inode stealing from other nodes), we may not want ocfs2_reserve_suballoc_bits to allocate new groups from the global_bitmap since it may already be full. So add a new parameter for this. Signed-off-by: Tao Ma Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit ad5a4d7093a76fa245e277e6f0f0e168a08aeff7 Author: Tao Ma Date: Wed Jan 30 14:21:32 2008 +0800 ocfs2: Enable cross extent block merge. In ocfs2_figure_merge_contig_type, we judge whether there exists a cross extent block merge and enable it by setting CONTIG_LEFT and CONTIG_RIGHT accordingly. Signed-off-by: Tao Ma Signed-off-by: Mark Fasheh commit 677b975282e48d1818df4181336307377d56b04e Author: Tao Ma Date: Wed Jan 30 14:21:05 2008 +0800 ocfs2: Add support for cross extent block In ocfs2_merge_rec_left, when we find the merge extent is "CONTIG_RIGHT" with the first extent record of the next extent block, we will merge it to the next extent block and change all the related extent blocks accordingly. In ocfs2_merge_rec_right, when we find the merge extent is "CONTIG_LEFT" with the last extent record of the previous extent block, we will merge it to the prevoius extent block and change all the related extent blocks accordingly. As for CONTIG_LEFTRIGHT, we will handle CONTIG_RIGHT first so that when the index is zero, the merge process will be more efficient and easier. Signed-off-by: Tao Ma Signed-off-by: Mark Fasheh commit 52f7c21b613f80cb425d115c9e5b4ed958a133c0 Author: Mark Fasheh Date: Tue Jan 29 17:08:26 2008 -0800 ocfs2: Move /sys/o2cb to /sys/fs/o2cb /sys/fs is where we really want file system specific sysfs objects. Ocfs2-tools has been updated to look in /sys/fs/o2cb. We can maintain backwards compatibility with old ocfs2-tools by using a sysfs symlink. After some time (2 years), the symlink can be safely removed. This patch also adds documentation to make it easier for people to figure out what /sys/fs/o2cb is used for. Signed-off-by: Mark Fasheh commit 5cc3bf2786f63cceb191c3c02ddd83c6f38a7d64 Author: Tao Ma Date: Wed Mar 5 15:50:12 2008 +0800 ocfs2: Reconnect after idle time out. Currently, o2net connects to a node on hb_up and disconnects on hb_down and net timeout. It disconnects on net timeout is ok, but it should attempt to reconnect back. This is because sometimes nodes get overloaded enough that the network connection breaks but the disk hb does not. And if we get into that situation, we either fence (unnecessarily) or wait for its disk hb to die (and sometimes hang in the process). So in this updated scheme, when the network disconnects, we keep attempting to reconnect till we succeed or we get a disk hb down event. If the other node is really dead, then we will eventually get a node down event. If not, we should be able to connect again and continue. Signed-off-by: Tao Ma Signed-off-by: Mark Fasheh commit 8f50eb978935431ccbf89b0344efd4ce6a924875 Author: Sunil Mushran Date: Fri Mar 14 11:18:24 2008 -0700 ocfs2/dlm: Cleanup lockres print A previous patch added KERN_NOTICE to printks printing the lockres that cluttered the output. This patch removes the log level. For people concerned with syslog clutter, please note we now use this facility to print lockres only during an error. Signed-off-by: Sunil Mushran Signed-off-by: Mark Fasheh commit c834cdb15702dd0147875b352cc7d4df93d7d900 Author: Sunil Mushran Date: Mon Mar 10 15:16:29 2008 -0700 ocfs2/dlm: Fix lockname in lockres print function __dlm_print_one_lock_resource was printing lockname incorrectly. Also, we now use printk directly instead of mlog as the latter prints the line context which is not useful for this print. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit e5a0334cbd65e27f8dfd9985aa805874fe59e879 Author: Sunil Mushran Date: Mon Mar 10 15:16:28 2008 -0700 ocfs2/dlm: Move dlm_print_one_mle() from dlmmaster.c to dlmdebug.c This patch helps in consolidating debugging related functions in dlmdebug.c. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 7209300a9b987e017cae2ef9d7ef55b0fdd71869 Author: Sunil Mushran Date: Mon Mar 10 15:16:27 2008 -0700 ocfs2/dlm: Dumps the purgelist into a debugfs file This patch dumps all the lockres' on the purgelist it can fit in one page into a debugfs file. Useful for debugging. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit d0129aceaecc2b1f5171b8e8036eb469b6e0fe81 Author: Sunil Mushran Date: Mon Mar 10 15:16:26 2008 -0700 ocfs2/dlm: Dumps the mles into a debugfs file This patch dumps all mles it can fit in one page into a debugfs file. Useful for debugging. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 751155a953e1fe558d3d3c3db7087712ffc15c3e Author: Sunil Mushran Date: Mon Mar 10 15:16:25 2008 -0700 ocfs2/dlm: Move struct dlm_master_list_entry to dlmcommon.h This patch moves some mle related definitions from dlmmaster.c to dlmcommon.h. Future patches need these definitions to dump mle debugging information. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 4e3d24ed1a1285fe3289653aacc965642706bacb Author: Sunil Mushran Date: Mon Mar 10 15:16:24 2008 -0700 ocfs2/dlm: Dumps the lockres' into a debugfs file This patch dumps all the lockres' alongwith all the locks into a debugfs file. Useful for debugging. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 007dce53a29ccffc000ab5373d188f73881390fd Author: Sunil Mushran Date: Mon Mar 10 15:16:23 2008 -0700 ocfs2/dlm: Dump the dlm state in a debugfs file This patch dumps the dlm state (dlm_ctxt) into a debugfs file. Useful for debugging. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 6325b4a22b8f5e40ea9353288b3d6a32181f9718 Author: Sunil Mushran Date: Mon Mar 10 15:16:22 2008 -0700 ocfs2/dlm: Create debugfs dirs This patch creates the debugfs directories that will hold the files to be used to dump the dlm state. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 29576f8bb54045be944ba809d4fca1ad77c94165 Author: Sunil Mushran Date: Mon Mar 10 15:16:21 2008 -0700 ocfs2/dlm: Link all lockres' to a tracking list This patch links all the lockres' to a tracking list in dlm_ctxt. We will use this in an upcoming patch that will walk the entire list and to dump the lockres states to a debugfs file. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 724bdca9b8449d9ee5f779dc27ee3d906a04508c Author: Sunil Mushran Date: Mon Mar 10 15:16:20 2008 -0700 ocfs2/dlm: Create slabcaches for lock and lockres This patch makes the o2dlm allocate memory for lockres, lockname and lock structures from slabcaches rather than kmalloc. This allows us to not only make these allocs more efficient but also allows us to track the memory being consumed by these structures. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 12eb0035d6f0466038ef2c6e5f6f9296b9b74d91 Author: Sunil Mushran Date: Mon Mar 10 15:16:19 2008 -0700 ocfs2/dlm: Rename slabcache dlm_mle_cache to o2dlm_mle This patch renames dlm_mle_slabcache to prevent namespace clashes with fs/dlm. Signed-off-by: Sunil Mushran Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 9341d22942d63d6a1e4cc90f246980dbb7e1ca94 Author: Joel Becker Date: Tue Mar 4 17:58:56 2008 -0800 ocfs2: Allow selection of cluster plug-ins. ocfs2 now supports plug-ins for the classic O2CB stack as well as userspace cluster stacks in conjunction with fs/dlm. This allows zero, one, or both of the plug-ins to be selected in Kconfig. For local mounts (non-clustered), neither plug-in is needed. Both plugins can be loaded at one time, the runtime will select the one needed for the cluster systme in use. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit b92eccdd28e1e3870a5b2aa625282c9ae8e35cec Author: Joel Becker Date: Wed Nov 28 14:53:30 2007 -0800 ocfs2: Add kbuild for ocfs2_stack_user.ko Add ocfs2_stack_user.ko to the Makefile so that it builds. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 8f318311faf57481452895448e6ffaec7c38a146 Author: Joel Becker Date: Tue Mar 4 16:09:39 2008 -0800 ocfs2: Change mlog_bug_on to BUG_ON in ocfs2_lockid.h The masklog code is in the o2cb stack, but ocfs2_lockid.h now needs to be included by the user stack. The BUG() in ocfs2_lock_type_string() does not need masklog support, so change it to a regular BUG_ON(). Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit cf4d8d75d8aba537a19b313a9364fd08ddbd5622 Author: David Teigland Date: Wed Feb 20 14:29:27 2008 -0800 ocfs2: add fsdlm to stackglue Add code to use fs/dlm. [ Modified to be part of the stack_user module -- Joel ] Signed-off-by: David Teigland Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit d4b95eef4dc4a59bcd42bdf783638a2eaa57b4c8 Author: Joel Becker Date: Wed Feb 20 15:39:44 2008 -0800 ocfs2: Add the 'set version' message to the ocfs2_control device. The "SETV" message sets the filesystem locking protocol version as negotiated by the client. The client negotiates based on the maximum version advertised in /sys/fs/ocfs2/max_locking_protocol. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 3cfd4ab6b6b4bee2035b62e1c293801c3d257502 Author: Joel Becker Date: Wed Feb 20 14:44:34 2008 -0800 ocfs2: Add the local node id to the handshake. This is the second part of the ocfs2_control handshake. After negotiating the ocfs2_control protocol, the daemon tells the filesystem what the local node id is via the SETN message. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit de870ef02295c9f5601dbf2efdc1be6df44b187b Author: Joel Becker Date: Mon Feb 18 17:07:09 2008 -0800 ocfs2: Introduce the DOWN message to ocfs2_control When the control daemon sees a node go down, it sends a DOWN message through the ocfs2_control device. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 462c7e6a257e547eebe1648396cf7c45e684091b Author: Joel Becker Date: Mon Feb 18 19:40:12 2008 -0800 ocfs2: Start the ocfs2_control handshake. When a control daemon opens the ocfs2_control device, it must perform a handshake to tell the filesystem it is something capable of monitoring cluster status. Only after the handshake is complete will the filesystem allow mounts. This is the first part of the handshake. The daemon reads all supported ocfs2_control protocols, then writes in the protocol it will use. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 6427a727557d9c964b7b162ae11bb156e2c501d5 Author: Joel Becker Date: Mon Feb 18 19:23:28 2008 -0800 ocfs2: Add the ocfs2_control misc device. The ocfs2_control misc device is how a userspace control daemon (controld) talks to the filesystem. Introduce the bare-bones filesystem ops. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 8adf0536c9fb578a8542dcf81104d3438a5287e4 Author: Joel Becker Date: Wed Nov 28 14:38:40 2007 -0800 ocfs2: Add the user stack module. Add a skeleton for the stack_user module. It's just the barebones module code. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 9c6c877c04ce17d76a35d2173d3a3840d6b796a2 Author: Joel Becker Date: Fri Feb 1 15:17:30 2008 -0800 ocfs2: Add the 'cluster_stack' sysfs file. Userspace can now query and specify the cluster stack in use via the /sys/fs/ocfs2/cluster_stack file. By default, it is 'o2cb', which is the classic stack. Thus, old tools that do not know how to modify this file will work just fine. The stack cannot be modified if there is a live filesystem. ocfs2_cluster_connect() now takes the expected cluster stack as an argument. This way, the filesystem and the stack glue ensure they are speaking to the same backend. If the stack is 'o2cb', the o2cb stack plugin is used. For any other value, the fsdlm stack plugin is selected. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit b61817e1166c5e19c08baf05196477cc345e1b1a Author: Joel Becker Date: Fri Feb 1 15:08:23 2008 -0800 ocfs2: Add the USERSPACE_STACK incompat bit. The filesystem gains the USERSPACE_STACK incomat bit and the s_cluster_info field on the superblock. When a userspace stack is in use, the name of the stack is stored on-disk for mount-time verification. The "cluster_stack" option is added to mount(2) processing. The mount process needs to pass the matching stack name. If the passed name and the on-disk name do not match, the mount is failed. When using the classic o2cb stack, the incompat bit is *not* set and no mount option is used other than the usual heartbeat=local. Thus, the filesystem is compatible with older tools. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 74ae4e104dfc57017783fc07d5f2f9129062207f Author: Joel Becker Date: Thu Jan 31 23:56:17 2008 -0800 ocfs2: Create stack glue sysfs files. Introduce a set of sysfs files that describe the current stack glue state. The files live under /sys/fs/ocfs2. The locking_protocol file displays the version of ocfs2's locking code. The loaded_cluster_plugins file displays all of the currently loaded stack plugins. When filesystems are mounted, the active_cluster_plugin file will display the plugin in use. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 286eaa95c5c5915a6b72cc3f0a2534161fd7928b Author: Joel Becker Date: Fri Feb 1 15:03:57 2008 -0800 ocfs2: Break out stackglue into modules. We define the ocfs2_stack_plugin structure to represent a stack driver. The o2cb stack code is split into stack_o2cb.c. This becomes the ocfs2_stack_o2cb.ko module. The stackglue generic functions are similarly split into the ocfs2_stackglue.ko module. This module now provides an interface to register drivers. The ocfs2_stack_o2cb driver registers itself. As part of this interface, ocfs2_stackglue can load drivers on demand. This is accomplished in ocfs2_cluster_connect(). ocfs2_cluster_disconnect() is now notified when a _hangup() is pending. If a hangup is pending, it will not release the driver module and will let _hangup() do that. Signed-off-by: Joel Becker commit e3dad42bf993a0f24eb6e46152356c9b119c15e8 Author: Joel Becker Date: Fri Feb 1 15:02:36 2008 -0800 ocfs2: Create ocfs2_stack_operations and split out the o2cb stack. Define the ocfs2_stack_operations structure. Build o2cb_stack_ops from all of the o2cb-specific stack functions. Change the generic stack glue functions to call the stack_ops instead of the o2cb functions directly. The o2cb functions are moved to stack_o2cb.c. The headers are cleaned up to where only needed headers are included. In this code, stackglue.c and stack_o2cb.c refer to some shared extern variables. When they become modules, that will change. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 553aa7e408eac402c00b67ddfa7aec13fe1f3a33 Author: Joel Becker Date: Fri Feb 1 14:51:03 2008 -0800 ocfs2: Split o2cb code from generic stack functions. Split off the o2cb-specific funtionality from the generic stack glue calls. This is a precurser to wrapping the o2cb functionality in an operations vector. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 63e0c48ae6986a5bbb8e8dd9210c0e6ca79f2e50 Author: Joel Becker Date: Wed Jan 30 16:58:36 2008 -0800 ocfs2: Clean up stackglue initialization The stack glue initialization function needs a better name so that it can be used cleanly when stackglue becomes a module. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit cf0acdcd640e9466059e69951c557e90b4bee45a Author: Joel Becker Date: Tue Jan 29 16:59:55 2008 -0800 ocfs2: Abstract out a debugging function for underlying dlms. dlmglue.c was still referencing a raw o2dlm lksb in one instance. Let's create a generic ocfs2_dlm_dump_lksb() function. This allows underlying DLMs to print whatever they want about their lock. We then move the o2dlm dump into stackglue.c where it belongs. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 1693a5c0117f8ccd010a666f97aaf0f14fb0a0e4 Author: David Teigland Date: Wed Jan 30 16:52:53 2008 -0800 ocfs2: handle async EAGAIN from NOQUEUE request When using fsdlm, -EAGAIN is returned in the async callback for NOQUEUE requests. Fix up dlmglue to expect this. Signed-off-by: David Teigland Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit de551246e7bc5558371c3427889a8db1b8cc60f4 Author: Joel Becker Date: Fri Feb 1 14:45:08 2008 -0800 ocfs2: Remove CANCELGRANT from the view of dlmglue. o2dlm has the non-standard behavior of providing a cancel callback (unlock_ast) even when the cancel has failed (the locking operation succeeded without canceling). This is called CANCELGRANT after the status code sent to the callback. fs/dlm does not provide this callback, so dlmglue must be changed to live without it. o2dlm_unlock_ast_wrapper() in stackglue now ignores CANCELGRANT calls. Because dlmglue no longer sees CANCELGRANT, ocfs2_unlock_ast() no longer needs to check for it. ocfs2_locking_ast() must catch that a cancel was tried and clear the cancel state. Making these changes opens up a locking race. dlmglue uses the the OCFS2_LOCK_BUSY flag to ensure only one thread is calling the dlm at any one time. But dlmglue must unlock the lockres before calling into the dlm. In the small window of time between unlocking the lockres and calling the dlm, the downconvert thread can try to cancel the lock. The downconvert thread is checking the OCFS2_LOCK_BUSY flag - it doesn't know that ocfs2_dlm_lock() has not yet been called. Because ocfs2_dlm_lock() has not yet been called, the cancel operation will just be a no-op. There's nothing to cancel. With CANCELGRANT, dlmglue uses the CANCELGRANT callback to clear up the cancel state. When it comes around again, it will retry the cancel. Eventually, the first thread will have called into ocfs2_dlm_lock(), and either the lock or the cancel will succeed. The downconvert thread can then do its downconvert. Without CANCELGRANT, there is nothing to clean up the cancellation state. The downconvert thread does not know to retry its operations. More importantly, the original lock may be blocking on the other node that is trying to cancel us. With neither able to make progress, the ast is never called and the cancellation state is never cleaned up that way. dlmglue is deadlocked. The OCFS2_LOCK_PENDING flag is introduced to remedy this window. It is set at the same time OCFS2_LOCK_BUSY is. Thus, the downconvert thread can check whether the lock is cancelable. If not, it just loops around to try again. Once ocfs2_dlm_lock() is called, the thread then clears OCFS2_LOCK_PENDING and wakes the downconvert thread. Now, if the downconvert thread finds the lock BUSY, it can safely try to cancel it. Whether the cancel works or not, the state will be properly set and the lock processing can continue. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 0abd6d1803b01c741430af270026d1d95a103d9c Author: Mark Fasheh Date: Tue Jan 29 16:59:56 2008 -0800 ocfs2: Fill node number during cluster stack init It doesn't make sense to query for a node number before connecting to the cluster stack. This should be safe to do because node_num is only just printed, and we're actually only moving the setting of node num a small amount further in the mount process. [ Disconnect when node query fails -- Joel ] Reviewed-by: Joel Becker Signed-off-by: Mark Fasheh commit 6953b4c008628b945bfe0cee97f6e78a98773859 Author: Joel Becker Date: Tue Jan 29 16:59:56 2008 -0800 ocfs2: Move o2hb functionality into the stack glue. The last bit of classic stack used directly in ocfs2 code is o2hb. Specifically, the check for heartbeat during mount and the call to ocfs2_hb_ctl during unmount. We create an extra API, ocfs2_cluster_hangup(), to encapsulate the call to ocfs2_hb_ctl. Other stacks will just leave hangup() empty. The check for heartbeat is moved into ocfs2_cluster_connect(). It will be matched by a similar check for other stacks. With this change, only stackglue.c includes cluster/ headers. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 19fdb624dc8ccb663f6e48b3a3a3fa4e4e567fc1 Author: Joel Becker Date: Wed Jan 30 15:38:24 2008 -0800 ocfs2: Abstract out node number queries. ocfs2 asks the cluster stack for the local node's node number for two reasons; to fill the slot map and to print it. While the slot map isn't necessary for userspace cluster stacks, the printing is very nice for debugging. Thus we add ocfs2_cluster_this_node() as a generic API to get this value. It is anticipated that the slot map will not be used under a userspace cluster stack, so validity checks of the node num only need to exist in the slot map code. Otherwise, it just gets used and printed as an opaque value. [ Fixed up some "int" versus "unsigned int" issues and made osb->node_num truly opaque. --Mark ] Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 4670c46ded9a18268d1265417ff4ac72145a7917 Author: Joel Becker Date: Fri Feb 1 14:39:35 2008 -0800 ocfs2: Introduce the new ocfs2_cluster_connect/disconnect() API. This step introduces a cluster stack agnostic API for initializing and exiting. fs/ocfs2/dlmglue.c no longer uses o2cb/o2dlm knowledge to connect to the stack. It is all handled in stackglue.c. heartbeat.c no longer needs to know how it gets called. ocfs2_do_node_down() is now a clean recovery trigger. The big gotcha is the ordering of initializations and de-initializations done underneath ocfs2_cluster_connect(). ocfs2_dlm_init() used to do all o2dlm initialization in one block. Thus, the o2dlm functionality of ocfs2_cluster_connect() is very straightforward. ocfs2_dlm_shutdown(), however, did a few things between de-registration of the eviction callback and actually shutting down the domain. Now de-registration and shutdown of the domain are wrapped within the single ocfs2_cluster_disconnect() call. I've checked the code paths to make sure we can safely tear down things in ocfs2_dlm_shutdown() before calling ocfs2_cluster_disconnect(). The filesystem has already set itself to ignore the callback. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 8f2c9c1b16bf6ed0903b29c49d56fa0109a390e4 Author: Joel Becker Date: Fri Feb 1 12:16:57 2008 -0800 ocfs2: Create the lock status block union. Wrap the lock status block (lksb) in a union. Later we will add a union element for the fs/dlm lksb. Create accessors for the status and lvb fields. Other than a debugging function, dlmglue.c does not directly reference the o2dlm locking path anymore. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 7431cd7e8dd0e46e9b12bd6a1ac1286f4b420371 Author: Joel Becker Date: Fri Feb 1 12:15:37 2008 -0800 ocfs2: Use -errno instead of dlm_status for ocfs2_dlm_lock/unlock() API. Change the ocfs2_dlm_lock/unlock() functions to return -errno values. This is the first step towards elminiating dlm_status in fs/ocfs2/dlmglue.c. The change also passes -errno values to ->unlock_ast(). [ Fix a return code in dlmglue.c and change the error translation table into an array of ints. --Mark ] Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit bd3e76105d4478ab89951a52d1a35250d24a9f16 Author: Joel Becker Date: Fri Feb 1 12:14:57 2008 -0800 ocfs2: Use global DLM_ constants in generic code. The ocfs2 generic code should use the values in . stackglue.c will convert them to o2dlm values. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 24ef1815e5e13e50196eb1ab8ddc0d783443bdf8 Author: Joel Becker Date: Tue Jan 29 17:37:32 2008 -0800 ocfs2: Separate out dlm lock functions. This is the first in a series of patches to isolate ocfs2 from the underlying cluster stack. Here we wrap the dlm locking functions with ocfs2-specific calls. Because ocfs2 always uses the same dlm lock status callbacks, we can eliminate the callbacks from the filesystem visible functions. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 386a2ef8576e966076c293f6496b9e3d7e3d9035 Author: Joel Becker Date: Fri Feb 1 12:06:54 2008 -0800 ocfs2: New slot map format The old slot map had a few limitations: - It was limited to one block, so the maximum slot count was 255. - Each slot was signed 16bits, limiting node numbers to INT16_MAX. - An empty slot was marked by the magic 0xFFFF (-1). The new slot map format provides 32bit node numbers (UINT32_MAX), a separate space to mark a slot in use, and extra room to grow. The slot map is now bounded by i_size, not a block. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit fb86b1f07120b66769a39c445da5c4300069dd44 Author: Joel Becker Date: Fri Feb 1 11:59:05 2008 -0800 ocfs2: Define the contents of the slot_map file. The slot map file is merely an array of __le16. Wrap it in a structure for cleaner reference. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit fc881fa0d59596c02f8707b5572567c369d4789a Author: Joel Becker Date: Fri Feb 1 12:04:48 2008 -0800 ocfs2: De-magic the in-memory slot map. The in-memory slot map uses the same magic as the on-disk one. There is a special value to mark a slot as invalid. It relies on the size of certain types and so on. Write a new in-memory map that keeps validity as a separate field. Outside of the I/O functions, OCFS2_INVALID_SLOT now means what it is supposed to. It also is no longer tied to the type size. This also means that only the I/O functions refer to 16bit quantities. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 1c8d9a6a330f46b3a6ddd204a2580131d5f0d6b7 Author: Joel Becker Date: Fri Feb 1 11:59:07 2008 -0800 ocfs2: slot_map I/O based on max_slots. The slot map code assumed a slot_map file has one block allocated. This changes the code to I/O as many blocks as will cover max_slots. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 553abd046af609191a91af7289d87d477adc659f Author: Joel Becker Date: Fri Feb 1 12:03:57 2008 -0800 ocfs2: Change the recovery map to an array of node numbers. The old recovery map was a bitmap of node numbers. This was sufficient for the maximum node number of 254. Going forward, we want node numbers to be UINT32. Thus, we need a new recovery map. Note that we can't keep track of slots here. We must write down the node number to recovery *before* we get the locks needed to convert a node number into a slot number. The recovery map is now an array of unsigned ints, max_slots in size. It moves to journal.c with the rest of recovery. Because it needs to be initialized, we move all of recovery initialization into a new function, ocfs2_recovery_init(). This actually cleans up ocfs2_initialize_super() a little as well. Following on, recovery cleaup becomes part of ocfs2_recovery_exit(). A number of node map functions are rendered obsolete and are removed. Finally, waiting on recovery is wrapped in a function rather than naked checks on the recovery_event. This is a cleanup from Mark. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit d85b20e4b300edfd290f21fc2d790ba16d2f225b Author: Joel Becker Date: Fri Feb 1 12:01:05 2008 -0800 ocfs2: Make ocfs2_slot_info private. Just use osb_lock around the ocfs2_slot_info data. This allows us to take the ocfs2_slot_info structure private in slot_info.c. All access is now via accessors. Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh commit 8e8a4603b5422c9145880e73b23bc4c2c4de0098 Author: Mark Fasheh Date: Fri Feb 1 11:59:09 2008 -0800 ocfs2: Move slot map access into slot_map.c journal.c and dlmglue.c would refresh the slot map by hand. Instead, have the update and clear functions do the work inside slot_map.c. The eventual result is to make ocfs2_slot_info defined privately in slot_map.c Signed-off-by: Joel Becker Signed-off-by: Mark Fasheh