commit 553e6a1aecf77a9655f02c6dd62dcf08e8c8cb78 Author: Greg Kroah-Hartman Date: Fri Nov 16 08:19:12 2007 -0800 Linux 2.6.23.2 commit 0520fb16465a12dda986d51fc7be3eca6e82603b Author: Jens Axboe Date: Tue Oct 30 11:18:15 2007 +0100 BLOCK: Fix bad sharing of tag busy list on queues with shared tag maps patch 6eca9004dfcb274a502438a591df5b197690afb1 in mainline. For the locking to work, only the tag map and tag bit map may be shared (incidentally, I was just explaining this to Nick yesterday, but I apparently didn't review the code well enough myself). But we also share the busy list! The busy_list must be queue private, or we need a block_queue_tag covering lock as well. So we have to move the busy_list to the queue. This'll work fine, and it'll actually also fix a problem with blk_queue_invalidate_tags() which will invalidate tags across all shared queues. This is a bit confusing, the low level driver should call it for each queue seperately since otherwise you cannot kill tags on just a single queue for eg a hard drive that stops responding. Since the function has no callers currently, it's not an issue. This is fixed with commit 6eca9004dfcb274a502438a591df5b197690afb1 in Linus' tree. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit bba9d994eb41060c8a6e09207f659cf4e26e9384 Author: Hugh Dickins Date: Mon Oct 29 14:37:20 2007 -0700 fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE patch 487e9bf25cbae11b131d6a14bdbb3a6a77380837 in mainline. It's possible to provoke unionfs (not yet in mainline, though in mm and some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)). I expect it's possible to provoke the 2.6.23 ecryptfs in the same way (but the 2.6.24 ecryptfs no longer calls lower level's ->writepage). This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could leak from tmpfs via write_cache_pages and unionfs to userspace. There's already a fix (e423003028183df54f039dfda8b58c49e78c89d7 - writeback: don't propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so far as it goes; but insufficient because it doesn't address the underlying issue, that shmem_writepage expects to be called only by vmscan (relying on backing_dev_info capabilities to prevent the normal writeback path from ever approaching it). That's an increasingly fragile assumption, and ramdisk_writepage (the other source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check wbc->for_reclaim before returning it. Make the same check in shmem_writepage, thereby sidestepping the page_mapped BUG also. Signed-off-by: Hugh Dickins Cc: Erez Zadok Reviewed-by: Pekka Enberg Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 59ddd4607313e7ee20b9ad36cd3b00f83068189a Author: David Miller Date: Mon Nov 12 23:59:05 2007 -0800 Fix compat futex hangs. [FUTEX]: Fix address computation in compat code. [ Upstream commit: 3c5fd9c77d609b51c0bab682c9d40cbb496ec6f1 ] compat_exit_robust_list() computes a pointer to the futex entry in userspace as follows: (void __user *)entry + futex_offset 'entry' is a 'struct robust_list __user *', and 'futex_offset' is a 'compat_long_t' (typically a 's32'). Things explode if the 32-bit sign bit is set in futex_offset. Type promotion sign extends futex_offset to a 64-bit value before adding it to 'entry'. This triggered a problem on sparc64 running 32-bit applications which would lock up a cpu looping forever in the fault handling for the userspace load in handle_futex_death(). Compat userspace runs with address masking (wherein the cpu zeros out the top 32-bits of every effective address given to a memory operation instruction) so the sparc64 fault handler accounts for this by zero'ing out the top 32-bits of the fault address too. Since the kernel properly uses the compat_uptr interfaces, kernel side accesses to compat userspace work too since they will only use addresses with the top 32-bit clear. Because of this compat futex layer bug we get into the following loop when executing the get_user() load near the top of handle_futex_death(): 1) load from address '0xfffffffff7f16bd8', FAULT 2) fault handler clears upper 32-bits, processes fault for address '0xf7f16bd8' which succeeds 3) goto #1 I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto for their tireless efforts helping me track down this bug. Signed-off-by: David S. Miller Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e823c33c6f670beba3c14f4a451fd2b34c3eb40c Author: Frans Pop Date: Wed Nov 14 01:18:19 2007 +0100 sched: keep utime/stime monotonic sched: keep utime/stime monotonic cpustats use utime/stime as a ratio against sum_exec_runtime, as a consequence it can happen - when the ratio changes faster than time accumulates - that either can be appear to go backwards. Combined backport for 2.6.23 of the following patches from mainline: commit 73a2bcb0edb9ffb0b007b3546b430e2c6e415eee Author: Peter Zijlstra sched: keep utime/stime monotonic commit 9301899be75b464ef097f0b5af7af6d9bd8f68a7 Author: Balbir Singh sched: fix /proc//stat stime/utime monotonicity, part 2 Signed-off-by: Frans Pop CC: Peter Zijlstra CC: Balbir Singh Signed-off-by: Greg Kroah-Hartman commit 436e61d93605a3a36902c9ee510b0ecba0d7d361 Author: Ingo Molnar Date: Tue Oct 16 23:18:38 2007 -0700 fix the softlockup watchdog to actually work patch a115d5caca1a2905ba7a32b408a6042b20179aaa in mainline. this Xen related commit: commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 Author: Jeremy Fitzhardinge Date: Tue May 8 00:28:02 2007 -0700 Ignore stolen time in the softlockup watchdog broke the softlockup watchdog to never report any lockups. (!) print_timestamp defaults to 0, this makes the following condition always true: if (print_timestamp < (touch_timestamp + 1) || and we'll in essence never report soft lockups. apparently the functionality of the soft lockup watchdog was never actually tested with that patch applied ... Signed-off-by: Ingo Molnar Cc: Jeremy Fitzhardinge Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4d03fda881a2cdd5826ed5eb58587d3668ea787a Author: Jens Axboe Date: Tue Oct 16 10:01:29 2007 +0200 splice: fix double kunmap() in vmsplice copy path patch 6866bef40d06f7c2baac3a855b1917a8ca75456c in mainline. The out label should not include the unmap, the only way to jump there already has unmapped the source. 00002000 f7c21a00 00000000 00000000 c0489036 00018e32 00000002 00000000 00001000 Call Trace: [] pipe_to_user+0xca/0xd3 [] __splice_from_pipe+0x53/0x1bd [] ------------[ cut here ]------------ filemap_fault+0x221/0x380 [] pipe_to_user+0x0/0xd3 [] sys_vmsplice+0x3b7/0x422 [] kernel BUG at mm/highmem.c:206! handle_mm_fault+0x4d5/0x8eb [] kmap_atomic+0x1c/0x20 [] unmap_vmas+0x3d1/0x584 [] free_pgtables+0x90/0xa0 [] pgd_dtor+0x0/0x1 [] audit_syscall_exit+0x2aa/0x2c6 [] do_syscall_trace+0x124/0x169 [] syscall_call+0x7/0xb ======================= Code: 2d 00 d0 5b 00 25 00 00 e0 ff 29 invalid opcode: 0000 [#1] c2 89 d0 c1 e8 0c 8b 14 85 a0 6c 7c c0 4a 85 d2 89 14 85 a0 6c 7c c0 74 07 31 c9 4a 75 15 eb 04 <0f> 0b eb fe 31 c9 81 3d 78 38 6d c0 78 38 6d c0 0f 95 c1 b0 01 EIP: [] kunmap_high+0x51/0x8e SS:ESP 0068:f5960df0 SMP Modules linked in: netconsole autofs4 hidp nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc ipv6 ib_iser rdma_cm ib_cm iw_cmib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath dm_mod video output sbs batteryac parport_pc lp parport sg i2c_piix4 i2c_core floppy cfi_probe gen_probe scb2_flash mtd chipreg tg3 e1000 button ide_cd serio_raw cdrom aic7xxx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd CPU: 3 EIP: 0060:[] Not tainted VLI EFLAGS: 00010246 (2.6.23 #1) EIP is at kunmap_high+0x51/0x8e Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 2e25e4331914ab9dc1cfaf2f162ef69e67a19c44 Author: Andrew Morton Date: Tue Oct 16 23:18:32 2007 -0700 writeback: don't propagate AOP_WRITEPAGE_ACTIVATE patch e423003028183df54f039dfda8b58c49e78c89d7 in mainline. This is a writeback-internal marker but we're propagating it all the way back to userspace!. Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f8b98ff93bba351932c1acfc75435fe7bfe48294 Author: Christoph Lameter Date: Mon Nov 5 11:15:43 2007 -0800 SLUB: Fix memory leak by not reusing cpu_slab patch 05aa345034de6ae9c77fb93f6a796013641d57d5 in mainline. SLUB: Fix memory leak by not reusing cpu_slab Fix the memory leak that may occur when we attempt to reuse a cpu_slab that was allocated while we reenabled interrupts in order to be able to grow a slab cache. The per cpu freelist may contain objects and in that situation we may overwrite the per cpu freelist pointer loosing objects. This only occurs if we find that the concurrently allocated slab fits our allocation needs. If we simply always deactivate the slab then the freelist will be properly reintegrated and the memory leak will go away. Signed-off-by: Christoph Lameter Cc: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit 0ebc8ca802af6a8b6c5d311a4a7250aa5d7d3625 Author: Tsugikazu Shibata Date: Fri Oct 12 15:16:06 2007 -0700 HOWTO: update ja_JP/HOWTO with latest changes patch 3b6662f192fc521b9657f63e68d20ec99979dae6 upstream. Here is another sync patch of Documentation/ja_JP/HOWTO Japanese developer sent me some cosmetic changes and also follow changes of HOWTO Cross reference URL (sosdg.org/qiyong/lxr) known_regression explanations on kernel dev. process Signed-off-by: Tsugikazu Shibata Signed-off-by: Greg Kroah-Hartman commit e8af293bb2f8cc44868a67e2af8010feaa7c309b Author: Jan Kiszka Date: Wed Nov 14 17:00:08 2007 -0800 fix param_sysfs_builtin name length check patch 22800a2830ec07e7cc5c837999890ac47cc7f5de in mainline. Commit faf8c714f4508207a9c81cc94dafc76ed6680b44 caused a regression: parameter names longer than MAX_KBUILD_MODNAME will now be rejected, although we just need to keep the module name part that short. This patch restores the old behaviour while still avoiding that memchr is called with its length parameter larger than the total string length. Signed-off-by: Jan Kiszka Cc: Dave Young Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 2b5ee2866a4a4781158986f21fdaa3395bc27d13 Author: Dave Young Date: Thu Oct 18 03:05:07 2007 -0700 param_sysfs_builtin memchr argument fix patch faf8c714f4508207a9c81cc94dafc76ed6680b44 in mainline. If memchr argument is longer than strlen(kp->name), there will be some weird result. It will casuse duplicate filenames in sysfs for the "nousb". kernel warning messages are as bellow: sysfs: duplicate filename 'usbcore' can not be created WARNING: at fs/sysfs/dir.c:416 sysfs_add_one() [] sysfs_add_one+0xa0/0xe0 [] create_dir+0x48/0xb0 [] sysfs_create_dir+0x29/0x50 [] create_dir+0x1b/0x50 [] kobject_add+0x46/0x150 [] kobject_init+0x3a/0x80 [] kernel_param_sysfs_setup+0x50/0xb0 [] param_sysfs_builtin+0xee/0x130 [] param_sysfs_init+0x23/0x60 [] __next_cpu+0x12/0x20 [] kernel_init+0x0/0xb0 [] kernel_init+0x0/0xb0 [] do_initcalls+0x46/0x1e0 [] create_proc_entry+0x52/0x90 [] register_irq_proc+0x9c/0xc0 [] proc_mkdir_mode+0x34/0x50 [] kernel_init+0x0/0xb0 [] kernel_init+0x62/0xb0 [] kernel_thread_helper+0x7/0x14 ======================= kobject_add failed for usbcore with -EEXIST, don't try to register things with the same name in the same directory. [] kobject_add+0xf6/0x150 [] kernel_param_sysfs_setup+0x50/0xb0 [] param_sysfs_builtin+0xee/0x130 [] param_sysfs_init+0x23/0x60 [] __next_cpu+0x12/0x20 [] kernel_init+0x0/0xb0 [] kernel_init+0x0/0xb0 [] do_initcalls+0x46/0x1e0 [] create_proc_entry+0x52/0x90 [] register_irq_proc+0x9c/0xc0 [] proc_mkdir_mode+0x34/0x50 [] kernel_init+0x0/0xb0 [] kernel_init+0x62/0xb0 [] kernel_thread_helper+0x7/0x14 ======================= Module 'usbcore' failed to be added to sysfs, error number -17 The system will be unstable now. Signed-off-by: Dave Young Cc: Greg KH Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit aead196be316ca4026558ac1139098080e0751c5 Author: Linus Torvalds Date: Wed Oct 31 09:19:46 2007 -0700 Remove broken ptrace() special-case code from file mapping The kernel has for random historical reasons allowed ptrace() accesses to access (and insert) pages into the page cache above the size of the file. However, Nick broke that by mistake when doing the new fault handling in commit 54cb8821de07f2ffcd28c380ce9b93d5784b40d7 ("mm: merge populate and nopage into fault (fixes nonlinear)". The breakage caused a hang with gdb when trying to access the invalid page. The ptrace "feature" really isn't worth resurrecting, since it really is wrong both from a portability _and_ from an internal page cache validity standpoint. So this removes those old broken remnants, and fixes the ptrace() hang in the process. Noticed and bisected by Duane Griffin, who also supplied a test-case (quoth Nick: "Well that's probably the best bug report I've ever had, thanks Duane!"). Cc: Duane Griffin Acked-by: Nick Piggin Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f153577e808532933e2cbe935e68c51be4c9a4b8 Author: J. Bruce Fields Date: Tue Oct 30 11:20:02 2007 -0400 locks: fix possible infinite loop in posix deadlock detection patch 97855b49b6bac0bd25f16b017883634d13591d00 in mainline. It's currently possible to send posix_locks_deadlock() into an infinite loop (under the BKL). For now, fix this just by bailing out after a few iterations. We may want to fix this in a way that better clarifies the semantics of deadlock detection. But that will take more time, and this minimal fix is probably adequate for any realistic scenario, and is simple enough to be appropriate for applying to stable kernels now. Thanks to George Davis for reporting the problem. Cc: "George G. Davis" Signed-off-by: J. Bruce Fields Acked-by: Alan Cox Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e354b801daa5649ae32e04b6e83d7f35fbde3490 Author: Gregory Haskins Date: Thu Oct 11 22:11:11 2007 +0200 lockdep: fix mismatched lockdep_depth/curr_chain_hash patch 3aa416b07f0adf01c090baab26fb70c35ec17623 in mainline. It is possible for the current->curr_chain_key to become inconsistent with the current index if the chain fails to validate. The end result is that future lock_acquire() operations may inadvertently fail to find a hit in the cache resulting in a new node being added to the graph for every acquire. Signed-off-by: Gregory Haskins Signed-off-by: Peter Zijlstra Signed-off-by: Ingo Molnar Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman