commit 6b99a1744ab187073bca84a9fd3ccbf091865ca6 Author: Greg Kroah-Hartman Date: Tue May 1 17:34:12 2007 -0700 Linux 2.6.20.11 commit 481576e9872a6366227bce3838f566a3fa338d81 Author: Bartlomiej Zolnierkiewicz Date: Wed Apr 25 16:18:52 2007 -0400 Revert "adjust legacy IDE resource setting (v2)" Revert "adjust legacy IDE resource setting (v2)" This reverts commit ed8ccee0918ad063a4741c0656fda783e02df627. It causes hang on boot for some users and we don't yet know why: http://bugzilla.kernel.org/show_bug.cgi?id=7562 http://lkml.org/lkml/2007/4/20/404 http://lkml.org/lkml/2007/3/25/113 Just reverse it for 2.6.21-final, having broken X server is somehow better than unbootable system. Signed-off-by: Bartlomiej Zolnierkiewicz Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 3720eda544ece8fa5ade8c87e5f2cf2fb05b0ff5 Author: Jens Axboe Date: Wed Apr 25 13:42:27 2007 +0200 cfq-iosched: fix alias + front merge bug There's a really rare and obscure bug in CFQ, that causes a crash in cfq_dispatch_insert() due to rq == NULL. One example of that is seen here: http://lkml.org/lkml/2007/4/15/41 Neil correctly diagnosed the situation for how this can happen, read that analysis here: http://lkml.org/lkml/2007/4/25/57 This looks like it requires md to trigger, even though it should potentially be possible to due with O_DIRECT (at least if you edit the kernel and doctor some of the unplug calls). The fix is to move the ->next_rq update to when we add a request to the rbtree. Then we remove the possibility for a request to exist in the rbtree code, but not have ->next_rq correctly updated. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit ab02a65798ea8b940ea5645655238c778d4e4765 Author: Wang Zhenyu Date: Wed Apr 25 15:07:38 2007 -0400 AGPGART: intel_agp: fix G965 GTT size detect [AGPGART] intel_agp: fix G965 GTT size detect On G965, I810_PGETBL_CTL is a mmio offset, but we wrongly take it as pci config space offset in detecting GTT size. This one line patch fixs this. Signed-off-by: Wang Zhenyu Signed-off-by: Dave Jones Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 27797cc764b52d417f5daf5703d4fa9f98867185 Author: Tommi Kyntola Date: Wed Apr 25 15:05:50 2007 -0400 ALSA: intel8x0 - Fix speaker output after S2RAM [ALSA] intel8x0 - Fix speaker output after S2RAM Fixed the mute speaker problem after S2RAM on some laptops: http://bugme.osdl.org/show_bug.cgi?id=6181 Signed-off-by: Tommi Kyntola Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 87b814c9a7e7316ae4ea64767dbd0ae2fd6c179d Author: Jean Delvare Date: Wed Apr 25 09:51:01 2007 +0200 hwmon/w83627ehf: Fix the fan5 clock divider write Users have been complaining about the w83627ehf driver flooding their logs with debug messages like: w83627ehf 9191-0a10: Increasing fan 4 clock divider from 64 to 128 or: w83627ehf 9191-0290: Increasing fan 4 clock divider from 4 to 8 The reason is that we failed to actually write the LSB of the encoded clock divider value for that fan, causing the next read to report the same old value again and again. Additionally, the fan number was improperly reported, making the bug harder to find. Signed-off-by: Jean Delvare Signed-off-by: Greg Kroah-Hartman commit 1b3bac165e24b4e5711b8bcf3ca7e7a990f5a22c Author: Jeff Mahoney Date: Mon Apr 23 14:41:17 2007 -0700 reiserfs: fix xattr root locking/refcount bug The listxattr() and getxattr() operations are only protected by a read lock. As a result, if either of these operations run in parallel, a race condition exists where the xattr_root will end up being cached twice, which results in the leaking of a reference and a BUG() on umount. This patch refactors get_xa_root(), __get_xa_root(), and create_xa_root(), into one get_xa_root() function that takes the appropriate locking around the entire critical section. Reported, diagnosed and tested by Andrea Righi Signed-off-by: Jeff Mahoney Cc: Andrea Righi Cc: "Vladimir V. Saveliev" Cc: Edward Shishkin Cc: Alex Zarochentsev Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 2b2c594d5faa4c85db286ed015df78ee02138eec Author: Balbir Singh Date: Mon Apr 23 14:41:05 2007 -0700 Taskstats fix the structure members alignment issue We broke the the alignment of members of taskstats to the 8 byte boundary with the CSA patches. In the current kernel, the taskstats structure is not suitable for use by 32 bit applications in a 64 bit kernel. On x86_64 Offsets of taskstats' members (64 bit kernel, 64 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 116, # ac_uid 120, # ac_gid 124, # ac_pid 128, # ac_ppid 132, # ac_btime 136, # ac_etime 144, # ac_utime 152, # ac_stime 160, # ac_minflt 168, # ac_majflt 176, # coremem 184, # virtmem 192, # hiwater_rss 200, # hiwater_vm 208, # read_char 216, # write_char 224, # read_syscalls 232, # write_syscalls 240, # read_bytes 248, # write_bytes 256, # cancelled_write_bytes ); Offsets of taskstats' members (64 bit kernel, 32 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 12, # cpu_count 20, # cpu_delay_total 28, # blkio_count 36, # blkio_delay_total 44, # swapin_count 52, # swapin_delay_total 60, # cpu_run_real_total 68, # cpu_run_virtual_total 76, # ac_comm 108, # ac_sched 109, # ac_pad 112, # ac_uid 116, # ac_gid 120, # ac_pid 124, # ac_ppid 128, # ac_btime 132, # ac_etime 140, # ac_utime 148, # ac_stime 156, # ac_minflt 164, # ac_majflt 172, # coremem 180, # virtmem 188, # hiwater_rss 196, # hiwater_vm 204, # read_char 212, # write_char 220, # read_syscalls 228, # write_syscalls 236, # read_bytes 244, # write_bytes 252, # cancelled_write_bytes ); This is one way to solve the problem without re-arranging structure members is to pack the structure. The patch adds an __attribute__((aligned(8))) to the taskstats structure members so that 32 bit applications using taskstats can work with a 64 bit kernel. Using __attribute__((packed)) would break the 64 bit alignment of members. The fix was tested on x86_64. After the fix, we got Offsets of taskstats' members (64 bit kernel, 64 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 120, # ac_uid 124, # ac_gid 128, # ac_pid 132, # ac_ppid 136, # ac_btime 144, # ac_etime 152, # ac_utime 160, # ac_stime 168, # ac_minflt 176, # ac_majflt 184, # coremem 192, # virtmem 200, # hiwater_rss 208, # hiwater_vm 216, # read_char 224, # write_char 232, # read_syscalls 240, # write_syscalls 248, # read_bytes 256, # write_bytes 264, # cancelled_write_bytes ); Offsets of taskstats' members (64 bit kernel, 32 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 120, # ac_uid 124, # ac_gid 128, # ac_pid 132, # ac_ppid 136, # ac_btime 144, # ac_etime 152, # ac_utime 160, # ac_stime 168, # ac_minflt 176, # ac_majflt 184, # coremem 192, # virtmem 200, # hiwater_rss 208, # hiwater_vm 216, # read_char 224, # write_char 232, # read_syscalls 240, # write_syscalls 248, # read_bytes 256, # write_bytes 264, # cancelled_write_bytes ); Signed-off-by: Balbir Singh Cc: Jay Lan Cc: Shailabh Nagar Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 637fc7e479a60ae55357c0e3ffeca5a475ef5733 Author: Christoph Lameter Date: Mon Apr 23 14:41:09 2007 -0700 page migration: fix NR_FILE_PAGES accounting NR_FILE_PAGES must be accounted for depending on the zone that the page belongs to. If we replace the page in the radix tree then we may have to shift the count to another zone. Suggested-by: Ethan Solomita Cc: Martin Bligh Signed-off-by: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit aad6bcca0b1ca356ede42583c96dc7ada541cffe Author: Taku Izumi Date: Mon Apr 23 14:41:00 2007 -0700 Fix possible NULL pointer access in 8250 serial driver I encountered the following kernel panic. The cause of this problem was NULL pointer access in check_modem_status() in 8250.c. I confirmed this problem is fixed by the attached patch, but I don't know this is the correct fix. sadc[4378]: NaT consumption 2216203124768 [1] Modules linked in: binfmt_misc dm_mirror dm_mod thermal processor fan container button sg e100 eepro100 mii ehci_hcd ohci_hcd Pid: 4378, CPU 0, comm: sadc psr : 00001210085a2010 ifs : 8000000000000289 ip : [] Not tainted ip is at check_modem_status+0xf1/0x360 unat: 0000000000000000 pfs : 0000000000000289 rsc : 0000000000000003 rnat: 800000000000cc18 bsps: 0000000000000000 pr : 0000000000aa6a99 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a000000100481fb0 b6 : a0000001004822e0 b7 : a000000100477f20 f6 : 1003e2222222222222222 f7 : 0ffdba200000000000000 f8 : 100018000000000000000 f9 : 10002a000000000000000 f10 : 0fffdccccccccc8c00000 f11 : 1003e0000000000000000 r1 : a000000100b9af40 r2 : 0000000000000008 r3 : a000000100ad4e21 r8 : 00000000000000bb r9 : 0000000000000001 r10 : 0000000000000000 r11 : a000000100ad4d58 r12 : e0000000037b7df0 r13 : e0000000037b0000 r14 : 0000000000000001 r15 : 0000000000000018 r16 : a000000100ad4d6c r17 : 0000000000000000 r18 : 0000000000000000 r19 : 0000000000000000 r20 : a00000010099bc88 r21 : 00000000000000bb r22 : 00000000000000bb r23 : c003fffffc0ff3fe r24 : c003fffffc000000 r25 : 00000000000ff3fe r26 : a0000001009b7ad0 r27 : 0000000000000001 r28 : a0000001009b7ad8 r29 : 0000000000000000 r30 : a0000001009b7ad0 r31 : a0000001009b7ad0 Call Trace: [] show_stack+0x40/0xa0 sp=e0000000037b7810 bsp=e0000000037b1118 [] show_regs+0x840/0x880 sp=e0000000037b79e0 bsp=e0000000037b10c0 [] die+0x1c0/0x2c0 sp=e0000000037b79e0 bsp=e0000000037b1078 [] die_if_kernel+0x50/0x80 sp=e0000000037b7a00 bsp=e0000000037b1048 [] ia64_fault+0x11e0/0x1300 sp=e0000000037b7a00 bsp=e0000000037b0fe8 [] ia64_leave_kernel+0x0/0x280 sp=e0000000037b7c20 bsp=e0000000037b0fe8 [] check_modem_status+0xf0/0x360 sp=e0000000037b7df0 bsp=e0000000037b0fa0 [] serial8250_get_mctrl+0x20/0xa0 sp=e0000000037b7df0 bsp=e0000000037b0f80 [] uart_read_proc+0x250/0x860 sp=e0000000037b7df0 bsp=e0000000037b0ee0 [] proc_file_read+0x1d0/0x4c0 sp=e0000000037b7e10 bsp=e0000000037b0e80 [] vfs_read+0x1b0/0x300 sp=e0000000037b7e20 bsp=e0000000037b0e30 [] sys_read+0x70/0xe0 sp=e0000000037b7e20 bsp=e0000000037b0db0 [] ia64_ret_from_syscall+0x0/0x20 sp=e0000000037b7e30 bsp=e0000000037b0db0 [] __kernel_syscall_via_break+0x0/0x20 sp=e0000000037b8000 bsp=e0000000037b0db0 Fix the possible NULL pointer access in check_modem_status() in 8250.c. The check_modem_status() would access 'info' member of uart_port structure, but it is not initialized before uart_open() is called. The check_modem_status() can be called through /proc/tty/driver/serial before uart_open() is called. Signed-off-by: Kenji Kaneshige Signed-off-by: Taku Izumi Cc: Russell King Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 8505260d098b10600044268568ef851419946c33 Author: Hugh Dickins Date: Mon Apr 23 14:41:02 2007 -0700 fix OOM killing processes wrongly thought MPOL_BIND I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog to see lots of other processes killed with "No available memory (MPOL_BIND)". memhog is killed correctly once we initialize nodemask in constrained_alloc(). Signed-off-by: Hugh Dickins Acked-by: Christoph Lameter Acked-by: William Irwin Acked-by: KAMEZAWA Hiroyuki Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit b438299c93753620877d3a848c0f82757d211b31 Author: Benjamin Herrenschmidt Date: Mon Apr 16 22:53:16 2007 -0700 fix bogon in /dev/mem mmap'ing on nommu While digging through my MAP_FIXED changes, I found that rather obvious bug in /dev/mem mmap implementation for nommu archs. get_unmapped_area() is expected to return an address, not a pfn. Signed-off-by: Benjamin Herrenschmidt Acked-By: David Howells Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 86f757f0125d688bdd3d211df3859e013d8544b2 Author: James Bottomley Date: Fri Apr 6 11:14:56 2007 -0500 3w-xxxx: fix oops caused by incorrect REQUEST_SENSE handling 3w-xxxx emulates a REQUEST_SENSE response by simply returning nothing. Unfortunately, it's assuming that the REQUEST_SENSE command is implemented with use_sg == 0, which is no longer the case. The oops occurs because it's clearing the scatterlist in request_buffer instead of the memory region. This is fixed by using tw_transfer_internal() to transfer correctly to the scatterlist. Acked-by: adam radford Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit 93c27c733bcab7cc19cc77dc5fb8b605921adf59 Author: Michal Januszewski Date: Thu Apr 19 16:34:50 2007 -0400 vt: fix potential race in VT_WAITACTIVE handler [PATCH] vt: fix potential race in VT_WAITACTIVE handler On a multiprocessor machine the VT_WAITACTIVE ioctl call may return 0 if fg_console has already been updated in redraw_screen() but the console switch itself hasn't been completed. Fix this by checking fg_console in vt_waitactive() with the console sem held. Signed-off-by: Michal Januszewski Acked-by: Antonino Daplas Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 21deacabe2e2b0b823dc9593f88af3fec103a974 Author: Zwane Mwaikambo Date: Thu Apr 19 16:33:13 2007 -0400 x86: Don't probe for DDC on VBE1.2 [PATCH] x86: Don't probe for DDC on VBE1.2 VBE1.2 doesn't support function 15h (DDC) resulting in a 'hang' whilst uncompressing kernel with some video cards. Make sure we check VBE version before fiddling around with DDC. http://bugzilla.kernel.org/show_bug.cgi?id=1458 Opened: 2003-10-30 09:12 Last update: 2007-02-13 22:03 Much thanks to Tobias Hain for help in testing and investigating the bug. Tested on; i386, Chips & Technologies 65548 VESA VBE 1.2 CONFIG_VIDEO_SELECT=Y CONFIG_FIRMWARE_EDID=Y Untested on x86_64. Signed-off-by: Zwane Mwaikambo Signed-off-by: Andi Kleen Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit d0769f552d8389cb6fe23b65b71bec7d9ba6b446 Author: Trond Myklebust Date: Thu Apr 19 16:31:23 2007 -0400 NFS: Fix an Oops in nfs_setattr() NFS: Fix an Oops in nfs_setattr() It looks like nfs_setattr() and nfs_rename() also need to test whether the target is a regular file before calling nfs_wb_all()... Signed-off-by: Trond Myklebust Signed-off-by: Linus Torvalds Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 12b1ca6601c0ff4bc4fe44f8d631cd3eeaf18c88 Author: Alan Cox Date: Tue Apr 17 23:59:01 2007 +0000 exec.c: fix coredump to pipe problem and obscure "security hole" exec.c: fix coredump to pipe problem and obscure "security hole" The patch checks for "|" in the pattern not the output and doesn't nail a pid on to a piped name (as it is a program name not a file) Also fixes a very very obscure security corner case. If you happen to have decided on a core pattern that starts with the program name then the user can run a program called "|myevilhack" as it stands. I doubt anyone does this. Signed-off-by: Alan Cox Confirmed-by: Christopher S. Aker Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit a9c01941701641d505c07e7364a03447c694f6e8 Author: Badari Pulavarty Date: Tue Apr 17 15:53:09 2007 -0400 cache_k8_northbridges() overflows beyond allocation cache_k8_northbridges() overflows beyond allocation cache_k8_northbridges() is storing config values to incorrect locations (in flush_words) and also its overflowing beyond the allocation, causing slab verification failures. Signed-off-by: Badari Pulavarty Signed-off-by: Linus Torvalds Cc: Andi Kleen Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 1ee88fb696591eedd34ee9bbcc0c001609ec7901 Author: Olaf Kirch Date: Wed Apr 18 15:14:14 2007 -0700 Fix IRDA oops'er This fixes and OOPS due to incorrect socket orpahning in the IRDA stack. [IrDA]: Correctly handling socket error This patch fixes an oops first reported in mid 2006 - see http://lkml.org/lkml/2006/8/29/358 The cause of this bug report is that when an error is signalled on the socket, irda_recvmsg_stream returns without removing a local wait_queue variable from the socket's sk_sleep queue. This causes havoc further down the road. In response to this problem, a patch was made that invoked sock_orphan on the socket when receiving a disconnect indication. This is not a good fix, as this sets sk_sleep to NULL, causing applications sleeping in recvmsg (and other places) to oops. This is against the latest net-2.6 and should be considered for -stable inclusion. Signed-off-by: Olaf Kirch Signed-off-by: Samuel Ortiz Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 100f4756d42036cca5968ef82542ecf6bccbc8a1 Author: Aubrey.Li Date: Tue Apr 17 14:46:33 2007 -0700 Fix netpoll UDP input path Netpoll UDP input handler needs to pull up the UDP headers and handle receive checksum offloading properly just like the normal UDP input path does else we get corrupted checksums. [NET]: Fix UDP checksum issue in net poll mode. In net poll mode, the current checksum function doesn't consider the kind of packet which is padded to reach a specific minimum length. I believe that's the problem causing my test case failed. The following patch fixed this issue. Signed-off-by: Aubrey.Li Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3300bb14330a902331bcdfd0f69d9b0945585c51 Author: John Heffner Date: Tue Apr 17 14:44:06 2007 -0700 Fix errors in tcp_mem[] calculations. In 2.6.18 a change was made to the tcp_mem[] calculations, but this causes regressions for some folks up to 2.6.20 The following fix to smooth out the calculation from the pending 2.6.21 tree by John Heffner fixes the problem for these folks. [TCP]: Fix tcp_mem[] initialization. Change tcp_mem initialization function. The fraction of total memory is now a continuous function of memory size, and independent of page size. Signed-off-by: John Heffner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 979a764e6449e9f9b73d7a9fa84d8e2e5bc65729 Author: Tom "spot" Callaway Date: Tue Apr 17 14:42:12 2007 -0700 Fix bogus inline directive in sparc64 PCI code [SPARC64]: Fix inline directive in pci_iommu.c While building a test kernel for the new esp driver (against git-current), I hit this bug. Trivial fix, put the inline declaration in the right place. :) Signed-off-by: Tom "spot" Callaway Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a95330df8145de41acdf84a7aa267b4ba0d0d410 Author: David Miller Date: Tue Apr 17 14:40:46 2007 -0700 Fix compat sys_ipc() on sparc64 The 32-bit syscall trampoline for sys_ipc() on sparc64 was sign extending various arguments, which is bogus when using compat_sys_ipc() since that function expects zero extended copies of all the arguments. This bug breaks the sparc64 kernel when built with gcc-4.2.x among other things. [SPARC64]: Fix arg passing to compat_sys_ipc(). Do not sign extend args using the sys32_ipc stub, that is buggy and unnecessary. Based upon an excellent report by Mikael Pettersson. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d3bf6f8a58459b7aefc10591807846b0b5697244 Author: David Miller Date: Tue Apr 17 14:35:07 2007 -0700 Fix qlogicpti DMA unmapping [SCSI] QLOGICPTI: Do not unmap DMA unless we actually mapped something. We only map DMA when cmd->request_bufflen is non-zero for non-sg buffers, we thus should make the same check when unmapping. Based upon a report from Pasi Pirhonen. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 169ed0ec5ee1f3733d5b23cd0d2ddadc697472df Author: David Miller Date: Tue Apr 17 14:37:25 2007 -0700 Fix sparc64 SBUS IOMMU allocator [SPARC64]: Fix SBUS IOMMU allocation code. There are several IOMMU allocator bugs. Instead of trying to fix this overly complicated code, just mirror the PCI IOMMU arena allocator which is very stable and well stress tested. I tried to make the code as identical as possible so we can switch sun4u PCI and SBUS over to a common piece of IOMMU code. All that will be need are two callbacks, one to do a full IOMMU flush and one to do a streaming buffer flush. This patch gets rid of a lot of hangs and mysterious crashes on SBUS sparc64 systems, at least for me. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 64f586d81ecc913a557055d4804c987f8c766888 Author: Hugh Dickins Date: Fri Apr 13 18:27:55 2007 +0100 holepunch: fix mmap_sem i_mutex deadlock sys_madvise has down_write of mmap_sem, then madvise_remove calls vmtruncate_range which takes i_mutex and i_alloc_sem: no, we can easily devise deadlocks from that ordering. madvise_remove drop mmap_sem while calling vmtruncate_range: luckily, since madvise_remove doesn't split or merge vmas, it's easy to handle this case with a NULL prev, without restructuring sys_madvise. (Though sad to retake mmap_sem when it's unlikely to be needed, and certainly down_read is sufficient for MADV_REMOVE, unlike the other madvices.) Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit 42988ea6a81e888a6aa28549071e00400b68f4f7 Author: Hugh Dickins Date: Fri Apr 13 18:27:10 2007 +0100 holepunch: fix disconnected pages after second truncate shmem_truncate_range has its own truncate_inode_pages_range, to free any pages racily instantiated while it was in progress: a SHMEM_PAGEIN flag is set when this might have happened. But holepunching gets no chance to clear that flag at the start of vmtruncate_range, so it's always set (unless a truncate came just before), so holepunch almost always does this second truncate_inode_pages_range. shmem holepunch has unlikely swap<->file races hereabouts whatever we do (without a fuller rework than is fit for this release): I was going to skip the second truncate in the punch_hole case, but Miklos points out that would make holepunch correctness more vulnerable to swapoff. So keep the second truncate, but follow it by an unmap_mapping_range to eliminate the disconnected pages (freed from pagecache while still mapped in userspace) that it might have left behind. Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit 32576fd4ae53e17af4ca814f7876372a96266b37 Author: Hugh Dickins Date: Fri Apr 13 18:26:13 2007 +0100 holepunch: fix shmem_truncate_range punch locking Miklos Szeredi observes that during truncation of shmem page directories, info->lock is released to improve latency (after lowering i_size and next_index to exclude races); but this is quite wrong for holepunching, which receives no such protection from i_size or next_index, and is left vulnerable to races with shmem_unuse, shmem_getpage and shmem_writepage. Hold info->lock throughout when holepunching? No, any user could prevent rescheduling for far too long. Instead take info->lock just when needed: in shmem_free_swp when removing the swap entries, and whenever removing a directory page from the level above. But so long as we remove before scanning, we can safely skip taking the lock at the lower levels, except at misaligned start and end of the hole. Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit ac66c863ed2a729015b32dc0b72dc07f4abdfd81 Author: Hugh Dickins Date: Fri Apr 13 18:25:00 2007 +0100 holepunch: fix shmem_truncate_range punching too far Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered in rare circumstances, because shmem_truncate_range() erroneously removes partially truncated directory pages at the end of the range: later reclaim on pages pointing to these removed directories triggers the BUG. Indeed, and it can also cause data loss beyond the hole. Fix this as in the patch proposed by Miklos, but distinguish between "limit" (how far we need to search: ignore truncation's next_index optimization in the holepunch case - if there are races it's more consistent to act on the whole range specified) and "upper_limit" (how far we can free directory pages: generally we must be careful to keep partially punched pages, but can relax at end of file - i_size being held stable by i_mutex). Signed-off-by: Hugh Dickins Signed-off-by: Greg Kroah-Hartman commit 036bb8535c517d6ea5669337ed709cd975369a2b Author: Avi Kivity Date: Sun Apr 22 12:28:49 2007 +0300 KVM: MMU: Fix host memory corruption on i386 with >= 4GB ram PAGE_MASK is an unsigned long, so using it to mask physical addresses on i386 (which are 64-bit wide) leads to truncation. This can result in page->private of unrelated memory pages being modified, with disasterous results. Fix by not using PAGE_MASK for physical addresses; instead calculate the correct value directly from PAGE_SIZE. Also fix a similar BUG_ON(). Acked-by: Ingo Molnar Signed-off-by: Avi Kivity Signed-off-by: Greg Kroah-Hartman commit 1c4b6343a17186145fdf658939c5682e916905bc Author: Avi Kivity Date: Sun Apr 22 12:28:05 2007 +0300 KVM: MMU: Fix guest writes to nonpae pde KVM shadow page tables are always in pae mode, regardless of the guest setting. This means that a guest pde (mapping 4MB of memory) is mapped to two shadow pdes (mapping 2MB each). When the guest writes to a pte or pde, we intercept the write and emulate it. We also remove any shadowed mappings corresponding to the write. Since the mmu did not account for the doubling in the number of pdes, it removed the wrong entry, resulting in a mismatch between shadow page tables and guest page tables, followed shortly by guest memory corruption. This patch fixes the problem by detecting the special case of writing to a non-pae pde and adjusting the address and number of shadow pdes zapped accordingly. Acked-by: Ingo Molnar Signed-off-by: Avi Kivity Signed-off-by: Greg Kroah-Hartman commit 68e26a3dfdc0bd9d5f547761f174072ac6ffd418 Author: Jiri Kosina Date: Sun Apr 15 22:30:15 2007 +0200 HID: zeroing of bytes in output fields is bogus HID: zeroing of bytes in output fields is bogus This patch removes bogus zeroing of unused bits in output reports, introduced in Simon's patch in commit d4ae650a. According to the specification, any sane device should not care about values of unused bits. What is worse, the zeroing is done in a way which is broken and might clear certain bits in output reports which are actually _used_ - a device that has multiple fields with one value of the size 1 bit each might serve as an example of why this is bogus - the second call of hid_output_report() would clear the first bit of report, which has already been set up previously. This patch will break LEDs on SpaceNavigator, because this device is broken and takes into account the bits which it shouldn't touch. The quirk for this particular device will be provided in a separate patch. Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit d44da5e9d31b3f386db33b50bb16ca4d0734d184 Author: Michael S. Tsirkin Date: Mon Apr 16 14:17:42 2007 -0700 IB/mthca: Fix data corruption after FMR unmap on Sinai In mthca_arbel_fmr_unmap(), the high bits of the key are masked off. This gets rid of the effect of adjust_key(), which makes sure that bits 3 and 23 of the key are equal when the Sinai throughput optimization is enabled, and so it may happen that an FMR will end up with bits 3 and 23 in the key being different. This causes data corruption, because when enabling the throughput optimization, the driver promises the HCA firmware that bits 3 and 23 of all memory keys will always be equal. Fix by re-applying adjust_key() after masking the key. Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for help in debug. Signed-off-by: Michael S. Tsirkin Signed-off-by: Roland Dreier Signed-off-by: Greg Kroah-Hartman commit bd8622520f56b6a0d453ab895ca3272328bf66d0 Author: NeilBrown Date: Tue Apr 17 12:01:41 2007 +1000 knfsd: Use a spinlock to protect sk_info_authunix sk_info_authunix is not being protected properly so the object that it points to can be cache_put twice, leading to corruption. We borrow svsk->sk_defer_lock to provide the protection. We should probably rename that lock to have a more generic name - later. Thanks to Gabriel for reporting this. Cc: Greg Banks Cc: Gabriel Barazer Signed-off-by: Neil Brown Signed-off-by: Greg Kroah-Hartman