commit ef7cede81db4ee2c99e25321c65fdc8e1fd8f67b Author: Greg Kroah-Hartman Date: Fri Nov 16 08:24:58 2007 -0800 Linux 2.6.23.3 commit edc0636c313992570c6b10020111a5e2f0ccb6f8 Author: Linus Torvalds Date: Thu Nov 1 19:07:35 2007 -0400 revert "x86_64: allocate sparsemem memmap above 4G" Reverted upstream by commit 6a22c57b8d2a62dea7280a6b2ac807a539ef0716 Revert this commit: commit 2e1c49db4c640b35df13889b86b9d62215ade4b6 Author: Zou Nan hai Date: Fri Jun 1 00:46:28 2007 -0700 x86_64: allocate sparsemem memmap above 4G This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6. First off, testing in Fedora has shown it to cause boot failures, bisected down by Martin Ebourne, and reported by Dave Jobes. So the commit will likely be reverted in the 2.6.23 stable kernels. Secondly, in the 2.6.24 model, x86-64 has now grown support for SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the bug is not visible any more, it's become invisible due to the code just being irrelevant and no longer enabled on the only architecture that this ever affected. Reported-by: Dave Jones Tested-by: Martin Ebourne Cc: Zou Nan hai Cc: Suresh Siddha Cc: Andrew Morton Acked-by: Andy Whitcroft Signed-off-by: Linus Torvalds Cc: Chuck Ebbert Signed-off-by: Greg Kroah-Hartman commit 963bbb3b5f669e3dcf46ba423d65db5c35a007a0 Author: Dave Johnson Date: Tue Oct 23 22:37:22 2007 +0200 x86: fix TSC clock source calibration error patch edaf420fdc122e7a42326fe39274c8b8c9b19d41 in mainline. I ran into this problem on a system that was unable to obtain NTP sync because the clock was running very slow (over 10000ppm slow). ntpd had declared all of its peers 'reject' with 'peer_dist' reason. On investigation, the tsc_khz variable was significantly incorrect causing xtime to run slow. After a reboot tsc_khz was correct so I did a reboot test to see how often the problem occurred: Test was done on a 2000 Mhz Xeon system. Of 689 reboots, 8 of them had unacceptable tsc_khz values (>500ppm): range of tsc_khz # of boots % of boots ---------------- ---------- ---------- < 1999750 0 0.000% 1999750 - 1999800 21 3.048% 1999800 - 1999850 166 24.128% 1999850 - 1999900 241 35.029% 1999900 - 1999950 211 30.669% 1999950 - 2000000 42 6.105% 2000000 - 2000000 0 0.000% 2000050 - 2000100 0 0.000% [...] 2000100 - 2015000 1 0.145% << BAD 2015000 - 2030000 6 0.872% << BAD 2030000 - 2045000 1 0.145% << BAD 2045000 < 0 0.000% The worst boot was 2032.577 Mhz, over 1.5% off! It appears that on rare occasions, mach_countup() is taking longer to complete than necessary. I suspect that this is caused by the CPU taking a periodic SMI interrupt right at the end of the 30ms calibration loop. This would cause the loop to delay while the SMI BIOS hander runs. The resulting TSC value is beyond what it actually should be resulting in a higher tsc_khz. The below patch makes native_calculate_cpu_khz() take the best (shortest duration, lowest khz) run of it's 3 calibration loops. If a SMI goes off causing a bad result (long duration, higher khz) it will be discarded. With the patch applied, 300 boots of the same system produce good results: range of tsc_khz # of boots % of boots ---------------- ---------- ---------- < 1999750 0 0.000% 1999750 - 1999800 30 10.000% 1999800 - 1999850 166 55.333% 1999850 - 1999900 89 29.667% 1999900 - 1999950 15 5.000% 1999950 < 0 0.000% Problem was found and tested against 2.6.18. Patch is against 2.6.22. Signed-off-by: Dave Johnson Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 2d49e8885782ea938a4d1a2cee4898753e505f70 Author: H. Peter Anvin Date: Thu Oct 25 16:09:38 2007 -0700 x86 setup: sizeof() is unsigned, unbreak comparisons patch e6e1ace9904b72478f0c5a5aa7bd174cb6f62561 in mainline. We use signed values for limit checking since the values can go negative under certain circumstances. However, sizeof() is unsigned and forces the comparison to be unsigned, so move the comparison into the heap_free() macros so we can ensure it is a signed comparison. Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 430bb2ee8d136eae9d5641a23e44c791b6d747f0 Author: H. Peter Anvin Date: Thu Oct 25 16:11:33 2007 -0700 x86 setup: handle boot loaders which set up the stack incorrectly patch 6b6815c6d5d1dc209701d1661a7a0e09a295db2f in mainline. Apparently some specific versions of LILO enter the kernel with a stack pointer that doesn't match the rest of the segments. Make our best attempt at untangling the resulting mess. Signed-off-by: H. Peter Anvin Signed-off-by: Greg Kroah-Hartman commit 4b69ffe37490d0507375fbe31be833310f883a24 Author: Ingo Molnar Date: Fri Oct 19 12:19:26 2007 +0200 x86: fix global_flush_tlb() bug patch 9a24d04a3c26c223f22493492c5c9085b8773d4a upstream While we were reviewing pageattr_32/64.c for unification, Thomas Gleixner noticed the following serious SMP bug in global_flush_tlb(): down_read(&init_mm.mmap_sem); list_replace_init(&deferred_pages, &l); up_read(&init_mm.mmap_sem); this is SMP-unsafe because list_replace_init() done on two CPUs in parallel can corrupt the list. This bug has been introduced about a year ago in the 64-bit tree: commit ea7322decb974a4a3e804f96a0201e893ff88ce3 Author: Andi Kleen Date: Thu Dec 7 02:14:05 2006 +0100 [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr down_read(&init_mm.mmap_sem); - dpage = xchg(&deferred_pages, NULL); + list_replace_init(&deferred_pages, &l); up_read(&init_mm.mmap_sem); the xchg() based version was SMP-safe, but list_replace_init() is not. So this "cleanup" introduced a nasty bug. why this bug never become prominent is a mystery - it can probably be explained with the (still) relative obscurity of the x86_64 architecture. the safe fix for now is to write-lock init_mm.mmap_sem. Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Cc: Andi Kleen Cc: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit ba4312eb977a93e2200156da102b199e587151a9 Author: Jeremy Fitzhardinge Date: Fri Oct 12 14:11:42 2007 -0700 xfs: eagerly remove vmap mappings to avoid upsetting Xen patch ace2e92e193126711cb3a83a3752b2c5b8396950 in mainline. XFS leaves stray mappings around when it vmaps memory to make it virtually contigious. This upsets Xen if one of those pages is being recycled into a pagetable, since it finds an extra writable mapping of the page. This patch solves the problem in a brute force way, by making XFS always eagerly unmap its mappings. [ Stable: This works around a bug in 2.6.23. We may come up with a better solution for mainline, but this seems like a low-impact fix for the stable kernel. ] Signed-off-by: Jeremy Fitzhardinge Cc: XFS masters Cc: Morten =?utf-8?q?B=C3=B8geskov?= Cc: Mark Williamson Signed-off-by: Greg Kroah-Hartman commit 418db1544b43f855ff2335158c700e2da08cbc3f Author: Jeremy Fitzhardinge Date: Fri Oct 12 14:11:40 2007 -0700 xen: fix incorrect vcpu_register_vcpu_info hypercall argument patch e3d2697669abbe26c08dc9b95e2a71c634d096ed in mainline. The kernel's copy of struct vcpu_register_vcpu_info was out of date, at best causing the hypercall to fail and the guest kernel to fall back to the old mechanism, or worse, causing random memory corruption. Signed-off-by: Jeremy Fitzhardinge Cc: Stable Kernel Cc: Morten =?utf-8?q?B=C3=B8geskov?= Cc: Mark Williamson Signed-off-by: Greg Kroah-Hartman commit edf06ad73aa3054aec42e1c27faf3ab4b497a1d3 Author: Jeremy Fitzhardinge Date: Fri Oct 12 14:11:37 2007 -0700 xen: deal with stale cr3 values when unpinning pagetables patch 9f79991d4186089e228274196413572cc000143b in mainline. When a pagetable is no longer in use, it must be unpinned so that its pages can be freed. However, this is only possible if there are no stray uses of the pagetable. The code currently deals with all the usual cases, but there's a rare case where a vcpu is changing cr3, but is doing so lazily, and the change hasn't actually happened by the time the pagetable is unpinned, even though it appears to have been completed. This change adds a second per-cpu cr3 variable - xen_current_cr3 - which tracks the actual state of the vcpu cr3. It is only updated once the actual hypercall to set cr3 has been completed. Other processors wishing to unpin a pagetable can check other vcpu's xen_current_cr3 values to see if any cross-cpu IPIs are needed to clean things up. Signed-off-by: Jeremy Fitzhardinge Signed-off-by: Greg Kroah-Hartman commit 4fc04833aaa0d7604e7fb7bbaa6854a4ff254b50 Author: Jeremy Fitzhardinge Date: Fri Oct 12 14:11:36 2007 -0700 xen: add batch completion callbacks patch 91e0c5f3dad47838cb2ecc1865ce789a0b7182b1 in mainline. This adds a mechanism to register a callback function to be called once a batch of hypercalls has been issued. This is typically used to unlock things which must remain locked until the hypercall has taken place. Signed-off-by: Jeremy Fitzhardinge Signed-off-by: Greg Kroah-Hartman commit 6f8a6ffc2bffa55668aaeb3fbe910bd78e901158 Author: Lepton Wu Date: Thu Nov 1 15:53:27 2007 -0400 UML - kill subprocesses on exit commit a24864a1d52a97e345a6bd4862a057f98364d098 uml: definitively kill subprocesses on panic In a stock 2.6.22.6 kernel, poweroff a user mode linux guest (2.6.22.6 running in skas0 mode) will halt the host linux. I think the reason is the kernel thread abort because of a bug. Then the sys_reboot in process of user mode linux guest is not trapped by the user mode linux kernel and is executed by host. I think it is better to make sure all of our children process to quit when user mode linux kernel abort. [ jdike - the kernel process needs to ignore SIGTERM, plus the waitpid/kill loop is needed to make sure that all of our children are dead before the kernel exits ] Signed-off-by: Lepton Wu Signed-off-by: Jeff Dike Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 9e6707f395a49741f22f083c166453c2e202f0ec Author: Jeff Dike Date: Thu Nov 1 15:53:26 2007 -0400 UML - stop using libc asm/user.h commit 189872f968def833727b6bfef83ebd7440c538e6 in mainline. uml: don't use glibc asm/user.h Stop including asm/user.h from libc - it seems to be disappearing from distros. It's replaced with sys/user.h which defines user_fpregs_struct and user_fpxregs_struct instead of user_i387_struct and struct user_fxsr_struct on i386. As a bonus, on x86_64, I get to dump some stupid typedefs which were needed in order to get asm/user.h to compile. Signed-off-by: Jeff Dike Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 63140953f5b642a12764a913a47a8e6bc97ec4ae Author: Jeff Dike Date: Thu Nov 1 15:53:26 2007 -0400 UML - Fix kernel vs libc symbols clash commit 818f6ef407b448cef63294b9d0f6f8a2af9cb817 in mainline. uml: fix an IPV6 libc vs kernel symbol clash On some systems, with IPV6 configured, there is a clash between the kernel's in6addr_any and the one in libc. This is handled in the usual (gross) way of defining the kernel symbol out of the way on the gcc command line. Signed-off-by: Jeff Dike Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 90118e754066f25ff6c168e257ecae536cbb02ac Author: Jeff Dike Date: Thu Nov 1 15:53:25 2007 -0400 UML - Stop using libc asm/page.h commit 71f926f2ea61994470a53c9e11d3ef993197cada in mainline. uml: stop using libc asm/page.h Remove includes of asm/page.h from libc code. This header seems to be disappearing, and UML doesn't make much use of it anyway. The one use, PAGE_SHIFT in stub.h, is handled by copying the constant from the kernel side of the house in common_offsets.h. [ jdike - added arch/um/kernel/skas/clone.c for -stable ] Signed-off-by: Jeff Dike Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 5361fb20ccec8c0d054fb4362735699fcb3f8aa9 Author: Michael Ellerman Date: Mon Sep 17 16:03:45 2007 +1000 POWERPC: Make sure to of_node_get() the result of pci_device_to_OF_node() patch db220b234da9f183b127b9c3077c253b94756e35 in mainline. pci_device_to_OF_node() returns the device node attached to a PCI device, but doesn't actually grab a reference - we need to do it ourselves. Signed-off-by: Michael Ellerman Acked-by: Benjamin Herrenschmidt Signed-off-by: Paul Mackerras Signed-off-by: Greg Kroah-Hartman commit 1d841b4fa6fa8d846925939bde04a9faf85fb18e Author: Kumar Gala Date: Thu Oct 11 17:07:34 2007 -0500 POWERPC: Fix handling of stfiwx math emulation patch ba02946a903015840ef672ccc9dc8620a7e83de6 in mainline Its legal for the stfiwx instruction to have RA = 0 as part of its effective address calculation. This is illegal for all other XE form instructions. Add code to compute the proper effective address for stfiwx if RA = 0 rather than treating it as illegal. Signed-off-by: Kumar Gala Signed-off-by: Greg Kroah-Hartman commit 2f51141bab7ae812f99da94d260036dbec4f7b18 Author: Ralf Baechle Date: Wed Oct 10 12:14:36 2007 +0100 MIPS: R1: Fix hazard barriers to make kernels work on R2 also. patch 572afc248c33c902760f6f24a72c180f0e4f1719 in mainline. Tested with Malta; inflates malta_defconfig by 3932 bytes. Ideally there should be additional configuration to allow getting rid of this overhead but that would be too much complexity at this stage of the release cycle. Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit e989c61af4b369a42566047d27cd18242e3ec996 Author: Ralf Baechle Date: Wed Oct 10 12:12:36 2007 +0100 MIPS: MT: Fix bug in multithreaded kernels. patch a76ab5c10d99bdf458067cb495e72c0ee5f09909 in mainline. When GDB writes a breakpoint into address area of inferior process the kernel needs to invalidate the modified memory in the inferior which is done by calling flush_cache_page which in turns calls r4k_flush_cache_page and local_r4k_flush_cache_page for VSMP or SMTC kernel via r4k_on_each_cpu(). As the VSMP and SMTC SMP kernels for 34K are running on a single shared caches it is possible to get away without interprocessor function calls. This optimization is implemented in r4k_on_each_cpu, so local_r4k_flush_cache_page is only ever called on the local CPU. This is where the following code in local_r4k_flush_cache_page() strikes: /* * If ownes no valid ASID yet, cannot possibly have gotten * this page into the cache. */ if (cpu_context(smp_processor_id(), mm) == 0) return; On VSMP and SMTC had a function of cpu_context() for each CPU(TC). So in case another CPU than the CPU executing local_r4k_cache_flush_page has not accessed the mm but one of the other CPUs has there may be data to be flushed in the cache yet local_r4k_cache_flush_page will falsely return leaving the I-cache inconsistent for the breakpoint. While the issue was discovered with GDB it also exists in local_r4k_flush_cache_range() and local_r4k_flush_cache(). Fixed by introducing a new function has_valid_asid which on MT kernels returns true if a mm is active on any processor in the system. This is relativly expensive since for memory acccesses in that loop cache misses have to be assumed but it seems the most viable solution for 2.6.23 and older -stable kernels. Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman commit efd233a45eed8688d5d5b02f56610b0abf2474cc Author: Chris Wright Date: Tue Oct 23 22:44:38 2007 -0700 Fix sparc64 MAP_FIXED handling of framebuffer mmaps patch d58aa8c7b1cc0add7b03e26bdb8988d98d2f4cd1 in mainline. From: Chris Wright Date: Tue, 23 Oct 2007 20:36:14 -0700 Subject: [PATCH] [SPARC64]: pass correct addr in get_fb_unmapped_area(MAP_FIXED) Looks like the MAP_FIXED case is using the wrong address hint. I'd expect the comment "don't mess with it" means pass the request straight on through, not change the address requested to -ENOMEM. Signed-off-by: Chris Wright Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 71ec6448bb2aa34353db19fec82d9969a3e4f672 Author: David Miller Date: Tue Oct 23 03:12:00 2007 -0700 Fix sparc64 niagara optimized RAID xor asm patch d060db63fd38a8a75f666576ef9999c28cdc31cf in mainline. [SPARC64]: Fix register usage in xor_raid_4(). Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is really really bad because those are the frame pointer and return PC. Based upon a raid5 crash report by Bertrand Joel. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman