commit 257313b2a87795e07a0bdf58d0fffbdba8b31051 Author: Linus Torvalds Date: Thu May 19 21:22:53 2011 -0700 selinux: avoid unnecessary avc cache stat hit count There is no point in counting hits - we can calculate it from the number of lookups and misses. This makes the avc statistics a bit smaller, and makes the code generation better too. Signed-off-by: Linus Torvalds commit 044aea9b83614948c98564000db07d1d32b2d29b Author: Linus Torvalds Date: Thu May 19 18:59:47 2011 -0700 selinux: de-crapify avc cache stat code generation You can turn off the avc cache stats, but distributions seem to not do that (perhaps because several performance tuning how-to's talk about the avc cache statistics). Which is sad, because the code it generates is truly horrendous, with the statistics update being sandwitched between get_cpu/put_cpu which in turn causes preemption disables etc. We're talking ten+ instructions just to increment a per-cpu variable in some pretty hot code. Fix the craziness by just using 'this_cpu_inc()' instead. Suddenly we only need a single 'inc' instruction to increment the statistics. This is quite noticeable in the incredibly hot avc_has_perm_noaudit() function (which triggers all the statistics by virtue of doing an avc_lookup() call). Signed-off-by: Linus Torvalds commit 39ab05c8e0b519ff0a04a869f065746e6e8c3d95 Merge: 1477fcc c42d223 Author: Linus Torvalds Date: Thu May 19 18:24:11 2011 -0700 Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (44 commits) debugfs: Silence DEBUG_STRICT_USER_COPY_CHECKS=y warning sysfs: remove "last sysfs file:" line from the oops messages drivers/base/memory.c: fix warning due to "memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION" memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION SYSFS: Fix erroneous comments for sysfs_update_group(). driver core: remove the driver-model structures from the documentation driver core: Add the device driver-model structures to kerneldoc Translated Documentation/email-clients.txt RAW driver: Remove call to kobject_put(). reboot: disable usermodehelper to prevent fs access efivars: prevent oops on unload when efi is not enabled Allow setting of number of raw devices as a module parameter Introduce CONFIG_GOOGLE_FIRMWARE driver: Google Memory Console driver: Google EFI SMI x86: Better comments for get_bios_ebda() x86: get_bios_ebda_length() misc: fix ti-st build issues params.c: Use new strtobool function to process boolean inputs debugfs: move to new strtobool ... Fix up trivial conflicts in fs/debugfs/file.c due to the same patch being applied twice, and an unrelated cleanup nearby. commit 1477fcc290b3d5c2614bde98bf3b1154c538860d Author: Stephen Rothwell Date: Fri May 20 11:11:53 2011 +1000 signal.h need a definition of struct task_struct This fixes these build errors on powerpc: In file included from arch/powerpc/mm/fault.c:18: include/linux/signal.h:239: error: 'struct task_struct' declared inside parameter list include/linux/signal.h:239: error: its scope is only this definition or declaration, which is probably not what you want include/linux/signal.h:240: error: 'struct task_struct' declared inside parameter list .. Exposed by commit e66eed651fd1 ("list: remove prefetching from regular list iterators"), which removed the include of from . Without that, linux/signal.h no longer accidentally got the declaration of 'struct task_struct'. Fix by properly declaring the struct, rather than introducing any new header file dependency. Signed-off-by: Stephen Rothwell Signed-off-by: Linus Torvalds commit eb04f2f04ed1227c266b3219c0aaeda525639718 Merge: 5765040 80d0208 Author: Linus Torvalds Date: Thu May 19 18:14:34 2011 -0700 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ... commit 5765040ebfc9a28d9dcfaaaaf3d25840d922de96 Merge: 08839ff de5397a Author: Linus Torvalds Date: Thu May 19 18:10:17 2011 -0700 Merge branch 'x86-smep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-smep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cpu: Enable/disable Supervisor Mode Execution Protection x86, cpu: Add SMEP CPU feature in CR4 x86, cpufeature: Add cpufeature flag for SMEP commit 08839ff8276bd1ba0ce8b2d595f9fe62a5b07210 Merge: 08b5d06 660e34c 0c61227 Author: Linus Torvalds Date: Thu May 19 18:09:45 2011 -0700 Merge branches 'x86-reboot-for-linus' and 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Reorder reboot method preferences * 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, setup: Fix EDD3.0 data verification. commit 08b5d06ec6cff1d952f13cfcffcbf41ff0ce2c86 Merge: 1358820 5d94e81 Author: Linus Torvalds Date: Thu May 19 18:08:06 2011 -0700 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Introduce pci_map_biosrom() x86, olpc: Use device tree for platform identification commit 13588209aa90d9c8e502750fc86160314555612f Merge: ac2941f dc382fd Author: Linus Torvalds Date: Thu May 19 18:07:31 2011 -0700 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits) x86, mm: Allow ZONE_DMA to be configurable x86, NUMA: Trim numa meminfo with max_pfn in a separate loop x86, NUMA: Rename setup_node_bootmem() to setup_node_data() x86, NUMA: Enable emulation on 32bit too x86, NUMA: Enable CONFIG_AMD_NUMA on 32bit too x86, NUMA: Rename amdtopology_64.c to amdtopology.c x86, NUMA: Make numa_init_array() static x86, NUMA: Make 32bit use common NUMA init path x86, NUMA: Initialize and use remap allocator from setup_node_bootmem() x86-32, NUMA: Add @start and @end to init_alloc_remap() x86, NUMA: Remove long 64bit assumption from numa.c x86, NUMA: Enable build of generic NUMA init code on 32bit x86, NUMA: Move NUMA init logic from numa_64.c to numa.c x86-32, NUMA: Update numaq to use new NUMA init protocol x86-32, NUMA: Replace srat_32.c with srat.c x86-32, NUMA: implement temporary NUMA init shims x86, NUMA: Move numa_nodes_parsed to numa.[hc] x86-32, NUMA: Move get_memcfg_numa() into numa_32.c x86, NUMA: make srat.c 32bit safe x86, NUMA: rename srat_64.c to srat.c ... commit ac2941f59a38eeb535e1f227a8f90d7fe6b7828b Merge: 0162818 935a638 c387aa3 983bbf1 dffa4b2 Author: Linus Torvalds Date: Thu May 19 18:03:56 2011 -0700 Merge branches 'x86-efi-for-linus', 'x86-gart-for-linus', 'x86-irq-for-linus' and 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, efi: Ensure that the entirity of a region is mapped x86, efi: Pass a minimal map to SetVirtualAddressMap() x86, efi: Merge contiguous memory regions of the same type and attribute x86, efi: Consolidate EFI nx control x86, efi: Remove virtual-mode SetVirtualAddressMap call * 'x86-gart-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, gart: Don't enforce GART aperture lower-bound by alignment * 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Don't unmask disabled irqs when migrating them x86: Skip migrating IRQF_PER_CPU irqs in fixup_irqs() * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mce: Drop the default decoding notifier x86, MCE: Do not taint when handling correctable errors commit 016281880439a8665ecf37514865742da58131d4 Merge: 17b1418 865be7a Author: Linus Torvalds Date: Thu May 19 17:55:12 2011 -0700 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cpu: Fix detection of Celeron Covington stepping A1 and B0 Documentation, ABI: Update L3 cache index disable text x86, AMD, cacheinfo: Fix L3 cache index disable checks x86, AMD, cacheinfo: Fix fallout caused by max3 conversion x86, cpu: Change NOP selection for certain Intel CPUs x86, cpu: Clean up and unify the NOP selection infrastructure x86, percpu: Use ASM_NOP4 instead of hardcoding P6_NOP4 x86, cpu: Move AMD Elan Kconfig under "Processor family" Fix up trivial conflicts in alternative handling (commit dc326fca2b64 "x86, cpu: Clean up and unify the NOP selection infrastructure" removed some hacky 5-byte instruction stuff, while commit d430d3d7e646 "jump label: Introduce static_branch() interface" renamed HAVE_JUMP_LABEL to CONFIG_JUMP_LABEL in the code that went away) commit 17b141803c6c6e27fbade3f97c1c9d8d66c72866 Merge: 78c4def 2b398bd 15d6aba 8fab6af Author: Linus Torvalds Date: Thu May 19 17:49:35 2011 -0700 Merge branches 'x86-apic-for-linus', 'x86-asm-for-linus' and 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, apic: Print verbose error interrupt reason on apic=debug * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Demacro CONFIG_PARAVIRT cpu accessors * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix mrst sparse complaints x86: Fix spelling error in the memcpy() source code comment x86, mpparse: Remove unnecessary variable commit 78c4def67e8eebe602655a3dec9aa08f0e2f7c4b Merge: 7e6628e 942c3c5 Author: Linus Torvalds Date: Thu May 19 17:45:08 2011 -0700 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: Make lookup table const RTC: Disable CONFIG_RTC_CLASS from being built as a module timers: Fix alarmtimer build issues when CONFIG_RTC_CLASS=n timers: Remove delayed irqwork from alarmtimers implementation timers: Improve alarmtimer comments and minor fixes timers: Posix interface for alarm-timers timers: Introduce in-kernel alarm-timer interface timers: Add rb_init_node() to allow for stack allocated rb nodes time: Add timekeeping_inject_sleeptime commit 7e6628e4bcb3b3546c625ec63ca724f28ab14f0c Merge: 0f1bdc1 ab0e08f Author: Linus Torvalds Date: Thu May 19 17:44:40 2011 -0700 Merge branch 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: hpet: Cleanup the clockevents init and register code x86: Convert PIT to clockevents_config_and_register() clockevents: Provide interface to reconfigure an active clock event device clockevents: Provide combined configure and register function clockevents: Restructure clock_event_device members clocksource: Get rid of the hardcoded 5 seconds sleep time limit clocksource: Restructure clocksource struct members commit 0f1bdc1815c4cb29b3cd71a7091b478e426faa0b Merge: 80fe02b a18f22a Author: Linus Torvalds Date: Thu May 19 17:44:13 2011 -0700 Merge branch 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clocksource: convert mips to generic i8253 clocksource clocksource: convert x86 to generic i8253 clocksource clocksource: convert footbridge to generic i8253 clocksource clocksource: add common i8253 PIT clocksource blackfin: convert to clocksource_register_hz mips: convert to clocksource_register_hz/khz sparc: convert to clocksource_register_hz/khz alpha: convert to clocksource_register_hz microblaze: convert to clocksource_register_hz/khz ia64: convert to clocksource_register_hz/khz x86: Convert remaining x86 clocksources to clocksource_register_hz/khz Make clocksource name const commit 80fe02b5daf176f99d3afc8f6c9dc9dece019836 Merge: df48d87 db670da ec514c4 Author: Linus Torvalds Date: Thu May 19 17:41:22 2011 -0700 Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) sched: Fix and optimise calculation of the weight-inverse sched: Avoid going ahead if ->cpus_allowed is not changed sched, rt: Update rq clock when unthrottling of an otherwise idle CPU sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched: Shorten the construction of the span cpu mask of sched domain sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG sched: Remove unused 'this_best_prio arg' from balance_tasks() sched: Remove noop in alloc_rt_sched_group() sched: Get rid of lock_depth sched: Remove obsolete comment from scheduler_tick() sched: Fix sched_domain iterations vs. RCU sched: Next buddy hint on sleep and preempt path sched: Make set_*_buddy() work on non-task entities sched: Remove need_migrate_task() sched: Move the second half of ttwu() to the remote cpu sched: Restructure ttwu() some more sched: Rename ttwu_post_activation() to ttwu_do_wakeup() sched: Remove rq argument from ttwu_stat() sched: Remove rq->lock from the first half of ttwu() sched: Drop rq->lock from sched_exec() ... * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix rt_rq runtime leakage bug commit df48d8716eab9608fe93924e4ae06ff110e8674f Merge: acd3025 29510ec Author: Linus Torvalds Date: Thu May 19 17:36:08 2011 -0700 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits) perf stat: Add more cache-miss percentage printouts perf stat: Add -d -d and -d -d -d options to show more CPU events ftrace/kbuild: Add recordmcount files to force full build ftrace: Add self-tests for multiple function trace users ftrace: Modify ftrace_set_filter/notrace to take ops ftrace: Allow dynamically allocated function tracers ftrace: Implement separate user function filtering ftrace: Free hash with call_rcu_sched() ftrace: Have global_ops store the functions that are to be traced ftrace: Add ops parameter to ftrace_startup/shutdown functions ftrace: Add enabled_functions file ftrace: Use counters to enable functions to trace ftrace: Separate hash allocation and assignment ftrace: Create a global_ops to hold the filter and notrace hashes ftrace: Use hash instead for FTRACE_FL_FILTER ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions perf bench, x86: Add alternatives-asm.h wrapper x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB ... commit acd30250d7d0f495685d1c7c6184636a22fcdf7f Merge: 6595b4a edf76f8 Author: Linus Torvalds Date: Thu May 19 17:30:15 2011 -0700 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Export functions to allow modular irq drivers genirq: Uninline and sanity check generic_handle_irq() genirq: Remove pointless ifdefs genirq: Make generic irq chip depend on CONFIG_GENERIC_IRQ_CHIP genirq: Add chip suspend and resume callbacks genirq: Implement a generic interrupt chip genirq: Support per-IRQ thread disabling. genirq: irq_desc: Document preflow_handler and affinity_hint genirq: Update DocBook comments genirq: Forgotten updates/deletions after removal of compat code commit 6595b4a940c4c447b619ab5268378ed03e632694 Merge: cbdad8d 5db1256 Author: Linus Torvalds Date: Thu May 19 17:29:29 2011 -0700 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: seqlock: Don't smp_rmb in seqlock reader spin loop watchdog, hung_task_timeout: Add Kconfig configurable default lockdep: Remove cmpxchg to update nr_chain_hlocks lockdep: Print a nicer description for simple irq lock inversions lockdep: Replace "Bad BFS generated tree" message with something less cryptic lockdep: Print a nicer description for irq inversion bugs lockdep: Print a nicer description for simple deadlocks lockdep: Print a nicer description for normal deadlocks lockdep: Print a nicer description for irq lock inversions commit cbdad8dc18b8ddd6c8b48c4ef26d46f00b5af923 Merge: 51509a2 86b9523 Author: Linus Torvalds Date: Thu May 19 17:28:58 2011 -0700 Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, gart: Rename pci-gart_64.c to amd_gart_64.c x86/amd-iommu: Use threaded interupt handler arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS x86/amd-iommu: Add support for invalidate_all command x86/amd-iommu: Add extended feature detection x86/amd-iommu: Add ATS enable/disable code x86/amd-iommu: Add flag to indicate IOTLB support x86/amd-iommu: Flush device IOTLB if ATS is enabled x86/amd-iommu: Select PCI_IOV with AMD IOMMU driver PCI: Move ATS declarations in seperate header file dma-debug: print information about leaked entry x86/amd-iommu: Flush all internal TLBs when IOMMUs are enabled x86/amd-iommu: Rename iommu_flush_device x86/amd-iommu: Improve handling of full command buffer x86/amd-iommu: Rename iommu_flush* to domain_flush* x86/amd-iommu: Remove command buffer resetting logic x86/amd-iommu: Cleanup completion-wait handling x86/amd-iommu: Cleanup inv_pages command handling x86/amd-iommu: Move inv-dte command building to own function x86/amd-iommu: Move compl-wait command building to own function commit 51509a283a908d73b20371addc67ee3ae7189934 Merge: 75f5076 6538df8 Author: Linus Torvalds Date: Thu May 19 16:46:07 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (34 commits) PM: Introduce generic prepare and complete callbacks for subsystems PM: Allow drivers to allocate memory from .prepare() callbacks safely PM: Remove CONFIG_PM_VERBOSE Revert "PM / Hibernate: Reduce autotuned default image size" PM / Hibernate: Add sysfs knob to control size of memory for drivers PM / Wakeup: Remove useless synchronize_rcu() call kmod: always provide usermodehelper_disable() PM / ACPI: Remove acpi_sleep=s4_nonvs PM / Wakeup: Fix build warning related to the "wakeup" sysfs file PM: Print a warning if firmware is requested when tasks are frozen PM / Runtime: Rework runtime PM handling during driver removal Freezer: Use SMP barriers PM / Suspend: Do not ignore error codes returned by suspend_enter() PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops" OMAP1 / PM: Use generic clock manipulation routines for runtime PM PM: Remove sysdev suspend, resume and shutdown operations PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM PM / AVR32: Use struct syscore_ops instead of sysdevs for PM ... commit 75f5076b12924f53340209d2cde73b98ed3b3095 Merge: 83d7e94 1df9fad Author: Linus Torvalds Date: Thu May 19 16:44:34 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Use pci_dev->revision RDMA/iwcm: Get rid of enum iw_cm_event_status IB/ipath: Use pci_dev->revision, again IB/qib: Prevent driver hang with unprogrammed boards RDMA/cxgb4: EEH errors can hang the driver RDMA/cxgb4: Reset wait condition atomically RDMA/cxgb4: Fix missing parentheses RDMA/cxgb4: Initialization errors can cause crash RDMA/cxgb4: Don't change QP state outside EP lock RDMA/cma: Add an ID_REUSEADDR option RDMA/cma: Fix handling of IPv6 addressing in cma_use_port commit 83d7e948754cf021ed7343b122940fcc27c1bd88 Merge: fce4a1d 9b090f2 Author: Linus Torvalds Date: Thu May 19 16:44:13 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm: kmemleak: Initialise kmemleak after debug_objects_mem_init() kmemleak: Select DEBUG_FS unconditionally in DEBUG_KMEMLEAK kmemleak: Do not return a pointer to an object that kmemleak did not get commit fce4a1dda2f1a9a25b3e5b7cd951070e0b42a818 Merge: e1f2084 6f6c3c3 Author: Linus Torvalds Date: Thu May 19 16:40:47 2011 -0700 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (48 commits) MIPS: Move arch_get_unmapped_area and gang to new file. MIPS: Cleanup arch_get_unmapped_area MIPS: Octeon: Don't request interrupts for unused IPI mailbox bits. Octeon: Fix interrupt irq settings for performance counters. MIPS: Fix build warnings on defconfigs MIPS: Lemote 2F, Malta: Fix build warning MIPS: Set ELF AT_PLATFORM string for Loongson2 processors MIPS: Set ELF AT_PLATFORM string for BMIPS processors MIPS: Introduce set_elf_platform() helper function MIPS: JZ4740: setup: Autodetect physical memory. MIPS: BCM47xx: Fix MAC address parsing. MIPS: BCM47xx: Extend the filling of SPROM from NVRAM MIPS: BCM47xx: Register SSB fallback sprom callback MIPS: BCM47xx: Extend bcm47xx_fill_sprom with prefix. SSB: Change fallback sprom to callback mechanism. MIPS: Alchemy: Clean up GPIO registers and accessors MIPS: Alchemy: Cleanup DMA addresses MIPS: Alchemy: Rewrite ethernet platform setup MIPS: Alchemy: Rewrite UART setup and constants. MIPS: Alchemy: Convert dbdma.c to syscore_ops ... commit e1f2084ed200eb31f2c9d1efe70569c76889c980 Merge: e33ab8f 659e6ed Author: Linus Torvalds Date: Thu May 19 16:27:21 2011 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: input/atari: Fix mouse movement and button mapping input/atari: Fix atarimouse init input/atari: Use the correct mouse interrupt hook m68k/atari: Do not use "/" in interrupt names m68k: unistd - Comment out definitions for unimplemented syscalls m68k: Really wire up sys_pselect6 and sys_ppoll m68k: Merge mmu and non-mmu versions of sys_call_table MAINTAINERS: Roman Zippel has been MIA for several years. m68k: bitops - Never step beyond the end of the bitmap m68k: bitops - offset == ((long)p - (long)vaddr) * 8 commit e33ab8f275cf6e0e0bf6c9c44149de46222b36cc Merge: 3bfccb7 7e186bd 8c59508 0f16d0d 7899891 Author: Linus Torvalds Date: Thu May 19 16:14:58 2011 -0700 Merge branches 'stable/irq', 'stable/p2m.bugfixes', 'stable/e820.bugfixes' and 'stable/mmu.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/irq' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: do not clear and mask evtchns in __xen_evtchn_do_upcall * 'stable/p2m.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/p2m: Create entries in the P2M_MFN trees's to track 1-1 mappings * 'stable/e820.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/setup: Fix for incorrect xen_extra_mem_start initialization under 32-bit xen/setup: Ignore E820_UNUSABLE when setting 1-1 mappings. * 'stable/mmu.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen mmu: fix a race window causing leave_mm BUG() commit 3bfccb74973db10a03d3d4c1d10fc00e77145699 Merge: 5318991 09ca132 887cb45 Author: Linus Torvalds Date: Thu May 19 16:14:35 2011 -0700 Merge branches 'stable/balloon.cleanup' and 'stable/general.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/balloon.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/balloon: Move dec_totalhigh_pages() from __balloon_append() to balloon_append() xen/balloon: Clarify credit calculation xen/balloon: Simplify HVM integration xen/balloon: Use PageHighMem() for high memory page detection * 'stable/general.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: drivers/xen/sys-hypervisor: Cleanup code/data sections definitions arch/x86/xen/smp: Cleanup code/data sections definitions arch/x86/xen/time: Cleanup code/data sections definitions arch/x86/xen/xen-ops: Cleanup code/data sections definitions arch/x86/xen/mmu: Cleanup code/data sections definitions arch/x86/xen/setup: Cleanup code/data sections definitions arch/x86/xen/enlighten: Cleanup code/data sections definitions arch/x86/xen/irq: Cleanup code/data sections definitions xen: tidy up whitespace in drivers/xen/Makefile commit 5318991645d78c83dde7a7bb1cba24695cc152c4 Merge: dc93275 7c1bfd6 d79647a Author: Linus Torvalds Date: Thu May 19 16:14:25 2011 -0700 Merge branches 'stable/backend.base.v3' and 'stable/gntalloc.v7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/backend.base.v3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pci: Fix compiler error when CONFIG_XEN_PRIVILEGED_GUEST is not set. xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions. xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override. xen/irq: The Xen hypervisor cleans up the PIRQs if the other domain forgot. xen/irq: Export 'xen_pirq_from_irq' function. xen/irq: Add support to check if IRQ line is shared with other domains. xen/irq: Check if the PCI device is owned by a domain different than DOMID_SELF. xen/pci: Add xen_[find|register|unregister]_device_domain_owner functions. * 'stable/gntalloc.v7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/gntdev,gntalloc: Remove unneeded VM flags commit dc93275150da9542f500fbd3d0515eecfefba7f6 Merge: 6b55b90 1403635 Author: Linus Torvalds Date: Thu May 19 16:14:02 2011 -0700 Merge branch 'stable/broadcom.ibft' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft-2.6 * 'stable/broadcom.ibft' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft-2.6: iscsi_ibft: search for broadcom specific ibft sign (v2) commit 6b55b908458120a5f469aa87d11821be67fbd8e2 Merge: f8223b1 bb0a56e Author: Linus Torvalds Date: Thu May 19 15:59:24 2011 -0700 Merge branch 'move-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'move-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Move x86 drivers to drivers/cpufreq/ commit f8223b17550c427ac249124ff7f25722221f4591 Merge: 98a38a5 1a8e146 Author: Linus Torvalds Date: Thu May 19 15:57:29 2011 -0700 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] remove redundant sprintf from request_module call. [CPUFREQ] cpufreq_stats.c: Fixed brace coding style issue [CPUFREQ] Fix memory leak in cpufreq_stat [CPUFREQ] cpufreq.h: Fix some checkpatch.pl coding style issues. [CPUFREQ] use dynamic debug instead of custom infrastructure [CPUFREQ] CPU hotplug, re-create sysfs directory and symlinks [CPUFREQ] Fix _OSC UUID in pcc-cpufreq commit bb0a56ecc4ba2a3db1b6ea6949c309886e3447d3 Author: Dave Jones Date: Thu May 19 18:51:07 2011 -0400 [CPUFREQ] Move x86 drivers to drivers/cpufreq/ Signed-off-by: Dave Jones commit 98a38a5d60a6e79eaad7f4a9b68cc1bd306ac5c0 Merge: 7663164 f721a46 Author: Linus Torvalds Date: Thu May 19 14:56:13 2011 -0700 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: params.c: Use new strtobool function to process boolean inputs debugfs: move to new strtobool Add a strtobool function matching semantics of existing in kernel equivalents modpost: Update 64k section support for binutils 2.18.50 module: Use binary search in lookup_symbol() module: Use the binary search for symbols resolution lib: Add generic binary search function to the kernel. module: Sort exported symbols module: each_symbol_section instead of each_symbol module: split unset_section_ro_nx function. module: undo module RONX protection correctly. module: zero mod->init_ro_size after init is freed. minor ANSI prototype sparse fix module: reorder kparam_array to remove alignment padding on 64 bit builds module: remove 64 bit alignment padding from struct module with CONFIG_TRACE* module: do not hide __modver_version_show declaration behind ifdef module: deal with alignment issues in built-in module versions commit 7663164f2619e37a1dcad59383e2fcf8c5194107 Merge: e66eed6 6151658 Author: Linus Torvalds Date: Thu May 19 14:55:34 2011 -0700 Merge branch 'docs-move' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs * 'docs-move' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs: Correct occurrences of - Documentation/kvm/ to Documentation/virtual/kvm - Documentation/uml/ to Documentation/virtual/uml - Documentation/lguest/ to Documentation/virtual/lguest throughout the kernel source tree. Add a 00-INDEX file to Documentation/virtual Remove uml from the top level 00-INDEX file. Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E: commit 80d02085d99039b3b7f3a73c8896226b0cb1ba07 Author: Paul E. McKenney Date: Thu May 12 01:08:07 2011 -0700 Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" This reverts commit e59fb3120becfb36b22ddb8bd27d065d3cdca499. This reversion was due to (extreme) boot-time slowdowns on SPARC seen by Yinghai Lu and on x86 by Ingo . This is a non-trivial reversion due to intervening commits. Conflicts: Documentation/RCU/trace.txt kernel/rcutree.c Signed-off-by: Ingo Molnar commit e66eed651fd18a961f11cda62f3b5286c8cc4f9f Author: Linus Torvalds Date: Thu May 19 14:15:29 2011 -0700 list: remove prefetching from regular list iterators This is removes the use of software prefetching from the regular list iterators. We don't want it. If you do want to prefetch in some iterator of yours, go right ahead. Just don't expect the iterator to do it, since normally the downsides are bigger than the upsides. It also replaces with , because the use of LIST_POISON ends up needing it. is sadly not self-contained, and including prefetch.h just happened to hide that. Suggested by David Miller (networking has a lot of regular lists that are often empty or a single entry, and prefetching is not going to do anything but add useless instructions). Acked-by: Ingo Molnar Acked-by: David S. Miller Cc: linux-arch@vger.kernel.org Signed-off-by: Linus Torvalds commit 75d65a425c0163d3ec476ddc12b51087217a070c Author: Linus Torvalds Date: Thu May 19 13:50:07 2011 -0700 hlist: remove software prefetching in hlist iterators They not only increase the code footprint, they actually make things slower rather than faster. On internationally acclaimed benchmarks ("make -j16" on an already fully built kernel source tree) the hlist prefetching slows down the build by up to 1%. (Almost all of it comes from hlist_for_each_entry_rcu() as used by avc_has_perm_noaudit(), which is very hot due to all the pathname lookups to see if there is anything to do). The cause seems to be two-fold: - on at least some Intel cores, prefetch(NULL) ends up with some microarchitectural stall due to the TLB miss that it incurs. The hlist case triggers this very commonly, since the NULL pointer is the last entry in the list. - the prefetch appears to cause more D$ activity, probably because it prefetches hash list entries that are never actually used (because we ended the search early due to a hit). Regardless, the numbers clearly say that the implicit prefetching is simply a bad idea. If some _particular_ user of the hlist iterators wants to prefetch the next list entry, they can do so themselves explicitly, rather than depend on all list iterators doing so implicitly. Acked-by: Ingo Molnar Acked-by: David S. Miller Cc: linux-arch@vger.kernel.org Signed-off-by: Linus Torvalds commit 29510ec3b626c86de9707bb8904ff940d430289b Merge: 398995c 95950c2 Author: Ingo Molnar Date: Thu May 19 19:48:03 2011 +0200 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core commit 398995ce7980b03b5803f8f31073b45d87746bc1 Merge: c330525 d697182 Author: Ingo Molnar Date: Thu May 19 19:25:55 2011 +0200 Merge branch 'tip/perf/core-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core commit 9b090f2da85bd0df5e1a1ecfe4120b7b50358f48 Author: Catalin Marinas Date: Thu May 19 16:25:30 2011 +0100 kmemleak: Initialise kmemleak after debug_objects_mem_init() Kmemleak frees objects via RCU and when CONFIG_DEBUG_OBJECTS_RCU_HEAD is enabled, the RCU callback triggers a call to free_object() in lib/debugobjects.c. Since kmemleak is initialised before debug objects initialisation, it may result in a kernel panic during booting. This patch moves the kmemleak_init() call after debug_objects_mem_init(). Reported-by: Marcin Slusarz Tested-by: Tejun Heo Signed-off-by: Catalin Marinas Cc: commit 79e0d9bd262bdd36009e8092e57e34dc5e22a1c7 Author: Catalin Marinas Date: Wed Apr 27 17:06:19 2011 +0100 kmemleak: Select DEBUG_FS unconditionally in DEBUG_KMEMLEAK In the past DEBUG_FS used to depend on SYSFS and DEBUG_KMEMLEAK selected it conditionally. This is no longer the case, so always select DEBUG_FS via DEBUG_KMEMLEAK. Signed-off-by: Catalin Marinas commit 52c3ce4ec5601ee383a14f1485f6bac7b278896e Author: Catalin Marinas Date: Wed Apr 27 16:44:26 2011 +0100 kmemleak: Do not return a pointer to an object that kmemleak did not get The kmemleak_seq_next() function tries to get an object (and increment its use count) before returning it. If it could not get the last object during list traversal (because it may have been freed), the function should return NULL rather than a pointer to such object that it did not get. Signed-off-by: Catalin Marinas Reported-by: Phil Carmody Acked-by: Phil Carmody Cc: commit 659e6ed55ff8d617c895c10288644e3e6107834e Author: Geert Uytterhoeven Date: Mon Jan 31 20:15:04 2011 +0100 input/atari: Fix mouse movement and button mapping Up and down movements were reversed, left and right buttons were swapped. Signed-off-by: Geert Uytterhoeven commit 186f200a95cbd13c291cdd3ddeb07aad0a3782cc Author: Michael Schmitz Date: Sun Dec 28 23:00:45 2008 +0100 input/atari: Fix atarimouse init Atarimouse fails to load as a module (with ENODEV), due to a brown paper bag bug, misinterpreting the semantics of atari_keyb_init(). [geert] Propagate the return value of atari_keyb_init() everywhere Signed-off-by: Michael Schmitz Signed-off-by: Geert Uytterhoeven commit 7786908c3c1bb38dcc5cd2c037251c05507eef16 Author: Michael Schmitz Date: Tue Dec 16 21:26:03 2008 +0100 input/atari: Use the correct mouse interrupt hook The Atari keyboard driver calls atari_mouse_interrupt_hook if it's set, not atari_input_mouse_interrupt_hook. Fix below. [geert] Killed off atari_mouse_interrupt_hook completely, after fixing another incorrect assignment in atarimouse.c. Signed-off-by: Michael Schmitz Signed-off-by: Geert Uytterhoeven commit 79abeed6ee93231d494c191a9251c0845bd71fdd Author: Geert Uytterhoeven Date: Wed May 4 14:55:41 2011 +0200 m68k/atari: Do not use "/" in interrupt names It may trigger a warning in fs/proc/generic.c:__xlate_proc_name() when trying to add an entry for the interrupt handler to sysfs. Signed-off-by: Geert Uytterhoeven commit 1fc74ac61229edfe053fb87e8939ae9ca3794389 Author: Geert Uytterhoeven Date: Thu May 5 20:33:02 2011 +0200 m68k: unistd - Comment out definitions for unimplemented syscalls Suggested-by: Arnd Bergmann Signed-off-by: Geert Uytterhoeven commit d6d42bb2f85d875dc0c421699de5a1401b2af6a6 Author: Geert Uytterhoeven Date: Fri May 6 20:57:11 2011 +0200 m68k: Really wire up sys_pselect6 and sys_ppoll We reserved the numbers a long time ago, but never wired them up in the syscall table as they need TIF_RESTORE_SIGMASK, which we only got last year in commit cb6831d5d3099e772a510eb3e1ed0760ccffb45e ("m68k: Switch to saner sigsuspend()") Signed-off-by: Geert Uytterhoeven Acked-by: Greg Ungerer Cc: stable@kernel.org commit c4245c9d6535f3d02fda7f6eb9adcec9f09e8fe3 Author: Geert Uytterhoeven Date: Wed Apr 6 22:12:53 2011 +0200 m68k: Merge mmu and non-mmu versions of sys_call_table Impact for nommu: - Store table in .rodata instead of .text, - Let kernel/sys_ni.c handle the stubbing of MMU-only syscalls, - Implement sys_mremap and sys_nfsservct, - Remove unused padding at the end of the table. Impact for mmu: - Store table in .rodata instead of .data. Signed-off-by: Geert Uytterhoeven Acked-by: Greg Ungerer commit 6cf515e113fc1938b3cc9812bd8519a4c6155ef9 Author: Geert Uytterhoeven Date: Sun Apr 24 10:32:49 2011 +0200 MAINTAINERS: Roman Zippel has been MIA for several years. Hence make AFFS and HFS orphans, and remove him as an m68k maintainer. Signed-off-by: Geert Uytterhoeven commit f82a519f1262963d6ab30fa238721463fad2e0c8 Author: Geert Uytterhoeven Date: Sun Apr 3 14:00:10 2011 +0200 m68k: bitops - Never step beyond the end of the bitmap find_next bitops on m68k (find_next_zero_bit, find_next_bit, and find_next_bit_le) may cause out of bounds memory access when the bitmap size in bits % 32 != 0 and offset (the bitnumber to start searching at) is very close to the bitmap size. For example, unsigned long bitmap[2] = { 0, 0 }; find_next_bit(bitmap, 63, 62); 1. find_next_bit() tries to find any set bits in bitmap[1], but no bits set. 2. Then find_first_bit(bimap + 2, -1) 3. Unfortunately find_first_bit() takes unsigned int as the size argument. 4. find_first_bit will access bitmap[2~] until it find any set bits. Add missing tests for stepping beyond the end of the bitmap to all find_{first,next}_*() functions, and make sure they never return a value larger than the bitmap size. Reported-by: Akinobu Mita Cc: Andreas Schwab Signed-off-by: Geert Uytterhoeven commit 359c47ea71be21c1105ae474e8c90ec3e988bbdf Author: Geert Uytterhoeven Date: Sun Apr 3 13:32:00 2011 +0200 m68k: bitops - offset == ((long)p - (long)vaddr) * 8 Hence use "offset" in find_next_{,zero_}bit(), like is already done for find_next_{,zero_}bit_le() Signed-off-by: Geert Uytterhoeven Cc: Akinobu Mita Cc: Andreas Schwab Signed-off-by: Geert Uytterhoeven commit 887cb45694f77d59de19674cb73146fec72fadbb Author: Daniel Kiper Date: Wed May 4 20:19:49 2011 +0200 drivers/xen/sys-hypervisor: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit b53cedebd74918237176520f9157deb7ae066b71 Author: Daniel Kiper Date: Wed May 4 20:18:05 2011 +0200 arch/x86/xen/smp: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit fb6ce5dea4bb704bfcd9fda7e6b1354da66f4d2f Author: Daniel Kiper Date: Wed May 4 20:18:45 2011 +0200 arch/x86/xen/time: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit ad7ba09e658f8e890797ed168dfb3f84be934b09 Author: Daniel Kiper Date: Wed May 4 20:19:15 2011 +0200 arch/x86/xen/xen-ops: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit 3f508953dd2ebcbcd32d28d0b6aad4d76980e722 Author: Daniel Kiper Date: Thu May 12 17:19:53 2011 -0400 arch/x86/xen/mmu: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper [v1: Rebased on top of latest linus's to include fixes in mmu.c] Signed-off-by: Konrad Rzeszutek Wilk commit 983bbf1af0664b78689612b247acb514300f62c7 Author: Tian, Kevin Date: Fri May 6 14:43:56 2011 +0800 x86: Don't unmask disabled irqs when migrating them It doesn't make sense to unconditionally unmask a disabled irq when migrating it from offlined cpu to another. If the irq triggers then it will be disabled in the interrupt handler anyway. So we can just avoid unmasking it. [ tglx: Made masking unconditional again and fixed the changelog ] Signed-off-by: Fengzhe Zhang Signed-off-by: Kevin Tian Cc: Ian Campbell Cc: Jan Beulich Cc: "xen-devel@lists.xensource.com" Link: http://lkml.kernel.org/r/%3C625BA99ED14B2D499DC4E29D8138F1505C8ED7F7E3%40shsmsx502.ccr.corp.intel.com%3E Signed-off-by: Thomas Gleixner commit b87ba87ca26e226b2277a2d5613ed596f408e96d Author: Tian, Kevin Date: Fri May 6 14:43:36 2011 +0800 x86: Skip migrating IRQF_PER_CPU irqs in fixup_irqs() IRQF_PER_CPU means that the irq cannot be moved away from a given cpu. So it must not be migrated when the cpu goes offline. [ tglx: massaged changelog ] Signed-off-by: Fengzhe Zhang Signed-off-by: Kevin Tian Cc: Ian Campbell Cc: Jan Beulich Cc: "xen-devel@lists.xensource.com" Link: http://lkml.kernel.org/r/%3C625BA99ED14B2D499DC4E29D8138F1505C8ED7F7E2%40shsmsx502.ccr.corp.intel.com%3E Signed-off-by: Thomas Gleixner commit c3305257cd4df63e03e21e331a0140ae9c0faccc Author: Ingo Molnar Date: Thu May 19 14:01:42 2011 +0200 perf stat: Add more cache-miss percentage printouts Print out the cache-miss percentage as well if the cache refs were collected, for all the generic cache event types. Before: 11,103,723,230 dTLB-loads # 622.471 M/sec ( +- 0.30% ) 87,065,337 dTLB-load-misses # 4.881 M/sec ( +- 0.90% ) After: 11,353,713,242 dTLB-loads # 626.020 M/sec ( +- 0.35% ) 113,393,472 dTLB-load-misses # 1.00% of all dTLB cache hits ( +- 0.49% ) Also ASCII color highlight too high percentages, them when it's executed on the console. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Steven Rostedt Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org Signed-off-by: Ingo Molnar commit 2cba3ffb9a9db3874304a1739002d053d53c738b Author: Ingo Molnar Date: Thu May 19 13:30:56 2011 +0200 perf stat: Add -d -d and -d -d -d options to show more CPU events Print even more detailed statistics if requested via perf stat -d: -d: detailed events, L1 and LLC data cache -d -d: more detailed events, dTLB and iTLB events -d -d -d: very detailed events, adding prefetch events Full output looks like this now: Performance counter stats for '/home/mingo/hackbench 10' (5 runs): 1703.674707 task-clock # 8.709 CPUs utilized ( +- 4.19% ) 49,068 context-switches # 0.029 M/sec ( +- 16.66% ) 8,303 CPU-migrations # 0.005 M/sec ( +- 24.90% ) 17,397 page-faults # 0.010 M/sec ( +- 0.46% ) 2,345,389,239 cycles # 1.377 GHz ( +- 4.61% ) [55.90%] 1,884,503,527 stalled-cycles-frontend # 80.35% frontend cycles idle ( +- 5.67% ) [50.39%] 743,919,737 stalled-cycles-backend # 31.72% backend cycles idle ( +- 8.75% ) [49.91%] 1,314,416,379 instructions # 0.56 insns per cycle # 1.43 stalled cycles per insn ( +- 2.53% ) [60.87%] 272,592,567 branches # 160.003 M/sec ( +- 1.74% ) [56.56%] 3,794,846 branch-misses # 1.39% of all branches ( +- 6.59% ) [58.50%] 449,982,778 L1-dcache-loads # 264.125 M/sec ( +- 2.47% ) [49.88%] 22,404,961 L1-dcache-load-misses # 4.98% of all L1-dcache hits ( +- 6.08% ) [55.05%] 6,204,750 LLC-loads # 3.642 M/sec ( +- 8.91% ) [43.75%] 1,837,411 LLC-load-misses # 1.078 M/sec ( +- 7.27% ) [12.07%] 411,440,421 L1-icache-loads # 241.502 M/sec ( +- 5.60% ) [36.52%] 27,556,832 L1-icache-load-misses # 16.175 M/sec ( +- 7.46% ) [46.72%] 464,067,627 dTLB-loads # 272.392 M/sec ( +- 4.46% ) [54.17%] 10,765,648 dTLB-load-misses # 6.319 M/sec ( +- 3.18% ) [48.68%] 1,273,080,386 iTLB-loads # 747.256 M/sec ( +- 3.38% ) [47.53%] 117,481 iTLB-load-misses # 0.069 M/sec ( +- 14.99% ) [47.01%] 4,590,653 L1-dcache-prefetches # 2.695 M/sec ( +- 4.49% ) [46.19%] 1,712,660 L1-dcache-prefetch-misses # 1.005 M/sec ( +- 3.75% ) [44.82%] 0.195622057 seconds time elapsed ( +- 6.84% ) Also clean up the attribute construction code to be appending, and factor it out into add_default_attributes(). Tweak the coverage percentage printout a bit, so that it's easier to view it alongside the +- sttddev colum. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Steven Rostedt Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.org Signed-off-by: Ingo Molnar commit ab0e08f15d23628dd8d50bf6ce1a935a8840c7dc Author: Thomas Gleixner Date: Wed May 18 21:33:43 2011 +0000 x86: hpet: Cleanup the clockevents init and register code No need to recalculate the frequency and the conversion factors over and over. Calculate the frequency once and use the new config/register interface and let the core code do the math. Signed-off-by: Thomas Gleixner Cc: John Stultz Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.646482357%40linutronix.de%3E commit 61ee9a4ba05f0a4163d43a33dee7a0651e080b98 Author: Thomas Gleixner Date: Wed May 18 21:33:42 2011 +0000 x86: Convert PIT to clockevents_config_and_register() Let the core do the work. Signed-off-by: Thomas Gleixner Cc: John Stultz Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.545615675%40linutronix.de%3E commit 80b816b736cfa5b9582279127099b20a479ab7d9 Author: Thomas Gleixner Date: Wed May 18 21:33:42 2011 +0000 clockevents: Provide interface to reconfigure an active clock event device Some ARM SoCs have clock event devices which have their frequency modified due to frequency scaling. Provide an interface which allows to reconfigure an active device. After reconfiguration reprogram the current pending event. Signed-off-by: Thomas Gleixner Cc: LAK Cc: John Stultz Acked-by: Linus Walleij Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.437459958%40linutronix.de%3E commit 57f0fcbe1dea8a36c9d1673086326059991c5f81 Author: Thomas Gleixner Date: Wed May 18 21:33:41 2011 +0000 clockevents: Provide combined configure and register function All clockevent devices have the same open coded initialization functions. Provide an interface which does all necessary initialization in the core code. Signed-off-by: Thomas Gleixner Cc: John Stultz Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.331975870%40linutronix.de%3E commit 847b2f42be203f3cff7f243fdd3ee50c1e06c882 Author: Thomas Gleixner Date: Wed May 18 21:33:41 2011 +0000 clockevents: Restructure clock_event_device members Group the hot path members of struct clock_event_device together so we have a better cache line footprint. Make it cacheline aligned. Signed-off-by: Thomas Gleixner Cc: John Stultz Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.223607682%40linutronix.de%3E commit 724ed53e8ac2c5278af8955673049714c1073464 Author: Thomas Gleixner Date: Wed May 18 21:33:40 2011 +0000 clocksource: Get rid of the hardcoded 5 seconds sleep time limit Slow clocksources can have a way longer sleep time than 5 seconds and even fast ones can easily cope with 600 seconds and still maintain proper accuracy. Signed-off-by: Thomas Gleixner Cc: John Stultz Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.109811585%40linutronix.de%3E commit 369db4c9524b7487faf1ff89646eee396c1363e1 Author: Thomas Gleixner Date: Wed May 18 21:33:40 2011 +0000 clocksource: Restructure clocksource struct members Group the hot path members of struct clocksource together so we have a better cache line footprint. Make it cacheline aligned. Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: Eric Dumazet Reviewed-by: Ingo Molnar Link: http://lkml.kernel.org/r/%3C20110518210136.003081882%40linutronix.de%3E commit d6971822c288ce5547190c6f812cc978d6520777 Author: Michal Marek Date: Tue May 17 15:36:46 2011 +0200 ftrace/kbuild: Add recordmcount files to force full build Modifications to recordmcount must be performed on all object files to stay consistent with what the kernel code may expect. Add the recordmcount files to the main dependencies to make sure any change to them causes a full recompile. Signed-off-by: Michal Marek Link: http://lkml.kernel.org/r/20110517133646.GP13293@sepie.suse.cz Signed-off-by: Steven Rostedt commit 6f6c3c33c027f2c83d53e8562cd9daa73fe8108b Author: Ralf Baechle Date: Thu May 19 09:21:33 2011 +0100 MIPS: Move arch_get_unmapped_area and gang to new file. It never really belonged into syscall.c and it's about to become well more complex. Signed-off-by: Ralf Baechle commit 9c1e8a9138ff92a4ff816ea8a1884ad2461a993a Author: Ralf Baechle Date: Tue May 17 16:18:09 2011 +0100 MIPS: Cleanup arch_get_unmapped_area As noticed by Kevin Cernekee in http://www.linux-mips.org/cgi-bin/extract-mesg.cgi?a=linux-mips&m=2011-05&i=BANLkTikq04wuK%3Dbz%2BLieavmm3oDtoYWKxg%40mail.gmail.com Patchwork: https://patchwork.linux-mips.org/patch/2387/ Signed-off-by: Ralf Baechle commit e650ce0f083ff9354a10ad66e6bf8c193e8a2755 Author: David Daney Date: Thu Feb 17 14:47:52 2011 -0800 MIPS: Octeon: Don't request interrupts for unused IPI mailbox bits. We only use the three low-order mailbox bits. Leave the upper bits alone for possible use by drivers and other software. Signed-off-by: David Daney To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2090/ Signed-off-by: Ralf Baechle commit 7716e6548abed1582a7759666e79d5c612a906c7 Author: Chandrakala Chavva Date: Thu Feb 17 13:57:52 2011 -0800 Octeon: Fix interrupt irq settings for performance counters. Octeon uses different interrupt irq for timer and performance counters. Set CvmCtl[IPPCI] to correct irq value very early. Signed-off-by: Chandrakala Chavva Signed-off-by: David Daney To: linux-mips@linux-mips.org Cc: Chandrakala Chavva Patchwork: https://patchwork.linux-mips.org/patch/2085/ Signed-off-by: Ralf Baechle commit b32ee693eb106172f89639acff88dc8fee8ba3e2 Author: Wanlong Gao Date: Sun Apr 10 03:04:18 2011 +0800 MIPS: Fix build warnings on defconfigs Since d45dcef77019012fc6769e657fc2f1a5d681bbbb ["Bluetooth: Fix BT_L2CAP and BT_SCO in Kconfig"] BT_L2CAP=m and BT_SCO=m are no longer valid so change the settings from m to y. [ralf@linux-mips.org: Merging only the MIPS parts of this patch.] Signed-off-by: Wanlong Gao To: akpm@linux-foundation.org To: manuel.lauss@googlemail.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linuxppc-dev@lists.ozlabs.org Patchwork: https://patchwork.linux-mips.org/patch/2277/ Signed-off-by: Ralf Baechle commit 1c8da7a1107a46c94b21cc176aaf95c819aab3db Author: Wanlong Gao Date: Sun Apr 10 01:42:17 2011 +0800 MIPS: Lemote 2F, Malta: Fix build warning Since 5ada28bf76752e33dce3d807bf0dfbe6d1b943ad ["led-class: always implement blinking"] LEDS_CLASS=m is no longer valid so change the setting from m to y. Signed-off-by: Wanlong Gao To: david.woodhouse@intel.com To: akpm@linux-foundation.org To: mingo@elte.hu Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2276/ Signed-off-by: Ralf Baechle commit 5aac1e8a381d52a977b5050369a82a547c446ee2 Author: Robert Millan Date: Sat Apr 16 11:29:29 2011 -0700 MIPS: Set ELF AT_PLATFORM string for Loongson2 processors Signed-off-by: Robert Millan Acked-by: David Daney Signed-off-by: Kevin Cernekee Cc: David Daney Cc: wu zhangjin Cc: Aurelien Jarno Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2302/ Signed-off-by: Ralf Baechle commit 06785df09b18e9127d16893039b64ae118c53cb4 Author: Kevin Cernekee Date: Sat Apr 16 11:29:28 2011 -0700 MIPS: Set ELF AT_PLATFORM string for BMIPS processors Signed-off-by: Kevin Cernekee Cc: Robert Millan Cc: David Daney Cc: wu zhangjin Cc: Aurelien Jarno Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2300/ Signed-off-by: Ralf Baechle commit c094c99e659efedcbb05a0f75b8f77145d8ec539 Author: Robert Millan Date: Mon Apr 18 11:37:55 2011 -0700 MIPS: Introduce set_elf_platform() helper function Replace these sequences: if (cpu == 0) __elf_platform = "foo"; with a trivial inline function. Signed-off-by: Robert Millan Signed-off-by: Kevin Cernekee Signed-off-by: David Daney Cc: wu zhangjin Cc: Aurelien Jarno Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2304/ Patchwork: https://patchwork.linux-mips.org/patch/2374/ Signed-off-by: Ralf Baechle commit 6edde0247644db475f68f25dcb1bf72260600081 Author: Maarten ter Huurne Date: Mon May 2 11:47:00 2011 +0200 MIPS: JZ4740: setup: Autodetect physical memory. Assume that the boot loader knows the physical memory of the system and deduce that information from the contents of the SDRAM control register. It is still possible to override with with the "mem=" parameter, but we have a sensible default now. Signed-off-by: Maarten ter Huurne Acked-by: Lars-Peter Clausen Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2319/ Signed-off-by: Ralf Baechle commit 9cbda726bb283d60cd4f34a3a9da8b5b48a46b0f Author: Hauke Mehrtens Date: Tue May 10 23:31:34 2011 +0200 MIPS: BCM47xx: Fix MAC address parsing. Some devices like the Netgear WGT634u are using minuses between the blocks of the MAC address and other devices are using colons to separate them. Signed-off-by: Hauke Mehrtens Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2366/ Signed-off-by: Ralf Baechle commit 41790fd51f71f3744a5d142cc5369eebab8817a0 Author: Hauke Mehrtens Date: Tue May 10 23:31:33 2011 +0200 MIPS: BCM47xx: Extend the filling of SPROM from NVRAM Some members of the struct ssb_sprom where not filled with data available in the NVRAM. Some attribute names in the NVRAM changed from SPROM version 3 to version 4. This patch was done by analyzing the the pci sprom parser in the ssb code and some open source parts of the braodcom wireless driver used on embedded devices. Signed-off-by: Hauke Mehrtens Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2365/ Signed-off-by: Ralf Baechle commit fe6f3642ac70d21004ddbe7242bd4548c35f1c10 Author: Hauke Mehrtens Date: Tue May 10 23:31:32 2011 +0200 MIPS: BCM47xx: Register SSB fallback sprom callback We are generating the prefix based on the PCI bus address the device is on. This is done like Broadcom does it in their code expect that the the bus number is increased by one. In the SB bus implementation used by Broadcom the SB bus emulates a PCI bus so the kernel sees one PCI bus more then in our implementation. We do not handle prefixes like sb/1/ yet as they are only used on the new bus which is not implemented yet. Signed-off-by: Hauke Mehrtens Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2364/ Signed-off-by: Ralf Baechle commit a7c62f8564357532872e106f0fa383728cf886cc Author: Hauke Mehrtens Date: Tue May 10 23:31:31 2011 +0200 MIPS: BCM47xx: Extend bcm47xx_fill_sprom with prefix. When an other SSB based device without an own SPROM is attached, using the PCI bus to the main SSB based device, the data normally found in the SPROM will be stored in the NVRAM on modern devices. The keys, to load the data from the NVRAM, are all using some sort of prefix like pci/1/1/, pci/1/3/ or sb/1/ before the actual key. This patch extends bcm47xx_fill_sprom() to make it possible to read out these values when some prefix was used. The keys for the SPROM data used on the main chip does not have a prefix. Signed-off-by: Hauke Mehrtens Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2363/ Signed-off-by: Ralf Baechle commit b3ae52b6b0335eba547221aad2cb3c50902e3d2d Author: Hauke Mehrtens Date: Tue May 10 23:31:30 2011 +0200 SSB: Change fallback sprom to callback mechanism. Some embedded devices like the Netgear WNDR3300 have two SSB based cards without an own sprom on the pci bus. We have to provide two different fallback sproms for these and this was not possible with the old solution. In the bcm47xx architecture the sprom data is stored in the nvram in the main flash storage. The architecture code will be able to fill the sprom with the stored data based on the bus where the device was found. The bcm63xx code should do the same thing as before, just using the new API. Acked-by: Michael Buesch Cc: netdev@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: Florian Fainelli Signed-off-by: Hauke Mehrtens Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2362/ Signed-off-by: Ralf Baechle commit b7f720d68c0042cc8ce496e31a61df79a77f1b48 Author: Manuel Lauss Date: Sun May 8 10:42:20 2011 +0200 MIPS: Alchemy: Clean up GPIO registers and accessors remove au_readl/au_writel, remove the predefined GPIO1/2 KSEG1 register addresses and fix the fallout in all boards and drivers. This also fixes a bug in the mtx-1_wdt driver which was introduced by commit 6ea8115bb6f359df4f45152f2b40e1d4d1891392 ("Convert mtx1 wdt to be a platform device and use generic GPIO API") before this patch mtx-1_wdt only modified GPIO215, the patch then used the gpio resource information as bit index into the GPIO2 register but the conversion to the GPIO API didn't realize that. With this patch the drivers original behaviour is restored and GPIO15 is left alone. Signed-off-by: Manuel Lauss Cc: Florian Fainelli To: Linux-MIPS Cc: linux-watchdog@vger.kernel.org Cc: Wim Van Sebroeck Patchwork: https://patchwork.linux-mips.org/patch/2381/ Signed-off-by: Ralf Baechle Date: Sun May 8 10:42:19 2011 +0200 MIPS: Alchemy: Cleanup DMA addresses According to the databooks, the Au1000 DMA engine must be programmed with the physical FIFO addresses. This patch does that; furthermore this opened the possibility to get rid of a lot of now unnecessary address defines. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2348/ Signed-off-by: Ralf Baechle Date: Sun May 8 10:42:18 2011 +0200 MIPS: Alchemy: Rewrite ethernet platform setup Rewrite ethernet setup to use runtime cpu detection, and also clean up the ethernet base address mess as far as possible. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2353/ Signed-off-by: Ralf Baechle Date: Sun May 8 10:42:17 2011 +0200 MIPS: Alchemy: Rewrite UART setup and constants. Detect CPU type at runtime and setup uarts accordingly; also clean up the uart base address mess in the process as far as possible. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2352/ Signed-off-by: Ralf Baechle Date: Sun May 8 10:42:16 2011 +0200 MIPS: Alchemy: Convert dbdma.c to syscore_ops Convert the PM sysdev to syscore_ops and clean up the ddma addresses a bit. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2351/ Signed-off-by: Ralf Baechle commit 4b5c82b5e57ac6cb919e7e74984e28b312bdf10c Author: Manuel Lauss Date: Sun May 8 10:42:15 2011 +0200 MIPS: Alchemy: Convert irq.c to syscore_ops. Convert the PM sysdev to use syscore_ops instead. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2350/ Signed-off-by: Ralf Baechle commit dca7587185b3a499a09a9e2755316eee31c49c7f Author: Manuel Lauss Date: Sun May 8 10:42:14 2011 +0200 MIPS: Alchemy: irq code and constant cleanup replace au_readl/au_writel with __raw_readl/__raw_writel, and clean up IC-related stuff from the headers. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2354/ Signed-off-by: Ralf Baechle commit c1e58a3129bc327f7e0eb06fd4fe5ebf2af5d8ef Author: Manuel Lauss Date: Sun May 8 10:42:13 2011 +0200 MIPS: Alchemy: update inlinable GPIO API This fixes a build failure with gpio_keys and CONFIG_GPIOLIB=n (mtx1): CC drivers/input/keyboard/gpio_keys.o gpio_keys.c: In function 'gpio_keys_report_event': gpio_keys.c:325:2: error: implicit declaration of function 'gpio_get_value_cansleep' gpio_keys.c: In function 'gpio_keys_setup_key': gpio_keys.c:390:3: error: implicit declaration of function 'gpio_set_debounce' Also add stubs for the other new functions. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2346/ Signed-off-by: Ralf Baechle commit 0591128066bdfe07e0ef0ab7f877f794d8ba071d Author: Manuel Lauss Date: Sun May 8 10:42:12 2011 +0200 MIPS: DB1200: Set Config[OD] for improved stability. Setting Config[OD] gets rid of a _LOT_ of spurious CPLD interrupts, but also decreases overall performance a bit. Signed-off-by: Manuel Lauss To: Linux-MIPS Cc: Florian Fainelli Cc: Wolfgang Grandegger Patchwork: https://patchwork.linux-mips.org/patch/2347/ Signed-off-by: Ralf Baechle commit 8b659a393171aed3dafa1d7455ac9eec1f3ed315 Author: Ralf Baechle Date: Thu May 19 09:21:29 2011 +0100 MIPS: Split do_syscall_trace into two functions. Signed-off-by: Ralf Baechle commit c19c20ac6338435469a2c222ef5dc55e0469a6dc Author: Ralf Baechle Date: Thu May 19 09:21:28 2011 +0100 MIPS: Use single define for pending work on syscall exit Signed-off-by: Ralf Baechle commit 4f0ad950880a33df792b1e63649e29f8784b0163 Author: Ralf Baechle Date: Thu May 19 09:21:28 2011 +0100 MIPS: IP27: Remove pointless switch statement. Signed-off-by: Ralf Baechle commit 2f58b8d04e680ec13157ba6eee44455438c56d5f Author: John Crispin Date: Thu May 5 23:00:23 2011 +0200 MIPS: Lantiq: Add watchdog support This patch adds the driver for the watchdog found inside the Lantiq SoC family. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: Wim Van Sebroeck Cc: linux-mips@linux-mips.org Cc: linux-watchdog@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2327/ Signed-off-by: Ralf Baechle commit f1f0ceaada9d040a41023017c87abb1d651b44af Author: John Crispin Date: Fri May 6 00:10:02 2011 +0200 MIPS: Lantiq: Add etop board support Register the etop platform device inside the machtype specific init code. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Signed-off-by: David Daney Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2356/ Patchwork: https://patchwork.linux-mips.org/patch/2370/ Signed-off-by: Ralf Baechle commit 504d4721ee8e432af4b5f196a08af38bc4dac5fe Author: John Crispin Date: Fri May 6 00:10:01 2011 +0200 MIPS: Lantiq: Add ethernet driver This patch adds the driver for the ETOP Packet Processing Engine (PPE32) found inside the XWAY family of Lantiq MIPS SoCs. This driver makes 100MBit ethernet work. Support for all 8 dma channels, gbit and the embedded switch found on the ar9/vr9 still needs to be implemented. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Cc: netdev@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2357/ Acked-by: David S. Miller Signed-off-by: Ralf Baechle commit dfec1a827d2bdc35d0990afd100f79a685ec0985 Author: John Crispin Date: Fri May 6 00:10:00 2011 +0200 MIPS: Lantiq: Add DMA support This patch adds support for the DMA engine found inside the XWAY family of SoCs. The engine has 5 ports and 20 channels. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2355/ Signed-off-by: Ralf Baechle commit 2f0fc4159a6abc20b13569522c545150b99485cf Author: John Crispin Date: Tue Apr 5 14:10:57 2011 +0200 SERIAL: Lantiq: Add driver for MIPS Lantiq SOCs. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Signed-off-by: Felix Fietkau Cc: alan@lxorguk.ukuu.org.uk Cc: linux-mips@linux-mips.org Cc: linux-serial@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/2269/ Acked-by: Alan Cox Signed-off-by: Ralf Baechle commit 935c500c377d8e414bbe08e0e169f6c85d2a4273 Author: John Crispin Date: Wed Mar 30 09:27:56 2011 +0200 MIPS: Lantiq: Add more gpio drivers The XWAY family allows to extend the number of gpios by using shift registers or latches. This patch adds the 2 drivers needed for this. The extended gpios are output only. [ralf@linux-mips.org: Fixed ltq_stp_probe section() attributes.] Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2258/ Signed-off-by: Ralf Baechle commit 973c32eb7f2d5c45d0e68b0083ead9ee763d9a6f Author: John Crispin Date: Wed Mar 30 09:27:55 2011 +0200 MIPS: Lantiq: Add machtypes for lantiq eval kits This patch adds mach specific code for the Lantiq EASY50712/50601 evaluation boards Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2255/ Patchwork: https://patchwork.linux-mips.org/patch/2361/ Signed-off-by: Ralf Baechle commit a053ac17024561f3a2fd02424b5f92823282b5ad Author: John Crispin Date: Wed Mar 30 09:27:54 2011 +0200 MIPS: Lantiq: Add mips_machine support This patch adds support for Gabor's mips_machine patch. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: Gabor Juhos Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2251/ Patchwork: https://patchwork.linux-mips.org/patch/2358/ Signed-off-by: Ralf Baechle commit 24aff71fa8df0d6a73dab17f3f2285a24b8f658f Author: John Crispin Date: Wed Mar 30 09:27:53 2011 +0200 MIPS: Lantiq: Add platform device support This patch adds the wrappers for registering our platform devices. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2254/ Patchwork: https://patchwork.linux-mips.org/patch/2360/ Patchwork: https://patchwork.linux-mips.org/patch/2359/ Signed-off-by: Ralf Baechle commit 3c5447390c3e1a462913e327a016a55cae501580 Author: John Crispin Date: Tue Apr 12 18:10:01 2011 +0200 MIPS: Lantiq: Add NOR flash support This patch adds the driver/map for NOR devices attached to the SoC via the External Bus Unit (EBU). Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: David Woodhouse Cc: Daniel Schwierzeck Cc: linux-mips@linux-mips.org Cc: linux-mtd@lists.infradead.org Acked-by: Artem Bityutskiy Patchwork: https://patchwork.linux-mips.org/patch/2285/ Signed-off-by: Ralf Baechle commit e47d488935ed0b2dd3d59d3ba4e13956ff6849c0 Author: John Crispin Date: Wed Mar 30 09:27:49 2011 +0200 MIPS: Lantiq: Add PCI controller support. The Lantiq family of SoCs have a EBU (External Bus Unit). This patch adds the driver that allows us to use the EBU as a PCI controller. In order for PCI to work the EBU is set to endianess swap all the data. In addition we need to make use of SWAP_IO_SPACE for device->host DMA to work. The clock of the PCI works in several modes (internal/external). If this is not configured correctly the SoC will hang. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2250/ Signed-off-by: Ralf Baechle commit 8ec6d93508f705dacafd5fcd058c69ef405002f9 Author: John Crispin Date: Wed Mar 30 09:27:48 2011 +0200 MIPS: Lantiq: add SoC specific code for XWAY family Add support for the Lantiq XWAY family of Mips24KEc SoCs. * Danube (PSB50702) * Twinpass (PSB4000) * AR9 (PSB50802) * Amazon SE (PSB5061) The Amazon SE is a lightweight SoC and has no PCI as well as a different clock. We split the code out into seperate files to handle this. The GPIO pins on the SoCs are multi function and there are several bits we can use to configure the pins. To be as compatible as possible to GPIOLIB we add a function int lq_gpio_request(unsigned int pin, unsigned int alt0, unsigned int alt1, unsigned int dir, const char *name); which lets you configure the 2 "alternate function" bits. This way drivers like PCI can make use of GPIOLIB without a cubersome wrapper. The PLL code inside arch/mips/lantiq/xway/clk-xway.c is voodoo to me. It was taken from a 2.4.20 source tree and was never really changed by me since then. Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2249/ Signed-off-by: Ralf Baechle commit 171bb2f19ed6f3627f4f783f658f2f475b2fbd50 Author: John Crispin Date: Wed Mar 30 09:27:47 2011 +0200 MIPS: Lantiq: Add initial support for Lantiq SoCs Add initial support for Mips based SoCs made by Lantiq. This series will add support for the XWAY family. The series allows booting a minimal system using a initramfs or NOR. Missing drivers and support for Amazon and GPON family will be provided in a later series. [Ralf: Remove some cargo cult programming and fixed formatting.] Signed-off-by: John Crispin Signed-off-by: Ralph Hempel Signed-off-by: David Daney Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2252/ Patchwork: https://patchwork.linux-mips.org/patch/2371/ Signed-off-by: Ralf Baechle commit c0a5afb9bcf6b5aa5685e4fcf1282cad5fab3d91 Author: Maxin John Date: Tue Mar 29 00:15:55 2011 +0300 MIPS: Enable kmemleak for MIPS Signed-off-by: Maxin B. John To: Catalin Marinas Cc: Daniel Baluta Cc: naveen yadav Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Patchwork: https://patchwork.linux-mips.org/patch/2244/ Signed-off-by: Ralf Baechle commit 9b130f8004e51c65b20b0f0e17cdee073a719047 Author: Jayachandran C Date: Sat May 7 01:37:31 2011 +0530 MIPS: XLR, XLS: Add PCI support. Adds pci/pci-xlr.c to support for XLR PCI/PCI-X interface and XLS PCIe interface. Update irq.c to ack PCI interrupts, use irq handler data to do the PCI/PCIe bus ack. Signed-off-by: Jayachandran C Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2337/ Signed-off-by: Ralf Baechle commit f9cab74fd9b0cf19f52a989694e7a1d8213af3a1 Author: Jayachandran C Date: Sat May 7 01:37:14 2011 +0530 MIPS: Add default configuration for XLR/XLS processors Enable XLR CPU support, SMP, initramfs based root filesystem etc. [ralf@linux-mips.org: shrink the defconfig file through make savedefconfig.] Signed-off-by: Jayachandran C To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2338/ Signed-off-by: Ralf Baechle commit 7f058e852b229ec77b37676b2b78baf2e78ffee8 Author: Jayachandran C Date: Sat May 7 01:36:57 2011 +0530 MIPS: Kconfig and Makefile update for Netlogic XLR/XLS Add NLM_XLR_BOARD, CPU_XLR and other config options Makefile updates, mostly based on r4k Signed-off-by: Jayachandran C To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2334/ Signed-off-by: Ralf Baechle commit 5c642506740ecbf20fb7a9e482287e4e5c639e5c Author: Jayachandran C Date: Sat May 7 01:36:40 2011 +0530 MIPS: Platform files for XLR/XLS processor support * include/asm/netlogic added with files common for all Netlogic processors (common with XLP which will be added later) * include/asm/netlogic/xlr for XLR/XLS chip specific files * netlogic/xlr for XLR/XLS platform files Signed-off-by: Jayachandran C To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2334/ Signed-off-by: Ralf Baechle commit efa0f81c11021c95b1e72c65868115b6fb4ecc6a Author: Jayachandran C Date: Sat May 7 01:36:21 2011 +0530 MIPS: Netlogic: Cache, TLB support and feature overrides for XLR CPU_XLR case added to mm/tlbex.c CPU_XLR case added to mm/c-r4k.c for PINDEX attribute Feature overrides for XLR cpu. Signed-off-by: Jayachandran C To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2333/ Signed-off-by: Ralf Baechle commit 3c595a515dbb61ae96e8f5607d895820aa06e870 Author: Jayachandran C Date: Sat May 7 01:36:05 2011 +0530 MIPS: Netlogic: mach-netlogic include files Add war.h and irq.h with XLR/XLS definitions. Signed-off-by: Jayachandran C To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2331/ Signed-off-by: Ralf Baechle commit a7117c6bddcbfff2fa237a14a853b32cb94bf59a Author: Jayachandran C Date: Wed May 11 12:04:58 2011 +0530 MIPS: Netlogic XLR/XLS processor IDs. Add Netlogic Microsystems company ID and processor IDs for XLR and XLS processors for CPU probe. Add CPU_XLR to cpu_type_enum. Signed-off-by: Jayachandran C Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/2367/ Signed-off-by: Ralf Baechle commit f721a465cddbe7f03e6cd2272008da558cf93818 Author: Jonathan Cameron Date: Tue Apr 19 12:43:47 2011 +0100 params.c: Use new strtobool function to process boolean inputs Signed-off-by: Jonathan Cameron Signed-off-by: Rusty Russell commit a0374396375d06398c419ebb6857fb5809cff81f Author: Jonathan Cameron Date: Tue Apr 19 12:43:46 2011 +0100 debugfs: move to new strtobool No functional changes requires that we eat errors from strtobool. If people want to not do this, then it should be fixed at a later date. V2: Simplification suggested by Rusty Russell removes the need for additional variable ret. Signed-off-by: Jonathan Cameron Signed-off-by: Rusty Russell commit d0f1fed29e6e73d9d17f4c91a5896a4ce3938d45 Author: Jonathan Cameron Date: Tue Apr 19 12:43:45 2011 +0100 Add a strtobool function matching semantics of existing in kernel equivalents This is a rename of the usr_strtobool proposal, which was a renamed, relocated and fixed version of previous kstrtobool RFC Signed-off-by: Jonathan Cameron Signed-off-by: Rusty Russell commit 6845756b29e4c4e7db41e2d75cafa9d091bc1c07 Author: Anders Kaseorg Date: Thu May 19 16:55:27 2011 -0600 modpost: Update 64k section support for binutils 2.18.50 Binutils 2.18.50 made a backwards-incompatible change in the way it writes ELF objects with over 65280 sections, to improve conformance with the ELF specification and interoperability with other ELF tools. Specifically, it no longer adds 256 to section indices SHN_LORESERVE and higher to skip over the reserved range SHN_LORESERVE through SHN_HIRESERVE; those values are only considered special in the st_shndx field, and not in other places where section indices are stored. See: http://sourceware.org/bugzilla/show_bug.cgi?id=5900 http://groups.google.com/group/generic-abi/browse_thread/thread/e8bb63714b072e67/6c63738f12cc8a17 Signed-off-by: Anders Kaseorg Signed-off-by: Rusty Russell commit 9d63487f86115b1d3ef69670043bcf2b83c4d227 Author: Alessio Igor Bogani Date: Wed May 18 22:35:59 2011 +0200 module: Use binary search in lookup_symbol() The function is_exported() with its helper function lookup_symbol() are used to verify if a provided symbol is effectively exported by the kernel or by the modules. Now that both have their symbols sorted we can replace a linear search with a binary search which provide a considerably speed-up. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani Acked-by: Greg Kroah-Hartman Signed-off-by: Rusty Russell commit 403ed27846aa126ecf0b842b5b179c506b9d989c Author: Alessio Igor Bogani Date: Wed Apr 20 11:10:52 2011 +0200 module: Use the binary search for symbols resolution Takes advantage of the order and locates symbols using binary search. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani Signed-off-by: Rusty Russell Tested-by: Dirk Behme commit 1a94dc35bc5c166d89913dc01a49d27a3c21a455 Author: Tim Abbott Date: Thu Apr 14 20:00:19 2011 +0200 lib: Add generic binary search function to the kernel. There a large number hand-coded binary searches in the kernel (run "git grep search | grep binary" to find many of them). Since in my experience, hand-coding binary searches can be error-prone, it seems worth cleaning this up by providing a generic binary search function. This generic binary search implementation comes from Ksplice. It has the same basic API as the C library bsearch() function. Ksplice uses it in half a dozen places with 4 different comparison functions, and I think our code is substantially cleaner because of this. Signed-off-by: Tim Abbott Extra-bikeshedding-by: Alan Jenkins Extra-bikeshedding-by: André Goddard Rosa Extra-bikeshedding-by: Rusty Russell Signed-off-by: Rusty Russell Signed-off-by: Alessio Igor Bogani Signed-off-by: Rusty Russell commit f02e8a6596b7dc9b2171f7ff5654039ef0950cdc Author: Alessio Igor Bogani Date: Thu Apr 14 14:59:39 2011 +0200 module: Sort exported symbols This patch places every exported symbol in its own section (i.e. "___ksymtab+printk"). Thus the linker will use its SORT() directive to sort and finally merge all symbol in the right and final section (i.e. "__ksymtab"). The symbol prefixed archs use an underscore as prefix for symbols. To avoid collision we use a different character to create the temporary section names. This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Bogani Signed-off-by: Rusty Russell (folded in '+' fixup) Tested-by: Dirk Behme commit de4d8d53465483168d6a627d409ee2d09d8e3308 Author: Rusty Russell Date: Tue Apr 19 21:49:58 2011 +0200 module: each_symbol_section instead of each_symbol Instead of having a callback function for each symbol in the kernel, have a callback for each array of symbols. This eases the logic when we move to sorted symbols and binary search. Signed-off-by: Rusty Russell Signed-off-by: Alessio Igor Bogani commit 01526ed0830643bd53a8434c3068e4c077e1b09d Author: Jan Glauber Date: Thu May 19 16:55:26 2011 -0600 module: split unset_section_ro_nx function. Split the unprotect function into a function per section to make the code more readable and add the missing static declaration. Signed-off-by: Jan Glauber Signed-off-by: Rusty Russell commit 448694a1d50432be63aafccb42d6f54d8cf3d02c Author: Jan Glauber Date: Thu May 19 16:55:26 2011 -0600 module: undo module RONX protection correctly. While debugging I stumbled over two problems in the code that protects module pages. First issue is that disabling the protection before freeing init or unload of a module is not symmetric with the enablement. For instance, if pages are set to RO the page range from module_core to module_core + core_ro_size is protected. If a module is unloaded the page range from module_core to module_core + core_size is set back to RW. So pages that were not set to RO are also changed to RW. This is not critical but IMHO it should be symmetric. Second issue is that while set_memory_rw & set_memory_ro are used for RO/RW changes only set_memory_nx is involved for NX/X. One would await that the inverse function is called when the NX protection should be removed, which is not the case here, unless I'm missing something. Signed-off-by: Jan Glauber Signed-off-by: Rusty Russell commit 4d10380e720a3ce19dbe88d0133f66ded07b6a8f Author: Jan Glauber Date: Thu May 19 16:55:25 2011 -0600 module: zero mod->init_ro_size after init is freed. Reset mod->init_ro_size to zero after the init part of a module is unloaded. Otherwise we need to check if module->init is NULL in the unprotect functions in the next patch. Signed-off-by: Jan Glauber Signed-off-by: Rusty Russell commit 5d05c70849f760ac8f4ed3ebfeefb92689858834 Author: Daniel J Blueman Date: Tue Mar 8 22:01:47 2011 +0800 minor ANSI prototype sparse fix Fix function prototype to be ANSI-C compliant, consistent with other function prototypes, addressing a sparse warning. Signed-off-by: Daniel J Blueman Signed-off-by: Rusty Russell commit c5be0b2eb1ca05e0cd747f9c0ba552c6ee8827a0 Author: Richard Kennedy Date: Thu May 19 16:55:25 2011 -0600 module: reorder kparam_array to remove alignment padding on 64 bit builds Reorder structure kparam_array to remove 8 bytes of alignment padding on 64 bit builds, dropping its size from 40 to 32 bytes. Also update the macro module_param_array_named to initialise the structure using its member names to allow it to be changed without touching all its call sites. 'git grep' finds module_param_array in 1037 places so this patch will save a small amount of data space across many modules. Signed-off-by: Richard Kennedy Signed-off-by: Rusty Russell commit a288bd651f4180c224cfddf837a0416157a36661 Author: Richard Kennedy Date: Thu May 19 16:55:25 2011 -0600 module: remove 64 bit alignment padding from struct module with CONFIG_TRACE* Reorder struct module to remove 24 bytes of alignment padding on 64 bit builds when the CONFIG_TRACE options are selected. This allows the structure to fit into one fewer cache lines, and its size drops from 592 to 568 on x86_64. Signed-off-by: Richard Kennedy Signed-off-by: Rusty Russell commit 9b73a5840c7d5f77e5766626716df13787cb258c Author: Dmitry Torokhov Date: Mon Feb 7 16:02:27 2011 -0800 module: do not hide __modver_version_show declaration behind ifdef Doing so prevents the following warning from sparse: CHECK kernel/params.c kernel/params.c:817:9: warning: symbol '__modver_version_show' was not declared. Should it be static? since kernel/params.c is never compiled with MODULE being set. Signed-off-by: Dmitry Torokhov Signed-off-by: Rusty Russell commit b4bc842802db3314f9a657094da0450a903ea619 Author: Dmitry Torokhov Date: Mon Feb 7 16:02:25 2011 -0800 module: deal with alignment issues in built-in module versions On m68k natural alignment is 2-byte boundary but we are trying to align structures in __modver section on sizeof(void *) boundary. This causes trouble when we try to access elements in this section in array-like fashion when create "version" attributes for built-in modules. Moreover, as DaveM said, we can't reliably put structures into independent objects, put them into a special section, and then expect array access over them (via the section boundaries) after linking the objects together to just "work" due to variable alignment choices in different situations. The only solution that seems to work reliably is to make an array of plain pointers to the objects in question and put those pointers in the special section. Reported-by: Geert Uytterhoeven Signed-off-by: Dmitry Torokhov Signed-off-by: Rusty Russell commit 95950c2ecb31314ef827428e43ff771cf3b037e5 Author: Steven Rostedt Date: Fri May 6 00:08:51 2011 -0400 ftrace: Add self-tests for multiple function trace users Add some basic sanity tests for multiple users of the function tracer at startup. Signed-off-by: Steven Rostedt commit 936e074b286ae779f134312178dbab139ee7ea52 Author: Steven Rostedt Date: Thu May 5 22:54:01 2011 -0400 ftrace: Modify ftrace_set_filter/notrace to take ops Since users of the function tracer can now pick and choose which functions they want to trace agnostically from other users of the function tracer, we need to pass the ops struct to the ftrace_set_filter() functions. The functions ftrace_set_global_filter() and ftrace_set_global_notrace() is added to keep the old filter functions which are used to modify the generic function tracers. Signed-off-by: Steven Rostedt commit cdbe61bfe70440939e457fb4a8d0995eaaed17de Author: Steven Rostedt Date: Thu May 5 21:14:55 2011 -0400 ftrace: Allow dynamically allocated function tracers Now that functions may be selected individually, it only makes sense that we should allow dynamically allocated trace structures to be traced. This will allow perf to allocate a ftrace_ops structure at runtime and use it to pick and choose which functions that structure will trace. Note, a dynamically allocated ftrace_ops will always be called indirectly instead of being called directly from the mcount in entry.S. This is because there's no safe way to prevent mcount from being preempted before calling the function, unless we modify every entry.S to do so (not likely). Thus, dynamically allocated functions will now be called by the ftrace_ops_list_func() that loops through the ops that are allocated if there are more than one op allocated at a time. This loop is protected with a preempt_disable. To determine if an ftrace_ops structure is allocated or not, a new util function was added to the kernel/extable.c called core_kernel_data(), which returns 1 if the address is between _sdata and _edata. Cc: Paul E. McKenney Signed-off-by: Steven Rostedt commit b848914ce39589d89ee0078a6d1ef452b464729e Author: Steven Rostedt Date: Wed May 4 09:27:52 2011 -0400 ftrace: Implement separate user function filtering ftrace_ops that are registered to trace functions can now be agnostic to each other in respect to what functions they trace. Each ops has their own hash of the functions they want to trace and a hash to what they do not want to trace. A empty hash for the functions they want to trace denotes all functions should be traced that are not in the notrace hash. Cc: Paul E. McKenney Signed-off-by: Steven Rostedt commit 07fd5515f3b5c20704707f63e7f4485b534508a8 Author: Steven Rostedt Date: Thu May 5 18:03:47 2011 -0400 ftrace: Free hash with call_rcu_sched() When a hash is modified and might be in use, we need to perform a schedule RCU operation on it, as the hashes will soon be used directly in the function tracer callback. Cc: Paul E. McKenney Signed-off-by: Steven Rostedt commit 2b499381bc50ede01b3d8eab164ca2fad00655f0 Author: Steven Rostedt Date: Tue May 3 22:49:52 2011 -0400 ftrace: Have global_ops store the functions that are to be traced This is a step towards each ops structure defining its own set of functions to trace. As the current code with pid's and such are specific to the global_ops, it is restructured to be used with the global ops. Signed-off-by: Steven Rostedt commit bd69c30b1d08032d97ab0dabd7a1eb7fb73ca2b2 Author: Steven Rostedt Date: Tue May 3 21:55:54 2011 -0400 ftrace: Add ops parameter to ftrace_startup/shutdown functions In order to allow different ops to enable different functions, the ftrace_startup() and ftrace_shutdown() functions need the ops parameter passed to them. Signed-off-by: Steven Rostedt commit 647bcd03d5b2fb44fd9c9ef1a4f50c2eee8f779a Author: Steven Rostedt Date: Tue May 3 14:39:21 2011 -0400 ftrace: Add enabled_functions file Add the enabled_functions file that is used to show all the functions that have been enabled for tracing as well as their ref counts. This helps seeing if any function has been registered and what functions are being traced. Signed-off-by: Steven Rostedt commit ed926f9b35cda0988234c356e16a7cb30f4e5338 Author: Steven Rostedt Date: Tue May 3 13:25:24 2011 -0400 ftrace: Use counters to enable functions to trace Every function has its own record that stores the instruction pointer and flags for the function to be traced. There are only two flags: enabled and free. The enabled flag states that tracing for the function has been enabled (actively traced), and the free flag states that the record no longer points to a function and can be used by new functions (loaded modules). These flags are now moved to the MSB of the flags (actually just the top 32bits). The rest of the bits (30 bits) are now used as a ref counter. Everytime a tracer register functions to trace, those functions will have its counter incremented. When tracing is enabled, to determine if a function should be traced, the counter is examined, and if it is non-zero it is set to trace. When a ftrace_ops is registered to trace functions, its hashes are examined. If the ftrace_ops filter_hash count is zero, then all functions are set to be traced, otherwise only the functions in the hash are to be traced. The exception to this is if a function is also in the ftrace_ops notrace_hash. Then that function's counter is not incremented for this ftrace_ops. Signed-off-by: Steven Rostedt commit 33dc9b1267d59cef46ff0bd6bc043190845dc919 Author: Steven Rostedt Date: Mon May 2 17:34:47 2011 -0400 ftrace: Separate hash allocation and assignment When filtering, allocate a hash to insert the function records. After the filtering is complete, assign it to the ftrace_ops structure. This allows the ftrace_ops structure to have a much smaller array of hash buckets instead of wasting a lot of memory. A read only empty_hash is created to be the minimum size that any ftrace_ops can point to. When a new hash is created, it has the following steps: o Allocate a default hash. o Walk the function records assigning the filtered records to the hash o Allocate a new hash with the appropriate size buckets o Move the entries from the default hash to the new hash. Signed-off-by: Steven Rostedt commit f45948e898e7bc76a73a468796d2ce80dd040058 Author: Steven Rostedt Date: Mon May 2 12:29:25 2011 -0400 ftrace: Create a global_ops to hold the filter and notrace hashes Combine the filter and notrace hashes to be accessed by a single entity, the global_ops. The global_ops is a ftrace_ops structure that is passed to different functions that can read or modify the filtering of the function tracer. The ftrace_ops structure was modified to hold a filter and notrace hashes so that later patches may allow each ftrace_ops to have its own set of rules to what functions may be filtered. Signed-off-by: Steven Rostedt commit 1cf41dd79993389b012e4542ab502ce36ae7343f Author: Steven Rostedt Date: Fri Apr 29 20:59:51 2011 -0400 ftrace: Use hash instead for FTRACE_FL_FILTER When multiple users are allowed to have their own set of functions to trace, having the FTRACE_FL_FILTER flag will not be enough to handle the accounting of those users. Each user will need their own set of functions. Replace the FTRACE_FL_FILTER with a filter_hash instead. This is temporary until the rest of the function filtering accounting gets in. Signed-off-by: Steven Rostedt commit b448c4e3ae6d20108dba1d7833f2c0d3dbad87ce Author: Steven Rostedt Date: Fri Apr 29 15:12:32 2011 -0400 ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions To prepare for the accounting system that will allow multiple users of the function tracer, having the FTRACE_FL_NOTRACE as a flag in the dyn_trace record does not make sense. All ftrace_ops will soon have a hash of functions they should trace and not trace. By making a global hash of functions not to trace makes this easier for the transition. Signed-off-by: Steven Rostedt commit b313207286a78abac19f1dd2721292eae598b0f5 Author: Ingo Molnar Date: Wed May 18 21:00:44 2011 +0200 perf bench, x86: Add alternatives-asm.h wrapper perf bench needs this to build the kernel's memcpy routine: In file included from bench/mem-memcpy-x86-64-asm.S:2:0: bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Steven Rostedt Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org Signed-off-by: Ingo Molnar commit 01ed58abec07633791f03684b937a7e22e00c9bb Merge: af2d03d 26afb7c Author: Ingo Molnar Date: Wed May 18 20:59:27 2011 +0200 Merge branch 'x86/mem' into perf/core Merge reason: memcpy_64.S changes an assumption perf bench has, so merge this here so we can fix it. Signed-off-by: Ingo Molnar commit af2d03d4aaa847ef41a229dfee098a47908437c6 Merge: 9469234 f296388 Author: Ingo Molnar Date: Wed May 18 19:46:10 2011 +0200 Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core commit edf76f8307c350bcb81f0c760118a991b3e62956 Author: Jonathan Cameron Date: Wed May 18 10:39:04 2011 +0100 irq: Export functions to allow modular irq drivers Export handle_simple_irq, irq_modify_status, irq_alloc_descs, irq_free_descs and generic_handle_irq to allow their usage in modules. First user is IIO, which wants to be built modular, but needs to be able to create irq chips, allocate and configure interrupt descriptors and handle demultiplexing interrupts. [ tglx: Moved the uninlinig of generic_handle_irq to a separate patch ] Signed-off-by: Jonathan Cameron Link: http://lkml.kernel.org/r/%3C1305711544-505-1-git-send-email-jic23%40cam.ac.uk%3E Signed-off-by: Thomas Gleixner commit fe12bc2c996d3e492b2920e32ac79f7bbae3e15d Author: Thomas Gleixner Date: Wed May 18 12:48:00 2011 +0200 genirq: Uninline and sanity check generic_handle_irq() generic_handle_irq() is missing a NULL pointer check for the result of irq_to_desc. This was a not a big problem, but we want to expose it to drivers, so we better have sanity checks in place. Add a return value as well, which indicates that the irq number was valid and the handler was invoked. Based on the pure code move from Jonathan Cameron. Signed-off-by: Thomas Gleixner Cc: Jonathan Cameron commit fe0514348452f5b0ad7e842b0d71b8322b1297de Author: Thomas Gleixner Date: Wed May 18 12:53:03 2011 +0200 genirq: Remove pointless ifdefs kernel/irq/ is only built when CONFIG_GENERIC_HARDIRQS=y. So making code inside of kernel/irq/ conditional on CONFIG_GENERIC_HARDIRQS is pointless. Signed-off-by: Thomas Gleixner commit 26afb7c661080ae3f1f13ddf7f0c58c4f931c22b Author: Jiri Olsa Date: Thu May 12 16:30:30 2011 +0200 x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit As reported in BZ #30352: https://bugzilla.kernel.org/show_bug.cgi?id=30352 there's a kernel bug related to reading the last allowed page on x86_64. The _copy_to_user() and _copy_from_user() functions use the following check for address limit: if (buf + size >= limit) fail(); while it should be more permissive: if (buf + size > limit) fail(); That's because the size represents the number of bytes being read/write from/to buf address AND including the buf address. So the copy function will actually never touch the limit address even if "buf + size == limit". Following program fails to use the last page as buffer due to the wrong limit check: #include #include #include #define PAGE_SIZE (4096) #define LAST_PAGE ((void*)(0x7fffffffe000)) int main() { int fds[2], err; void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0); assert(ptr == LAST_PAGE); err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); assert(err == 0); err = send(fds[0], ptr, PAGE_SIZE, 0); perror("send"); assert(err == PAGE_SIZE); err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL); perror("recv"); assert(err == PAGE_SIZE); return 0; } The other place checking the addr limit is the access_ok() function, which is working properly. There's just a misleading comment for the __range_not_ok() macro - which this patch fixes as well. The last page of the user-space address range is a guard page and Brian Gerst observed that the guard page itself due to an erratum on K8 cpus (#121 Sequential Execution Across Non-Canonical Boundary Causes Processor Hang). However, the test code is using the last valid page before the guard page. The bug is that the last byte before the guard page can't be read because of the off-by-one error. The guard page is left in place. This bug would normally not show up because the last page is part of the process stack and never accessed via syscalls. Signed-off-by: Jiri Olsa Acked-by: Brian Gerst Acked-by: Linus Torvalds Cc: Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar commit de5397ad5b9ad22e2401c4dacdf1bb3b19c05679 Author: Fenghua Yu Date: Wed May 11 16:51:05 2011 -0700 x86, cpu: Enable/disable Supervisor Mode Execution Protection Enable/disable newly documented SMEP (Supervisor Mode Execution Protection) CPU feature in kernel. CR4.SMEP (bit 20) is 0 at power-on. If the feature is supported by CPU (X86_FEATURE_SMEP), enable SMEP by setting CR4.SMEP. New kernel option nosmep disables the feature even if the feature is supported by CPU. [ hpa: moved the call to setup_smep() until after the vendor-specific initialization; that ensures that CPUID features are unmasked. We will still run it before we have userspace (never mind uncontrolled userspace). ] Signed-off-by: Fenghua Yu LKML-Reference: <1305157865-31727-1-git-send-email-fenghua.yu@intel.com> Signed-off-by: H. Peter Anvin commit dc23c0bccf5eea171c87b3db285d032b9a5f06c4 Author: Fenghua Yu Date: Tue May 17 18:44:27 2011 -0700 x86, cpu: Add SMEP CPU feature in CR4 Add support for newly documented SMEP (Supervisor Mode Execution Protection) CPU feature in CR4. Signed-off-by: Fenghua Yu LKML-Reference: <1305683069-25394-3-git-send-email-fenghua.yu@intel.com> Signed-off-by: H. Peter Anvin commit d0281a257f370b09c410e466571858b4e12869c9 Author: Fenghua Yu Date: Tue May 17 18:44:26 2011 -0700 x86, cpufeature: Add cpufeature flag for SMEP Add support for newly documented SMEP (Supervisor Mode Execution Protection) CPU feature flag. SMEP prevents the CPU in kernel-mode to jump to an executable page that has the user flag set in the PTE. This prevents the kernel from executing user-space code accidentally or maliciously, so it for example prevents kernel exploits from jumping to specially prepared user-mode shell code. [ hpa: added better description by Ingo Molnar ] Signed-off-by: Fenghua Yu LKML-Reference: <1305683069-25394-2-git-send-email-fenghua.yu@intel.com> Signed-off-by: H. Peter Anvin commit 2f19e06ac30771c7cb96fd61d8aeacfa74dac21c Author: Fenghua Yu Date: Tue May 17 15:29:18 2011 -0700 x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB Support memset() with enhanced rep stosb. On processors supporting enhanced REP MOVSB/STOSB, the alternative memset_c_e function using enhanced rep stosb overrides the fast string alternative memset_c and the original function. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-10-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 057e05c1d6440117875f455e59da8691e08f65d5 Author: Fenghua Yu Date: Tue May 17 15:29:17 2011 -0700 x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB Support memmove() by enhanced rep movsb. On processors supporting enhanced REP MOVSB/STOSB, the alternative memmove() function using enhanced rep movsb overrides the original function. The patch doesn't change the backward memmove case to use enhanced rep movsb. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-9-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 101068c1f4a947ffa08f2782c78e40097300754d Author: Fenghua Yu Date: Tue May 17 15:29:16 2011 -0700 x86, mem: memcpy_64.S: Optimize memcpy by enhanced REP MOVSB/STOSB Support memcpy() with enhanced rep movsb. On processors supporting enhanced rep movsb, the alternative memcpy() function using enhanced rep movsb overrides the original function and the fast string function. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-8-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 4307bec9344aed83f8107c3eb4285bd9d218fc10 Author: Fenghua Yu Date: Tue May 17 15:29:15 2011 -0700 x86, mem: copy_user_64.S: Support copy_to/from_user by enhanced REP MOVSB/STOSB Support copy_to_user/copy_from_user() by enhanced REP MOVSB/STOSB. On processors supporting enhanced REP MOVSB/STOSB, the alternative copy_user_enhanced_fast_string function using enhanced rep movsb overrides the original function and the fast string function. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-7-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit e365c9df2f2f001450decf9512412d2d5bd1cdef Author: Fenghua Yu Date: Tue May 17 15:29:14 2011 -0700 x86, mem: clear_page_64.S: Support clear_page() with enhanced REP MOVSB/STOSB Intel processors are adding enhancements to REP MOVSB/STOSB and the use of REP MOVSB/STOSB for optimal memcpy/memset or similar functions is recommended. Enhancement availability is indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/ STOSB). Support clear_page() with rep stosb for processor supporting enhanced REP MOVSB /STOSB. On processors supporting enhanced REP MOVSB/STOSB, the alternative clear_page_c_e function using enhanced REP STOSB overrides the original function and the fast string function. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-6-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 9072d11da15a71e086eab3b5085184f2c1d06913 Author: Fenghua Yu Date: Tue May 17 15:29:13 2011 -0700 x86, alternative: Add altinstruction_entry macro Add altinstruction_entry macro to generate .altinstructions section entries from assembly code. This should be less failure-prone than open-coding. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-5-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 509731336313b3799cf03071d72c64fa6383895e Author: Fenghua Yu Date: Tue May 17 15:29:12 2011 -0700 x86, alternative, doc: Add comment for applying alternatives order Some string operation functions may be patched twice, e.g. on enhanced REP MOVSB /STOSB processors, memcpy is patched first by fast string alternative function, then it is patched by enhanced REP MOVSB/STOSB alternative function. Add comment for applying alternatives order to warn people who may change the applying alternatives order for any reason. [ Documentation-only patch ] Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-4-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 161ec53c702ce9df2f439804dfb9331807066daa Author: Fenghua Yu Date: Tue May 17 15:29:11 2011 -0700 x86, mem, intel: Initialize Enhanced REP MOVSB/STOSB If kernel intends to use enhanced REP MOVSB/STOSB, it must ensure IA32_MISC_ENABLE.Fast_String_Enable (bit 0) is set and CPUID.(EAX=07H, ECX=0H): EBX[bit 9] also reports 1. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-3-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 724a92ee45c04cb9d82884a856b03b1e594d9de1 Author: Fenghua Yu Date: Tue May 17 15:29:10 2011 -0700 x86, cpufeature: Add CPU feature bit for enhanced REP MOVSB/STOSB Intel processors are adding enhancements to REP MOVSB/STOSB and the use of REP MOVSB/STOSB for optimal memcpy/memset or similar functions is recommended. Enhancement availability is indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/ STOSB). Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305671358-14478-2-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin commit 6538df80194e305f1b78cafb556f4bb442f808b3 Author: Rafael J. Wysocki Date: Tue May 17 23:26:21 2011 +0200 PM: Introduce generic prepare and complete callbacks for subsystems Introduce generic .prepare() and .complete() power management callbacks, currently missing, that can be used by subsystems and power domains and export them. Provide NULL definitions of all the generic system sleep callbacks for CONFIG_PM_SLEEP unset. Signed-off-by: Rafael J. Wysocki commit 91e7c75ba93c48a82670d630b9daac92ff70095d Author: Rafael J. Wysocki Date: Tue May 17 23:26:00 2011 +0200 PM: Allow drivers to allocate memory from .prepare() callbacks safely If device drivers allocate substantial amounts of memory (above 1 MB) in their hibernate .freeze() callbacks (or in their legacy suspend callbcks during hibernation), the subsequent creation of hibernate image may fail due to the lack of memory. This is the case, because the drivers' .freeze() callbacks are executed after the hibernate memory preallocation has been carried out and the preallocated amount of memory may be too small to cover the new driver allocations. Unfortunately, the drivers' .prepare() callbacks also are executed after the hibernate memory preallocation has completed, so they are not suitable for allocating additional memory either. Thus the only way a driver can safely allocate memory during hibernation is to use a hibernate/suspend notifier. However, the notifiers are called before the freezing of user space and the drivers wanting to use them for allocating additional memory may not know how much memory needs to be allocated at that point. To let device drivers overcome this difficulty rework the hibernation sequence so that the memory preallocation is carried out after the drivers' .prepare() callbacks have been executed, so that the .prepare() callbacks can be used for allocating additional memory to be used by the drivers' .freeze() callbacks. Update documentation to match the new behavior of the code. Signed-off-by: Rafael J. Wysocki commit c650da23d59d2c82307380414606774c6d49b8bd Author: Rafael J. Wysocki Date: Tue May 17 23:25:10 2011 +0200 PM: Remove CONFIG_PM_VERBOSE Now that we have CONFIG_DYNAMIC_DEBUG there is no need for yet another flag causing dev_dbg() and pr_debug() statements in the core PM code to produce output. Moreover, CONFIG_PM_VERBOSE causes so much output to be generated that it's not really useful and almost no one sets it. References: https://bugzilla.kernel.org/show_bug.cgi?id=23182 Signed-off-by: Rafael J. Wysocki commit 290c748725c170ed9a02522959ae67f528eefe98 Merge: 2d2a916 72874da Author: Rafael J. Wysocki Date: Tue May 17 23:23:46 2011 +0200 Merge branch 'power-domains' into for-linus * power-domains: PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops" OMAP1 / PM: Use generic clock manipulation routines for runtime PM PM / Runtime: Generic clock manipulation rountines for runtime PM (v6) PM / Runtime: Add subsystem data field to struct dev_pm_info OMAP2+ / PM: move runtime PM implementation to use device power domains PM / Platform: Use generic runtime PM callbacks directly shmobile: Use power domains for platform runtime PM PM: Export platform bus type's default PM callbacks PM: Make power domain callbacks take precedence over subsystem ones commit 2d2a9163bd4f3ba301f8138c32e4790edc30156c Merge: 1c1be3a 2e711c0 Author: Rafael J. Wysocki Date: Tue May 17 23:23:40 2011 +0200 Merge branch 'syscore' into for-linus * syscore: PM: Remove sysdev suspend, resume and shutdown operations PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM PM / AVR32: Use struct syscore_ops instead of sysdevs for PM PM / Blackfin: Use struct syscore_ops instead of sysdevs for PM ARM / Samsung: Use struct syscore_ops for "core" power management ARM / PXA: Use struct syscore_ops for "core" power management ARM / SA1100: Use struct syscore_ops for "core" power management ARM / Integrator: Use struct syscore_ops for core PM ARM / OMAP: Use struct syscore_ops for "core" power management ARM: Use struct syscore_ops instead of sysdevs for PM in common code commit 1c1be3a949a61427a962771c85a347c822aeb991 Author: Rafael J. Wysocki Date: Sun May 15 11:39:48 2011 +0200 Revert "PM / Hibernate: Reduce autotuned default image size" This reverts commit bea3864fb627d110933cfb8babe048b63c4fc76e (PM / Hibernate: Reduce autotuned default image size), because users are now able to resolve the issue this commit was supposed to address in a different way (i.e. by using the new /sys/power/reserved_size interface). Signed-off-by: Rafael J. Wysocki commit ddeb648708108091a641adad0a438ec4fd8bf190 Author: Rafael J. Wysocki Date: Sun May 15 11:38:48 2011 +0200 PM / Hibernate: Add sysfs knob to control size of memory for drivers Martin reports that on his system hibernation occasionally fails due to the lack of memory, because the radeon driver apparently allocates too much of it during the device freeze stage. It turns out that the amount of memory allocated by radeon during hibernation (and presumably during system suspend too) depends on the utilization of the GPU (e.g. hibernating while there are two KDE 4 sessions with compositing enabled causes radeon to allocate more memory than for one KDE 4 session). In principle it should be possible to use image_size to make the memory preallocation mechanism free enough memory for the radeon driver, but in practice it is not easy to guess the right value because of the way the preallocation code uses image_size. For this reason, it seems reasonable to allow users to control the amount of memory reserved for driver allocations made after the hibernate preallocation, which currently is constant and amounts to 1 MB. Introduce a new sysfs file, /sys/power/reserved_size, whose value will be used as the amount of memory to reserve for the post-preallocation reservations made by device drivers, in bytes. For backwards compatibility, set its default (and initial) value to the currently used number (1 MB). References: https://bugzilla.kernel.org/show_bug.cgi?id=34102 Reported-and-tested-by: Martin Steigerwald Signed-off-by: Rafael J. Wysocki commit 13e381365614855bf14c8ad68f9b65e3afd3dd2c Author: Eric Dumazet Date: Wed May 11 22:40:45 2011 +0200 PM / Wakeup: Remove useless synchronize_rcu() call wakeup_source_add() adds an item into wakeup_sources list. There is no need to call synchronize_rcu() at this point. Its only needed in wakeup_source_remove() Signed-off-by: Eric Dumazet Signed-off-by: Rafael J. Wysocki commit 13d53f8775c6a00b070a3eef6833795412eb7fcd Author: Kay Sievers Date: Tue May 10 21:27:34 2011 +0200 kmod: always provide usermodehelper_disable() We need to prevent kernel-forked processes during system poweroff. Such processes try to access the filesystem whose disks we are trying to shutdown at the same time. This causes delays and exceptions in the storage drivers. A follow-up patch will add these calls and need usermodehelper_disable() also on systems without suspend support. Signed-off-by: Kay Sievers Signed-off-by: Rafael J. Wysocki commit c3b0795c98c08351567464150db66d11e05d7611 Author: Amerigo Wang Date: Tue May 10 21:09:38 2011 +0200 PM / ACPI: Remove acpi_sleep=s4_nonvs acpi_sleep=s4_nonvs is superseded by acpi_sleep=nonvs, so remove it. Signed-off-by: WANG Cong Acked-by: Pavel Machek Acked-by: Len Brown Signed-off-by: Rafael J. Wysocki commit e762318baae3002944d68220910aef7caffcd065 Author: Rafael J. Wysocki Date: Sat May 7 18:11:52 2011 +0200 PM / Wakeup: Fix build warning related to the "wakeup" sysfs file The "wakeup" device sysfs file is only created if CONFIG_PM_SLEEP is set, so put it under CONFIG_PM_SLEEP and make a build warning related to it go away. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit a144c6a6c924aa1da04dd77fb84b89927354fdff Author: Rafael J. Wysocki Date: Fri May 6 20:09:42 2011 +0200 PM: Print a warning if firmware is requested when tasks are frozen Some drivers erroneously use request_firmware() from their ->resume() (or ->thaw(), or ->restore()) callbacks, which is not going to work unless the firmware has been built in. This causes system resume to stall until the firmware-loading timeout expires, which makes users think that the resume has failed and reboot their machines unnecessarily. For this reason, make _request_firmware() print a warning and return immediately with error code if it has been called when tasks are frozen and it's impossible to start any new usermode helpers. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman Reviewed-by: Valdis Kletnieks commit e1866b33b1e89f077b7132daae3dfd9a594e9a1a Author: Rafael J. Wysocki Date: Fri Apr 29 00:33:45 2011 +0200 PM / Runtime: Rework runtime PM handling during driver removal The driver core tries to prevent race conditions between runtime PM and driver removal from happening by incrementing the runtime PM usage counter of the device and executing pm_runtime_barrier() before running the bus notifier and the ->remove() callbacks provided by the device's subsystem or driver. This guarantees that, if a future runtime suspend of the device has been scheduled or a runtime resume or idle request has been queued up right before the driver removal, it will be canceled or waited for to complete and no other asynchronous runtime suspend or idle requests for the device will be put into the PM workqueue until the ->remove() callback returns. However, it doesn't prevent resume requests from being queued up after pm_runtime_barrier() has been called and it doesn't prevent pm_runtime_resume() from executing the device subsystem's runtime resume callback. Morever, it prevents the device's subsystem or driver from putting the device into the suspended state by calling pm_runtime_suspend() from its ->remove() routine. This turns out to be a major inconvenience for some subsystems and drivers that want to leave the devices they handle in the suspended state. To really prevent runtime PM callbacks from racing with the bus notifier callback in __device_release_driver(), which is necessary, because the notifier is used by some subsystems to carry out operations affecting the runtime PM functionality, use pm_runtime_get_sync() instead of the combination of pm_runtime_get_noresume() and pm_runtime_barrier(). This will resume the device if it's in the suspended state and will prevent it from being suspended again until pm_runtime_put_*() is called. To allow subsystems and drivers to put devices into the suspended state by calling pm_runtime_suspend() from their ->remove() routines, execute pm_runtime_put_sync() after running the bus notifier in __device_release_driver(). This will require subsystems and drivers to make their ->remove() callbacks avoid races with runtime PM directly, but it will allow of more flexibility in the handling of devices during the removal of their drivers. Signed-off-by: Rafael J. Wysocki commit ee940d8dccd899aa1777ea84da3d9cd04b1d2e8e Author: Mike Frysinger Date: Mon Apr 25 12:33:15 2011 +0200 Freezer: Use SMP barriers The freezer processes are dealing with multiple threads running simultaneously, and on a UP system, the memory reads/writes do not need barriers to keep things in sync. These are only needed on SMP systems, so use SMP barriers instead. Signed-off-by: Mike Frysinger Acked-by: Pavel Machek Signed-off-by: Rafael J. Wysocki commit 3c431936087e93d2219a184a8e19eaa68077e379 Author: MyungJoo Ham Date: Fri Apr 22 22:00:54 2011 +0200 PM / Suspend: Do not ignore error codes returned by suspend_enter() The current implementation of suspend-to-RAM returns 0 if there is an error from suspend_enter(), because suspend_devices_and_enter() ignores the return value from suspend_enter(). This patch addresses this issue and properly keep the error return from suspend_enter() and let suspend_devices_and_enter relay the error return. Signed-off-by: MyungJoo Ham Signed-off-by: Kyungmin Park Signed-off-by: Rafael J. Wysocki commit 2494b030ba9334c7dd7df9b9f7abe4eacc950ec5 Author: Fenghua Yu Date: Tue May 17 12:33:26 2011 -0700 x86, cpufeature: Fix cpuid leaf 7 feature detection CPUID leaf 7, subleaf 0 returns the maximum subleaf in EAX, not the number of subleaves. Since so far only subleaf 0 is defined (and only the EBX bitfield) we do not need to qualify the test. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1305660806-17519-1-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin Cc: 2.6.36..39 commit 94692349c4fc1bc74c19a28f9379509361a06a3b Author: Stephane Eranian Date: Tue May 17 15:36:19 2011 +0200 perf: Fix multi-event parsing bug This patch fixes an issue with event parsing. The following commit appears to have broken the ability to specify a comma separated list of events: commit ceb53fbf6dbb1df26d38379a262c6981fe73dd36 Author: Ingo Molnar Date: Wed Apr 27 04:06:33 2011 +0200 perf stat: Fail more clearly when an invalid modifier is specified This patch fixes this while preserving the desired effect: $ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null Performance counter stats for 'ls /dev/null': 365956 instructions:u # 0.00 insns per cycle 731806 instructions:k # 0.00 insns per cycle 0.001108862 seconds time elapsed $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers Signed-off-by: Stephane Eranian Cc: acme@redhat.com Cc: peterz@infradead.org Cc: fweisbec@gmail.com Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad Signed-off-by: Ingo Molnar commit 86b9523ab1517f6edeb87295329c901930d3732d Merge: 2c14ddc fffcda1 Author: Ingo Molnar Date: Tue May 17 14:39:00 2011 +0200 Merge branch 'gart/rename' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu commit dc382fd5bcca7098a984705ed6ac880f539d068e Author: David Rientjes Date: Mon May 16 13:54:10 2011 -0700 x86, mm: Allow ZONE_DMA to be configurable ZONE_DMA is unnecessary for a large number of machines that do not require less than 32-bit DMA addressing, e.g. ISA legacy DMA or PCI cards with a restricted DMA address mask. This patch allows users to disable ZONE_DMA for x86 if they know they will not be using such devices with their kernel. This prevents the VM from unnecessarily reserving a ratio of memory (defaulting to 1/256th of system capacity) with lowmem_reserve_ratio for such allocations when it will never be used. Signed-off-by: David Rientjes Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1105161353560.4353@chino.kir.corp.google.com Signed-off-by: H. Peter Anvin commit 865be7a81071a77014c83cd01536c989eed362b4 Author: Ondrej Zary Date: Mon May 16 21:38:08 2011 +0200 x86, cpu: Fix detection of Celeron Covington stepping A1 and B0 Steppings A1 and B0 of Celeron Covington are currently misdetected as Pentium II (Dixon). Fix it by removing the stepping check. [ hpa: this fixes this specific bug... the CPUID documentation specifies that the L2 cache size can disambiguate additional CPUs; this patch does not fix that. ] Signed-off-by: Ondrej Zary Link: http://lkml.kernel.org/r/201105162138.15416.linux@rainbow-software.org Signed-off-by: H. Peter Anvin commit f29638868280534ed7e2fdd93b31557232597940 Author: Martin Schwidefsky Date: Tue May 10 10:10:43 2011 +0200 ftrace/s390: mcount offset calculation Do the mcount offset adjustment in the recordmcount.pl/recordmcount.[ch] at compile time and not in ftrace_call_adjust at run time. Signed-off-by: Martin Schwidefsky Signed-off-by: Steven Rostedt commit 521ccb5c4aece609311bfa7157910a8f0c942af5 Author: Martin Schwidefsky Date: Tue May 10 10:10:41 2011 +0200 ftrace/x86: mcount offset calculation Do the mcount offset adjustment in the recordmcount.pl/recordmcount.[ch] at compile time and not in ftrace_call_adjust at run time. Signed-off-by: Martin Schwidefsky Signed-off-by: Steven Rostedt commit 07d8b595f367f4604e6027ad4cba33cbe3f55e10 Author: Martin Schwidefsky Date: Tue May 10 10:10:40 2011 +0200 ftrace/recordmcount: mcount address adjustment Introduce mcount_adjust{,_32,_64} to the C implementation of recordmcount analog to $mcount_adjust in the perl script. The adjustment is added to the address of the relocations against the mcount symbol. If this adjustment is done by recordmcount at compile time the ftrace_call_adjust function can be turned into a nop. Cc: John Reiser Signed-off-by: Martin Schwidefsky Signed-off-by: Steven Rostedt commit 41b402a201a12efdff4acc990e023a89a409cd41 Author: Steven Rostedt Date: Wed Apr 20 21:13:06 2011 -0400 ftrace/recordmcount: Add helper function get_sym_str_and_relp() The code to get the symbol, string, and relp pointers in the two functions sift_rel_mcount() and nop_mcount() are identical and also non-trivial. Moving this duplicate code into a single helper function makes the code easier to read and more maintainable. Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023739.723658553@goodmis.org Signed-off-by: Steven Rostedt commit 37762cb9977626343b3cd1aab9146313c94748c2 Author: Steven Rostedt Date: Wed Apr 20 20:47:34 2011 -0400 ftrace/recordmcount: Remove duplicate code to find mcount symbol The code in sift_rel_mcount() and nop_mcount() to get the mcount symbol number is identical. Replace the two locations with a call to a function that does the work. Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023739.488093407@goodmis.org Signed-off-by: Steven Rostedt commit 2895cd2ab81dfb7bc22637bc110857db44a30b4a Author: Steven Rostedt Date: Wed Apr 13 16:43:29 2011 -0400 ftrace/x86: Do not trace .discard.text section The section called .discard.text has tracing attached to it and is currently ignored by ftrace. But it does include a call to the mcount stub. Adding a notrace to the code keeps gcc from adding the useless mcount caller to it. Link: http://lkml.kernel.org/r/20110421023739.243651696@goodmis.org Signed-off-by: Steven Rostedt commit f0201738b61b1adcf6b2c4719c5c415745014c1c Author: Steven Rostedt Date: Tue Apr 12 19:06:39 2011 -0400 ftrace: Avoid recording mcount on .init sections directly The init and exit sections should not be traced and adding a call to mcount to them is a waste of text and instruction cache. Have the macro section attributes include notrace to ignore these functions for tracing from the build. Link: http://lkml.kernel.org/r/20110421023738.953028219@goodmis.org Signed-off-by: Steven Rostedt commit 85356f802225fedeee8c3e65bdd93b263ace0a8b Author: Steven Rostedt Date: Tue Apr 12 18:59:10 2011 -0400 kbuild/recordmcount: Add RECORDMCOUNT_WARN to warn about mcount callers When mcount is called in a section that ftrace will not modify it into a nop, we want to warn about this. But not warn about this always. Now if the user builds the kernel with the option RECORDMCOUNT_WARN=1 then the build will warn about mcount callers that are ignored and will just waste execution time. Acked-by: Michal Marek Cc: linux-kbuild@vger.kernel.org Link: http://lkml.kernel.org/r/20110421023738.714956282@goodmis.org Signed-off-by: Steven Rostedt commit dfad3d598c4bbbaf137588e22bac1ce624529f7e Author: Steven Rostedt Date: Tue Apr 12 18:53:25 2011 -0400 ftrace/recordmcount: Add warning logic to warn on mcount not recorded There's some sections that should not have mcount recorded and should not have modifications to the that code. But currently they waste some time by calling mcount anyway (which simply returns). As the real answer should be to either whitelist the section or have gcc ignore it fully. This change adds a option to recordmcount to warn when it finds a section that is ignored by ftrace but still contains mcount callers. This is not on by default as developers may not know if the section should be completely ignored or added to the whitelist. Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023738.476989377@goodmis.org Signed-off-by: Steven Rostedt commit ffd618fa39284f8cc343894b566dd42ec6e74e77 Author: Steven Rostedt Date: Fri Apr 8 03:58:48 2011 -0400 ftrace/recordmcount: Make ignored mcount calls into nops at compile time There are sections that are ignored by ftrace for the function tracing because the text is in a section that can be removed without notice. The mcount calls in these sections are ignored and ftrace never sees them. The downside of this is that the functions in these sections still call mcount. Although the mcount function is defined in assembly simply as a return, this added overhead is unnecessary. The solution is to convert these callers into nops at compile time. A better solution is to add 'notrace' to the section markers, but as new sections come up all the time, it would be nice that they are delt with when they are created. Later patches will deal with finding these sections and doing the proper solution. Thanks to H. Peter Anvin for giving me the right nops to use for x86. Cc: "H. Peter Anvin" Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023738.237101176@goodmis.org Signed-off-by: Steven Rostedt commit 8abd5724a7f1631ab2276954156c629d4d17149a Author: Steven Rostedt Date: Wed Apr 13 13:31:08 2011 -0400 ftrace/recordmcount: Modify only executable sections PROGBITS is not enough to determine if the section should be modified or not. Only process sections that are marked as executable. Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023737.991485123@goodmis.org Signed-off-by: Steven Rostedt commit 9f087e7612115b7a201d4f3392a95ac7408948ab Author: Steven Rostedt Date: Wed Apr 6 14:10:22 2011 -0400 ftrace: Add .kprobe.text section to whitelist for recordmcount.c The .kprobe.text section is safe to modify mcount to nop and tracing. Add it to the whitelist in recordmcount.c and recordmcount.pl. Cc: John Reiser Cc: Masami Hiramatsu Link: http://lkml.kernel.org/r/20110421023737.743350547@goodmis.org Signed-off-by: Steven Rostedt commit e90b0c8bf211958a296d60369fecd51b35864407 Author: Steven Rostedt Date: Wed Apr 6 13:32:24 2011 -0400 ftrace/trivial: Clean up record mcount to use Linux switch style The Linux style for switch statements is: switch (var) { case x: [...] break; } Not: switch (var) { case x: { [...] } break; Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023737.523968644@goodmis.org Signed-off-by: Steven Rostedt commit dd5477ff3ba978892014ea5f988cb1bf04aa505e Author: Steven Rostedt Date: Wed Apr 6 13:21:17 2011 -0400 ftrace/trivial: Clean up recordmcount.c to use Linux style comparisons The Linux ftrace subsystem style for comparing is: var == 1 var > 0 and not: 1 == var 0 < var It is considered that Linux developers are smart enough not to do the if (var = 1) mistake. Cc: John Reiser Link: http://lkml.kernel.org/r/20110421023737.290712238@goodmis.org Signed-off-by: Steven Rostedt commit eecaaba5b2e4ae762b4726fae2e3b22630e137ec Author: Borislav Petkov Date: Mon May 16 15:39:48 2011 +0200 Documentation, ABI: Update L3 cache index disable text Change contact person to AMD kernel mailing list, update text and external references, drop "Users:" tag. Cc: Randy Dunlap Signed-off-by: Borislav Petkov Link: http://lkml.kernel.org/r/1305553188-21061-4-git-send-email-bp@amd64.org Acked-by: Greg Kroah-Hartman Signed-off-by: H. Peter Anvin commit 42be450565b0fc4607fae3e3a7da038d367a23ed Author: Frank Arnold Date: Mon May 16 15:39:47 2011 +0200 x86, AMD, cacheinfo: Fix L3 cache index disable checks We provide two slots to disable cache indices, and have a check to prevent both slots to be used for the same index. If the user disables the same index on different subcaches, both slots will hold the same index, e.g. $ echo 2047 > /sys/devices/system/cpu/cpu0/cache/index3/cache_disable_0 $ cat /sys/devices/system/cpu/cpu0/cache/index3/cache_disable_0 2047 $ echo 1050623 > /sys/devices/system/cpu/cpu0/cache/index3/cache_disable_1 $ cat /sys/devices/system/cpu/cpu0/cache/index3/cache_disable_1 2047 due to the fact that the check was looking only at index bits [11:0] and was ignoring writes to bits outside that range. The more correct fix is to simply check whether the index is within the bounds of [0..l3->indices]. While at it, cleanup comments and drop now-unused local macros. Signed-off-by: Frank Arnold Link: http://lkml.kernel.org/r/1305553188-21061-3-git-send-email-bp@amd64.org Signed-off-by: Borislav Petkov Signed-off-by: H. Peter Anvin commit 50e7534427283afd997d58481778c07bea79eb63 Author: Borislav Petkov Date: Mon May 16 15:39:46 2011 +0200 x86, AMD, cacheinfo: Fix fallout caused by max3 conversion 732eacc0542d0aa48797f675888b85d6065af837 converted code around the kernel using nested max() macros to use the new max3 macro but forgot to remove the old line in intel_cacheinfo.c. Fix it. Cc: Hagen Paul Pfeifer Cc: Frank Arnold Signed-off-by: Borislav Petkov Link: http://lkml.kernel.org/r/1305553188-21061-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin commit 72874daa5e9064c4e8d689e6a04b1e96f687f872 Author: Rafael J. Wysocki Date: Tue May 3 19:45:32 2011 +0200 PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset Fix a build issue in drivers/base/power/clock_ops.c occuring when CONFIG_PM_RUNTIME is not set. Signed-off-by: Rafael J. Wysocki commit 2064af917b3ba7589070064ca4ed12cecd99a63c Author: Kevin Hilman Date: Fri Apr 29 00:37:26 2011 +0200 PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops" The platform_bus_set_pm_ops() operation is deprecated in favor of the new device power domain infrastructre implemented in commit 7538e3db6e015e890825fbd9f8659952896ddd5b (PM: add support for device power domains) Signed-off-by: Kevin Hilman Signed-off-by: Rafael J. Wysocki commit 600b776eb39a13a28b090ba9efceb0c69d4508aa Author: Rafael J. Wysocki Date: Mon May 16 20:15:36 2011 +0200 OMAP1 / PM: Use generic clock manipulation routines for runtime PM Convert OMAP1 to using the new generic clock manipulation routines and a device power domain for runtime PM instead of overriding the platform bus type's runtime PM callbacks. This allows us to simplify OMAP1-specific code and to share some code with other platforms (shmobile in particular). Signed-off-by: Rafael J. Wysocki Acked-by: Kevin Hilman commit 7c1bfd685bcdc822ab1d7411ea05c82bd2a7b260 Author: Konrad Rzeszutek Wilk Date: Mon May 16 13:47:30 2011 -0400 xen/pci: Fix compiler error when CONFIG_XEN_PRIVILEGED_GUEST is not set. If we have CONFIG_XEN and the other parameters to build an Linux kernel that is non-privileged, the xen_[find|register|unregister]_ device_domain_owner functions should not be compiled. They should use the nops defined in arch/x86/include/asm/xen/pci.h instead. This fixes: arch/x86/pci/xen.c:496: error: redefinition of ‘xen_find_device_domain_owner’ arch/x86/include/asm/xen/pci.h:25: note: previous definition of ‘xen_find_device_domain_owner’ was here arch/x86/pci/xen.c:510: error: redefinition of ‘xen_register_device_domain_owner’ arch/x86/include/asm/xen/pci.h:29: note: previous definition of ‘xen_register_device_domain_owner’ was here arch/x86/pci/xen.c:532: error: redefinition of ‘xen_unregister_device_domain_owner’ arch/x86/include/asm/xen/pci.h:34: note: previous definition of ‘xen_unregister_device_domain_owner’ was here Signed-off-by: Konrad Rzeszutek Wilk Reported-by: Randy Dunlap commit db670dac49b5423b39b5e523d28fe32045d71b10 Author: Stephan Baerwolf Date: Wed May 11 18:03:29 2011 +0200 sched: Fix and optimise calculation of the weight-inverse If the inverse loadweight should be zero, function "calc_delta_mine" calculates the inverse of "lw->weight" (in 32bit integer ops). This calculation is actually a little bit impure (because it is inverting something around "lw-weight"+1), especially when "lw->weight" becomes smaller. The correct inverse would be 1/lw->weight multiplied by "WMULT_CONST" for fixcomma-scaling it into integers. (So WMULT_CONST/lw->weight ...) The old, impure algorithm took two divisions for inverting lw->weight, the new, more exact one only takes one and an additional unlikely-if. Signed-off-by: Stephan Baerwolf Signed-off-by: Peter Zijlstra Cc: Linus Torvalds Link: http://lkml.kernel.org/n/tip-0pz0wnyalr4tk4ln11xwumdx@git.kernel.org [ This could explain some aritmetical issues for small shares but nothing concrete has been reported yet so we are not confident enough to queue this up in sched/urgent and for -stable backport. But if anyone finds this commit and sees it to fix some badness then we can certainly change our mind! ] Signed-off-by: Ingo Molnar commit db44fc017d5989302713ab4e7f9e922b648f4b59 Author: Yong Zhang Date: Mon May 9 22:07:05 2011 +0800 sched: Avoid going ahead if ->cpus_allowed is not changed If cpumask_equal(&p->cpus_allowed, new_mask) is true, seems there is no reason to prevent set_cpus_allowed_ptr() return directly. Signed-off-by: Yong Zhang Acked-by: Hillf Danton Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110509140705.GA2219@zhy Signed-off-by: Ingo Molnar commit 61eadef6a9bde9ea62fda724a9cb501ce9bc925a Author: Mike Galbraith Date: Fri Apr 29 08:36:50 2011 +0200 sched, rt: Update rq clock when unthrottling of an otherwise idle CPU If an RT task is awakened while it's rt_rq is throttled, the time between wakeup/enqueue and unthrottle/selection may be accounted as rt_time if the CPU is idle. Set rq->skip_clock_update negative upon throttle release to tell put_prev_task() that we need a clock update. Reported-by: Thomas Giesel Signed-off-by: Mike Galbraith Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1304059010.7472.1.camel@marge.simson.net Signed-off-by: Ingo Molnar commit ec514c487c3d4b652943da7b0afbc094eee08cfa Author: Cheng Xu Date: Sat May 14 14:20:02 2011 +0800 sched: Fix rt_rq runtime leakage bug This patch is to fix the real-time scheduler bug reported at: https://lkml.org/lkml/2011/4/26/13 That is, when running multiple real-time threads on every logical CPUs and then turning off one CPU, the kernel will bug at function __disable_runtime(). Function __disable_runtime() bugs and reports leakage of rt_rq runtime. The root cause is __disable_runtime() assumes it iterates through all the existing rt_rq's while walking rq->leaf_rt_rq_list, which actually contains only runnable rt_rq's. This problem also applies to __enable_runtime() and print_rt_stats(). The patch is based on above analysis, appears to fix the problem, but is only lightly tested. Reported-by: Paul E. McKenney Tested-by: Paul E. McKenney Signed-off-by: Cheng Xu Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/4DCE1F12.6040609@linux.vnet.ibm.com Signed-off-by: Ingo Molnar commit a18f22a968de17b29f2310cdb7ba69163e65ec15 Merge: a1c57e0 798778b Author: Thomas Gleixner Date: Sat May 14 12:06:36 2011 +0200 Merge branch 'consolidate-clksrc-i8253' of master.kernel.org:~rmk/linux-2.6-arm into timers/clocksource Conflicts: arch/ia64/kernel/cyclone.c arch/mips/kernel/i8253.c arch/x86/kernel/i8253.c Reason: Resolve conflicts so further cleanups do not conflict further Signed-off-by: Thomas Gleixner commit 798778b8653f64b7b2162ac70eca10367cff6ce8 Author: Russell King Date: Sun May 8 19:03:03 2011 +0100 clocksource: convert mips to generic i8253 clocksource Convert MIPS i8253 clocksource code to use generic i8253 clocksource. Acked-by: John Stultz Acked-by: Thomas Gleixner Cc: Ralf Baechle Signed-off-by: Russell King commit 82491451dd25a3abe8496ddbd04ddb3f77d285c2 Author: Russell King Date: Sun May 8 18:55:19 2011 +0100 clocksource: convert x86 to generic i8253 clocksource Convert x86 i8253 clocksource code to use generic i8253 clocksource. Acked-by: John Stultz Acked-by: Thomas Gleixner Cc: "H. Peter Anvin" Cc: Ingo Molnar Signed-off-by: Russell King commit 8c414ff3f4dcdde228c6a668385218290d73a265 Author: Russell King Date: Sun May 8 18:50:20 2011 +0100 clocksource: convert footbridge to generic i8253 clocksource Convert the footbridge isa-timer code to use generic i8253 clocksource. Acked-by: John Stultz Acked-by: Thomas Gleixner Signed-off-by: Russell King commit 89c0b8e2520e12d69dafc663dfbd39f8180438ea Author: Russell King Date: Sun May 8 18:47:58 2011 +0100 clocksource: add common i8253 PIT clocksource This is based upon both arch/arm/mach-footbridge/isa-timer.c and arch/x86/kernel/i8253.c. Acked-by: John Stultz Acked-by: Thomas Gleixner Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Ralf Baechle Signed-off-by: Russell King commit c42d2237143fcf35cff642cefe2bcf7786aae312 Author: Stephen Boyd Date: Thu May 12 16:50:07 2011 -0700 debugfs: Silence DEBUG_STRICT_USER_COPY_CHECKS=y warning Enabling DEBUG_STRICT_USER_COPY_CHECKS causes the following warning: In file included from arch/x86/include/asm/uaccess.h:573, from include/linux/uaccess.h:5, from include/linux/highmem.h:7, from include/linux/pagemap.h:10, from fs/debugfs/file.c:18: In function 'copy_from_user', inlined from 'write_file_bool' at fs/debugfs/file.c:435: arch/x86/include/asm/uaccess_64.h:65: warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct presumably due to buf_size being signed causing GCC to fail to see that buf_size can't become negative. Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 82a3242e11d9e63c8195be46c954efaefee35e22 Author: Greg Kroah-Hartman Date: Thu May 12 16:01:02 2011 -0700 sysfs: remove "last sysfs file:" line from the oops messages On some arches (x86, sh, arm, unicore, powerpc) the oops message would print out the last sysfs file accessed. This was very useful in finding a number of sysfs and driver core bugs in the 2.5 and early 2.6 development days, but it has been a number of years since this file has actually helped in debugging anything that couldn't also be trivially determined from the stack traceback. So it's time to delete the line. This is good as we need all the space we can get for oops messages at times on consoles. Acked-by: Phil Carmody Acked-by: Ingo Molnar Cc: Andrew Morton Cc: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit ae15a3b4d1374b733016ce4b4148b2ba42bbeb0f Author: Daniel Kiper Date: Wed May 4 20:17:21 2011 +0200 arch/x86/xen/setup: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit ad3062a0f438a5f436dae267f795c0a9686f11d2 Author: Daniel Kiper Date: Wed May 4 20:15:18 2011 +0200 arch/x86/xen/enlighten: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit 251511a18dcfe1868e3edfdc490777dcfaa1f851 Author: Daniel Kiper Date: Wed May 4 20:16:07 2011 +0200 arch/x86/xen/irq: Cleanup code/data sections definitions Cleanup code/data sections definitions accordingly to include/linux/init.h. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit bdf516748ed73f6219f5e4065a8fe2f333520687 Author: Ian Campbell Date: Mon Apr 4 13:39:53 2011 +0100 xen: tidy up whitespace in drivers/xen/Makefile Various merges over time have led to rather a mish-mash of indentation. Signed-off-by: Ian Campbell Cc: Jeremy Fitzhardinge Cc: xen-devel@lists.xensource.com Signed-off-by: Konrad Rzeszutek Wilk commit a236c71766a5f69edf189e2eaeb0aa587c8c5684 Author: Andrew Morton Date: Thu May 12 12:45:31 2011 -0700 drivers/base/memory.c: fix warning due to "memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION" drivers/base/memory.c: In function 'memory_block_change_state': drivers/base/memory.c:281: warning: unused variable 'i' less beer, more testing Cc: Anton Blanchard Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 8c5950881c3b5e6e350e4b0438a8ccc513d90df9 Author: Konrad Rzeszutek Wilk Date: Fri Apr 1 15:18:48 2011 -0400 xen/p2m: Create entries in the P2M_MFN trees's to track 1-1 mappings .. when applicable. We need to track in the p2m_mfn and p2m_mfn_p the MFNs and pointers, respectivly, for the P2M entries that are allocated for the identity mappings. Without this, a PV domain with an E820 that triggers the 1-1 mapping to kick in, won't be able to be restored as the P2M won't have the identity mappings. Signed-off-by: Konrad Rzeszutek Wilk commit 0f16d0dfcdb5aab97d9e368f008b070b5b3ec6d3 Author: Daniel Kiper Date: Wed May 11 22:34:38 2011 +0200 xen/setup: Fix for incorrect xen_extra_mem_start initialization under 32-bit git commit 24bdb0b62cc82120924762ae6bc85afc8c3f2b26 (xen: do not create the extra e820 region at an addr lower than 4G) does not take into account that ifdef CONFIG_X86_32 instead of e820_end_of_low_ram_pfn() find_low_pfn_range() is called (both calls are from arch/x86/kernel/setup.c). find_low_pfn_range() behaves correctly and does not require change in xen_extra_mem_start initialization. Additionally, if xen_extra_mem_start is initialized in the same way as ifdef CONFIG_X86_64 then memory hotplug support for Xen balloon driver (under development) is broken. Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit 15bfc094517db2ddf38ca7ed47f3a1c0ad24f7c4 Author: Konrad Rzeszutek Wilk Date: Tue Apr 12 07:57:15 2011 -0400 xen/setup: Ignore E820_UNUSABLE when setting 1-1 mappings. When we parse the raw E820, the Xen hypervisor can set "E820_RAM" to "E820_UNUSABLE" if the mem=X argument is used. As such we should _not_ consider the E820_UNUSABLE as an 1-1 identity mapping, but instead use the same case as for E820_RAM. Signed-off-by: Konrad Rzeszutek Wilk commit 7899891c7d161752f29abcc9bc0a9c6c3a3af26c Author: Tian, Kevin Date: Thu May 12 10:56:08 2011 +0800 xen mmu: fix a race window causing leave_mm BUG() There's a race window in xen_drop_mm_ref, where remote cpu may exit dirty bitmap between the check on this cpu and the point where remote cpu handles drop request. So in drop_other_mm_ref we need check whether TLB state is still lazy before calling into leave_mm. This bug is rarely observed in earlier kernel, but exaggerated by the commit 831d52bc153971b70e64eccfbed2b232394f22f8 ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm") which clears bitmap after changing the TLB state. the call trace is as below: --------------------------------- kernel BUG at arch/x86/mm/tlb.c:61! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb CPU 1 Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table] Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 RIP: e030:[] [] leave_mm+0x15/0x46 RSP: e02b:ffff88002805be48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40) Stack: ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 Call Trace: [] drop_other_mm_ref+0x2a/0x53 [] generic_smp_call_function_single_interrupt+0xd8/0xfc [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hyper visor_callback+0x1e/0x30 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? flush_old_exec+0x3ac/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef [] ? need_resched+0x23/0x2d [] ? process_measurement+0xc0/0xd7 [] ? load_elf_binary+0x0/0x17ef [] ? search_binary_handler+0xc8/0x255 [] ? do_execve+0x1c3/0x29e [] ? sys_execve+0x43/0x5d [] ? __call_usermodehelper+0x0/0x6f [] ? kernel_execve+0x68/0xd0 [] ? __call_usermodehelper+0x0/0x6f [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? ____call_usermodehelper+0x113/0x11e [] ? child_rip+0xa/0x20 [] ? __call_usermodehelper+0x0/0x6f [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? child_rip+0x0/0x20 Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 RIP [] leave_mm+0x15/0x46 RSP ---[ end trace ce9cee6832a9c503 ]--- Tested-by: Maoxiaoyun Signed-off-by: Kevin Tian [v1: Fleshed out the git description a bit] Signed-off-by: Konrad Rzeszutek Wilk commit 1df9fad1220beb21b4b4230373cbf4924436dd88 Merge: d0c49bf 2f25e9a 1c65335 Author: Roland Dreier Date: Thu May 12 08:57:20 2011 -0700 Merge branches 'cma', 'cxgb4' and 'qib' into for-next commit 1c65335714e99864a438b0d757c8736d3c6e7d79 Author: Sergei Shtylyov Date: Mon May 9 22:07:31 2011 -0700 IB/qib: Use pci_dev->revision The driver reads PCI revision ID from the PCI configuration register while it's already stored by PCI subsystem in the revision field of struct pci_dev. Signed-off-by: Sergei Shtylyov Acked-by: Mike Marciniszyn Signed-off-by: Roland Dreier commit 449a66fd1fa75d36dca917704827c40c8f416bca Author: Richard Weinberger Date: Thu May 12 15:11:12 2011 +0200 x86: Remove warning and warning_symbol from struct stacktrace_ops Both warning and warning_symbol are nowhere used. Let's get rid of them. Signed-off-by: Richard Weinberger Cc: Oleg Nesterov Cc: Andrew Morton Cc: Huang Ying Cc: Soeren Sandmann Pedersen Cc: Namhyung Kim Cc: x86 Cc: H. Peter Anvin Cc: Thomas Gleixner Cc: Robert Richter Cc: Paul Mundt Link: http://lkml.kernel.org/r/1305205872-10321-2-git-send-email-richard@nod.at Signed-off-by: Frederic Weisbecker commit 5db1256a5131d3b133946fa02ac9770a784e6eb2 Author: Milton Miller Date: Thu May 12 04:13:54 2011 -0500 seqlock: Don't smp_rmb in seqlock reader spin loop Move the smp_rmb after cpu_relax loop in read_seqlock and add ACCESS_ONCE to make sure the test and return are consistent. A multi-threaded core in the lab didn't like the update from 2.6.35 to 2.6.36, to the point it would hang during boot when multiple threads were active. Bisection showed af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867 (clockevents: Remove the per cpu tick skew) as the culprit and it is supported with stack traces showing xtime_lock waits including tick_do_update_jiffies64 and/or update_vsyscall. Experimentation showed the combination of cpu_relax and smp_rmb was significantly slowing the progress of other threads sharing the core, and this patch is effective in avoiding the hang. A theory is the rmb is affecting the whole core while the cpu_relax is causing a resource rebalance flush, together they cause an interfernce cadance that is unbroken when the seqlock reader has interrupts disabled. At first I was confused why the refactor in 3c22cd5709e8143444a6d08682a87f4c57902df3 (kernel: optimise seqlock) didn't affect this patch application, but after some study that affected seqcount not seqlock. The new seqcount was not factored back into the seqlock. I defer that the future. While the removal of the timer interrupt offset created contention for the xtime lock while a cpu does the additonal work to update the system clock, the seqlock implementation with the tight rmb spin loop goes back much further, and is just waiting for the right trigger. Cc: Signed-off-by: Milton Miller Cc: Cc: Linus Torvalds Cc: Andi Kleen Cc: Nick Piggin Cc: Benjamin Herrenschmidt Cc: Anton Blanchard Cc: Paul McKenney Acked-by: Eric Dumazet Link: http://lkml.kernel.org/r/%3Cseqlock-rmb%40mdm.bga.com%3E Signed-off-by: Thomas Gleixner commit 3e51e3edfd81bfd9853ad7de91167e4ce33d0fe7 Author: Samir Bellabes Date: Wed May 11 18:18:05 2011 +0200 sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched_fork() and wake_up_new_task() are defined with a parameter 'unsigned long clone_flags', which is unused. This patch removes the parameters. Signed-off-by: Samir Bellabes Acked-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1305130685-1047-1-git-send-email-sam@synack.fr Signed-off-by: Ingo Molnar commit 9cb5baba5e3acba0994ad899ee908799104c9965 Merge: 7142d17 693d92a Author: Ingo Molnar Date: Thu May 12 09:36:18 2011 +0200 Merge commit 'v2.6.39-rc7' into sched/core commit 5409d2cd841cf2c76396470e566500f6505f8d2a Author: Anton Blanchard Date: Wed May 11 17:25:14 2011 +1000 memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION On ppc64 the minimum memory section for hotplug is 16MB but most recent machines have a memory block size of 256MB. This means memory_block_change_state does 16 separate calls to memory_section_action. This also means we call the notifiers 16 times and the hook in the ehea network driver is quite costly. To offline one 256MB region takes: # time echo offline > /sys/devices/system/memory/memory32/state 7.9s This patch removes the loop and calls online_pages or remove_memory once for the entire region and in doing so makes the logic simpler since we don't have to back out if things fail part way through. The same test to offline one region now takes: # time echo online > /sys/devices/system/memory/memory32/state 0.67s Over 11 times faster. Signed-off-by: Anton Blanchard Signed-off-by: Greg Kroah-Hartman commit 2e711c04dbbf7a7732a3f7073b1fc285d12b369d Author: Rafael J. Wysocki Date: Tue Apr 26 19:15:07 2011 +0200 PM: Remove sysdev suspend, resume and shutdown operations Since suspend, resume and shutdown operations in struct sysdev_class and struct sysdev_driver are not used any more, remove them. Also drop sysdev_suspend(), sysdev_resume() and sysdev_shutdown() used for executing those operations and modify all of their users accordingly. This reduces kernel code size quite a bit and reduces its complexity. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit f5a592f7d74e38c5007876c731e6bf5580072e63 Author: Rafael J. Wysocki Date: Tue Apr 26 19:14:57 2011 +0200 PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM Make some PowerPC architecture's code use struct syscore_ops objects for power management instead of sysdev classes and sysdevs. This simplifies the code and reduces the kernel's memory footprint. It also is necessary for removing sysdevs from the kernel entirely in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit f98bf4aa39ecce96020631cba910be094c87ac3c Author: Rafael J. Wysocki Date: Tue Apr 26 19:14:47 2011 +0200 PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM Make some UNICORE32 architecture's code use struct syscore_ops objects for power management instead of sysdev classes and sysdevs. This simplifies the code and reduces the kernel's memory footprint. It also is necessary for removing sysdevs from the kernel entirely in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman Acked-by: Guan Xuetao commit f25f4f522a9d2e18595da9df0bf1b6f282040e08 Author: Rafael J. Wysocki Date: Tue Apr 26 19:14:35 2011 +0200 PM / AVR32: Use struct syscore_ops instead of sysdevs for PM Convert some AVR32 architecture's code to using struct syscore_ops objects for power management instead of sysdev classes and sysdevs. This simplifies the code and reduces the kernel's memory footprint. It also is necessary for removing sysdevs from the kernel entirely in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman Acked-by: Hans-Christian Egtvedt commit 1f8e1cdac616e510eeb2dc2a9226bf597bc6cfd6 Author: Robert P. J. Day Date: Sat May 7 17:18:20 2011 -0400 SYSFS: Fix erroneous comments for sysfs_update_group(). Fix what is clearly a simple copy-and-paste error in commenting the sysfs_update_group() routine. Signed-off-by: Robert P. J. Day Signed-off-by: Greg Kroah-Hartman commit 2c14ddc135d93cb4b33ee4bd4a2b05e21a4a762c Merge: 54b3335 72fe00f Author: Ingo Molnar Date: Tue May 10 22:15:57 2011 +0200 Merge branch 'iommu/2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu commit 7e186bdd0098b34c69fb8067c67340ae610ea499 Author: Stefano Stabellini Date: Fri May 6 12:27:50 2011 +0100 xen: do not clear and mask evtchns in __xen_evtchn_do_upcall Change the irq handler of evtchns and pirqs that don't need EOI (pirqs that correspond to physical edge interrupts) to handle_edge_irq. Use handle_fasteoi_irq for pirqs that need eoi (they generally correspond to level triggered irqs), no risk in loosing interrupts because we have to EOI the irq anyway. This change has the following benefits: - it uses the very same handlers that Linux would use on native for the same irqs (handle_edge_irq for edge irqs and msis, and handle_fasteoi_irq for everything else); - it uses these handlers in the same way native code would use them: it let Linux mask\unmask and ack the irq when Linux want to mask\unmask and ack the irq; - it fixes a problem occurring when a driver calls disable_irq() in its handler: the old code was unconditionally unmasking the evtchn even if the irq is disabled when irq_eoi was called. See Documentation/DocBook/genericirq.tmpl for more informations. Signed-off-by: Stefano Stabellini [v1: Fixed space/tab issues] Signed-off-by: Konrad Rzeszutek Wilk commit fffcda1183e93df84ad73ba7eb7782a5c354e2b3 Author: Joerg Roedel Date: Tue May 10 17:22:06 2011 +0200 x86, gart: Rename pci-gart_64.c to amd_gart_64.c This file only contains code relevant for the northbridge gart in AMD processors. This patch renames the file to represent this fact in the filename. Signed-off-by: Joerg Roedel commit 2b348a77981227c6b64fb9cf19f7c711a6806bc9 Author: Lin Ming Date: Fri Apr 29 08:41:57 2011 +0000 perf probe: Fix the missed parameter initialization pubname_callback_param::found should be initialized to 0 in fastpath lookup, the structure is on the stack and uninitialized otherwise. Signed-off-by: Lin Ming Cc: Arnaldo Carvalho de Melo Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com Signed-off-by: Ingo Molnar commit 932fed4e2e42c3d730c01bb63b1c4f812c533d5b Merge: 57d5241 693d92a Author: Ingo Molnar Date: Tue May 10 17:05:24 2011 +0200 Merge commit 'v2.6.39-rc7' into perf/core Merge reason: pull in the latest fixes. Signed-off-by: Ingo Molnar commit 72fe00f01f9a3240a1073be27aeaf4fc476cc662 Author: Joerg Roedel Date: Tue May 10 10:50:42 2011 +0200 x86/amd-iommu: Use threaded interupt handler Move the interupt handling for the iommu into the interupt thread to reduce latencies and prepare interupt handling for pri handling. Signed-off-by: Joerg Roedel commit 604c307bf47350c74bb36507b86a08726c7c2075 Merge: e969687 ba4b87a fd7b553 58fc7f1 Author: Joerg Roedel Date: Tue May 10 10:25:23 2011 +0200 Merge branches 'dma-debug/next', 'amd-iommu/command-cleanups', 'amd-iommu/ats' and 'amd-iommu/extended-features' into iommu/2.6.40 Conflicts: arch/x86/include/asm/amd_iommu_types.h arch/x86/kernel/amd_iommu.c arch/x86/kernel/amd_iommu_init.c commit e969687595c27e02e02be0c9363261826123ba77 Author: Joe Perches Date: Fri Nov 5 16:12:35 2010 -0700 arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS Coalesce format as well. Signed-off-by: Joe Perches Signed-off-by: Joerg Roedel commit d0c49bf391b2e230a8f3ae4486da7df440f1216d Author: Roland Dreier Date: Mon May 9 22:23:57 2011 -0700 RDMA/iwcm: Get rid of enum iw_cm_event_status The IW_CM_EVENT_STATUS_xxx values were used in only a couple of places; cma.c uses -Exxx values instead, and so do the amso1100, cxgb3 and cxgb4 drivers -- only nes was using the enum values (with the mild consequence that all nes connection failures were treated as generic errors rather than reported as timeouts or rejections). We can fix this confusion by getting rid of enum iw_cm_event_status and using a plain int for struct iw_cm_event.status, and converting nes to use -Exxx as the other iWARP drivers do. This also gets rid of the warning drivers/infiniband/core/cma.c: In function 'cma_iw_handler': drivers/infiniband/core/cma.c:1333:3: warning: case value '4294967185' not in enumerated type 'enum iw_cm_event_status' drivers/infiniband/core/cma.c:1336:3: warning: case value '4294967186' not in enumerated type 'enum iw_cm_event_status' drivers/infiniband/core/cma.c:1332:3: warning: case value '4294967192' not in enumerated type 'enum iw_cm_event_status' Signed-off-by: Roland Dreier Reviewed-by: Steve Wise Reviewed-by: Sean Hefty Reviewed-by: Faisal Latif commit ec03d6777a9e2b76917ef6d3fc8a01e174bcb9b1 Author: Sergei Shtylyov Date: Mon May 9 22:07:31 2011 -0700 IB/ipath: Use pci_dev->revision, again Commit 44c10138fd4b ("PCI: Change all drivers to use pci_device->revision") already converted this driver to using the revision field of struct pci_dev but commit bb9171448deb ("IB/ipath: Misc changes to prepare for IB7220 introduction") later reverted that change for some strange reason. Restore the change. Signed-off-by: Sergei Shtylyov Acked-by: Mike Marciniszyn Signed-off-by: Roland Dreier commit 9f5754e34bf964a41e821b2e0b358bc7c314ca4e Author: Mitko Haralanov Date: Mon May 9 22:07:31 2011 -0700 IB/qib: Prevent driver hang with unprogrammed boards The time limit test now correctly checks against current jiffies to avoid the hang. Signed-off-by: Mitko Haralanov Signed-off-by: Mike Marciniszyn Signed-off-by: Roland Dreier commit 2f25e9a540951ebd533b9b98d2259deb44b0b476 Author: Steve Wise Date: Mon May 9 22:06:23 2011 -0700 RDMA/cxgb4: EEH errors can hang the driver A few more EEH fixes: c4iw_wait_for_reply(): detect fatal EEH condition on timeout and return an error. The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH event followed by a ib_register_device() when the device was reinitialized. However, the RDMA core doesn't allow multiple iterations of register/deregister by the provider. See drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where the kobject ref is held until the device is deallocated in ib_deallocate_device(). Calling deregister adds this kobj reference, and then a subsequent register call will generate a WARN_ON() from the kobject subsystem because the kobject is being initialized but is already initialized with the ref held. So the provider must deregister and dealloc when resetting for an EEH event, then alloc/register to re-initialize. To do this, we cannot use the device ptr as our ULD handle since it will change with each reallocation. This commit adds a ULD context struct which is used as the ULD handle, and then contains the device pointer and other state needed. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit d9594d990a528d4c444777d0f360bb50c6114825 Author: Steve Wise Date: Mon May 9 22:06:22 2011 -0700 RDMA/cxgb4: Reset wait condition atomically The driver was never really waiting for RDMA_WR/FINI completions because the condition variable used to determine if the completion happened was never reset, and this condition variable is reused for both connection setup and teardown. This causes various driver crashes under heavy loads due to releasing resources too early. The fix is to use atomic bits to correctly reset the condition immediately after the completion is detected. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 85d215b0f316bee0a6936bd1a5f21abf03333eaa Author: Roel Kluin Date: Mon May 9 22:06:22 2011 -0700 RDMA/cxgb4: Fix missing parentheses Parens are missing: '|' has a higher presedence than '?'. Signed-off-by: Roel Kluin Acked-by: Steve Wise Signed-off-by: Roland Dreier commit bbe9a0a2bc07cf30c5b89b51154f2c87200a5dfd Author: Steve Wise Date: Mon May 9 22:06:22 2011 -0700 RDMA/cxgb4: Initialization errors can cause crash c4iw_uld_add() must return ERR_PTR() values instead of NULL on failure. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit 30c95c2d495c1c8d4d6a97bb9f4e4eacb91ba1d2 Author: Steve Wise Date: Mon May 9 22:06:22 2011 -0700 RDMA/cxgb4: Don't change QP state outside EP lock Concurrent ingress CLOSE and ULP ABORT operations causes a crash due to a race condition where the close path releases the EP lock and then tries to move the QP state to CLOSED. This must be done inside the EP lock to avoid the race. Signed-off-by: Steve Wise Signed-off-by: Roland Dreier commit a9bb79128aa659f97b774b97c9bb1bdc74444595 Author: Hefty, Sean Date: Mon May 9 22:06:10 2011 -0700 RDMA/cma: Add an ID_REUSEADDR option Lustre requires that clients bind to a privileged port number before connecting to a remote server. On larger clusters (typically more than about 1000 nodes), the number of privileged ports is exhausted, resulting in lustre being unusable. To handle this, we add support for reusable addresses to the rdma_cm. This mimics the behavior of the socket option SO_REUSEADDR. A user may set an rdma_cm_id to reuse an address before calling rdma_bind_addr() (explicitly or implicitly). If set, other rdma_cm_id's may be bound to the same address, provided that they all have reuse enabled, and there are no active listens. If rdma_listen() is called on an rdma_cm_id that has reuse enabled, it will only succeed if there are no other id's bound to that same address. The reuse option is exported to user space. The behavior of the kernel reuse implementation was verified against that given by sockets. This patch is derived from a path by Ira Weiny Signed-off-by: Sean Hefty Signed-off-by: Roland Dreier commit 43b752daae9445a3b2b075a236840d801fce1593 Author: Hefty, Sean Date: Mon May 9 22:06:10 2011 -0700 RDMA/cma: Fix handling of IPv6 addressing in cma_use_port cma_use_port() assumes that the sockaddr is an IPv4 address. Since IPv6 addressing is supported (and also to support other address families) make the code more generic in its address handling. Signed-off-by: Sean Hefty Signed-off-by: Roland Dreier commit 935a638241b0658b9749edd060f972575f9d4a78 Author: Matthew Garrett Date: Thu May 5 15:19:46 2011 -0400 x86, efi: Ensure that the entirity of a region is mapped It's possible for init_memory_mapping() to fail to map the entire region if it crosses a boundary, so ensure that we complete the mapping. Signed-off-by: Matthew Garrett Link: http://lkml.kernel.org/r/1304623186-18261-5-git-send-email-mjg@redhat.com Signed-off-by: H. Peter Anvin commit 7cb00b72876ea2451eb79d468da0e8fb9134aa8a Author: Matthew Garrett Date: Thu May 5 15:19:45 2011 -0400 x86, efi: Pass a minimal map to SetVirtualAddressMap() Experimentation with various EFI implementations has shown that functions outside runtime services will still update their pointers if SetVirtualAddressMap() is called with memory descriptors outside the runtime area. This is obviously insane, and therefore is unsurprising. Evidence from instrumenting another EFI implementation suggests that it only passes the set of descriptors covering runtime regions, so let's avoid any problems by doing the same. Runtime descriptors are copied to a separate memory map, and only that map is passed back to the firmware. Signed-off-by: Matthew Garrett Link: http://lkml.kernel.org/r/1304623186-18261-4-git-send-email-mjg@redhat.com Signed-off-by: H. Peter Anvin commit 202f9d0a41809e3424af5f61489b48b622824aed Author: Matthew Garrett Date: Thu May 5 15:19:44 2011 -0400 x86, efi: Merge contiguous memory regions of the same type and attribute Some firmware implementations assume that physically contiguous regions will be contiguous in virtual address space. This assumption is, obviously, entirely unjustifiable. Said firmware implementations lack the good grace to handle their failings in a measured and reasonable manner, instead tending to shit all over address space and oopsing the kernel. In an ideal universe these firmware implementations would simultaneously catch fire and cease to be a problem, but since some of them are present in attractively thin and shiny metal devices vanity wins out and some poor developer spends an extended period of time surrounded by a growing array of empty bottles until the underlying reason becomes apparent. Said developer presents this patch, which simply merges adjacent regions if they happen to be contiguous and have the same EFI memory type and caching attributes. Signed-off-by: Matthew Garrett Link: http://lkml.kernel.org/r/1304623186-18261-3-git-send-email-mjg@redhat.com Signed-off-by: H. Peter Anvin commit 9cd2b07c197e3ff594fc04f5fb3d86efbeab6ad8 Author: Matthew Garrett Date: Thu May 5 15:19:43 2011 -0400 x86, efi: Consolidate EFI nx control The core EFI code and 64-bit EFI code currently have independent implementations of code for setting memory regions as executable or not. Let's consolidate them. Signed-off-by: Matthew Garrett Link: http://lkml.kernel.org/r/1304623186-18261-2-git-send-email-mjg@redhat.com Signed-off-by: H. Peter Anvin commit 2b5e8ef35bc89eee944c0627edacbb1fea5a1b84 Author: Matthew Garrett Date: Thu May 5 15:19:42 2011 -0400 x86, efi: Remove virtual-mode SetVirtualAddressMap call The spec says that SetVirtualAddressMap doesn't work once you're in virtual mode, so there's no point in having infrastructure for calling it from there. Signed-off-by: Matthew Garrett Link: http://lkml.kernel.org/r/1304623186-18261-1-git-send-email-mjg@redhat.com Signed-off-by: H. Peter Anvin commit 11c476f31a0fabc6e604da5b09a6590b57c3fb20 Author: Paul E. McKenney Date: Mon May 2 00:56:57 2011 -0700 net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree The RCU callback prl_entry_destroy_rcu() just calls kfree(), so we can use kfree_rcu() instead of call_rcu(). Signed-off-by: Paul E. McKenney Cc: Alexey Kuznetsov Cc: "Pekka Savola (ipv6)" Cc: James Morris Cc: Hideaki YOSHIFUJI Cc: Patrick McHardy Acked-by: David S. Miller Reviewed-by: Josh Triplett commit 8e3572cff70ee19a0a1f2e2dde0bca0b7c8b54dc Author: Paul E. McKenney Date: Mon May 2 00:52:23 2011 -0700 batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu The RCU callback softif_neigh_free_rcu() just calls kfree(), so we can use kfree_rcu() instead of call_rcu(). Signed-off-by: Paul E. McKenney Cc: Marek Lindner Cc: Simon Wunderlich Acked-by: David S. Miller Reviewed-by: Josh Triplett Acked-by: Sven Eckelmann commit ae179ae433bb4ef6b6179c5c1c7b6cc7dc01c670 Author: Paul E. McKenney Date: Sun May 1 23:27:50 2011 -0700 batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() The RCU callback neigh_node_free_rcu() just calls kfree(), so we can use kfree_rcu() instead of call_rcu(). Signed-off-by: Paul E. McKenney Cc: Marek Lindner Cc: Simon Wunderlich Acked-by: David S. Miller Reviewed-by: Josh Triplett Acked-by: Sven Eckelmann commit eb340b2f804860a51a0b92e35fd36742b6c2d6b7 Author: Paul E. McKenney Date: Sun May 1 23:25:02 2011 -0700 batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu The RCU callback gw_node_free_rcu() just calls kfree(), so we can use kfree_rcu() instead of call_rcu(). Signed-off-by: Paul E. McKenney Cc: Marek Lindner Cc: Simon Wunderlich Acked-by: David S. Miller Reviewed-by: Josh Triplett Acked-by: Sven Eckelmann commit 0744371aeba7a5004006c2309971ee026c0b2000 Author: Lai Jiangshan Date: Tue Mar 15 18:02:42 2011 +0800 net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() The rcu callback kfree_tid_tx() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(kfree_tid_tx). Signed-off-by: Lai Jiangshan Signed-off-by: Paul E. McKenney Cc: "John W. Linville" Cc: Johannes Berg Reviewed-by: Josh Triplett Acked-by: "David S. Miller" commit 88b4a0347a81539884df5ad535e10cf9fa87d8d4 Author: Lai Jiangshan Date: Fri Mar 18 12:15:02 2011 +0800 net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() The rcu callback xt_osf_finger_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(xt_osf_finger_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit a74ce1425e4f6075de443274a5e917c84357eac4 Author: Lai Jiangshan Date: Fri Mar 18 12:14:15 2011 +0800 net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() The rcu callback work_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(work_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: "John W. Linville" Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 61845220248c2368095158420b029683fad5570a Author: Lai Jiangshan Date: Fri Mar 18 12:10:25 2011 +0800 net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() The rcu callback wq_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(wq_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 7e113a9c759d1918fcbe456c5eb8890a4da62eaa Author: Lai Jiangshan Date: Fri Mar 18 12:09:03 2011 +0800 net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() The rcu callback phonet_device_rcu_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(phonet_device_rcu_free). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit fa4bbc4ca5795fbf1303b98c924e51a21a920f14 Author: Lai Jiangshan Date: Fri Mar 18 12:08:29 2011 +0800 perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() The rcu callback swevent_hlist_release_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(swevent_hlist_release_rcu). Signed-off-by: Lai Jiangshan Acked-by: Peter Zijlstra Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit cb796ff338db9f064f4ecb7d41898e8bedcad4d9 Author: Lai Jiangshan Date: Fri Mar 18 12:07:41 2011 +0800 perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() The rcu callback free_ctx() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(free_ctx). Signed-off-by: Lai Jiangshan Acked-by: Peter Zijlstra Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 1f8d36a1869f5efae4fadf6baf01f02211040b97 Author: Lai Jiangshan Date: Fri Mar 18 12:07:09 2011 +0800 net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() The rcu callback __nf_ct_ext_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(__nf_ct_ext_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 04d4dfed8e64f65d672502a614a4bb9093d1affd Author: Lai Jiangshan Date: Fri Mar 18 12:06:32 2011 +0800 net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() The rcu callback net_generic_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(net_generic_release). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 6b2622328110b9c1094a4f644a6cdc9b6d5450ac Author: Lai Jiangshan Date: Fri Mar 18 12:04:50 2011 +0800 net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() The rcu callback netlbl_unlhsh_free_addr6() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(netlbl_unlhsh_free_addr6). Signed-off-by: Lai Jiangshan Acked-by: Paul Moore Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit c3b494209cc6084bd76d823d185a60ee5bd40b0d Author: Lai Jiangshan Date: Fri Mar 18 12:03:56 2011 +0800 net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() The rcu callback netlbl_unlhsh_free_addr4() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(netlbl_unlhsh_free_addr4). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Acked-by: Paul Moore Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 690273fc70e94a07d70044881e5e52926301bcd3 Author: Lai Jiangshan Date: Fri Mar 18 12:03:19 2011 +0800 security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() The rcu callback sel_netif_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(sel_netif_free). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit b55071eb6011413af3b9c434ae77dea8832069c8 Author: Lai Jiangshan Date: Fri Mar 18 12:02:47 2011 +0800 net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() The rcu callback xps_dev_maps_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(xps_dev_maps_release). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit edc86d8a1c824cca1df676f47fc713232885561f Author: Lai Jiangshan Date: Fri Mar 18 12:02:20 2011 +0800 net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() The rcu callback xps_map_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(xps_map_release). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit f6f80238fab1ad19c44b4a12501528d50fc7fcd6 Author: Lai Jiangshan Date: Fri Mar 18 12:01:31 2011 +0800 net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() The rcu callback rps_map_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(rps_map_release). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit e3cbf28fa61456bd2f0ba39846ffd905257eb4b0 Author: Lai Jiangshan Date: Fri Mar 18 12:00:50 2011 +0800 net,rcu: convert call_rcu(ipv6_mc_socklist_reclaim) to kfree_rcu() The rcu callback ipv6_mc_socklist_reclaim() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ipv6_mc_socklist_reclaim). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 6e070aecd9e304264a6b8655f49aa7e6db0e55f2 Author: Lai Jiangshan Date: Fri Mar 18 12:00:07 2011 +0800 macvlan,rcu: convert call_rcu(macvlan_port_rcu_free) to kfree_rcu() The rcu callback macvlan_port_rcu_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(macvlan_port_rcu_free). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit bcec8b6531d481ced35506517af69adb2399f2a4 Author: Lai Jiangshan Date: Fri Mar 18 11:57:21 2011 +0800 ixgbe,rcu: convert call_rcu(ring_free_rcu) to kfree_rcu() The rcu callback ring_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ring_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit fa81c0e1d2d2176f1136c72c080c9ea4a98be347 Author: Lai Jiangshan Date: Fri Mar 18 11:39:43 2011 +0800 net,rcu: convert call_rcu(free_dm_hw_stat) to kfree_rcu() The rcu callback free_dm_hw_stat() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(free_dm_hw_stat). Signed-off-by: Lai Jiangshan Acked-by: Neil Horman Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 10d50e748d983ff1003e0cf556ea17fa8f32c382 Author: Lai Jiangshan Date: Fri Mar 18 11:45:08 2011 +0800 net,rcu: convert call_rcu(ip_mc_socklist_reclaim) to kfree_rcu() The rcu callback ip_mc_socklist_reclaim() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ip_mc_socklist_reclaim). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 7519cce48fb0a314dac473bdfba203b787435857 Author: Lai Jiangshan Date: Fri Mar 18 11:44:46 2011 +0800 net,rcu: convert call_rcu(ip_sf_socklist_reclaim) to kfree_rcu() The rcu callback ip_sf_socklist_reclaim() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ip_sf_socklist_reclaim). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 42ea299d3f1941f8c3dbc1fe031d682f8ad45005 Author: Lai Jiangshan Date: Fri Mar 18 11:44:08 2011 +0800 net,rcu: convert call_rcu(ip_mc_list_reclaim) to kfree_rcu() The rcu callback ip_mc_list_reclaim() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ip_mc_list_reclaim). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit dad178fcd5b295f5f350a46a5eaf2f28e847bb5a Author: Lai Jiangshan Date: Fri Mar 18 11:43:26 2011 +0800 net,rcu: convert call_rcu(__gen_kill_estimator) to kfree_rcu() The rcu callback __gen_kill_estimator() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(__gen_kill_estimator). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit bceb0f4512d448763fe98c9f37504c98bbebbed6 Author: Lai Jiangshan Date: Fri Mar 18 11:42:34 2011 +0800 net,rcu: convert call_rcu(__leaf_info_free_rcu) to kfree_rcu() The rcu callback __leaf_info_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(__leaf_info_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 4670994d150a86ebd53ab353a2af517c5465bfaf Author: Lai Jiangshan Date: Fri Mar 18 11:42:11 2011 +0800 net,rcu: convert call_rcu(fc_rport_free_rcu) to kfree_rcu() The rcu callback fc_rport_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(fc_rport_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 3acb458c32293405cf68985b7b3ac5dc0a5e7929 Author: Lai Jiangshan Date: Fri Mar 18 12:11:07 2011 +0800 security,rcu: convert call_rcu(user_update_rcu_disposal) to kfree_rcu() The rcu callback user_update_rcu_disposal() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(user_update_rcu_disposal). Signed-off-by: Lai Jiangshan Signed-off-by: Paul E. McKenney Acked-by: David Howells Reviewed-by: Josh Triplett commit 75ef0368d182785c7c5c06ac11081e31257a313e Author: Lai Jiangshan Date: Tue Mar 15 18:11:46 2011 +0800 net,act_police,rcu: remove rcu_barrier() There is no callback of this module maybe queued since we use kfree_rcu(), we can safely remove the rcu_barrier(). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 1e547757eca3c30eeeac526716e4ae833c2a9a2f Author: Lai Jiangshan Date: Tue Mar 15 18:10:12 2011 +0800 net,rcu: convert call_rcu(dn_dev_free_ifa_rcu) to kfree_rcu() The rcu callback dn_dev_free_ifa_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(dn_dev_free_ifa_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 217f18639bc18ba4bbb67481113037344c148938 Author: Lai Jiangshan Date: Tue Mar 15 18:08:58 2011 +0800 net,rcu: convert call_rcu(ha_rcu_free) to kfree_rcu() The rcu callback ha_rcu_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ha_rcu_free). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 1231f0baa547a541a7481119323b7f964dda4788 Author: Lai Jiangshan Date: Tue Mar 15 18:05:02 2011 +0800 net,rcu: convert call_rcu(sctp_local_addr_free) to kfree_rcu() The rcu callback sctp_local_addr_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(sctp_local_addr_free). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 37b6b935e96e837ccc60812c03e9f92e7dce2e61 Author: Lai Jiangshan Date: Tue Mar 15 18:01:42 2011 +0800 net,rcu: convert call_rcu(listeners_free_rcu) to kfree_rcu() The rcu callback listeners_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(listeners_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit e57859854aaac9b7ce0db3fe4959190a1a80b60a Author: Lai Jiangshan Date: Tue Mar 15 18:00:14 2011 +0800 net,rcu: convert call_rcu(inet6_ifa_finish_destroy_rcu) to kfree_rcu() The rcu callback inet6_ifa_finish_destroy_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(inet6_ifa_finish_destroy_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 38f57d1a4b4c13db92c7f6300a4e4ae70ff94d2b Author: Lai Jiangshan Date: Tue Mar 15 17:59:14 2011 +0800 net,rcu: convert call_rcu(in6_dev_finish_destroy_rcu) to kfree_rcu() The rcu callback in6_dev_finish_destroy_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(in6_dev_finish_destroy_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 5957b1ac52c5fde2afe5d80abe2a1cb82a9b4f20 Author: Lai Jiangshan Date: Tue Mar 15 17:58:00 2011 +0800 net,rcu: convert call_rcu(tcf_police_free_rcu) to kfree_rcu() [PATCH 05/17] net,rcu: convert call_rcu(tcf_police_free_rcu) to kfree_rcu() The rcu callback tcf_police_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(tcf_police_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit f5c8593c107500979909bd51c85e74bb2eaffbaa Author: Lai Jiangshan Date: Tue Mar 15 17:57:04 2011 +0800 net,rcu: convert call_rcu(tcf_common_free_rcu) to kfree_rcu() The rcu callback tcf_common_free_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(tcf_common_free_rcu). Signed-off-by: Lai Jiangshan Acked-by: David S. Miller Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 025cea99db3fb110ebc8ede5ff647833fab9574f Author: Lai Jiangshan Date: Tue Mar 15 17:56:10 2011 +0800 cgroup,rcu: convert call_rcu(__free_css_id_cb) to kfree_rcu() The rcu callback __free_css_id_cb() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(__free_css_id_cb). Signed-off-by: Lai Jiangshan Acked-by: Paul Menage Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit f2da1c40dc003939f616f27a103b2592f1424b07 Author: Lai Jiangshan Date: Tue Mar 15 17:55:16 2011 +0800 cgroup,rcu: convert call_rcu(free_cgroup_rcu) to kfree_rcu() The rcu callback free_cgroup_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(free_cgroup_rcu). Signed-off-by: Lai Jiangshan Acked-by: Paul Menage Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 30088ad815802f850f26114920ccf9effd4bc520 Author: Lai Jiangshan Date: Tue Mar 15 17:53:46 2011 +0800 cgroup,rcu: convert call_rcu(free_css_set_rcu) to kfree_rcu() The rcu callback free_css_set_rcu() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(free_css_set_rcu). Signed-off-by: Lai Jiangshan Acked-by: Paul Menage Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 1217ed1ba5c67393293dfb0f03c353b118dadeb4 Author: Paul E. McKenney Date: Wed May 4 21:43:49 2011 -0700 rcu: permit rcu_read_unlock() to be called while holding runqueue locks Avoid calling into the scheduler while holding core RCU locks. This allows rcu_read_unlock() to be called while holding the runqueue locks, but only as long as there was no chance of the RCU read-side critical section having been preempted. (Otherwise, if RCU priority boosting is enabled, rcu_read_unlock() might call into the scheduler in order to unboost itself, which might allows self-deadlock on the runqueue locks within the scheduler.) Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit 8fab6af2156c0100f953fd61f4e0b2f82c9776dc Author: Luis R. Rodriguez Date: Fri May 6 15:00:09 2011 -0700 x86: Fix mrst sparse complaints Fix these Sparse complaints: CHECK arch/x86/platform/mrst/mrst.c arch/x86/platform/mrst/mrst.c:197:13: warning: symbol 'mrst_time_init' was not declared. Should it be static? arch/x86/platform/mrst/mrst.c:219:16: warning: symbol 'mrst_arch_setup' was not declared. Should it be static? Acked-by: Alan Cox Cc: Roman Gezikov Cc: Joonas Viskari Cc: Andrew Morton Cc: Allen Kao Signed-off-by: Luis R. Rodriguez Link: http://lkml.kernel.org/r/1304719209-26913-1-git-send-email-lrodriguez@atheros.com Signed-off-by: Ingo Molnar commit 4cb1f43ce8c72ee453c00fcb9f6ee9c4ebd03f98 Merge: 9de4966 0ee5623f Author: Ingo Molnar Date: Sat May 7 10:51:38 2011 +0200 Merge commit 'v2.6.39-rc6' into x86/cleanups Merge reason: move to a (much) newer upstream base. Signed-off-by: Ingo Molnar commit 63dc355a5a8cf296e2b1cc2e4192190dca221129 Author: Wanlong Gao Date: Thu May 5 07:55:37 2011 +0800 driver core: remove the driver-model structures from the documentation Remove the struct bus_type, class, device, device_driver from the driver-model docs. With another patch add them to device.h, since they are out of date. That will keep things up to date and provide a better way to document this stuff. Signed-off-by: Wanlong Gao Acked-by: Harry Wei Signed-off-by: Greg Kroah-Hartman commit 880ffb5c6c5c8c8c6efd9efe9355317322b4603b Author: Wanlong Gao Date: Thu May 5 07:55:36 2011 +0800 driver core: Add the device driver-model structures to kerneldoc Add the comments to the structure bus_type, device_driver, device, class to device.h for generating the driver-model kerneldoc. With another patch these all removed from the files in Documentation/driver-model/ since they are out of date. That will keep things up to date and provide a better way to document this stuff. Signed-off-by: Wanlong Gao Acked-by: Harry Wei Signed-off-by: Greg Kroah-Hartman commit 3ccff540070b5adde7eec443676cfee1dd6b89fd Author: Harry Wei Date: Thu May 5 09:47:48 2011 +0800 Translated Documentation/email-clients.txt The patch includes the translation Documentation/email-clients.txt. If anyone has other problems, please let me know. Signed-off-by: Harry Wei Signed-off-by: Greg Kroah-Hartman commit 9333744dc7dcd85531cff13cabf1d5d6baf18e7d Author: Robert P. J. Day Date: Wed May 4 05:19:34 2011 -0400 RAW driver: Remove call to kobject_put(). If cdev_add() fails, there is no justification for subsequently calling kobject_put(). Signed-off-by: Robert P. J. Day Signed-off-by: Greg Kroah-Hartman commit b50fa7c8077c625919b1e0a75fc37b825f024518 Author: Kay Sievers Date: Thu May 5 13:32:05 2011 +0200 reboot: disable usermodehelper to prevent fs access In case CONFIG_UEVENT_HELPER_PATH is not set to "", which it should be on every system, the kernel forks processes during shutdown, which try to access the rootfs, even when the binary does not exist. It causes exceptions and long delays in the disk driver, which gets read requests at the time it tries to shut down the disk. This patch disables all kernel-forked processes during reboot to allow a clean poweroff. Cc: Tejun Heo Tested-By: Anton Guda Signed-off-by: Kay Sievers Signed-off-by: Greg Kroah-Hartman commit aabb6e1531b0c78423dca6a39620892249fef7f9 Author: Randy Dunlap Date: Fri May 6 13:27:41 2011 -0700 efivars: prevent oops on unload when efi is not enabled efivars_exit() should check for efi_enabled and not undo allocations when efi is not enabled. Otherwise there is an Oops during module unload: calling efivars_init+0x0/0x1000 [efivars] @ 2810 EFI Variables Facility v0.08 2004-May-17 initcall efivars_init+0x0/0x1000 [efivars] returned 0 after 5120 usecs Oops: 0000 [#1] SMP DEBUG_PAGEALLOC last sysfs file: /sys/module/firmware_class/initstate CPU 1 Modules linked in: efivars(-) af_packet tun nfsd lockd nfs_acl auth_rpcgss sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq freq_table mperf binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep mousedev snd_seq joydev snd_seq_device mac_hid evdev snd_pcm usbkbd usbmouse usbhid snd_timer hid tg3 snd sr_mod pcspkr rtc_cmos soundcore cdrom iTCO_wdt processor sg dcdbas i2c_i801 rtc_core iTCO_vendor_support intel_agp snd_page_alloc thermal_sys rtc_lib intel_gtt 8250_pnp button hwmon unix ide_pci_generic ide_core ata_generic pata_acpi ata_piix sd_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core firmware_class ehci_hcd usbcore [last unloaded: dell_rbu] Pid: 2812, comm: rmmod Not tainted 2.6.39-rc6 #1 Dell Inc. OptiPlex 745 /0TY565 RIP: 0010:[] [] unregister_efivars+0x28/0x12c [efivars] RSP: 0018:ffff88005eedde98 EFLAGS: 00010283 RAX: ffffffffa06a23fc RBX: ffffffffa06a44c0 RCX: ffff88007c227a50 RDX: 0000000000000000 RSI: 00000055ac13db78 RDI: ffffffffa06a44c0 RBP: ffff88005eeddec8 R08: 0000000000000000 R09: ffff88005eeddd78 R10: ffffffffa06a4220 R11: ffff88005eeddd78 R12: fffffffffffff7d0 R13: 00007fff5a3aaec0 R14: 0000000000000000 R15: ffffffffa06a4508 FS: 00007fa8dcc4a6f0(0000) GS:ffff88007c200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000005d148000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 2812, threadinfo ffff88005eedc000, task ffff88006754b000) Stack: ffff88005eeddec8 ffffffffa06a4220 0000000000000000 00007fff5a3aaec0 0000000000000000 0000000000000001 ffff88005eedded8 ffffffffa06a2418 ffff88005eeddf78 ffffffff810d3598 ffffffffa06a4220 0000000000000880 Call Trace: [] efivars_exit+0x1c/0xc04 [efivars] [] sys_delete_module+0x2d6/0x368 [] ? lockdep_sys_exit_thunk+0x35/0x67 [] ? audit_syscall_entry+0x172/0x1a5 [] system_call_fastpath+0x16/0x1b Code: 5c c9 c3 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 4c 8b 67 48 48 89 fb 4c 8d 7f 48 49 81 ec 30 08 00 00 <4d> 8b ac 24 30 08 00 00 49 81 ed 30 08 00 00 eb 59 48 89 df 48 RIP [] unregister_efivars+0x28/0x12c [efivars] RSP CR2: 0000000000000000 ---[ end trace aa99b99090f70baa ]--- Matt apparently removed such a check in 2004 (with no reason given): * 17 May 2004 - Matt Domsch * remove check for efi_enabled in exit but there have been several changes since then. Signed-off-by: Randy Dunlap Signed-off-by: Mike Waychison Tested-by: Randy Dunlap Cc: Matt Domsch Cc: Signed-off-by: Greg Kroah-Hartman commit 0078bff5283d1fd6417b840eda6dab912b7a5560 Author: Jan Kara Date: Fri Apr 29 00:24:29 2011 +0200 Allow setting of number of raw devices as a module parameter Allow setting of maximal number of raw devices as a module parameter. This requires changing of static array into a vmalloced one (the array is going to be too large for kmalloc). Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 57d524154ffe99d27fb55e0e30ddbad9f4c35806 Merge: e04d1b2 c63ca0c Author: Ingo Molnar Date: Fri May 6 21:07:33 2011 +0200 Merge branch 'perf/stat' into perf/core Merge reason: the perf stat improvements are tested and ready now. Signed-off-by: Ingo Molnar commit 61516587513c84ac26e68e3ab008dc6e965d0378 Author: Rob Landley Date: Fri May 6 09:27:36 2011 -0700 Correct occurrences of - Documentation/kvm/ to Documentation/virtual/kvm - Documentation/uml/ to Documentation/virtual/uml - Documentation/lguest/ to Documentation/virtual/lguest throughout the kernel source tree. Signed-off-by: Rob Landley Signed-off-by: Randy Dunlap commit 570aa13a5dbea9b905b4cd6315aa6e1b3a9fd2f8 Author: Rob Landley Date: Fri May 6 09:23:23 2011 -0700 Add a 00-INDEX file to Documentation/virtual Remove uml from the top level 00-INDEX file. Signed-off-by: Rob Landley Signed-off-by: Randy Dunlap commit ed16648eb5b86917f0b90bdcdbc857202da72f90 Author: Rob Landley Date: Fri May 6 09:22:02 2011 -0700 Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E: cd Documentation mkdir virtual git mv kvm uml lguest virtual Signed-off-by: Rob Landley Signed-off-by: Randy Dunlap commit e04d1b23f9706186187dcb0be1a752e48dcc540b Author: Lin Ming Date: Fri May 6 07:14:02 2011 +0000 perf events, x86: Add SandyBridge stalled-cycles-frontend/backend events Extend the Intel SandyBridge PMU driver with definitions for generic front-end and back-end stall events. ( As commit 3011203 "perf events, x86: Add Westmere stalled-cycles-frontend/backend events" says, these are only approximations. ) Signed-off-by: Lin Ming Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Steven Rostedt Link: http://lkml.kernel.org/r/1304666042-17577-1-git-send-email-ming.m.lin@intel.com Signed-off-by: Ingo Molnar commit 7142d17e8f935fa842e9f6eece2281b6d41625d6 Author: Hillf Danton Date: Thu May 5 20:53:20 2011 +0800 sched: Shorten the construction of the span cpu mask of sched domain For a given node, when constructing the cpumask for its sched_domain to span, if there is no best node available after searching, further efforts could be saved, based on small change in the return value of find_next_best_node(). Signed-off-by: Hillf Danton Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Yong Zhang Link: http://lkml.kernel.org/r/BANLkTi%3DqPWxRAa6%2BdT3ohEP6Z%3D0v%2Be4EXA@mail.gmail.com Signed-off-by: Ingo Molnar commit 4934a4d3d3fa775601a9f1b35cc0e2aa93f81355 Author: Rakib Mullick Date: Wed May 4 22:53:46 2011 +0600 sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG cfs_rq->nr_spread_over is only used when CONFIG_SCHED_DEBUG is set. So wrap it with CONFIG_SCHED_DEBUG. Signed-off-by: Rakib Mullick Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1304528026.15681.3.camel@localhost.localdomain Signed-off-by: Ingo Molnar commit 29ce831000081dd757d3116bf774aafffc4b6b20 Author: Gleb Natapov Date: Wed May 4 16:31:03 2011 +0300 rcu: provide rcu_virt_note_context_switch() function. Provide rcu_virt_note_context_switch() for vitalization use to note quiescent state during guest entry. Signed-off-by: Gleb Natapov Signed-off-by: Paul E. McKenney commit bad6e1393cb505fe17747344a23666464daa3fa7 Author: Paul E. McKenney Date: Mon May 2 23:40:04 2011 -0700 rcu: get rid of signed overflow in check_cpu_stall() Signed integer overflow is undefined by the C standard, so move calculations to unsigned. Signed-off-by: Paul E. McKenney commit b554d7de8d112fca4188da3bf0d7f8b56f42fb95 Author: Eric Dumazet Date: Thu Apr 28 07:23:45 2011 +0200 rcu: optimize rcutiny rcu_sched_qs() currently calls local_irq_save()/local_irq_restore() up to three times. Remove irq masking from rcu_qsctr_help() / invoke_rcu_kthread() and do it once in rcu_sched_qs() / rcu_bh_qs() This generates smaller code as well. text data bss dec hex filename 2314 156 24 2494 9be kernel/rcutiny.old.o 2250 156 24 2430 97e kernel/rcutiny.new.o Fix an outdated comment for rcu_qsctr_help() Move invoke_rcu_kthread() definition before its use. Signed-off-by: Eric Dumazet Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 2655d57ef35aa327a2e58a1c5dc7b65c65003f4e Author: Paul E. McKenney Date: Thu Apr 7 22:47:23 2011 -0700 rcu: prevent call_rcu() from diving into rcu core if irqs disabled This commit marks a first step towards making call_rcu() have real-time behavior. If irqs are disabled, don't dive into the RCU core. Later on, this new early exit will wake up the per-CPU kthread, which first must be modified to handle the cases involving callback storms. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit baa1ae0c9f1c618bc60706efa75fef3508bcee58 Author: Paul E. McKenney Date: Sat Mar 26 22:01:35 2011 -0700 rcu: further lower priority in rcu_yield() Although rcu_yield() dropped from real-time to normal priority, there is always the possibility that the competing tasks have been niced. So nice to 19 in rcu_yield() to help ensure that other tasks have a better chance of running. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 9ab1544eb4196ca8d05c433b2eb56f74496b1ee3 Author: Lai Jiangshan Date: Fri Mar 18 11:15:47 2011 +0800 rcu: introduce kfree_rcu() Many rcu callbacks functions just call kfree() on the base structure. These functions are trivial, but their size adds up, and furthermore when they are used in a kernel module, that module must invoke the high-latency rcu_barrier() function at module-unload time. The kfree_rcu() function introduced by this commit addresses this issue. Rather than encoding a function address in the embedded rcu_head structure, kfree_rcu() instead encodes the offset of the rcu_head structure within the base structure. Because the functions are not allowed in the low-order 4096 bytes of kernel virtual memory, offsets up to 4095 bytes can be accommodated. If the offset is larger than 4095 bytes, a compile-time error will be generated in __kfree_rcu(). If this error is triggered, you can either fall back to use of call_rcu() or rearrange the structure to position the rcu_head structure into the first 4096 bytes. Note that the allowable offset might decrease in the future, for example, to allow something like kmem_cache_free_rcu(). The new kfree_rcu() function can replace code as follows: call_rcu(&p->rcu, simple_kfree_callback); where "simple_kfree_callback()" might be defined as follows: void simple_kfree_callback(struct rcu_head *p) { struct foo *q = container_of(p, struct foo, rcu); kfree(q); } with the following: kfree_rcu(&p->rcu, rcu); Note that the "rcu" is the name of a field in the structure being freed. The reason for using this rather than passing in a pointer to the base structure is that the above approach allows better type checking. This commit is based on earlier work by Lai Jiangshan and Manfred Spraul: Lai's V1 patch: http://lkml.org/lkml/2008/9/18/1 Manfred's patch: http://lkml.org/lkml/2009/1/2/115 Signed-off-by: Lai Jiangshan Signed-off-by: Manfred Spraul Signed-off-by: Paul E. McKenney Reviewed-by: David Howells Reviewed-by: Josh Triplett commit 6cc68793e380bb51f447d8d02af873b7bc01f222 Author: Paul E. McKenney Date: Wed Mar 2 13:15:15 2011 -0800 rcu: fix spelling The "preemptible" spelling is preferable. May as well fix it. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 13491a0ee1ef862b6c842132b6eb9c5e721af5ad Author: Lai Jiangshan Date: Fri Feb 25 11:37:59 2011 -0800 rcu: call __rcu_read_unlock() in exit_rcu for tree RCU Using __rcu_read_lock() in place of rcu_read_lock() leaves any debug state as it really should be, namely with the lock still held. Signed-off-by: Lai Jiangshan Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 7e8b4c72344e0d904b0e3fa9fd2eb116f04b3d41 Author: Paul E. McKenney Date: Thu Feb 24 19:26:21 2011 -0800 rcu: Converge TINY_RCU expedited and normal boosting This applies a trick from TREE_RCU boosting to TINY_RCU, eliminating code and adding comments. The key point is that it is possible for the booster thread itself to work out whether there is a normal or expedited boost required based solely on local information. There is therefore no need for boost initiation to know or care what type of boosting is required. In addition, when boosting is complete for a given grace period, then by definition there cannot be any more boosting for that grace period. This allows eliminating yet more state and statistics. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 203373c81b83e98da82836c4b8b5dd1e6fc9011f Author: Paul E. McKenney Date: Thu Feb 24 15:25:21 2011 -0800 rcu: remove useless ->boosted_this_gp field The ->boosted_this_gp field is a holdover from an earlier design that was to carry out multiple boost operations in parallel. It is not required by the current design, which boosts one task at a time. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit ddeb75814f09205df795121d9e373e82de7f2aca Author: Paul E. McKenney Date: Wed Feb 23 17:03:06 2011 -0800 rcu: code cleanups in TINY_RCU priority boosting. Extraneous semicolon, bad comment, and fold INIT_LIST_HEAD() into list_del() to get list_del_init(). Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit f0a07aeaf8935b7e9ef8032ce6546169f143951c Author: Paul E. McKenney Date: Wed Feb 23 11:10:52 2011 -0800 rcu: Switch to this_cpu() primitives This removes a couple of lines from invoke_rcu_cpu_kthread(), improving readability. Reported-by: Christoph Lameter Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 108aae22339f445c134aeb48eca25df1014ab08d Author: Paul E. McKenney Date: Wed Feb 23 09:56:00 2011 -0800 rcu: Use WARN_ON_ONCE for DEBUG_OBJECTS_RCU_HEAD warnings Avoid additional multiple-warning confusion in memory-corruption scenarios. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 561190e3b3db372403fb6a327b0121b4cae1b87e Author: Paul E. McKenney Date: Wed Mar 30 09:10:44 2011 -0700 rcu: mark rcutorture boosting callback as being on-stack The CONFIG_DEBUG_OBJECTS_RCU_HEAD facility requires that on-stack RCU callbacks be flagged explicitly to debug-objects using the init_rcu_head_on_stack() and destroy_rcu_head_on_stack() functions. This commit applies those functions to the rcutorture code that tests RCU priority boosting. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit b0c9d7ff2793502650ad987c3f237d5fe5587a1e Author: Paul E. McKenney Date: Tue Mar 29 12:56:56 2011 -0700 rcu: add DEBUG_OBJECTS_RCU_HEAD check for alignment Verify that rcu_head structures are aligned to a four-byte boundary. This check is enabled by CONFIG_DEBUG_OBJECTS_RCU_HEAD. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit fc2ecf7ec76c5ee150b83dcefc863fa03fd365fb Author: Mathieu Desnoyers Date: Wed Feb 23 09:42:14 2011 -0800 rcu: Enable DEBUG_OBJECTS_RCU_HEAD from !PREEMPT The prohibition of DEBUG_OBJECTS_RCU_HEAD from !PREEMPT was due to the fixup actions. So just produce a warning from !PREEMPT. Signed-off-by: Mathieu Desnoyers Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 5ece5bab3ed8594ce2c85c6c6e6b82109db36ca7 Author: Paul E. McKenney Date: Fri Apr 22 18:08:51 2011 -0700 rcu: Add forward-progress diagnostic for per-CPU kthreads Increment a per-CPU counter on each pass through rcu_cpu_kthread()'s service loop, and add it to the rcudata trace output. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 15ba0ba860871cf74b48b1bb47c26c91a66126f3 Author: Paul E. McKenney Date: Wed Apr 6 16:01:16 2011 -0700 rcu: add grace-period age and more kthread state to tracing This commit adds the age in jiffies of the current grace period along with the duration in jiffies of the longest grace period since boot to the rcu/rcugp debugfs file. It also adds an additional "O" state to kthread tracing to differentiate between the kthread waiting due to having nothing to do on the one hand and waiting due to being on the wrong CPU on the other hand. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit a9f4793d8900dc5dc09b3951bdcd4731290e06fe Author: Paul E. McKenney Date: Mon May 2 03:46:10 2011 -0700 rcu: fix tracing bug thinko on boost-balk attribution The rcu_initiate_boost_trace() function mis-attributed refusals to initiate RCU priority boosting that were in fact due to its not yet being time to boost. This patch fixes the faulty comparison. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit 90e6ac3657fd3b0446d585082000af3cf46439a7 Author: Paul E. McKenney Date: Wed Apr 6 15:20:47 2011 -0700 rcu: update tracing documentation for new rcutorture and rcuboost This commit documents the new debugfs rcu/rcutorture and rcu/rcuboost trace files. The description has been updated as suggested by Josh Triplett. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit 4a29865689dbb87a02e3b0fff4a4ae5041273173 Author: Paul E. McKenney Date: Sun Apr 3 21:33:51 2011 -0700 rcu: make rcutorture version numbers available through debugfs It is not possible to accurately correlate rcutorture output with that of debugfs. This patch therefore adds a debugfs file that prints out the rcutorture version number, permitting easy correlation. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit d71df90eadfc35aa549ff9a850842673febca71f Author: Paul E. McKenney Date: Tue Mar 29 17:48:28 2011 -0700 rcu: add tracing for RCU's kthread run states. Add tracing to help debugging situations when RCU's kthreads are not running but are supposed to be. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 0ac3d136b2e3cdf1161178223bc5da14a06241d0 Author: Paul E. McKenney Date: Mon Mar 28 15:47:07 2011 -0700 rcu: add callback-queue information to rcudata output This commit adds an indication of the state of the callback queue using a string of four characters following the "ql=" integer queue length. The first character is "N" if there are callbacks that have been queued that are not yet ready to be handled by the next grace period, or "." otherwise. The second character is "R" if there are callbacks queued that are ready to be handled by the next grace period, or "." otherwise. The third character is "W" if there are callbacks waiting for the current grace period, or "." otherwise. Finally, the fourth character is "D" if there are callbacks that have been handled by a prior grace period and are waiting to be invoked, or ".". Note that callbacks that are in the process of being invoked are not shown. These callbacks would have been removed from the rcu_data structure's list by rcu_do_batch() prior to being executed. (These callbacks are also not reflected in the "ql=" total, FWIW.) Also, document the new callback-queue trace information. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 2fa218d8bbcff239302f9f36e19d7187077dd636 Author: Paul E. McKenney Date: Sun Mar 27 21:37:58 2011 -0700 rcu: Update RCU's trace.txt documentation for new format The trace.txt file had obsolete output for the debugfs rcu/rcudata file, so update it. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 0ea1f2ebeb217d38770aebf91c4ecaa8e01b3305 Author: Paul E. McKenney Date: Tue Feb 22 13:42:43 2011 -0800 rcu: Add boosting to TREE_PREEMPT_RCU tracing Includes total number of tasks boosted, number boosted on behalf of each of normal and expedited grace periods, and statistics on attempts to initiate boosting that failed for various reasons. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 67b98dba474f293c389fc2b7254dcf7c0492e3bd Author: Paul E. McKenney Date: Mon Feb 21 13:31:55 2011 -0800 rcu: eliminate unused boosting statistics The n_rcu_torture_boost_allocerror and n_rcu_torture_boost_afferror statistics are not actually incremented anymore, so eliminate them. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 3acf4a9a3d63f23430f940842829175b0778a1b8 Author: Paul E. McKenney Date: Sun Apr 17 23:45:23 2011 -0700 rcu: avoid hammering sched with yet another bound RT kthread The scheduler does not appear to take kindly to having multiple real-time threads bound to a CPU that is going offline. So this commit is a temporary hack-around to avoid that happening. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit e3995a25fa361ce987a7d0ade00b17e3151519d7 Author: Paul E. McKenney Date: Mon Apr 18 15:31:26 2011 -0700 rcu: put per-CPU kthread at non-RT priority during CPU hotplug operations If you are doing CPU hotplug operations, it is best not to have CPU-bound realtime tasks running CPU-bound on the outgoing CPU. So this commit makes per-CPU kthreads run at non-realtime priority during that time. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 0f962a5e7277c34987b77dc82fc9aefcedc95e27 Author: Paul E. McKenney Date: Thu Apr 14 12:13:53 2011 -0700 rcu: Force per-rcu_node kthreads off of the outgoing CPU The scheduler has had some heartburn in the past when too many real-time kthreads were affinitied to the outgoing CPU. So, this commit lightens the load by forcing the per-rcu_node and the boost kthreads off of the outgoing CPU. Note that RCU's per-CPU kthread remains on the outgoing CPU until the bitter end, as it must in order to preserve correctness. Also avoid disabling hardirqs across calls to set_cpus_allowed_ptr(), given that this function can block. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney commit 27f4d28057adf98750cf863c40baefb12f5b6d21 Author: Paul E. McKenney Date: Mon Feb 7 12:47:15 2011 -0800 rcu: priority boosting for TREE_PREEMPT_RCU Add priority boosting for TREE_PREEMPT_RCU, similar to that for TINY_PREEMPT_RCU. This is enabled by the default-off RCU_BOOST kernel parameter. The priority to which to boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel parameter (defaulting to real-time priority 1) and the time to wait before boosting the readers who are blocking a given grace period is controlled by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds). Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114 Author: Paul E. McKenney Date: Wed Jan 12 14:10:23 2011 -0800 rcu: move TREE_RCU from softirq to kthread If RCU priority boosting is to be meaningful, callback invocation must be boosted in addition to preempted RCU readers. Otherwise, in presence of CPU real-time threads, the grace period ends, but the callbacks don't get invoked. If the callbacks don't get invoked, the associated memory doesn't get freed, so the system is still subject to OOM. But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit moves the callback invocations to a kthread, which can be boosted easily. Also add comments and properly synchronized all accesses to rcu_cpu_kthread_task, as suggested by Lai Jiangshan. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 12f5f524cafef3ab689929b118f2dfb8bf2be321 Author: Paul E. McKenney Date: Mon Nov 29 21:56:39 2010 -0800 rcu: merge TREE_PREEPT_RCU blocked_tasks[] lists Combine the current TREE_PREEMPT_RCU ->blocked_tasks[] lists in the rcu_node structure into a single ->blkd_tasks list with ->gp_tasks and ->exp_tasks tail pointers. This is in preparation for RCU priority boosting, which will add a third dimension to the combinatorial explosion in the ->blocked_tasks[] case, but simply a third pointer in the new ->blkd_tasks case. Also update documentation to reflect blocked_tasks[] merge Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit e59fb3120becfb36b22ddb8bd27d065d3cdca499 Author: Paul E. McKenney Date: Tue Sep 7 10:38:22 2010 -0700 rcu: Decrease memory-barrier usage based on semi-formal proof Commit d09b62d fixed grace-period synchronization, but left some smp_mb() invocations in rcu_process_callbacks() that are no longer needed, but sheer paranoia prevented them from being removed. This commit removes them and provides a proof of correctness in their absence. It also adds a memory barrier to rcu_report_qs_rsp() immediately before the update to rsp->completed in order to handle the theoretical possibility that the compiler or CPU might move massive quantities of code into a lock-based critical section. This also proves that the sheer paranoia was not entirely unjustified, at least from a theoretical point of view. In addition, the old dyntick-idle synchronization depended on the fact that grace periods were many milliseconds in duration, so that it could be assumed that no dyntick-idle CPU could reorder a memory reference across an entire grace period. Unfortunately for this design, the addition of expedited grace periods breaks this assumption, which has the unfortunate side-effect of requiring atomic operations in the functions that track dyntick-idle state for RCU. (There is some hope that the algorithms used in user-level RCU might be applied here, but some work is required to handle the NMIs that user-space applications can happily ignore. For the short term, better safe than sorry.) This proof assumes that neither compiler nor CPU will allow a lock acquisition and release to be reordered, as doing so can result in deadlock. The proof is as follows: 1. A given CPU declares a quiescent state under the protection of its leaf rcu_node's lock. 2. If there is more than one level of rcu_node hierarchy, the last CPU to declare a quiescent state will also acquire the ->lock of the next rcu_node up in the hierarchy, but only after releasing the lower level's lock. The acquisition of this lock clearly cannot occur prior to the acquisition of the leaf node's lock. 3. Step 2 repeats until we reach the root rcu_node structure. Please note again that only one lock is held at a time through this process. The acquisition of the root rcu_node's ->lock must occur after the release of that of the leaf rcu_node. 4. At this point, we set the ->completed field in the rcu_state structure in rcu_report_qs_rsp(). However, if the rcu_node hierarchy contains only one rcu_node, then in theory the code preceding the quiescent state could leak into the critical section. We therefore precede the update of ->completed with a memory barrier. All CPUs will therefore agree that any updates preceding any report of a quiescent state will have happened before the update of ->completed. 5. Regardless of whether a new grace period is needed, rcu_start_gp() will propagate the new value of ->completed to all of the leaf rcu_node structures, under the protection of each rcu_node's ->lock. If a new grace period is needed immediately, this propagation will occur in the same critical section that ->completed was set in, but courtesy of the memory barrier in #4 above, is still seen to follow any pre-quiescent-state activity. 6. When a given CPU invokes __rcu_process_gp_end(), it becomes aware of the end of the old grace period and therefore makes any RCU callbacks that were waiting on that grace period eligible for invocation. If this CPU is the same one that detected the end of the grace period, and if there is but a single rcu_node in the hierarchy, we will still be in the single critical section. In this case, the memory barrier in step #4 guarantees that all callbacks will be seen to execute after each CPU's quiescent state. On the other hand, if this is a different CPU, it will acquire the leaf rcu_node's ->lock, and will again be serialized after each CPU's quiescent state for the old grace period. On the strength of this proof, this commit therefore removes the memory barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp(). The effect is to reduce the number of memory barriers by one and to reduce the frequency of execution from about once per scheduling tick per CPU to once per grace period. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit a00e0d714fbded07a7a2254391ce9ed5a5cb9d82 Author: Paul E. McKenney Date: Tue Feb 8 17:14:39 2011 -0800 rcu: Remove conditional compilation for RCU CPU stall warnings The RCU CPU stall warnings can now be controlled using the rcu_cpu_stall_suppress boot-time parameter or via the same parameter from sysfs. There is therefore no longer any reason to have kernel config parameters for this feature. This commit therefore removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE kernel config parameters. The RCU_CPU_STALL_TIMEOUT parameter remains to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter remains to allow task-stall information to be suppressed if desired. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett commit 1a8e1463a49aaa452da1cefe184a00d4df47f1ef Author: Kees Cook Date: Wed May 4 08:38:56 2011 -0700 [CPUFREQ] remove redundant sprintf from request_module call. Since format string handling is part of request_module, there is no need to construct the module name. As such, drop the redundant sprintf and heap usage. Signed-off-by: Kees Cook Signed-off-by: Dave Jones commit 469057d587a9de2cd6087d71a008b908e785a5b6 Author: Karthigan Srinivasan Date: Fri Apr 1 17:34:47 2011 -0500 [CPUFREQ] cpufreq_stats.c: Fixed brace coding style issue Fixed brace coding style issue. Signed-off-by: Karthigan Srinivasan Signed-off-by: Dave Jones commit 98586ed8b8878e10691203687e89a42fa3355300 Author: steven finney Date: Mon May 2 11:29:17 2011 -0700 [CPUFREQ] Fix memory leak in cpufreq_stat When a CPU is taken offline in an SMP system, cpufreq_remove_dev() nulls out the per-cpu policy before cpufreq_stats_free_table() can make use of it. cpufreq_stats_free_table() then skips the call to sysfs_remove_group(), leaving about 100 bytes of sysfs-related memory unclaimed each time a CPU-removal occurs. Break up cpu_stats_free_table into sysfs and table portions, and call the sysfs portion early. Signed-off-by: Steven Finney Signed-off-by: Dave Jones Cc: stable@kernel.org commit 335dc3335ff692173bc766b248f7a97f3a23b30a Author: Thiago Farina Date: Thu Apr 28 20:42:53 2011 -0300 [CPUFREQ] cpufreq.h: Fix some checkpatch.pl coding style issues. Before: $ scripts/checkpatch.pl --file --terse include/linux/cpufreq.h total: 14 errors, 11 warnings, 419 lines checked After: $ scripts/checkpatch.pl --file --terse include/linux/cpufreq.h total: 2 errors, 4 warnings, 422 lines checked Signed-off-by: Thiago Farina Signed-off-by: Dave Jones commit 2d06d8c49afdcc9bb35a85039fa50f0fe35bd40e Author: Dominik Brodowski Date: Sun Mar 27 15:04:46 2011 +0200 [CPUFREQ] use dynamic debug instead of custom infrastructure With dynamic debug having gained the capability to report debug messages also during the boot process, it offers a far superior interface for debug messages than the custom cpufreq infrastructure. As a first step, remove the old cpufreq_debug_printk() function and replace it with a call to the generic pr_debug() function. How can dynamic debug be used on cpufreq? You need a kernel which has CONFIG_DYNAMIC_DEBUG enabled. To enabled debugging during runtime, mount debugfs and $ echo -n 'module cpufreq +p' > /sys/kernel/debug/dynamic_debug/control for debugging the complete "cpufreq" module. To achieve the same goal during boot, append ddebug_query="module cpufreq +p" as a boot parameter to the kernel of your choice. For more detailled instructions, please see Documentation/dynamic-debug-howto.txt Signed-off-by: Dominik Brodowski Signed-off-by: Dave Jones commit 27ecddc2a9f99ce4ac9a59a0acd77f7100b6d034 Author: Jacob Shin Date: Wed Apr 27 13:32:11 2011 -0500 [CPUFREQ] CPU hotplug, re-create sysfs directory and symlinks When we discover CPUs that are affected by each other's frequency/voltage transitions, the first CPU gets a sysfs directory created, and rest of the siblings get symlinks. Currently, when we hotplug off only the first CPU, all of the symlinks and the sysfs directory gets removed. Even though rest of the siblings are still online and functional, they are orphaned, and no longer governed by cpufreq. This patch, given the above scenario, creates a sysfs directory for the first sibling and symlinks for the rest of the siblings. Please note the recursive call, it was rather too ugly to roll it out. And the removal of redundant NULL setting (it is already taken care of near the top of the function). Signed-off-by: Jacob Shin Acked-by: Mark Langsdorf Reviewed-by: Thomas Renninger Signed-off-by: Dave Jones Cc: stable@kernel.org commit 904cc1e637a00dba1b58e7752f485f90ebf2a568 Author: Naga Chumbalkar Date: Tue Apr 26 17:05:18 2011 +0000 [CPUFREQ] Fix _OSC UUID in pcc-cpufreq UUID needs to be written out the way it is described in Sec 18.5.124 of ACPI 4.0a Specification. Platform firmware's use of this UUID/_OSC is optional, which is why we didn't notice this bug earlier. Signed-off-by: Naga Chumbalkar Signed-off-by: Dave Jones Cc: stable@kernel.org commit 931aeeda0dca81152aec48f30be01e86a268bf89 Author: Vladimir Davydov Date: Tue May 3 22:31:07 2011 +0400 sched: Remove unused 'this_best_prio arg' from balance_tasks() It's passed across multiple functions but is never really used, so remove it. Signed-off-by: Vladimir Davydov Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1304447467-29200-1-git-send-email-vdavydov@parallels.com Signed-off-by: Ingo Molnar commit e7e7ee2eab2080248084d71fe0a115ab745eb2aa Author: Ingo Molnar Date: Wed May 4 08:42:29 2011 +0200 perf events: Clean up definitions and initializers, update copyrights Fix a few inconsistent style bits that were added over the past few months. Cc: Peter Zijlstra Link: http://lkml.kernel.org/n/tip-yv4hwf9yhnzoada8pcpb3a97@git.kernel.org Signed-off-by: Ingo Molnar commit 48dbb6dc86ca5d1b2224937d774c7ba98bc3a485 Author: Borislav Petkov Date: Tue May 3 15:26:43 2011 +0200 hw breakpoints: Move to kernel/events/ As part of the events sybsystem unification, relocate hw_breakpoint.c into its new destination. Cc: Frederic Weisbecker Signed-off-by: Borislav Petkov commit fae85b7c8bcc7de9c0a2698587e20c15beb7d5a6 Author: Borislav Petkov Date: Tue Oct 26 20:24:03 2010 +0200 perf: Start the restructuring mv kernel/perf_event.c -> kernel/events/core.c. From there, all further sensible splitting can happen. The idea is that due to perf_event.c becoming pretty sizable and with the advent of the marriage with ftrace, splitting functionality into its logical parts should help speeding up the unification and to manage the complexity of the subsystem. Signed-off-by: Borislav Petkov commit 942c3c5c329274fa6de5998cb911cf3d0a42d0b1 Author: Mike Frysinger Date: Mon May 2 15:24:27 2011 -0400 hrtimer: Make lookup table const Signed-off-by: Mike Frysinger Link: http://lkml.kernel.org/r/%3C1304364267-14489-1-git-send-email-vapier%40gentoo.org%3E Signed-off-by: Thomas Gleixner commit 3687a2c0d81b23d30db4384ca804a701fc686e16 Merge: b4d246b c7bcecb Author: Thomas Gleixner Date: Mon May 2 21:37:03 2011 +0200 Merge branch 'linus' into timers/core Reason: Pick up the hrtimer_clock_to_base_table fix from mainline Signed-off-by: Thomas Gleixner commit b4d246b12410b53506c311e5e0b6abb71ead65c6 Author: John Stultz Date: Fri Apr 29 15:03:11 2011 -0700 RTC: Disable CONFIG_RTC_CLASS from being built as a module The RTC subsystem has a number of accessors that are available via include/linux/rtc.h. However many of these interfaces are not available for use if CONFIG_RTC_CLASS=m. So in order to support wider use of the RTC in the kernel, I'm removing the tristate config option for a bool, so that code can easily be conditionalized if the RTC class is present or not. Signed-off-by: John Stultz Cc: Ingo Molnar Signed-off-by: Thomas Gleixner commit 472647dcd7e351dbeda750e5ab3e8f7b06d1199a Author: John Stultz Date: Fri Apr 29 15:03:10 2011 -0700 timers: Fix alarmtimer build issues when CONFIG_RTC_CLASS=n Ingo pointed out that the alarmtimers won't build if CONFIG_RTC_CLASS=n. This patch adds proper ifdefs to the alarmtimer code to disable the rtc usage if it is not built in. Reported-by: Ingo Molnar Signed-off-by: John Stultz Signed-off-by: Thomas Gleixner commit c42321c76b0ef472e3bae4bfcb0f46ab19e038ef Author: Thomas Gleixner Date: Mon May 2 18:16:22 2011 +0200 genirq: Make generic irq chip depend on CONFIG_GENERIC_IRQ_CHIP Only compile it in when there are users. Signed-off-by: Thomas Gleixner Cc: linux-arm-kernel@lists.infradead.org commit e5a10c1bd12a5d71bbb6406c1b0dbbc9d8958397 Author: Yinghai Lu Date: Mon May 2 17:24:49 2011 +0200 x86, NUMA: Trim numa meminfo with max_pfn in a separate loop During testing 32bit numa unifying code from tj, found one system with more than 64g fails to use numa. It turns out we do not trim numa meminfo correctly against max_pfn in case start address of a node is higher than 64GiB. Bug fix made it to tip tree. This patch moves the checking and trimming to a separate loop. So we don't need to compare low/high in following merge loops. It makes the code more readable. Also it makes the node merge printouts less strange. On a 512GiB numa system with 32bit, before: > NUMA: Node 0 [0,a0000) + [100000,80000000) -> [0,80000000) > NUMA: Node 0 [0,80000000) + [100000000,1080000000) -> [0,1000000000) after: > NUMA: Node 0 [0,a0000) + [100000,80000000) -> [0,80000000) > NUMA: Node 0 [0,80000000) + [100000000,1000000000) -> [0,1000000000) Signed-off-by: Yinghai Lu [Updated patch description and comment slightly.] Signed-off-by: Tejun Heo commit a56bca80db8903bb557b9ac38da68dc5b98ea672 Author: Yinghai Lu Date: Mon May 2 17:24:49 2011 +0200 x86, NUMA: Rename setup_node_bootmem() to setup_node_data() After using memblock to replace bootmem, that function only sets up node_data now. Change the name to reflect what it actually does. tj: Minor adjustment to the patch description. Signed-off-by: Yinghai Lu Signed-off-by: Tejun Heo commit 1b7e03ef7570568d2fb9e6640d7006a0edd728f6 Author: Tejun Heo Date: Mon May 2 17:24:48 2011 +0200 x86, NUMA: Enable emulation on 32bit too Now that NUMA init path is unified, NUMA emulation can be enabled on 32bit. Make numa_emluation.c safe on 32bit by doing the followings. * Define MAX_DMA32_PFN on 32bit too. * Include bootmem.h for max_pfn declaration. * Use u64 explicitly and always use PFN_PHYS() when converting page number to address. * Avoid __udivdi3() generation on 32bit by doing number of pages calculation instead in split_nodes_interleave(). And drop X86_64 dependency from Kconfig. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 2706a0bf7b02693ed88752df877f10c2206292ff Author: Tejun Heo Date: Mon May 2 17:24:48 2011 +0200 x86, NUMA: Enable CONFIG_AMD_NUMA on 32bit too Now that NUMA init path is unified, amdtopology can be enabled on 32bit. Make amdtopology.c safe on 32bit by explicitly using u64 and drop X86_64 dependency from Kconfig. Inclusion of bootmem.h is added for max_pfn declaration. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit c6f58878204b0414e00b43f9bcebf754284f95b4 Author: Tejun Heo Date: Mon May 2 17:24:48 2011 +0200 x86, NUMA: Rename amdtopology_64.c to amdtopology.c amdtopology is going to be used by 32bit too drop _64 suffix. This is pure rename. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 752d4f372f90a2f6eb562aaffb639957890cbcab Author: Tejun Heo Date: Mon May 2 17:24:48 2011 +0200 x86, NUMA: Make numa_init_array() static numa_init_array() no longer has users outside of numa.c. Make it static. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit bd6709a91a593d8fe35d08da542e9f93bb74a304 Author: Tejun Heo Date: Mon May 2 17:24:48 2011 +0200 x86, NUMA: Make 32bit use common NUMA init path With both _numa_init() methods converted and the rest of init code adjusted, numa_32.c now can switch from the 32bit only init code to the common one in numa.c. * Shim get_memcfg_*()'s are dropped and initmem_init() calls x86_numa_init(), which is updated to handle NUMAQ. * All boilerplate operations including node range limiting, pgdat alloc/init are handled by numa_init(). 32bit only implementation is removed. * 32bit numa_add_memblk(), numa_set_distance() and memory_add_physaddr_to_nid() removed and common versions in numa_32.c enabled for 32bit. This change causes the following behavior changes. * NODE_DATA()->node_start_pfn/node_spanned_pages properly initialized for 32bit too. * Much more sanity checks and configuration cleanups. * Proper handling of node distances. * The same NUMA init messages as 64bit. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 7888e96b264fad27f97f58c0f3a4d20326eaf181 Author: Tejun Heo Date: Mon May 2 14:18:54 2011 +0200 x86, NUMA: Initialize and use remap allocator from setup_node_bootmem() setup_node_bootmem() is taken from 64bit and doesn't use remap allocator. It's about to be shared with 32bit so add support for it. If NODE_DATA is remapped, it's noted in the debug message and node locality check is skipped as the __pa() of the remapped address doesn't reflect the actual physical address. On 64bit, remap allocator becomes noop and doesn't affect the behavior. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 99cca492ea8ced305bfd687521ed69fb9e0147aa Author: Tejun Heo Date: Mon May 2 14:18:54 2011 +0200 x86-32, NUMA: Add @start and @end to init_alloc_remap() Instead of dereferencing node_start/end_pfn[] directly, make init_alloc_remap() take @start and @end and let the caller be responsible for making sure the range is sane. This is to prepare for use from unified NUMA init code. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 38f3e1ca24cc3ec416855e02676f91c898a8a262 Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86, NUMA: Remove long 64bit assumption from numa.c Code moved from numa_64.c has assumption that long is 64bit in several places. This patch removes the assumption by using {s|u}64_t explicity, using PFN_PHYS() for page number -> addr conversions and adjusting printf formats. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 744baba0c4072b04664952a89292e4708eaf949a Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86, NUMA: Enable build of generic NUMA init code on 32bit Generic NUMA init code was moved to numa.c from numa_64.c but is still guaraded by CONFIG_X86_64. This patch removes the compile guard and enables compiling on 32bit. * numa_add_memblk() and numa_set_distance() clash with the shim implementation in numa_32.c and are left out. * memory_add_physaddr_to_nid() clashes with 32bit implementation and is left out. * MAX_DMA_PFN definition in dma.h moved out of !CONFIG_X86_32. * node_data definition in numa_32.c removed in favor of the one in numa.c. There are places where ulong is assumed to be 64bit. The next patch will fix them up. Note that although the code is compiled it isn't used yet and this patch doesn't cause any functional change. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit a4106eae650a4d5d30fcdd36d998edfa5ccb0ec4 Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86, NUMA: Move NUMA init logic from numa_64.c to numa.c Move the generic 64bit NUMA init machinery from numa_64.c to numa.c. * node_data[], numa_mem_info and numa_distance * numa_add_memblk[_to](), numa_remove_memblk[_from]() * numa_set_distance() and friends * numa_init() and all the numa_meminfo handling helpers called from it * dummy_numa_init() * memory_add_physaddr_to_nid() A new function x86_numa_init() is added and the content of numa_64.c::initmem_init() is moved into it. initmem_init() now simply calls x86_numa_init(). Constants and numa_off declaration are moved from numa_{32|64}.h to numa.h. This is code reorganization and doesn't involve any functional change. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 299a180aec6a8ee3069cf0fe90d722ac20c1f837 Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86-32, NUMA: Update numaq to use new NUMA init protocol Update numaq such that it calls numa_add_memblk() and sets numa_nodes_parsed instead of directly diddling with NUMA states. The original get_memcfg_numaq() is renamed to numaq_numa_init() and new get_memcfg_numaq() is created in numa_32.c. The shim numa_add_memblk() implementation handles node_start/end_pfn[] and node_set_online() for nodes with memory. The new get_memcfg_numaq() exactly the same with get_memcfg_from_srat() other than calling the numaq init function. Things get_memcfgs_numaq() do are not strictly necessary for numaq but added for consistency and to help unifying NUMA init handling. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 5acd91ab837c9d066af7345aea6462dc55695db7 Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86-32, NUMA: Replace srat_32.c with srat.c SRAT support implementation in srat_32.c and srat.c are generally similar; however, there are some differences. First of all, 64bit implementation supports more types of SRAT entries. 64bit supports x2apic, affinity, memory and SLIT. 32bit only supports processor and memory. Most other differences stem from different initialization protocols employed by 64bit and 32bit NUMA init paths. On 64bit, * Mappings among PXM, node and apicid are directly done in each SRAT entry callback. * Memory affinity information is passed to numa_add_memblk() which takes care of all interfacing with NUMA init. * Doesn't directly initialize NUMA configurations. All the information is recorded in numa_nodes_parsed and memblks. On 32bit, * Checks numa_off. * Things go through one more level of indirection via private tables but eventually end up initializing the same mappings. * node_start/end_pfn[] are initialized and memblock_x86_register_active_regions() is called for each memory chunk. * node_set_online() is called for each online node. * sort_node_map() is called. There are also other minor differences in sanity checking and messages but taking 64bit version should be good enough. This patch drops the 32bit specific implementation and makes the 64bit implementation common for both 32 and 64bit. The init protocol differences are dealt with in two places - the numa_add_memblk() shim added in the previous patch and new temporary numa_32.c:get_memcfg_from_srat() which wraps invocation of x86_acpi_numa_init(). The shim numa_add_memblk() handles the folowings. * node_start/end_pfn[] initialization. * node_set_online() for memory nodes. * Invocation of memblock_x86_register_active_regions(). The shim get_memcfg_from_srat() handles the followings. * numa_off check. * node_set_online() for CPU nodes. * sort_node_map() invocation. * Clearing of numa_nodes_parsed and active_ranges on failure. The shims are temporary and will be removed as the generic NUMA init path in 32bit is replaced with 64bit one. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit b0d310801a4c1f95b44357e4ebc22a9903e3bf3d Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86-32, NUMA: implement temporary NUMA init shims To help transition to common NUMA init, implement temporary 32bit shims for numa_add_memblk() and numa_set_distance(). numa_add_memblk() registers the memblk and adjusts node_start/end_pfn[]. numa_set_distance() is noop. These shims will allow using 64bit NUMA init functions on 32bit and gradual transition to common NUMA init path. For detailed description, please read description of commits which make use of the shim functions. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit e6df595b37c7c033ef7400b4fdd382a2dc4f4131 Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86, NUMA: Move numa_nodes_parsed to numa.[hc] Move numa_nodes_parsed from numa_64.[hc] to numa.[hc] to prepare for NUMA init path unification. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit daf4f480ae24270bac06db4293908d36b4834e21 Author: Tejun Heo Date: Mon May 2 14:18:53 2011 +0200 x86-32, NUMA: Move get_memcfg_numa() into numa_32.c There's no reason get_memcfg_numa() to be implemented inline in mmzone_32.h. Move it to numa_32.c and also make get_memcfg_numa_flag() static. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit eca9ad313293c41021bfcf23e985a14f6991a121 Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86, NUMA: make srat.c 32bit safe Make srat.c 32bit safe by removing the assumption that unsigned long is 64bit. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 7b2600f8ee0536bb738f3387cf2c30e8e334e149 Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86, NUMA: rename srat_64.c to srat.c Rename srat_64.c to srat.c. This is to prepare for unification of NUMA init paths between 32 and 64bit. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 1201e10a092adc9c88a6ce5f27740cc5cd0d26e5 Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86, NUMA: trivial cleanups * Kill no longer used struct bootnode. * Kill dangling declaration of pxm_to_nid() in numa_32.h. * Make setup_node_bootmem() static. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 797390d8554b1e07aabea37d0140933b0412dba0 Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86-32, NUMA: use sparse_memory_present_with_active_regions() Instead of calling memory_present() for each region from NUMA init, call sparse_memory_present_with_active_regions() from paging_init() similarly to x86-64. For flat and numaq, this results in exactly the same memory_present() calls. For srat, if there are multiple memory chunks for a node, after this change, memory_present() will be called separately for each chunk instead of being called once to encompass the whole range, which doesn't cause any harm and actually is the better behavior. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 84914ed0ec6787d38e84b510f92ad4ca3a572fd8 Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional NUMAQ is the only meaningful user of this callback and setup_local_APIC() the only callsite. Stop torturing everyone else by making the callback optional and removing all the boilerplate implementations and assignments. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 6bd262731bf7559bab8c749786e8652e2df1fb4e Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86, NUMA: Unify 32/64bit numa_cpu_node() implementation Currently, the only meaningful user of apic->x86_32_numa_cpu_node() is NUMAQ which returns valid mapping only after CPU is initialized during SMP bringup; thus, the previous patch to set apicid -> node in setup_local_APIC() makes __apicid_to_node[] always contain the correct mapping whether custom apic->x86_32_numa_cpu_node() is used or not. So, there is no reason to keep separate 32bit implementation. We can always consult __apicid_to_node[]. Move 64bit implementation from numa_64.c to numa.c and remove 32bit implementation from numa_32.c. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit c4b90c11992e61123071977c0e5556e59a70852c Author: Tejun Heo Date: Mon May 2 14:18:52 2011 +0200 x86-32, NUMA: Automatically set apicid -> node in setup_local_APIC() Some x86-32 NUMA implementations (NUMAQ) don't initialize apicid -> node mapping using set_apicid_to_node() during NUMA init but implement custom apic->x86_32_numa_cpu_node() instead. This patch automatically initializes the default apic -> node mapping table from apic->x86_32_numa_cpu_node() from setup_local_APIC() such that the mapping table is in sync with the actual mapping. As the table isn't used by custom implementations, this doesn't make any difference at this point. This is in preparation of unifying numa_cpu_node() between x86-32 and 64. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit acd26d611e60c1a7c2a14269ab99760f779121f4 Author: Tejun Heo Date: Mon May 2 14:18:51 2011 +0200 x86-64, NUMA: simplify nodedata allocation With top-down memblock allocation, the allocation range limits in ealry_node_mem() can be simplified - try node-local first, then any node but in any case don't allocate below DMA limit. Remove early_node_mem() and implement simplified allocation directly in setup_node_bootmem(). Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit ebe685f24eeb85fbdb0f33792f1dabdbf35eff38 Author: Tejun Heo Date: Mon May 2 14:18:51 2011 +0200 x86-64, NUMA: trivial cleanups for setup_node_bootmem() Make the following trivial changes in preparation for further updates. * nodeid -> nid, nid -> tnid * use nd_ prefix for nodedata related variables * remove start/end_pfn and use start/end directly Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit 9688678a6670c7f0ae3872450a8047c0ad401efb Author: Tejun Heo Date: Mon May 2 14:18:51 2011 +0200 x86-64, NUMA: Simplify hotadd memory handling The only special handling NUMA needs to do for hotadd memory is determining the node for the hotadd memory given the address of it and there's nothing specific to specific config method used. srat_64.c does somewhat elaborate error checking on ACPI_SRAT_MEM_HOT_PLUGGABLE regions, remembers them and implements memory_add_physaddr_to_nid() which determines the node for given hotadd address. This is almost completely redundant. All the information is already available to the generic NUMA code which already performs all the sanity checking and merging. All that's necessary is not using __initdata from numa_meminfo and providing a function which uses it to map address to node. Drop the specific implementation from srat_64.c and add generic memory_add_physaddr_to_nid() in numa_64.c, which is enabled if CONFIG_MEMORY_HOTPLUG is set. Other than dropping the code, srat_64.c doesn't need any change as it already calls numa_add_memblk() for hot pluggable regions which is enough. While at it, change CONFIG_MEMORY_HOTPLUG_SPARSE in srat_64.c to CONFIG_MEMORY_HOTPLUG, for NUMA on x86-64, the two are always the same. Signed-off-by: Tejun Heo Cc: Ingo Molnar Cc: Yinghai Lu Cc: David Rientjes Cc: Thomas Gleixner Cc: "H. Peter Anvin" commit ba67cf5cf2ce10ad86a212b70f8c7c75d93a5016 Merge: aff3648 2be1910 Author: Tejun Heo Date: Mon May 2 14:16:37 2011 +0200 Merge branch 'x86/urgent' into x86-mm Merge reason: Pick up the following two fix commits. 2be19102b7: x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo() 765af22da8: x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change Scheduled NUMA init 32/64bit unification changes depend on these. Signed-off-by: Tejun Heo commit aff364860aa105b2deacc6f21ec8ef524460e3fc Merge: c7a7b814 993ba15 Author: Tejun Heo Date: Mon May 2 14:08:43 2011 +0200 Merge branch 'x86/numa' into x86-mm Merge reason: Pick up x86-32 remap allocator cleanup changes - 14 commits, 3fe14ab541^..993ba1585c. 3fe14ab541: x86-32, numa: Fix failure condition check in alloc_remap() 993ba1585c: x86-32, numa: Update remap allocator comments Scheduled NUMA init 32/64bit unification changes depend on them. Signed-off-by: Tejun Heo commit 9de4966a4d218f29c68e96e8e7b4d2840dedec79 Author: Bart Van Assche Date: Sun May 1 14:09:21 2011 +0200 x86: Fix spelling error in the memcpy() source code comment Signed-off-by: Bart Van Assche Cc: "H. Peter Anvin" Link: http://lkml.kernel.org/r/201105011409.21629.bvanassche@acm.org Signed-off-by: Ingo Molnar commit ac0a3260f37b8616da8d33488ec94b94e6ae5b31 Merge: 809435f b9df92d Author: Ingo Molnar Date: Sun May 1 19:11:42 2011 +0200 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core commit 809435ff4f43a5c0cb0201b3b89176253d5ade18 Merge: 3267382 058e297 Author: Ingo Molnar Date: Sun May 1 19:09:39 2011 +0200 Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core commit b9df92d2a94eef8811061aecb1396290df440e2e Author: Steven Rostedt Date: Thu Apr 28 20:32:08 2011 -0400 ftrace: Consolidate the function match routines for normal and mods The code used for matching functions is almost identical between normal selecting of functions and using the :mod: feature of set_ftrace_notrace. Consolidate the two users into one function. Signed-off-by: Steven Rostedt commit 491d0dcfb9707e1f83eff93ca503eb7573162ef2 Author: Steven Rostedt Date: Wed Apr 27 21:43:36 2011 -0400 ftrace: Consolidate updating of ftrace_trace_function There are three locations that perform almost identical functions in order to update the ftrace_trace_function (the ftrace function variable that gets called by mcount). Consolidate these into a single function called update_ftrace_function(). Signed-off-by: Steven Rostedt commit 996e87be7f537fb34638875dd37083166c733425 Author: Steven Rostedt Date: Tue Apr 26 16:11:03 2011 -0400 ftrace: Move record update for normal and modules into a separate function The updating of a function record is moved to a single function. This will allow us to add specific changes in one location for both modules and kernel functions. Later patches will determine if the function record itself needs to be updated (which enables the mcount caller), or just the ftrace_ops needs the update. Signed-off-by: Steven Rostedt commit d2c8c3eafbf715306ec891e7ca52d3d999acbe31 Author: Steven Rostedt Date: Mon Apr 25 14:32:42 2011 -0400 ftrace: Remove FTRACE_FL_CONVERTED flag Since we disable all function tracer processing if we detect that a modification of a instruction had failed, we do not need to track that the record has failed. No more ftrace processing is allowed, and the FTRACE_FL_CONVERTED flag is pointless. The FTRACE_FL_CONVERTED flag was used to denote records that were successfully converted from mcount calls into nops. But if a single record fails, all of ftrace is disabled. Signed-off-by: Steven Rostedt commit 45a4a2372b364107cabea79f255b333236626416 Author: Steven Rostedt Date: Thu Apr 21 23:16:46 2011 -0400 ftrace: Remove FTRACE_FL_FAILED flag Since we disable all function tracer processing if we detect that a modification of a instruction had failed, we do not need to track that the record has failed. No more ftrace processing is allowed, and the FTRACE_FL_FAILED flag is pointless. Removing this flag simplifies some of the code, but some ftrace_disabled checks needed to be added or move around a little. Signed-off-by: Steven Rostedt commit 3499e461147636bf55c41128d83b679ac6ab2d86 Author: Steven Rostedt Date: Thu Apr 21 22:59:12 2011 -0400 ftrace: Remove failures file The failures file in the debugfs tracing directory would list the functions that failed to convert when the old dead ftrace daemon tried to update code but failed. Since this code is now dead along with the daemon the failures file is useless. Remove it. Signed-off-by: Steven Rostedt commit 8ab2b7efd3e2ccf2c2dda3206b8171ecdbd0af40 Author: Steven Rostedt Date: Thu Apr 21 22:41:35 2011 -0400 ftrace: Remove unnecessary disabling of irqs The disabling of interrupts around ftrace_update_code() was used to protect against the evil ftrace daemon from years past. But that daemon has long been killed. It is safe to keep interrupts enabled while updating the initial mcount into nops. The ftrace_mutex is also held which keeps other users at bay. Signed-off-by: Steven Rostedt commit 0778d9ad33898faab7bf6316108b471790376e35 Author: Steven Rostedt Date: Fri Apr 29 10:36:31 2011 -0400 ftrace: Make FTRACE_WARN_ON() work in if condition Let FTRACE_WARN_ON() be used as a stand alone statement or inside a conditional: if (FTRACE_WARN_ON(x)) Signed-off-by: Steven Rostedt commit 058e297d34a404caaa5ed277de15698d8dc43000 Author: Steven Rostedt Date: Fri Apr 29 22:35:33 2011 -0400 ftrace: Only update the function code on write to filter files If function tracing is enabled, a read of the filter files will cause the call to stop_machine to update the function trace sites. It should only call stop_machine on write. Cc: stable@kernel.org Signed-off-by: Steven Rostedt commit a1d9a09ae8003380a7f2297ee4367947cbdf874f Author: Mike Waychison Date: Fri Apr 29 17:39:31 2011 -0700 Introduce CONFIG_GOOGLE_FIRMWARE In order to keep Google's firmware drivers organized amongst themselves, all Google firmware drivers are gated on CONFIG_GOOGLE_FIRMWARE=y, which defaults to 'n' in the kernel build. Signed-off-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit e561bc45920aade3f8a5aad9058a00e750af1345 Author: Mike Waychison Date: Fri Apr 29 17:39:25 2011 -0700 driver: Google Memory Console This patch introduces the 'memconsole' driver. Our firmware gives us access to an in-memory log of the firmware's output. This gives us visibility in a data-center of headless machines as to what the firmware is doing. The memory console is found by the driver by finding a header block in the EBDA. The buffer is then copied out, and is exported to userland in the file /sys/firmware/log. Signed-off-by: San Mehat Signed-off-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit 74c5b31c6618f01079212332b2e5f6c42f2d6307 Author: Mike Waychison Date: Fri Apr 29 17:39:19 2011 -0700 driver: Google EFI SMI The "gsmi" driver bridges userland with firmware specific routines for accessing hardware. Currently, this driver only supports NVRAM and eventlog information. Deprecated functions have been removed from the driver, though their op-codes are left in place so that they are not re-used. This driver works by trampolining into the firmware via the smi_command outlined in the FADT table. Three protocols are used due to various limitations over time, but all are included herein. This driver should only ever load on Google boards, identified by either a "Google, Inc." board vendor string in DMI, or "GOOGLE" in the OEM strings of the FADT ACPI table. This logic happens in gsmi_system_valid(). Signed-off-by: Duncan Laurie Signed-off-by: Aaron Durbin Signed-off-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit 85eb8c8d0b0900c073b0e6f89979ac9c439ade1a Author: Rafael J. Wysocki Date: Sat Apr 30 00:25:44 2011 +0200 PM / Runtime: Generic clock manipulation rountines for runtime PM (v6) Many different platforms and subsystems may want to disable device clocks during suspend and enable them during resume which is going to be done in a very similar way in all those cases. For this reason, provide generic routines for the manipulation of device clocks during suspend and resume. Convert the ARM shmobile platform to using the new routines. Signed-off-by: Rafael J. Wysocki commit c63ca0c01d73563d4e2ab174bb3dd1e5efb907e6 Author: David Ahern Date: Fri Apr 29 16:04:15 2011 -0600 perf stat: Tell user about unsupported events in the list Similar to perf-record, tell user about unsupported events that will not be counted if invoked in verbose mode. e.g., $ perf stat -e dTLB-prefetch-misses -v -- sleep 1 dTLB-prefetch-misses event is not supported by the kernel. dTLB-prefetch-misses: 0 0 0 Performance counter stats for 'sleep 1': dTLB-prefetch-misses 1.001884783 seconds time elapsed Signed-off-by: David Ahern Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/r/1304114655-10600-1-git-send-email-dsahern@gmail.com Signed-off-by: Ingo Molnar commit f548ccd47d608e88d432745091e13f927ced83f7 Author: Mike Waychison Date: Mon Mar 14 23:58:50 2011 -0700 x86: Better comments for get_bios_ebda() Make the comments a bit clearer for get_bios_ebda so that it actually tells us what it is returning. Signed-off-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit 57d5f9f808b7650a92f31e9cd3acd3f415a22530 Author: Mike Waychison Date: Mon Mar 14 23:58:45 2011 -0700 x86: get_bios_ebda_length() Add a wrapper routine that tells us the length of the EBDA if it is present. This guy also ensures that the returned length doesn't let the EBDA run past the 640KiB mark. Signed-off-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit 773d67903ad608d3f64cc5b00e2f881473413c13 Author: Randy Dunlap Date: Tue Apr 26 09:18:51 2011 -0700 misc: fix ti-st build issues st_drv uses skb*() interfaces, so it should depend on NET. It also uses GPIO interfaces, so it should depend on GPIOLIB. st_kim.c uses syss_*() calls, so it should #include . Fixes these observed build errors: ERROR: "skb_queue_purge" [drivers/misc/ti-st/st_drv.ko] undefined! ERROR: "skb_pull" [drivers/misc/ti-st/st_drv.ko] undefined! ERROR: "skb_queue_tail" [drivers/misc/ti-st/st_drv.ko] undefined! ERROR: "__alloc_skb" [drivers/misc/ti-st/st_drv.ko] undefined! ERROR: "kfree_skb" [drivers/misc/ti-st/st_drv.ko] undefined! ERROR: "skb_dequeue" [drivers/misc/ti-st/st_drv.ko] undefined! ERROR: "skb_put" [drivers/misc/ti-st/st_drv.ko] undefined! Signed-off-by: Randy Dunlap Cc: Pavan Savoy Signed-off-by: Greg Kroah-Hartman commit 947b4ad1d198b7303ecc961f4939a331be0c48f0 Author: Ingo Molnar Date: Fri Apr 29 22:52:42 2011 +0200 perf list: Fix max event string size Recent stalled-cycles event names were larger than the 40 chars printout used by perf list. Extend that, make it robust for future extensions and also adjust alignments in face of wider event names. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 301120396b766ae4480e52ece220516a1707822b Author: Ingo Molnar Date: Sat Apr 30 09:14:54 2011 +0200 perf events, x86: Add Westmere stalled-cycles-frontend/backend events Extend the Intel Westmere PMU driver with definitions for generic front-end and back-end stall events. ( These are only approximations. ) Reported-by: David Ahern Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n008io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 370faf1dd0461ad811852c8abbbcd3d73b1e4fc4 Author: Ingo Molnar Date: Fri Apr 29 16:11:03 2011 +0200 perf stat: Fail softly on unsupported events David Ahern reported this perf stat failure: > # /tmp/build-perf/perf stat -- sleep 1 > Error: stalled-cycles-frontend event is not supported. > Fatal: Not all events could be opened. > > This is a Dell R410 with an E5620 processor. Fail in a softer fashion on unknown/unsupported events. Reported-by: David Ahern Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n006io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit fce3c786d3a49eff397583b4b62fa38df90db937 Author: Ingo Molnar Date: Sat Apr 30 09:03:15 2011 +0200 perf stat: Leave more room for percentages Triple digit percentages do not fit otherwise. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n005io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 2b427e14b77dbf3e05f1bd0785f1d07ea5fe924e Author: Ingo Molnar Date: Fri Apr 29 14:16:18 2011 +0200 perf stat: Adjust stall cycles warning percentages Adjust to color thresholds to better match the percentages seen in real workloads. Both are now a bit more sensitive. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n004io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit d3d1e86da07b4565815e3dbcd082f53017d215f8 Author: Ingo Molnar Date: Fri Apr 29 13:49:08 2011 +0200 perf stat: Analyze front-end and back-end stall counts Sample output: Performance counter stats for './loop_1b': 873.691065 task-clock # 1.000 CPUs utilized 1 context-switches # 0.000 M/sec 1 CPU-migrations # 0.000 M/sec 96 page-faults # 0.000 M/sec 2,012,637,222 cycles # 2.304 GHz (66.58%) 1,001,397,911 stalled-cycles-frontend # 49.76% frontend cycles idle (66.58%) 7,523,398 stalled-cycles-backend # 0.37% backend cycles idle (66.76%) 2,004,551,046 instructions # 1.00 insns per cycle # 0.50 stalled cycles per insn (66.80%) 1,001,304,992 branches # 1146.063 M/sec (66.76%) 39,453 branch-misses # 0.00% of all branches (66.64%) 0.874046121 seconds time elapsed Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n003io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 129c04cb8ce2e4bf3f17223f58ef16aa8a2cb3b8 Author: Ingo Molnar Date: Fri Apr 29 14:41:28 2011 +0200 perf tools: Add front-end and back-end stalled cycles support Update perf tooling to deal with front-end and back-end stalled cycles events. Add both the default 'perf stat' output. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n002io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 91fc4cc00099986bc1ba50e1f421c3548cffae42 Author: Ingo Molnar Date: Fri Apr 29 14:17:19 2011 +0200 perf, x86: Add new stalled cycles events for Intel and AMD CPUs Extend the Intel and AMD event definitions with generic front-end and back-end stall events. ( These are only approximations - suggestions are welcome for better events. ) Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n001io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 8f62242246351b5a4bc0c1f00c0c7003edea128a Author: Ingo Molnar Date: Fri Apr 29 13:19:47 2011 +0200 perf events: Add generic front-end and back-end stalled cycle event definitions Add two generic hardware events: front-end and back-end stalled cycles. These events measure conditions when the CPU is executing code but its capabilities are not fully utilized. Understanding such situations and analyzing them is an important sub-task of code optimization workflows. Both events limit performance: most front end stalls tend to be caused by branch misprediction or instruction fetch cachemisses, backend stalls can be caused by various resource shortages or inefficient instruction scheduling. Front-end stalls are the more important ones: code cannot run fast if the instruction stream is not being kept up. An over-utilized back-end can cause front-end stalls and thus has to be kept an eye on as well. The exact composition is very program logic and instruction mix dependent. We use the terms 'stall', 'front-end' and 'back-end' loosely and try to use the best available events from specific CPUs that approximate these concepts. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n000io7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit c7a7b814c9dca9ee01b38e63b4a46de87156d3b6 Author: Tim Gardner Date: Thu Apr 28 11:00:30 2011 -0600 ioremap: Delay sanity check until after a successful mapping While tracking down the reason for an ioremap() failure I was distracted by the WARN_ONCE() in __ioremap_caller(). Performing a WARN_ONCE() sanity check before the mapping is successful seems pointless if the caller sends bad values. A case in point is when the BIOS provides erroneous screen_info values causing vesafb_probe() to request an outrageuous size. The WARN_ONCE is then wasted on bogosity. Move the warning to a point where the mapping has been successfully allocated. Addresses: http://bugs.launchpad.net/bugs/772042 Reviewed-by: Suresh Siddha Signed-off-by: Tim Gardner Link: http://lkml.kernel.org/r/4DB99D2E.9080106@canonical.com Signed-off-by: Ingo Molnar commit 1d2b71f61b6a10216274e27b717becf9ae101fc7 Author: Rafael J. Wysocki Date: Fri Apr 29 00:36:53 2011 +0200 PM / Runtime: Add subsystem data field to struct dev_pm_info Some subsystems need to attach PM-related data to struct device and they need to use devres for this purpose. For their convenience and to make code more straightforward, add a new field called subsys_data to struct dev_pm_info and let subsystems use it for attaching PM-related information to devices. Convert the ARM shmobile platform to using the new field. Signed-off-by: Rafael J. Wysocki commit 638080c37ae08fd0c44cec13d7948ca5385ae851 Author: Kevin Hilman Date: Fri Apr 29 00:36:42 2011 +0200 OMAP2+ / PM: move runtime PM implementation to use device power domains In commit 7538e3db6e015e890825fbd9f8659952896ddd5b (PM: add support for device power domains) a better way for handling platform-specific power hooks was introduced. Rather than using the platform_bus dev_pm_ops overrides (platform_bus_set_pm_ops()), this patch moves the OMAP runtime PM implementation over to using device power domains. Since OMAP is the only user of platform_bus_set_pm_ops(), that interface can be removed (and will be in a forthcoming patch.) [rjw: Rebased on top of a previous change modifying the handling of power domains by the PM core so that power domain callbacks take precendence over subsystem-level PM callbacks.] Signed-off-by: Kevin Hilman Acked-by: Grant Likely Signed-off-by: Rafael J. Wysocki commit 8b313a38ecffc0ff0b4c5115f0a461f73b7dfdb6 Author: Rafael J. Wysocki Date: Fri Apr 29 00:36:32 2011 +0200 PM / Platform: Use generic runtime PM callbacks directly Once shmobile platforms have been converted to using power domains for overriding the platform bus type's PM callbacks, it isn't necessary to use the __weakly defined wrappers around the generinc runtime PM callbacks in the platform bus type any more. Signed-off-by: Rafael J. Wysocki commit 38ade3a1fa0421c12627c7b48c33e89414fc9b76 Author: Rafael J. Wysocki Date: Fri Apr 29 00:36:21 2011 +0200 shmobile: Use power domains for platform runtime PM shmobile platforms replace the runtime PM callbacks of the platform bus type with their own routines, but this means that the callbacks are replaced system-wide. This may not be the right approach if the platform devices on the system are not of the same type (e.g. some of them belong to an SoC and the others are located in separate chips), because in those cases they may require different handling. Thus it is better to use power domains to override the platform bus type's PM handling, as it generally is possible to use different power domains for devices with different PM requirements. Define a default power domain for shmobile in both the SH and ARM falvors and use it to override the platform bus type's PM callbacks. Since the suspend and hibernate callbacks of the new "default" power domains need to be the same and the platform bus type's suspend and hibernate callbacks for the time being, export those callbacks so that can be used outside of the platform bus type code. Signed-off-by: Rafael J. Wysocki commit 69c9dd1ecf446ad8a830e4afc539a2a1adc85b78 Author: Rafael J. Wysocki Date: Fri Apr 29 00:36:05 2011 +0200 PM: Export platform bus type's default PM callbacks Export the default PM callbacks defined for the platform bus type so that they can be used by power domains for suspending and resuming platform devices in the future. Signed-off-by: Rafael J. Wysocki commit 4d27e9dcff00a6425d779b065ec8892e4f391661 Author: Rafael J. Wysocki Date: Fri Apr 29 00:35:50 2011 +0200 PM: Make power domain callbacks take precedence over subsystem ones Change the PM core's behavior related to power domains in such a way that, if a power domain is defined for a given device, its callbacks will be executed instead of and not in addition to the device subsystem's PM callbacks. The idea behind the initial implementation of power domains handling by the PM core was that power domain callbacks would be executed in addition to subsystem callbacks, so that it would be possible to extend the subsystem callbacks by using power domains. It turns out, however, that this wouldn't be really convenient in some important situations. For example, there are systems in which power can only be removed from entire power domains. On those systems it is not desirable to execute device drivers' PM callbacks until it is known that power is going to be removed from the devices in question, which means that they should be executed by power domain callbacks rather then by subsystem (e.g. bus type) PM callbacks, because subsystems generally have no information about what devices belong to which power domain. Thus, for instance, if the bus type in question is the platform bus type, its PM callbacks generally should not be called in addition to power domain callbacks, because they run device drivers' callbacks unconditionally if defined. While in principle the default subsystem PM callbacks, or a subset of them, may be replaced with different functions, it doesn't seem correct to do so, because that would change the subsystem's behavior with respect to all devices in the system, regardless of whether or not they belong to any power domains. Thus, the only remaining option is to make power domain callbacks take precedence over subsystem callbacks. Signed-off-by: Rafael J. Wysocki Acked-by: Grant Likely Acked-by: Kevin Hilman commit 7068b7a16270f1e85a8893d74b0f3c58d7826883 Author: John Stultz Date: Thu Apr 28 13:29:18 2011 -0700 timers: Remove delayed irqwork from alarmtimers implementation Thomas asked about the delayed irq work in the alarmtimers code, and I realized that it was a legacy from when the alarmtimer base lock was a mutex (due to concerns that we'd be interacting with the RTC device, which is protected by mutexes). Since the alarmtimer base is now protected by a spinlock, we can simply execute alarmtimer functions directly from the hrtimer callback. Should any future alarmtimer functions sleep, they can simply manage scheduling any delayed work themselves. CC: Thomas Gleixner Signed-off-by: John Stultz commit 180bf812ceaf01eb8ac69b86f3be0bd57f697668 Author: John Stultz Date: Thu Apr 28 12:58:11 2011 -0700 timers: Improve alarmtimer comments and minor fixes This patch addresses a number of minor comment improvements and other minor issues from Thomas' review of the alarmtimers code. CC: Thomas Gleixner Signed-off-by: John Stultz commit e11feaa1192a079ba8e88a12121e9b12d55d4239 Author: Jeff Mahoney Date: Wed Apr 27 14:27:24 2011 -0400 watchdog, hung_task_timeout: Add Kconfig configurable default This patch allows the default value for sysctl_hung_task_timeout_secs to be set at build time. The feature carries virtually no overhead, so it makes sense to keep it enabled. On heavily loaded systems, though, it can end up triggering stack traces when there is no bug other than the system being underprovisioned. We use this patch to keep the hung task facility available but disabled at boot-time. The default of 120 seconds is preserved. As a note, commit e162b39a may have accidentally reverted commit fb822db4, which raised the default from 120 seconds to 480 seconds. Signed-off-by: Jeff Mahoney Acked-by: Mandeep Singh Baines Link: http://lkml.kernel.org/r/4DB8600C.8080000@suse.com Signed-off-by: Ingo Molnar commit ede70290046043b2638204cab55e26ea1d0c6cd9 Author: Ingo Molnar Date: Thu Apr 28 08:48:42 2011 +0200 perf stat: Fix compatibility behavior Instead of failing on an unknown event, when new perf stat is run on older kernels: $ ./perf stat true Error: open_counter returned with 22 (Invalid argument). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. Just ignore EINVAL and ENOSYS, we'll print the results as not counted: Performance counter stats for 'true': 0.239483 task-clock # 0.493 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 86 page-faults # 0.359 M/sec 704,766 cycles # 2.943 GHz stalled-cycles 381,961 instructions # 0.54 insns per cycle 69,626 branches # 290.735 M/sec 4,594 branch-misses # 6.60% of all branches 0.000485883 seconds time elapsed Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio5hjpn3dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit f9cef0a90c4e7637f1ec98474a1a099aec45eb65 Author: Ingo Molnar Date: Thu Apr 28 18:17:11 2011 +0200 perf stat: Add --sync/-S option --sync will tell perf stat to run sync() before starting a command. This allows IO-heavy tests to be used with --repeat, without one iteration impacting the other. Elapsed time will stabilize for example: before: 3.971525714 seconds time elapsed ( +- 8.56% ) after: 3.211098537 seconds time elapsed ( +- 1.52% ) So measurements will be more accurate. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn1dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 8a850cadca0e387c87a0911a61e99fd66aeb57ec Author: Ingo Molnar Date: Thu Apr 28 11:16:44 2011 +0200 perf event, x86: Use better stalled cycles metric Use the UOPS_EXECUTED.*,c=1,i=1 event on Intel CPUs - it is a rather good indicator of CPU execution stalls, more sensitive and more inclusive than the 0xa2 resource stalls event (which does not count nearly as many stall types). Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn2dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 0c61227094b3ddaca2f847ee287c4a2e3762b5a2 Author: Gleb Natapov Date: Tue Apr 26 11:21:32 2011 +0300 x86, setup: Fix EDD3.0 data verification. Check for nonzero path in edd_has_edd30() has no sense. First, it looks at the wrong memory. Device path starts at offset 30 of the info->params structure which is at offset 8 from the beginning of info structure, but code looks at info + 4 instead. This was correct when code was introduced, but around v2.6.4 three more fields were added to edd_info structure (commit 66b61a5c in history.git). Second, even if it will check correct memory it will always succeed since at offset 30 (params->key) there will be non-zero values otherwise previous check would fail. The patch replaces this bogus check with one that verifies checksum. Signed-off-by: Gleb Natapov Link: http://lkml.kernel.org/r/20110426082132.GG2265@redhat.com Signed-off-by: H. Peter Anvin commit 9ceb1c3d1fe15c2f9b55eaa8978019ef0e0a06ac Author: Ingo Molnar Date: Thu Apr 28 02:57:53 2011 +0200 perf stat: Fix printout vertical alignment Before: | | Performance counter stats for '/home/mingo/hackbench 20' (5 runs): | | 71,321,607 instructions:u # 0.42 insns per cycle ( +- 0.00% ) | 168,040,009 cycles:u # 0.000 GHz ( +- 0.81% ) | | 1.468002368 seconds time elapsed ( +- 1.33% ) | After: | | Performance counter stats for '/home/mingo/hackbench 20' (5 runs): | | 71,321,607 instructions:u # 0.42 insns per cycle ( +- 0.00% ) | 168,040,009 cycles:u # 0.000 GHz ( +- 0.81% ) | | 1.468002368 seconds time elapsed ( +- 1.33% ) | The last column (stddev noise) is properly aligned, vertically. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn0dsrm@git.kernel.org Signed-off-by: Ingo Molnar commit 32673822e440eb92eb334631eb0a199d0c532d13 Merge: fa7b694 5373db8 Author: Ingo Molnar Date: Wed Apr 27 10:38:30 2011 +0200 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core Conflicts: include/linux/perf_event.h Merge reason: pick up the latest jump-label enhancements, they are cooked ready. Signed-off-by: Ingo Molnar commit 9a7adcf5c6dea63d2e47e6f6d2f7a6c9f48b9337 Author: John Stultz Date: Tue Jan 11 09:54:33 2011 -0800 timers: Posix interface for alarm-timers This patch exposes alarm-timers to userland via the posix clock and timers interface, using two new clockids: CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM. Both clockids behave identically to CLOCK_REALTIME and CLOCK_BOOTTIME, respectively, but timers set against the _ALARM suffixed clockids will wake the system if it is suspended. Some background can be found here: https://lwn.net/Articles/429925/ The concept for Alarm-timers was inspired by the Android Alarm driver (by Arve Hjønnevåg) found in the Android kernel tree. See: http://android.git.kernel.org/?p=kernel/common.git;a=blob;f=drivers/rtc/alarm.c;h=1250edfbdf3302f5e4ea6194847c6ef4bb7beb1c;hb=android-2.6.36 While the in-kernel interface is pretty similar between alarm-timers and Android alarm driver, the user-space interface for the Android alarm driver is via ioctls to a new char device. As mentioned above, I've instead chosen to export this functionality via the posix interface, as it seemed a little simpler and avoids creating duplicate interfaces to things like CLOCK_REALTIME and CLOCK_MONOTONIC under alternate names (ie:ANDROID_ALARM_RTC and ANDROID_ALARM_SYSTEMTIME). The semantics of the Android alarm driver are different from what this posix interface provides. For instance, threads other then the thread waiting on the Android alarm driver are able to modify the alarm being waited on. Also this interface does not allow the same wakelock semantics that the Android driver provides (ie: kernel takes a wakelock on RTC alarm-interupt, and holds it through process wakeup, and while the process runs, until the process either closes the char device or calls back in to wait on a new alarm). One potential way to implement similar semantics may be via the timerfd infrastructure, but this needs more research. There may also need to be some sort of sysfs system level policy hooks that allow alarm timers to be disabled to keep them from firing at inappropriate times (ie: laptop in a well insulated bag, mid-flight). CC: Arve Hjønnevåg CC: Thomas Gleixner CC: Alessandro Zummo Acked-by: Arnd Bergmann Signed-off-by: John Stultz commit ff3ead96d17f47ee70c294a5cc2cce9b61e82f0f Author: John Stultz Date: Tue Jan 11 09:42:13 2011 -0800 timers: Introduce in-kernel alarm-timer interface This provides the in kernel interface and infrastructure for alarm-timers. Alarm-timers are a hybrid style timer, similar to hrtimers, but when the system is suspended, the RTC device is set to fire and wake the system for when the soonest alarm-timer expires. The concept for Alarm-timers was inspired by the Android Alarm driver (by Arve Hjønnevåg) found in the Android kernel tree. See: http://android.git.kernel.org/?p=kernel/common.git;a=blob;f=drivers/rtc/alarm.c;h=1250edfbdf3302f5e4ea6194847c6ef4bb7beb1c;hb=android-2.6.36 This in-kernel interface should be fairly compatible with the Android alarm driver in-kernel interface, but has the advantage of utilizing the new RTC timerqueue code instead of doing direct RTC manipulation. CC: Arve Hjønnevåg CC: Thomas Gleixner CC: Alessandro Zummo Acked-by: Arnd Bergmann Signed-off-by: John Stultz commit 88d19cf37952a7e1e38b2bf87a00f0e857e63180 Author: John Stultz Date: Mon Jan 3 18:59:43 2011 -0800 timers: Add rb_init_node() to allow for stack allocated rb nodes In cases where a timerqueue_node or some structure that utilizes a timerqueue_node is allocated on the stack, gcc would give warnings caused by the timerqueue_init()'s calling RB_CLEAR_NODE, which self-references the nodes uninitialized data. The solution is to create an rb_init_node() function that zeros the rb_node structure out and then calls RB_CLEAR_NODE(), and then call the new init function from timerqueue_init(). CC: Thomas Gleixner Acked-by: Arnd Bergmann Signed-off-by: John Stultz commit 304529b1b6f8612ccbb4582e997051b48b94f4a4 Author: John Stultz Date: Fri Apr 1 14:32:09 2011 -0700 time: Add timekeeping_inject_sleeptime Some platforms cannot implement read_persistent_clock, as their RTC devices are only accessible when interrupts are enabled. This keeps them from being used by the timekeeping code on resume to measure the time in suspend. The RTC layer tries to work around this, by calling do_settimeofday on resume after irqs are reenabled to set the time properly. However, this only corrects CLOCK_REALTIME, and does not properly adjust the sleep time value. This causes btime in /proc/stat to be incorrect as well as making the new CLOCK_BOTTTIME inaccurate. This patch resolves the issue by introducing a new timekeeping hook to allow the RTC layer to inject the sleep time on resume. The code also checks to make sure that read_persistent_clock is nonfunctional before setting the sleep time, so that should the RTC's HCTOSYS option be configured in on a system that does support read_persistent_clock we will not increase the total_sleep_time twice. CC: Arve Hjønnevåg CC: Thomas Gleixner Acked-by: Arnd Bergmann Signed-off-by: John Stultz commit c6264deff7ea6125492b442edad885e5429679af Author: Ingo Molnar Date: Wed Apr 27 13:50:47 2011 +0200 perf stat: Add -d/--detailed flag to run with a lot of events Add the new -d/--detailed flag, which generates a pretty detailed event list: Performance counter stats for './hackbench 10' (10 runs): 1514.287888 task-clock # 10.897 CPUs utilized ( +- 3.05% ) 39,698 context-switches # 0.026 M/sec ( +- 12.19% ) 8,147 CPU-migrations # 0.005 M/sec ( +- 16.55% ) 17,918 page-faults # 0.012 M/sec ( +- 0.37% ) 2,944,504,050 cycles # 1.944 GHz ( +- 3.89% ) (32.60%) 1,043,971,283 stalled-cycles # 35.45% of all cycles are idle ( +- 5.22% ) (44.48%) 1,655,906,768 instructions # 0.56 insns per cycle # 0.63 stalled cycles per insn ( +- 1.95% ) (55.09%) 338,832,373 branches # 223.757 M/sec ( +- 1.96% ) (64.47%) 3,892,416 branch-misses # 1.15% of all branches ( +- 5.49% ) (73.12%) 606,410,482 L1-dcache-loads # 400.459 M/sec ( +- 1.29% ) (71.21%) 31,204,395 L1-dcache-load-misses # 5.15% of all L1-dcache hits ( +- 3.04% ) (60.43%) 3,922,751 LLC-loads # 2.590 M/sec ( +- 6.80% ) (46.87%) 5,037,288 LLC-load-misses # 3.327 M/sec ( +- 3.56% ) (13.00%) 0.138966828 seconds time elapsed ( +- 4.11% ) This can be used "at a glance" for narrower analysis. -d can also be used in addition to other -e events, to further expand an event list. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-cxs98quixs3qyvdqx3goojc4@git.kernel.org Signed-off-by: Ingo Molnar commit 8bb6c79f24e66538f606076915e918242c02ec7c Author: Ingo Molnar Date: Wed Apr 27 13:25:24 2011 +0200 perf stat: Print out miss/hit ratio for L1 data-cache events Print out this kind of l1-dcache-misses percentage: Performance counter stats for './bw_tcp localhost': 29,956,262,201 cycles # 3.002 GHz (scaled from 85.14%) 8,255,209,558 stalled-cycles # 27.56% of all cycles are idle (scaled from 86.56%) 1,206,130,308 l1-dcache-misses # 40.49% of all L1-dcache hits (scaled from 86.30%) 2,978,756,779 l1-dcache-refs # 298.512 M/sec (scaled from 70.02%) 8,861,956,159 instructions # 0.30 insns per cycle # 0.93 stalled cycles per insn (scaled from 84.27%) 1,644,306,068 branches # 164.782 M/sec (scaled from 86.43%) 74,778,443 branch-misses # 4.55% of all branches (scaled from 70.69%) 9978.695711 task-clock # 0.693 CPUs utilized 14.404347983 seconds time elapsed And color the result depending on the severity of cache-trashing. Acked-by: Peter Zijlstra Acked-by Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-54gmz0zymaid84zcs7joq02p@git.kernel.org Signed-off-by: Ingo Molnar commit c78df6c1d49b5d798f1579141e3a12be7c325d1e Author: Ingo Molnar Date: Wed Apr 27 12:16:10 2011 +0200 perf stat: Print branch misses warning colors Print the missed-branches percentage with different warning level ASCII colors, as the percentage passes the 5%/10%/20% thresholds. These thresholds are set to relatively low levels, because on most CPUs even a moderate percentage of branch-misses already shows up as a slowdown. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-ybqukg7p86leiup7gl03ecgk@git.kernel.org Signed-off-by: Ingo Molnar commit a5d243d04a150acbfa79d641154f49e5d920f64f Author: Ingo Molnar Date: Wed Apr 27 05:39:24 2011 +0200 perf stat: Print stalled cycles warning colors Print the stalled-cycles percentage with different warning level ASCII colors, as the percentage passes the 25%/50%/75% thresholds. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-e25zz44rcms7mu9az4fu5zp0@git.kernel.org Signed-off-by: Ingo Molnar commit f99844cb76b7d347711c22cdcb94266b7214141f Author: Ingo Molnar Date: Wed Apr 27 05:35:39 2011 +0200 perf stat: Fix -nan% output in perf stat noise printouts Before: 0 CPU-migrations # 0.000 M/sec ( +- -nan% ) After: 0 CPU-migrations # 0.000 M/sec ( +- 0.00% ) Also factor out the noise printing function. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-z89h2v1bk1mikcbsf7e6v34q@git.kernel.org Signed-off-by: Ingo Molnar commit 1fc570ad89e55dc32dfa4dda1311948b38f26524 Author: Ingo Molnar Date: Wed Apr 27 05:20:22 2011 +0200 perf stat: Add stalled cycles to the default output The new default output looks like this: Performance counter stats for './loop_1b_instructions': 236.010686 task-clock # 0.996 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 99 page-faults # 0.000 M/sec 756,487,646 cycles # 3.205 GHz 354,938,996 stalled-cycles # 46.92% of all cycles are idle 1,001,403,797 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn 100,279,773 branches # 424.895 M/sec 12,646 branch-misses # 0.013 % of all branches 0.236902540 seconds time elapsed We dropped cache-refs and cache-misses and added stalled-cycles - this is a more generic "how well utilized is the CPU" metric. If the stalled-cycles ratio is too high then more specific measurements can be taken to figure out the source of the inefficiency. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org Signed-off-by: Ingo Molnar commit 481f988a016f7a0327a5537bde4794349fc4625c Author: Ingo Molnar Date: Wed Apr 27 04:34:16 2011 +0200 perf stat: Add stalled cycles accounting, prettify the resulting output Add stalled cycles accounting and use it to print the "cycles stalled per instruction" value. Also change the unit of the cycles output from M/sec to GHz - this is more intuitive. Prettify the output to: Performance counter stats for './loop_1b_instructions': 239.775036 task-clock # 0.997 CPUs utilized 761,903,912 cycles # 3.178 GHz 356,620,620 stalled-cycles # 46.81% of all cycles are idle 1,001,578,351 instructions # 1.31 insns per cycle # 0.36 stalled cycles per insn 14,782 cache-references # 0.062 M/sec 5,694 cache-misses # 38.520 % of all cache refs 0.240493656 seconds time elapsed Also adjust the --repeat output to make the percentages align vertically: Performance counter stats for './loop_1b_instructions' (10 runs): 236.096793 task-clock # 0.997 CPUs utilized ( +- 0.011% ) 756,553,086 cycles # 3.204 GHz ( +- 0.002% ) 354,942,692 stalled-cycles # 46.92% of all cycles are idle ( +- 0.008% ) 1,001,389,700 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn ( +- 0.000% ) 10,166 cache-references # 0.043 M/sec ( +- 0.742% ) 468 cache-misses # 4.608 % of all cache refs ( +- 13.385% ) 0.236874136 seconds time elapsed ( +- 0.01% ) Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-uapziqny39601apdmmhoz7hk@git.kernel.org Signed-off-by: Ingo Molnar commit dcd9936a5a6d89512b5323c1145647f2dbe0236f Author: Ingo Molnar Date: Wed Apr 27 04:36:37 2011 +0200 perf stat: Factor our shadow stats Create update_shadow_stats() which is then used in both read_counter_aggr() and read_counter(). This not only simplifies the code but also fixes a bug: HW_CACHE_REFERENCES was not updated in read_counter(). Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-9uc55z3g88r47exde7zxjm6p@git.kernel.org Signed-off-by: Ingo Molnar commit 749141d926faf23ef811686a8050e7cf13dc223f Author: Ingo Molnar Date: Wed Apr 27 04:24:57 2011 +0200 perf stat: Make all displayed event names parseable as well Right now we display this by default: 0.202204 task-clock-msecs # 0.282 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 85 page-faults # 0.420 M/sec The task-clock-msecs event cannot actually be passed back as an event name, the event name we recognize is 'task-clock'. So change the output of the cpu-clock and task-clock events to be idempotent. ( Units should be printed out in the right-side column, if needed. ) Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org Signed-off-by: Ingo Molnar commit ceb53fbf6dbb1df26d38379a262c6981fe73dd36 Author: Ingo Molnar Date: Wed Apr 27 04:06:33 2011 +0200 perf stat: Fail more clearly when an invalid modifier is specified Currently we fail without printing any error message on "perf stat -e task-clock-msecs". The reason is that the task-clock event is matched and the "-msecs" postfix is assumed to be an event modifier - but is not recognized. This patch changes the code to be more informative: $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers And restructures the return value of parse_event_modifier() to allow the printing of all variants of invalid event modifiers. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org Signed-off-by: Ingo Molnar commit b908debd4eef91471016138569f7a9e292be682e Author: Ingo Molnar Date: Wed Apr 27 03:55:40 2011 +0200 perf tools: Accept case-insensitive symbolic event variants We currently fail on something like '-e CPU-migrations', with: invalid or unsupported event: 'CPU-migrations' While 'CPU-migrations' is how we actually print out the event in the default perf stat output: Performance counter stats for 'true': 0.202204 task-clock-msecs # 0.282 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec So change the matching to be case-insensitive. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org Signed-off-by: Ingo Molnar commit d58f4c82fed69fdd4a79fa54fe17fd14d98c27aa Author: Ingo Molnar Date: Wed Apr 27 03:42:18 2011 +0200 perf stat: Print cache misses as percentage Before: 113,393,041 cache-references # 83.636 M/sec 7,052,454 cache-misses # 5.202 M/sec After: 112,589,441 cache-references # 87.925 M/sec 6,556,354 cache-misses # 5.823 % misses/hits percentages are more expressive than absolute numbers or rates. (Also prettify the CPUs printout line to not have a trailing whitespace.) Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-axm28f43x439bl41zkvfzd63@git.kernel.org Signed-off-by: Ingo Molnar commit 11ba2b85f506306c8dfc9fe144aa4ddc43242855 Author: Ingo Molnar Date: Sun Apr 24 15:05:10 2011 +0200 perf stat: Print stalled cycles percentage Print: 611,527 cycles 400,553 instructions # ( 0.71 instructions per cycle ) 77,809 stalled-cycles # ( 12.71% of all cycles ) 0.000610987 seconds time elapsed Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Signed-off-by: Ingo Molnar Link: http://lkml.kernel.org/n/tip-fd6x8r1cpyb6zhlrc4ix8m45@git.kernel.org commit 5c543e3c442d6382db127152c7096ca6a55283de Author: Ingo Molnar Date: Wed Apr 27 12:02:04 2011 +0200 perf events, x86: Mark constrant tables read mostly Various constraint tables were not marked read-mostly. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-wpqwwvmhxucy5e718wnamjiv@git.kernel.org Signed-off-by: Ingo Molnar commit 94403f8863d0d1d2005291b2ef0719c2534aa303 Author: Ingo Molnar Date: Sun Apr 24 08:18:31 2011 +0200 perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLES The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate cycles the CPU does nothing useful, because it is stalled on a cache-miss or some other condition. Acked-by: Peter Zijlstra Acked-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Link: http://lkml.kernel.org/n/tip-fue11vymwqsoo5to72jxxjyl@git.kernel.org Signed-off-by: Ingo Molnar commit 7bd5fafeb414cf00deee32c82834f8bf1426b9ac Merge: fa7b694 ec75a71 Author: Ingo Molnar Date: Tue Apr 26 19:36:14 2011 +0200 Merge branch 'perf/urgent' into perf/stat Merge reason: We want to queue up dependent changes. Signed-off-by: Ingo Molnar commit 1437f5bca3c2d162f058cba37dfbeb20f619040d Author: Hillf Danton Date: Sat Apr 23 21:29:05 2011 +0800 sched: Remove noop in alloc_rt_sched_group() The rq varible, though computed for each possible cpu, has nothing to do in the function, so it can be removed. This also eliminates a build warning. Signed-off-by: Hillf Danton Reviewed-by: Yong Zhang Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/BANLkTin-FfQfqW5ym1iuEmrk8s777Y1LAg@mail.gmail.com Signed-off-by: Ingo Molnar commit e7e09cd667a43d8287f85d453a16fc0ec1e2c7b7 Author: Jonathan Cameron Date: Tue Apr 19 12:43:47 2011 +0100 params.c: Use new strtobool function to process boolean inputs No functional changes. Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit 8705b48e7159655c116154928fe104fd6561fa94 Author: Jonathan Cameron Date: Tue Apr 19 12:43:46 2011 +0100 debugfs: move to new strtobool No functional changes requires that we eat errors from strtobool. If people want to not do this, then it should be fixed at a later date. V2: Simplification suggested by Rusty Russell removes the need for additional variable ret. Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit ad58671cf32c74a8d6e8f51e63e9cf4e7a73bf1e Author: Jonathan Cameron Date: Tue Apr 19 12:43:45 2011 +0100 Add a strtobool function matching semantics of existing in kernel equivalents This is a rename of the usr_strtobool proposal, which was a renamed, relocated and fixed version of previous kstrtobool RFC Signed-off-by: Jonathan Cameron Signed-off-by: Greg Kroah-Hartman commit 85723943537bf6a73bdf1140b2088fbe0c17c3c2 Author: Wanlong Gao Date: Sat Apr 23 22:18:26 2011 +0800 drivers:base:fix the coding format of memory.c Fix the line longer than 80 of memory_uevent function . Signed-off-by: Wanlong Gao Signed-off-by: Greg Kroah-Hartman commit 67f9cbf9affe39f67cd3f1d2e2a2a43089d9ab3a Author: Rafael J. Wysocki Date: Fri Apr 22 22:03:31 2011 +0200 PM / Blackfin: Use struct syscore_ops instead of sysdevs for PM Convert some Blackfin architecture's code to using struct syscore_ops objects for power management instead of sysdev classes and sysdevs. This simplifies the code and reduces the kernel's memory footprint. It also is necessary for removing sysdevs from the kernel entirely in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman Acked-by: Mike Frysinger commit bb072c3cf21d1c9a5a2eeb5a00679ee7bf39675b Author: Rafael J. Wysocki Date: Fri Apr 22 22:03:21 2011 +0200 ARM / Samsung: Use struct syscore_ops for "core" power management Replace sysdev classes and struct sys_device objects used for "core" power management by Samsung platforms with struct syscore_ops objects that are simpler. This generally reduces the code size and the kernel memory footprint. It also is necessary for removing sysdevs entirely from the kernel in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman Acked-by: Kukjin Kim commit 2eaa03b5bebd1e80014f780d7bf27c3e66daefd6 Author: Rafael J. Wysocki Date: Fri Apr 22 22:03:11 2011 +0200 ARM / PXA: Use struct syscore_ops for "core" power management Replace sysdev classes and struct sys_device objects used for "core" power management by the PXA platform code with struct syscore_ops objects that are simpler. This reduces the code size and the kernel memory footprint. It also is necessary for removing sysdevs entirely from the kernel in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit 905339807bde7bb726001b69fbdf69ab0cf69a9e Author: Rafael J. Wysocki Date: Fri Apr 22 22:03:03 2011 +0200 ARM / SA1100: Use struct syscore_ops for "core" power management Replace the sysdev class and struct sys_device used for power management by the SA1100 interrupt-handling code with a struct syscore_ops object which is simpler. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit b7808056141bc4d67213036921a5a685ebec0274 Author: Rafael J. Wysocki Date: Fri Apr 22 22:02:55 2011 +0200 ARM / Integrator: Use struct syscore_ops for core PM Replace the sysdev class and struct sys_device used for power management by the Integrator interrupt-handling code with a struct syscore_ops object which is simpler. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit 3c437ffd20329619672b12a97bee944bccdd4ec9 Author: Rafael J. Wysocki Date: Fri Apr 22 22:02:46 2011 +0200 ARM / OMAP: Use struct syscore_ops for "core" power management Replace the sysdev class and struct sys_device used for power management in the OMAP's GPIO code with a struct syscore_ops object which is simpler. Signed-off-by: Rafael J. Wysocki Acked-by: Kevin Hilman Acked-by: Greg Kroah-Hartman commit 328f5cc30290a92ea3ca62b2a63d2b9ebcb0d334 Author: Rafael J. Wysocki Date: Fri Apr 22 22:02:33 2011 +0200 ARM: Use struct syscore_ops instead of sysdevs for PM in common code Convert some ARM architecture's common code to using struct syscore_ops objects for power management instead of sysdev classes and sysdevs. This simplifies the code and reduces the kernel's memory footprint. It also is necessary for removing sysdevs from the kernel entirely in the future. Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman commit 15d6aba24d88231415f4e7e091c0f1e60c3e6fd5 Author: Avi Kivity Date: Sun Apr 24 11:11:51 2011 +0300 x86: Demacro CONFIG_PARAVIRT cpu accessors Recently, we had a build failure on !CONFIG_PARAVIRT due to a callback ->wbinvd() clashing with a macro wbinvd(). While we worked around the issue, avoid it in the future by changing the macro (and a few surrounding ones) to an inline function. Signed-off-by: Avi Kivity Cc: "H. Peter Anvin" Link: http://lkml.kernel.org/r/1303632711-21662-1-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar commit 625f2a378e5a10f45fdc37932fc9f8a21676de9e Author: Jonathan Corbet Date: Fri Apr 22 11:19:10 2011 -0600 sched: Get rid of lock_depth Neil Brown pointed out that lock_depth somehow escaped the BKL removal work. Let's get rid of it now. Note that the perf scripting utilities still have a bunch of code for dealing with common_lock_depth in tracepoints; I have left that in place in case anybody wants to use that code with older kernels. Suggested-by: Neil Brown Signed-off-by: Jonathan Corbet Cc: Arnd Bergmann Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110422111910.456c0e84@bike.lwn.net Signed-off-by: Ingo Molnar commit fa7b69475a6c192853949ba496dd9c37b497b548 Author: Justin P. Mattock Date: Fri Apr 22 10:08:52 2011 -0700 perf events, x86, P4: Fix typo in comment Signed-off-by: Justin P. Mattock Acked-by: Cyrill Gorcunov Cc: trivial@kernel.org Link: http://lkml.kernel.org/r/1303492132-3004-1-git-send-email-justinmattock@gmail.com Signed-off-by: Ingo Molnar commit cfefd21e693dca791bf9ecfc9dd3794facad533c Author: Thomas Gleixner Date: Fri Apr 15 22:36:08 2011 +0200 genirq: Add chip suspend and resume callbacks These callbacks are only called in the syscore suspend/resume code on interrupt chips which have been registered via the generic irq chip mechanism. Calling those callbacks per irq would be rather icky, but with the generic irq chip mechanism we can call this per registered chip. Signed-off-by: Thomas Gleixner Cc: linux-arm-kernel@lists.infradead.org commit 7d8280624797bbe2f5170bd3c85c75a8c9c74242 Author: Thomas Gleixner Date: Sun Apr 3 11:42:53 2011 +0200 genirq: Implement a generic interrupt chip Implement a generic interrupt chip, which is configurable and is able to handle the most common irq chip implementations. Signed-off-by: Thomas Gleixner Cc: linux-arm-kernel@lists.infradead.org Tested-by: H Hartley Sweeten Tested-by: Tony Lindgren Tested-by; Kevin Hilman commit 7f1b1244e159a8490d7fb13667c6cb7e1e75046b Author: Paul Mundt Date: Thu Apr 7 06:01:44 2011 +0900 genirq: Support per-IRQ thread disabling. This adds support for disabling threading on a per-IRQ basis via the IRQ status instead of the IRQ flow, which is necessary for interrupts that don't follow the natural IRQ flow channels, such as those that are virtually created. The new APIs added are simply: irq_set_thread() irq_set_nothread() which follow the rest of the IRQ status routines. Chained handlers also have IRQ_NOTHREAD set on them automatically, making the lack of threading explicit rather than implicit. Subsequently, the nothread flag can be viewed through the standard genirq debugging facilities. [ tglx: Fixed cleanup fallout ] Signed-off-by: Paul Mundt Link: http://lkml.kernel.org/r/%3C20110406210135.GF18426%40linux-sh.org%3E Signed-off-by: Thomas Gleixner commit 770767787c23040dc152e7ae230597ff55b39470 Author: Geert Uytterhoeven Date: Sun Apr 10 11:01:52 2011 +0200 genirq: irq_desc: Document preflow_handler and affinity_hint [ tglx: Filled in the FIXME place holders ] Signed-off-by: Geert Uytterhoeven Link: http://lkml.kernel.org/r/%3C1302426113-13808-2-git-send-email-geert%40linux-m68k.org%3E Signed-off-by: Thomas Gleixner commit ee430599bf63c13ee521a352f562a4281cba5e61 Author: Geert Uytterhoeven Date: Sun Apr 10 11:01:53 2011 +0200 genirq: Update DocBook comments Fix some parts to match the actual code. [ tglx: Resolved the FIXMEs Gerd put in ] Signed-off-by: Geert Uytterhoeven Link: http://lkml.kernel.org/r/%3C1302426113-13808-3-git-send-email-geert%40linux-m68k.org%3E Signed-off-by: Thomas Gleixner Cc: linux-doc@vger.kernel.org commit 0911f124bf55357803d53197cc1ae5479f5e37e2 Author: Geert Uytterhoeven Date: Sun Apr 10 11:01:51 2011 +0200 genirq: Forgotten updates/deletions after removal of compat code commit 0c6f8a8b917ad361319c8ace3e9f28e69bfdb4c1 ("genirq: Remove compat code") removed the compat code, but forgot to update some references in comments and delete some of its documentation. Signed-off-by: Geert Uytterhoeven Link: http://lkml.kernel.org/r/%3C1302426113-13808-1-git-send-email-geert%40linux-m68k.org%3E Signed-off-by: Thomas Gleixner commit c8705082404823a5bb3e02a32ba0764399b9e6f2 Author: Uwe Kleine-König Date: Wed Apr 20 09:44:46 2011 +0200 driver core: let dev_set_drvdata return int instead of void as it can fail Before commit b402843 (Driver core: move dev_get/set_drvdata to drivers/base/dd.c) calling dev_set_drvdata with dev=NULL was an unchecked error. After some discussion about what to return in this case removing the check (and so producing a null pointer exception) seems fine. Signed-off-by: Uwe Kleine-König Signed-off-by: Greg Kroah-Hartman commit 4a03d6f7c863a039b937649a93341615f531358e Author: Uwe Kleine-König Date: Wed Apr 20 09:44:45 2011 +0200 driver core/platform_device_add_resources: free resource before overwriting Reviewed-by: Viresh Kumar Signed-off-by: Uwe Kleine-König Signed-off-by: Greg Kroah-Hartman commit cea896238fbfdbce254f51fc8fd78c59df50081f Author: Uwe Kleine-König Date: Wed Apr 20 09:44:44 2011 +0200 driver core/platform_device_add_resources: set resource to NULL if !res This makes the res = NULL case more consistant to the res != NULL case as now both overwrite pdev->resource. Reviewed-by: Viresh Kumar Signed-off-by: Uwe Kleine-König Signed-off-by: Greg Kroah-Hartman commit 251e031d132ea3d03e0a32f2240c67f449979c5d Author: Uwe Kleine-König Date: Wed Apr 20 09:44:43 2011 +0200 driver core/platform_device_add_data: free platform data before overwriting Reviewed-by: Viresh Kumar Signed-off-by: Uwe Kleine-König Signed-off-by: Greg Kroah-Hartman commit 27a33f9e8fb203e71925257cf039fe6ec623c5d1 Author: Uwe Kleine-König Date: Wed Apr 20 09:44:42 2011 +0200 driver core/platform_device_add_data: set platform_data to NULL if !data This makes the data = NULL case more consistent to the data != NULL case. The functional change is that now platform_device_add_data(somepdev, NULL, somesize) sets pdev->dev.platform_data to NULL instead of not touching it. Reviewed-by: Viresh Kumar Signed-off-by: Uwe Kleine-König Signed-off-by: Greg Kroah-Hartman commit 7f100d1566d6ee353a43be92599511fc438ec281 Author: Karthigan Srinivasan Date: Mon Apr 18 16:16:52 2011 -0500 drivers/base/core.c: Fixed brace coding style issue. Fixed brace coding style issue. Signed-off-by: Karthigan Srinivasan Signed-off-by: Greg Kroah-Hartman commit 8497d6a21c4b17052e868bd53a74c82b557a6c46 Author: Sebastian Ott Date: Tue Apr 12 19:05:37 2011 +0200 driver-core: fix race between device_register and driver_register When a device is registered to a bus it will be a) added to the list of devices of the bus and b) bind to a driver (if one matches). As a result of a driver being registered at this bus between a) and b) this device could already be bound to a driver. This leads to a warning and incorrect refcounting. To fix this add a check to device_attach to identify an already bound device. Signed-off-by: Sebastian Ott Signed-off-by: Greg Kroah-Hartman commit fc2711992b8601c20b7cc078f533e55c3106fbd4 Author: Pavan Savoy Date: Fri Apr 8 04:57:43 2011 -0500 drivers:misc:ti-st: remove rfkill dependency rfkill is no longer used by Texas Instruments shared transport driver to communicate with user-space. This patch removes the dependency of rfkill to be enabled to build shared transport driver in the Kconfig. Signed-off-by: Pavan Savoy Signed-off-by: Greg Kroah-Hartman commit 764b0c4b3256ad4431cb52eaf99c0abe6df0a085 Author: Pavan Savoy Date: Fri Apr 8 04:57:42 2011 -0500 drivers:misc:ti-st: handle delayed tty receive When certain technologies shutdown their interface without waiting for the acknowledgement from the chip. The receive_buf from the TTY would be invoked a while after the relevant technology is unregistered. This patch introduces a new flag "is_registered" which maintains the state of protocols BT, FM or GPS and thereby removes the need to clear the protocol data from ST when protocols gets unregistered. This fixes corner cases when HCI RESET is sent down from bluetooth stack and the receive_buf is called from tty after 250ms before which bluetooth would have unregistered from the system. OR - when FM application decides to close down the device without sending a power-off FM command resulting in some RDS data or interrupt data coming in after the driver is unregistered. Signed-off-by: Pavan Savoy Signed-off-by: Greg Kroah-Hartman commit e0944ee63f7249802be74454cef81c97630ae1cd Author: Steven Rostedt Date: Wed Apr 20 21:42:00 2011 -0400 lockdep: Remove cmpxchg to update nr_chain_hlocks For some reason nr_chain_hlocks is updated with cmpxchg, but this is performed inside of the lockdep global "grab_lock()", which also makes simple modification of this variable atomic. Remove the cmpxchg logic for updating nr_chain_hlocks and simplify the code. Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014300.727863282@goodmis.org Signed-off-by: Ingo Molnar commit 282b5c2f6f663c008444321fd8fcdd374596046b Author: Steven Rostedt Date: Wed Apr 20 21:41:59 2011 -0400 lockdep: Print a nicer description for simple irq lock inversions Lockdep output can be pretty cryptic, having nicer output can save a lot of head scratching. When a simple irq inversion scenario is detected by lockdep (lock A taken in interrupt context but also in thread context without disabling interrupts) we now get the following (hopefully more informative) output: other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(lockA); lock(lockA); *** DEADLOCK *** Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014300.436140880@goodmis.org Signed-off-by: Ingo Molnar commit 6be8c3935b914dfbc24b27c91c2b6d583645e61a Author: Steven Rostedt Date: Wed Apr 20 21:41:58 2011 -0400 lockdep: Replace "Bad BFS generated tree" message with something less cryptic The message of "Bad BFS generated tree" is a bit confusing. Replace it with a more sane error message. Thanks to Peter Zijlstra for helping me come up with a better message. Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014300.135521252@goodmis.org Signed-off-by: Ingo Molnar commit dad3d7435e1d8c254d6877dc06852dc00c5da812 Author: Steven Rostedt Date: Wed Apr 20 21:41:57 2011 -0400 lockdep: Print a nicer description for irq inversion bugs Irq inversion and irq dependency bugs are only subtly different. The diffenerence lies where the interrupt occurred. For irq dependency: irq_disable lock(A) lock(B) unlock(B) unlock(A) irq_enable lock(B) unlock(B) lock(A) The interrupt comes in after it has been established that lock A can be held when taking an irq unsafe lock. Lockdep detects the problem when taking lock A in interrupt context. With the irq_inversion the irq happens before it is established and lockdep detects the problem with the taking of lock B: lock(A) irq_disable lock(A) lock(B) unlock(B) unlock(A) irq_enable lock(B) unlock(B) Since the problem with the locking logic for both of these issues is in actuality the same, they both should report the same scenario. This patch implements that and prints this: other info that might help us debug this: Chain exists of: &rq->lock --> lockA --> lockC Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(lockC); local_irq_disable(); lock(&rq->lock); lock(lockA); lock(&rq->lock); *** DEADLOCK *** Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014259.910720381@goodmis.org Signed-off-by: Ingo Molnar commit 48702ecf308e53f176c1f6fdc193d622ded54ac0 Author: Steven Rostedt Date: Wed Apr 20 21:41:56 2011 -0400 lockdep: Print a nicer description for simple deadlocks Lockdep output can be pretty cryptic, having nicer output can save a lot of head scratching. When a simple deadlock scenario is detected by lockdep (lock A -> lock A) we now get the following new output: other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(lock)->rlock); lock(&(lock)->rlock); *** DEADLOCK *** Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014259.643930104@goodmis.org Signed-off-by: Ingo Molnar commit f4185812aa046ecb97e8817e10148cacdd7a6baa Author: Steven Rostedt Date: Wed Apr 20 21:41:55 2011 -0400 lockdep: Print a nicer description for normal deadlocks The lockdep output can be pretty cryptic, having nicer output can save a lot of head scratching. When a normal deadlock scenario is detected by lockdep (lock A -> lock B and there exists a place where lock B -> lock A) we now get the following new output: other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(lockB); lock(lockA); lock(lockB); lock(lockA); *** DEADLOCK *** On cases where there's a deeper chair, it shows the partial chain that can cause the issue: Chain exists of: lockC --> lockA --> lockB Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(lockB); lock(lockA); lock(lockB); lock(lockC); *** DEADLOCK *** Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014259.380621789@goodmis.org Signed-off-by: Ingo Molnar commit 3003eba313dd0e0502dd71548c36fe7c19801ce5 Author: Steven Rostedt Date: Wed Apr 20 21:41:54 2011 -0400 lockdep: Print a nicer description for irq lock inversions Locking order inversion due to interrupts is a subtle problem. When an irq lockiinversion discovered by lockdep it currently reports something like: [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] ... and then prints out the locks that are involved, as back traces. Judging by lkml feedback developers were routinely confused by what a HARDIRQ->safe to unsafe issue is all about, and sometimes even blew it off as a bug in lockdep. It is not obvious when lockdep prints this message about a lock that is never taken in interrupt context. After explaining the problems that lockdep is reporting, I decided to add a description of the problem in visual form. Now the following is shown: --- other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(lockA); local_irq_disable(); lock(&rq->lock); lock(lockA); lock(&rq->lock); *** DEADLOCK *** --- The above is the case when the unsafe lock is taken while holding a lock taken in irq context. But when a lock is taken that also grabs a unsafe lock, the call chain is shown: --- other info that might help us debug this: Chain exists of: &rq->lock --> lockA --> lockC Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(lockC); local_irq_disable(); lock(&rq->lock); lock(lockA); lock(&rq->lock); *** DEADLOCK *** Signed-off-by: Steven Rostedt Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110421014259.132728798@goodmis.org Signed-off-by: Ingo Molnar commit 103b3934817a7c42fba6e1ef76ecb390a2837d40 Author: Cyrill Gorcunov Date: Thu Apr 21 11:03:20 2011 -0400 perf, x86: P4 PMU -- Use perf_sample_data_init helper Instead of opencoded assignments better to use perf_sample_data_init helper. Tested-by: Lin Ming Signed-off-by: Cyrill Gorcunov Signed-off-by: Don Zickus Cc: Cyrill Gorcunov Link: http://lkml.kernel.org/r/1303398203-2918-2-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar commit eff430de53be6f3328c3eebe93755f1ecf499e37 Merge: 9cbdb70 91e8549 Author: Ingo Molnar Date: Fri Apr 22 10:19:26 2011 +0200 Merge branch 'linus' into perf/core Merge reason: Pick up upstream fixes. Signed-off-by: Ingo Molnar commit d3bf52e998056a6002b2aecfe1d25486376382ac Author: Rakib Mullick Date: Wed Apr 20 21:27:32 2011 +0600 sched: Remove obsolete comment from scheduler_tick() scheduler_tick() is no longer called by fork code - this got discarded a long time ago by commit bc947631d1d532 ("sched: improve efficiency of sched_fork()"). So, remove the comment which still claims otherwise. Signed-off-by: Rakib Mullick Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/BANLkTimO4iGP0QpaHO1HHF1QOnVcQpc0cw@mail.gmail.com Signed-off-by: Ingo Molnar commit 42ac9e87fdd89b77fa2ca0a5226023c1c2d83226 Merge: 057f3fa f0e615c Author: Ingo Molnar Date: Thu Apr 21 11:39:21 2011 +0200 Merge commit 'v2.6.39-rc4' into sched/core Merge reason: Pick up upstream fixes. Signed-off-by: Ingo Molnar commit dffa4b2f62ff28c982144c7033001b1ece4d3532 Author: Borislav Petkov Date: Wed Apr 20 12:23:49 2011 +0200 x86, mce: Drop the default decoding notifier The default notifier doesn't make a lot of sense to call in the correctable errors case. Drop it and emit the mcelog decoding hint only in the uncorrectable errors case and when no notifier is registered. Also, limit issuing the "mcelog --ascii" message in the rare case when we dump unreported CEs before panicking. While at it, remove unused old x86_mce_decode_callback from the header. Signed-off-by: Borislav Petkov Signed-off-by: Prarit Bhargava Cc: Tony Luck Cc: Nagananda Chumbalkar Cc: Russ Anderson Link: http://lkml.kernel.org/r/20110420102349.GB1361@aftab Signed-off-by: Ingo Molnar commit 8a91707d0a1a49193e23cb2d243632f2289feb24 Author: Konrad Rzeszutek Wilk Date: Wed Apr 20 11:54:10 2011 -0400 xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions. If the backends, which use these two functions, are compiled as a module we need these two functions to be exported. Signed-off-by: Konrad Rzeszutek Wilk commit 9cbdb702092a2d82f909312f4ec3eeded77bb82e Author: David Ahern Date: Wed Apr 6 21:54:20 2011 -0600 perf script: improve validation of sample attributes for output fields Check for required sample attributes using evsel rather than sample_type in the session header. If the attribute for a default field is not present for the event type (e.g., new command operating on file from older kernel) the field is removed from the output list. Expected event types must exist. For example, if a user specifies -f trace:time,trace -f sw:time,cpu,sym the perf.data file must contain both tracepoints and software events (ie., it is an error if either does not exist in the file). Attribute checking is done once at the beginning of perf-script rather than for each sample. v1 -> v2: - addressed comments from acme Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.com Signed-off-by: David Ahern Signed-off-by: Arnaldo Carvalho de Melo commit 70a5f52165bd04cf3b33f30d5d234be28dcf29d4 Author: Andrew Morton Date: Tue Apr 12 16:13:31 2011 -0700 kmsg: properly support writev to avoid interleaved printk lines fix make `len' size_t, avoid multiple-assignments. Cc: Kay Sievers Cc: Lennart Poettering Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 7e5b58bcbcb3d7518389c1d82fb6e926f5a9f72c Author: Kay Sievers Date: Thu Apr 7 04:29:20 2011 +0200 printk: /dev/kmsg - properly support writev() to avoid interleaved printk() lines printk: /dev/kmsg - properly support writev() to avoid interleaved printk lines We should avoid calling printk() in a loop, when we pass a single string to /dev/kmsg with writev(). Cc: Lennart Poettering Signed-off-by: Kay Sievers Cc: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 47296b1962ead8301488f0dbe8424c7db7eac635 Author: Jie Zhou Date: Wed Apr 6 14:42:40 2011 +0800 uio: clean uioinfo when uninstall uio driver The uioinfo should be cleaned up when uninstall, otherwise re-install failure of uio_pdrv_genirq.ko will happen. Signed-off-by: Jie Zhou Signed-off-by: Aisheng Dong Signed-off-by: Hans J. Koch Signed-off-by: Greg Kroah-Hartman commit c6edc42fe1b5562abae22beabbebd9e557527ae3 Author: Hillf Danton Date: Thu Mar 31 20:38:47 2011 +0800 uio: fix allocating minor id for uio device The number of uio devices that could be used should be less than UIO_MAX_DEVICES by design, and this work guards any cases in which id more than UIO_MAX_DEVICES is utilized. Signed-off-by: Hillf Danton Signed-off-by: Hans J. Koch Signed-off-by: Greg Kroah-Hartman commit f0c554fddd3be561542cd37acdb3adc9ec5483ee Author: Hillf Danton Date: Mon Mar 28 23:33:26 2011 +0200 uio: fix finding mm index for vma When finding mm index for vma it looks more flexible that the mm could be sparse, and both the size of mm and the pgoff of vma could give correct selection. Signed-off-by: Hillf Danton Signed-off-by: Hans J. Koch Signed-off-by: Greg Kroah-Hartman commit d8408aef910b5d538ae07218992b270a9e01067f Author: Daniel Trautmann Date: Mon Mar 21 15:36:35 2011 +0100 uio_netx: Add support for netPLC cards This patch adds support for Hilscher / IBHsoftec netPLC cards to uio_netx userspace IO driver. Changes from v1 -> v2: Fixed whitespace errors reported by scripts/checkpatch.pl which were caused by email client. Signed-off-by: Daniel Trautmann Signed-off-by: "Hans J. Koch" Signed-off-by: Greg Kroah-Hartman commit 5934b5f3b0116858a5f950abcac344ddee054b69 Author: Tsugikazu Shibata Date: Mon Apr 4 11:47:36 2011 +0900 HOWTO: sync up Documentaion/ja_JP/HOWTO Signed-off-by: Tsugikazu Shibata Signed-off-by: Greg Kroah-Hartman commit 088ab0b4d855d68a0f0c16b72fb8e492a533aaa1 Author: Ludwig Nussel Date: Mon Feb 28 15:57:17 2011 +0100 kernel/ksysfs.c: expose file_caps_enabled in sysfs A kernel booted with no_file_caps allows to install fscaps on a binary but doesn't actually honor the fscaps when running the binary. Userspace currently has no sane way to determine whether installing fscaps actually has any effect. Since parsing /proc/cmdline is fragile this patch exposes the current setting (1 or 0) via /sys/kernel/fscaps Signed-off-by: Ludwig Nussel Signed-off-by: Greg Kroah-Hartman commit aed65af1cc2f6fc9ded5a8158f1405a02cf6d2ff Author: Stephen Hemminger Date: Mon Mar 28 09:12:52 2011 -0700 drivers: make device_type const The device_type structure does not contain data that changes during usage and should be const. This allows devices to declare the struct const. I have patches to change all the subsystems, but need the infra structure change first. Signed-off-by: Stephen Hemminger Signed-off-by: Greg Kroah-Hartman commit 695cca8763078c743417886e80af8ccb78bec9f2 Author: Mike Waychison Date: Fri Mar 18 16:50:03 2011 -0700 firmware: Fix grammar in sysfs-firmware-dmi doc Fix the grammar in describing the position attribute of DMI entries in the dmi-sysfs module. While here, make a couple other small clarifying fixups to the docs. Reported-by: Signed-off-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit 3116aabc81ccfeeb73f183ed8b1e3031520d1e59 Author: Dan Carpenter Date: Fri Mar 18 10:12:38 2011 +0300 efivars: handle errors from register_efivars() We should unwind and return an error if register_efivars() fails. Signed-off-by: Dan Carpenter Acked-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit 051d51bc6a867d9466a975e4d7ca51b21a9c2c4e Author: Dan Carpenter Date: Fri Mar 18 10:12:14 2011 +0300 efivars: memory leak on error in create_efivars_bin_attributes() This is a cut and paste bug. We intended to free ->del_var and ->new_var but we only free ->new_var. Signed-off-by: Dan Carpenter Acked-by: Mike Waychison Signed-off-by: Greg Kroah-Hartman commit bcdd323b893ad3c9b7ef26b5e4a0bef974238501 Author: Felipe Balbi Date: Wed Mar 16 15:59:35 2011 +0200 device: add dev_WARN_ONCE it's quite useful to print the device name on the stack dump caused by WARN(), but there are other cases where we might want to use WARN_ONCE. Introduce a helper similar to dev_WARN() for that case too. Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 7b70bd3441437b7bc04fc9d321e17c8ed0e8f958 Author: Borislav Petkov Date: Mon Apr 18 16:00:21 2011 +0200 x86, MCE: Do not taint when handling correctable errors Correctable errors are considered something rather normal on modern hardware these days. Even more importantly, correctable errors mean exactly that - they've been corrected by the hardware - and there's no need to taint the kernel since execution hasn't been compromised so far. Also, drop tainting in the thermal throttling code for a similar reason: crossing a thermal threshold does not mean corruption. Signed-off-by: Borislav Petkov Acked-by: Tony Luck Acked-by: Nagananda Chumbalkar Cc: Prarit Bhargava Cc: Russ Anderson Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/1303135222-17118-1-git-send-email-bp@amd64.org Signed-off-by: Ingo Molnar commit 2b398bd9f8f73be706b41adcbb240ce95793049a Author: Youquan Song Date: Thu Apr 14 14:36:08 2011 +0800 x86, apic: Print verbose error interrupt reason on apic=debug End users worry about the error interrupt printout we generate currently: pr_debug("APIC error on CPU%d: %02x(%02x)\n", smp_processor_id(), v , v1); ... and would like to know the reason why error interrupts are generated. This patch prints out more detailed debug information. Another practical problem is that dynamic debug is not initialized yet when the APIC initializes, so the pr_debug() will not output the error interrupt debug information on bootup. In this patch, we use apic_printk(APIC_DEBUG, ...), so the apic=debug boot option will print verbose error interupts during bootup. Signed-off-by: Youquan Song Cc: Joe Perches Cc: hpa@linux.intel.com Cc: suresh.b.siddha@intel.com Cc: yong.y.wang@linux.intel.com Cc: jbaron@redhat.com Cc: trenn@suse.de Cc: kent.liu@intel.com Cc: chaohong.guo@intel.com Link: http://lkml.kernel.org/r/1302762968-24380-2-git-send-email-youquan.song@intel.com Signed-off-by: Ingo Molnar commit 0817a6a3a4fc7c069111083351ca33a78da2a0c9 Author: Arun Sharma Date: Thu Apr 14 10:38:18 2011 -0700 perf script: Add support for PERF_TYPE_RAW Useful for getting stack traces for hardware events not handled by PERF_TYPE_HARDWARE. Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Tom Zanussi Signed-off-by: Arun Sharma Link: http://lkml.kernel.org/n/tip-qimdcdpekjqxuyqovy4kjusx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo commit f18568aae5612ab37f20e5f383d6154ea69c9dfc Author: Michael Witten Date: Tue Apr 12 20:30:13 2011 +0000 perf tools: git mv tools/perf/{features-tests.mak,config/} Signed-off-by: Michael Witten Link: http://lkml.kernel.org/n/tip-a6zhefjayuounko1tk5sjji2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo commit 7fbd065f5a2b299172502f09fc3fbde02b48f591 Author: Michael Witten Date: Tue Apr 12 20:27:59 2011 +0000 perf tools: Move `try-cc' The `try-cc' user-defined function was in tools/perf/feature-tests.mak; this commit moves it to tools/perf/config/utilities.mak. Signed-off-by: Michael Witten Link: http://lkml.kernel.org/n/tip-bqhwcuxsrve0iodn6q4ejaoi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo commit ced465c400b23656ef2c4fbfb4add0e5b92e3d97 Author: Michael Witten Date: Sat Apr 2 21:46:09 2011 +0000 perf tools: Makefile: PYTHON{,_CONFIG} to bandage Python 3 incompatibility Currently, Python 3 is not supported by perf's code; this can cause the build to fail for systems that have Python 3 installed as the default python: python{,-config} The Correct Solution is to write compatibility code so that Python 3 works out-of-the-box. However, users often have an ancillary Python 2 installed: python2{,-config} Therefore, a quick fix is to allow the user to specify those ancillary paths as the python binaries that Makefile should use, thereby avoiding Python 3 altogether; as an added benefit, the Python binaries may be installed in non-standard locations without the need for updating any PATH variable. This commit adds the ability to set PYTHON and/or PYTHON_CONFIG either as environment variables or as make variables on the command line; the paths may be relative, and usually only PYTHON is necessary in order for PYTHON_CONFIG to be defined implicitly. Some rudimentary error checking is performed when the user explicitly specifies a value for any of these variables. In addition, this commit introduces significantly robust makefile infrastructure for working with paths and communicating with the shell; it's currently only used for handling Python, but I hope it will prove useful in refactoring the makefiles. Thanks to: Raghavendra D Prabhu for motivating this patch. Acked-by: Raghavendra D Prabhu Link: http://lkml.kernel.org/r/e987828e-87ec-4973-95e7-47f10f5d9bab-mfwitten@gmail.com Signed-off-by: Michael Witten Signed-off-by: Arnaldo Carvalho de Melo commit 3643b133f2cb8023e8cedcbef43215a99d7df561 Author: Michael Witten Date: Sat Apr 9 01:12:56 2011 +0000 perf tools: Makefile: Clean up `python/perf.so' rule There is no need for a subshell or an explicit `export'; as per the POSIX Shell Command Language specification: http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_01 http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_10_02 It is only necessary to include the environment variable assignment just before the command to be run. Also, it is better to use single-quotes, because GNU make might expand `$(BASIC_CFLAGS)' into something that the shell could interpret within double-quotes. Acked-by: Raghavendra D Prabhu Link: http://lkml.kernel.org/n/tip-58n38o02ocuzrm9qh096hsf5@git.kernel.org Signed-off-by: Michael Witten Signed-off-by: Arnaldo Carvalho de Melo commit aeafcbaf4fcfeb74aeed65609ea5ead48dfc09f8 Author: Arnaldo Carvalho de Melo Date: Thu Mar 31 10:56:28 2011 -0300 perf symbols: Give more useful names to 'self' parameters One more installment on an area that is mostly dormant. Suggested-by: Thomas Gleixner Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Tom Zanussi LKML-Reference: Signed-off-by: Arnaldo Carvalho de Melo commit 057f3fadb347e9c51b07e1b277bbdda79f976768 Author: Peter Zijlstra Date: Mon Apr 18 11:24:34 2011 +0200 sched: Fix sched_domain iterations vs. RCU Vladis Kletnieks reported a new RCU debug warning in the scheduler. Since commit dce840a08702b ("sched: Dynamically allocate sched_domain/ sched_group data-structures") the sched_domain trees are protected by RCU instead of RCU-sched. This means that we need to include rcu_read_lock() protection when we iterate them since disabling preemption doesn't suffice anymore. Reported-by: Valdis.Kletnieks@vt.edu Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1302882741.2388.241.camel@twins Signed-off-by: Ingo Molnar commit 2f36825b176f67e5c5228aa33d828bc39718811f Author: Venkatesh Pallipadi Date: Thu Apr 14 10:30:53 2011 -0700 sched: Next buddy hint on sleep and preempt path When a task in a taskgroup sleeps, pick_next_task starts all the way back at the root and picks the task/taskgroup with the min vruntime across all runnable tasks. But when there are many frequently sleeping tasks across different taskgroups, it makes better sense to stay with same taskgroup for its slice period (or until all tasks in the taskgroup sleeps) instead of switching cross taskgroup on each sleep after a short runtime. This helps specifically where taskgroups corresponds to a process with multiple threads. The change reduces the number of CR3 switches in this case. Example: Two taskgroups with 2 threads each which are running for 2ms and sleeping for 1ms. Looking at sched:sched_switch shows: BEFORE: taskgroup_1 threads [5004, 5005], taskgroup_2 threads [5016, 5017] cpu-soaker-5004 [003] 3683.391089 cpu-soaker-5016 [003] 3683.393106 cpu-soaker-5005 [003] 3683.395119 cpu-soaker-5017 [003] 3683.397130 cpu-soaker-5004 [003] 3683.399143 cpu-soaker-5016 [003] 3683.401155 cpu-soaker-5005 [003] 3683.403168 cpu-soaker-5017 [003] 3683.405170 AFTER: taskgroup_1 threads [21890, 21891], taskgroup_2 threads [21934, 21935] cpu-soaker-21890 [003] 865.895494 cpu-soaker-21935 [003] 865.897506 cpu-soaker-21934 [003] 865.899520 cpu-soaker-21935 [003] 865.901532 cpu-soaker-21934 [003] 865.903543 cpu-soaker-21935 [003] 865.905546 cpu-soaker-21891 [003] 865.907548 cpu-soaker-21890 [003] 865.909560 cpu-soaker-21891 [003] 865.911571 cpu-soaker-21890 [003] 865.913582 cpu-soaker-21891 [003] 865.915594 cpu-soaker-21934 [003] 865.917606 Similar problem is there when there are multiple taskgroups and say a task A preempts currently running task B of taskgroup_1. On schedule, pick_next_task can pick an unrelated task on taskgroup_2. Here it would be better to give some preference to task B on pick_next_task. A simple (may be extreme case) benchmark I tried was tbench with 2 tbench client processes with 2 threads each running on a single CPU. Avg throughput across 5 50 sec runs was: BEFORE: 105.84 MB/sec AFTER: 112.42 MB/sec Signed-off-by: Venkatesh Pallipadi Acked-by: Rik van Riel Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1302802253-25760-1-git-send-email-venki@google.com Signed-off-by: Ingo Molnar commit 69c80f3e9d3c569f8a3cee94ba1a324b5a7fa6b9 Author: Venkatesh Pallipadi Date: Wed Apr 13 18:21:09 2011 -0700 sched: Make set_*_buddy() work on non-task entities Make set_*_buddy() work on non-task sched_entity, to facilitate the use of next_buddy to cache a group entity in cases where one of the tasks within that entity sleeps or gets preempted. set_skip_buddy() was incorrectly comparing the policy of task that is yielding to be not equal to SCHED_IDLE. Yielding should happen even when task yielding is SCHED_IDLE. This change removes the policy check on the yielding task. Signed-off-by: Venkatesh Pallipadi Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1302744070-30079-2-git-send-email-venki@google.com Signed-off-by: Ingo Molnar commit c8e5910edf8bbe2e5c6c35a4ef2a578cc7893b25 Author: Robert Richter Date: Sat Apr 16 02:27:55 2011 +0200 perf, x86: Use ALTERNATIVE() to check for X86_FEATURE_PERFCTR_CORE Using ALTERNATIVE() when checking for X86_FEATURE_PERFCTR_CORE avoids an extra pointer chase and data cache hit. Signed-off-by: Robert Richter Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1302913676-14352-4-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar commit 68d2cf25d39324c54b5e42de7915c623a0917abe Merge: 176fcc5 5d2cd90 Author: Ingo Molnar Date: Tue Apr 19 07:55:58 2011 +0200 Merge branch 'perf/urgent' into perf/core Merge reason: we'll be queueing up dependent changes. Signed-off-by: Ingo Molnar commit d8d9766c8c29f71c37bc4b74cc9fcf6a192c9bfd Author: H. Peter Anvin Date: Mon Apr 18 15:31:57 2011 -0700 x86, cpu: Change NOP selection for certain Intel CPUs Due to a decoder implementation quirk, some specific Intel CPUs actually perform better with the "k8_nops" than with the SDM-recommended NOPs. For runtime-selected NOPs, if we detect those specific CPUs then use the k8_nops instead of the ones we would normally use. Signed-off-by: H. Peter Anvin Cc: Tejun Heo Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Jason Baron Link: http://lkml.kernel.org/r/1303166160-10315-4-git-send-email-hpa@linux.intel.com commit dc326fca2b640fc41aed7c015d0f456935a66255 Author: H. Peter Anvin Date: Mon Apr 18 15:19:51 2011 -0700 x86, cpu: Clean up and unify the NOP selection infrastructure Clean up and unify the NOP selection infrastructure: - Make the atomic 5-byte NOP a part of the selection system. - Pick NOPs once during early boot and then be done with it. Signed-off-by: H. Peter Anvin Cc: Tejun Heo Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Jason Baron Link: http://lkml.kernel.org/r/1303166160-10315-3-git-send-email-hpa@linux.intel.com commit b1e7734f024c9ce4393016a97c8d821e1f18d9b4 Author: H. Peter Anvin Date: Mon Apr 18 15:18:02 2011 -0700 x86, percpu: Use ASM_NOP4 instead of hardcoding P6_NOP4 For use in assembly constants, use the ASM_NOP* defines. Signed-off-by: H. Peter Anvin Cc: Christoph Lameter Cc: Tejun Heo Link: http://lkml.kernel.org/r/1303166160-10315-2-git-send-email-hpa@linux.intel.com commit c387aa3a1a910ce00b86f3a85082d24f144db256 Author: Joerg Roedel Date: Mon Apr 18 15:45:43 2011 +0200 x86, gart: Don't enforce GART aperture lower-bound by alignment This patch changes the allocation of the GART aperture to enforce only natural alignment instead of aligning it on 512MB. This big alignment was used to force the GART aperture to be over 512MB. This is enforced by using 512MB as the lower-bound address in the allocation range. [ hpa: The actual number 512 MiB needs to be revisited, too. ] Cc: Yinghai Lu Signed-off-by: Joerg Roedel Link: http://lkml.kernel.org/r/1303134346-5805-2-git-send-email-joerg.roedel@amd.com Signed-off-by: H. Peter Anvin commit cf8d91633ddef9e816ccbf3da833c79ce508988d Author: Konrad Rzeszutek Wilk Date: Mon Feb 28 17:58:48 2011 -0500 xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override. We only supported the M2P (and P2M) override only for the GNTMAP_contains_pte type mappings. Meaning that we grants operations would "contain the machine address of the PTE to update" If the flag is unset, then the grant operation is "contains a host virtual address". The latter case means that the Hypervisor takes care of updating our page table (specifically the PTE entry) with the guest's MFN. As such we should not try to do anything with the PTE. Previous to this patch we would try to clear the PTE which resulted in Xen hypervisor being upset with us: (XEN) mm.c:1066:d0 Attempt to implicitly unmap a granted PTE c0100000ccc59067 (XEN) domain_crash called from mm.c:1067 (XEN) Domain 0 (vcpu#0) crashed on cpu#3: (XEN) ----[ Xen-4.0-110228 x86_64 debug=y Not tainted ]---- and crashing us. This patch allows us to inhibit the PTE clearing in the PV guest if the GNTMAP_contains_pte is not set. On the m2p_remove_override path we provide the same parameter. Sadly in the grant-table driver we do not have a mechanism to tell m2p_remove_override whether to clear the PTE or not. Since the grant-table driver is used by user-space, we can safely assume that it operates only on PTE's. Hence the implementation for it to work on !GNTMAP_contains_pte returns -EOPNOTSUPP. In the future we can implement the support for this. It will require some extra accounting structure to keep track of the page[i], and the flag. [v1: Added documentation details, made it return -EOPNOTSUPP instead of trying to do a half-way implementation] Signed-off-by: Konrad Rzeszutek Wilk commit 6ddafdaab3f809b110ada253d2f2d4910ebd3ac5 Merge: 3905c54 bd8e7dd Author: Ingo Molnar Date: Mon Apr 18 14:53:18 2011 +0200 Merge branch 'sched/locking' into sched/core Merge reason: the rq locking changes are stable, propagate them into the .40 queue. Signed-off-by: Ingo Molnar commit 140363500ddadad0c09cb512cc0c96a4d3efa053 Author: Mike Christie Date: Thu Apr 14 23:50:57 2011 -0400 iscsi_ibft: search for broadcom specific ibft sign (v2) Broadcom iscsi offload firmware uses a non standard ibft sign of "BIFT". When we added support for boot, the anaconda team and I were using older firmware (I guess 4 years old), so boot does not work on current cards. This patch modifies the ibft search code to search for "BIFT" along with the other possible values. Broadcom has tested the patch and reported it works with their firmware. Mike has tested Chelsio and Intel cards. [v2: - Add ACPI_SIG_IBFT to ibft_signs - replace break with goto in find_ibft_in_mem innner loop.] Signed-off-by: Mike Christie Signed-off-by: Peter Jones Signed-off-by: Konrad Rzeszutek Wilk commit 1eff1ad0285038e309a81da4a004f071608309fb Author: Konrad Rzeszutek Wilk Date: Wed Feb 16 16:26:44 2011 -0500 xen/irq: The Xen hypervisor cleans up the PIRQs if the other domain forgot. And if the other domain forgot to clean up its PIRQs we don't need to fail the operation. Just take a note of it and continue on. Signed-off-by: Konrad Rzeszutek Wilk commit e6197acc726ab3baa60375a5891d58c2ee87e0f3 Author: Konrad Rzeszutek Wilk Date: Thu Feb 24 14:20:12 2011 -0500 xen/irq: Export 'xen_pirq_from_irq' function. We need this to find the real Xen PIRQ value for a device that requests an MSI or MSI-X. In the past we used 'xen_gsi_from_irq' since that function would return an Xen PIRQ or GSI depending on the provided IRQ. Now that we have seperated that we need to use the correct function. [v2: Deal with rebase on stable/irq.cleanup] Signed-off-by: Konrad Rzeszutek Wilk commit c7c2c3a28657cfdcef50c02b18ccca3761209e17 Author: Konrad Rzeszutek Wilk Date: Mon Nov 8 14:26:36 2010 -0500 xen/irq: Add support to check if IRQ line is shared with other domains. We do this via the PHYSDEVOP_irq_status_query support hypervisor call. We will get a positive value if another domain has binded its PIRQ to the specified GSI (IRQ line). [v2: Deal with v2.6.37-rc1 rebase fallout] [v3: Deal with stable/irq.cleanup fallout] [v4: xen_ignore_irq->xen_test_irq_shared] Signed-off-by: Konrad Rzeszutek Wilk commit beafbdc1df02877612dc9039c1de0639921fddec Author: Konrad Rzeszutek Wilk Date: Thu Apr 14 11:17:36 2011 -0400 xen/irq: Check if the PCI device is owned by a domain different than DOMID_SELF. We check if there is a domain owner for the PCI device. In case of failure (meaning no domain has registered for this device) we make DOMID_SELF the owner. Signed-off-by: Konrad Rzeszutek Wilk [v2: deal with rebasing on v2.6.37-1] [v3: deal with rebasing on stable/irq.cleanup] [v4: deal with rebasing on stable/irq.ween_of_nr_irqs] [v5: deal with rebasing on v2.6.39-rc3] Signed-off-by: Jeremy Fitzhardinge Acked-by: Xiantao Zhang commit c55fa78b13b32d3f19e19cd0c8b9378fdc09e521 Author: Konrad Rzeszutek Wilk Date: Mon Nov 8 14:13:35 2010 -0500 xen/pci: Add xen_[find|register|unregister]_device_domain_owner functions. When the Xen PCI backend is told to enable or disable MSI/MSI-X functions, the initial domain performs these operations. The initial domain needs to know which domain (guest) is going to use the PCI device so when it makes the appropiate hypercall to retrieve the MSI/MSI-X vector it will also assign the PCI device to the appropiate domain (guest). This boils down to us needing a mechanism to find, set and unset the domain id that will be using the device. [v2: EXPORT_SYMBOL -> EXPORT_SYMBOL_GPL.] Signed-off-by: Konrad Rzeszutek Wilk commit bd8e7dded88a3e1c085c333f19ff31387616f71a Author: Peter Zijlstra Date: Tue Apr 5 17:23:59 2011 +0200 sched: Remove need_migrate_task() Oleg noticed that need_migrate_task() doesn't need the ->on_cpu check now that ttwu() doesn't do remote enqueues for !->on_rq && ->on_cpu, so remove the helper and replace the single instance with a direct ->on_rq test. Suggested-by: Oleg Nesterov Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.556674812@chello.nl Signed-off-by: Ingo Molnar commit 317f394160e9beb97d19a84c39b7e5eb3d7815a8 Author: Peter Zijlstra Date: Tue Apr 5 17:23:58 2011 +0200 sched: Move the second half of ttwu() to the remote cpu Now that we've removed the rq->lock requirement from the first part of ttwu() and can compute placement without holding any rq->lock, ensure we execute the second half of ttwu() on the actual cpu we want the task to run on. This avoids having to take rq->lock and doing the task enqueue remotely, saving lots on cacheline transfers. As measured using: http://oss.oracle.com/~mason/sembench.c $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done $ echo 4096 32000 64 128 > /proc/sys/kernel/sem $ ./sembench -t 2048 -w 1900 -o 0 unpatched: run time 30 seconds 647278 worker burns per second patched: run time 30 seconds 816715 worker burns per second Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.515897185@chello.nl commit c05fbafba1c5482bee399b360288fa405415e126 Author: Peter Zijlstra Date: Tue Apr 5 17:23:57 2011 +0200 sched: Restructure ttwu() some more Factor our helper functions to make the inner workings of try_to_wake_up() more obvious, this also allows for adding remote queues. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.475848012@chello.nl commit 23f41eeb42ce7f6f1210904e49e84718f02cb61c Author: Peter Zijlstra Date: Tue Apr 5 17:23:56 2011 +0200 sched: Rename ttwu_post_activation() to ttwu_do_wakeup() The ttwu_post_activation() code does the core wakeup, it sets TASK_RUNNING and performs wakeup-preemption, so give is a more descriptive name. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.434609705@chello.nl commit b84cb5df1f9ad6da3f214c638d5fb08d0c99de1f Author: Peter Zijlstra Date: Tue Apr 5 17:23:55 2011 +0200 sched: Remove rq argument from ttwu_stat() In order to call ttwu_stat() without holding rq->lock we must remove its rq argument. Since we need to change rq stats, account to the local rq instead of the task rq, this is safe since we have IRQs disabled. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.394638826@chello.nl commit e4a52bcb9a18142d79e231b6733cabdbf2e67c1f Author: Peter Zijlstra Date: Tue Apr 5 17:23:54 2011 +0200 sched: Remove rq->lock from the first half of ttwu() Currently ttwu() does two rq->lock acquisitions, once on the task's old rq, holding it over the p->state fiddling and load-balance pass. Then it drops the old rq->lock to acquire the new rq->lock. By having serialized ttwu(), p->sched_class, p->cpus_allowed with p->pi_lock, we can now drop the whole first rq->lock acquisition. The p->pi_lock serializing concurrent ttwu() calls protects p->state, which we will set to TASK_WAKING to bridge possible p->pi_lock to rq->lock gaps and serialize set_task_cpu() calls against task_rq_lock(). The p->pi_lock serialization of p->sched_class allows us to call scheduling class methods without holding the rq->lock, and the serialization of p->cpus_allowed allows us to do the load-balancing bits without races. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.354401150@chello.nl commit 8f42ced974df7d5af2de4cf5ea21fe978c7e4478 Author: Peter Zijlstra Date: Tue Apr 5 17:23:53 2011 +0200 sched: Drop rq->lock from sched_exec() Since we can now call select_task_rq() and set_task_cpu() with only p->pi_lock held, and sched_exec() load-balancing has always been optimistic, drop all rq->lock usage. Oleg also noted that need_migrate_task() will always be true for current, so don't bother calling that at all. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.314204889@chello.nl Signed-off-by: Ingo Molnar commit ab2515c4b98f7bc4fa11cad9fa0f811d63a72a26 Author: Peter Zijlstra Date: Tue Apr 5 17:23:52 2011 +0200 sched: Drop rq->lock from first part of wake_up_new_task() Since p->pi_lock now protects all things needed to call select_task_rq() avoid the double remote rq->lock acquisition and rely on p->pi_lock. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.273362517@chello.nl Signed-off-by: Ingo Molnar commit 0122ec5b02f766c355b3168df53a6c038a24fa0d Author: Peter Zijlstra Date: Tue Apr 5 17:23:51 2011 +0200 sched: Add p->pi_lock to task_rq_lock() In order to be able to call set_task_cpu() while either holding p->pi_lock or task_rq(p)->lock we need to hold both locks in order to stabilize task_rq(). This makes task_rq_lock() acquire both locks, and have __task_rq_lock() validate that p->pi_lock is held. This increases the locking overhead for most scheduler syscalls but allows reduction of rq->lock contention for some scheduler hot paths (ttwu). Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.232781355@chello.nl Signed-off-by: Ingo Molnar commit 2acca55ed98ad9b9aa25e7e587ebe306c0313dc7 Author: Peter Zijlstra Date: Tue Apr 5 17:23:50 2011 +0200 sched: Also serialize ttwu_local() with p->pi_lock Since we now serialize ttwu() using p->pi_lock, we also need to serialize ttwu_local() using that, otherwise, once we drop the rq->lock from ttwu() it can race with ttwu_local(). Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.192366907@chello.nl commit a8e4f2eaecc9bfa4954adf79a04f4f22fddd829c Author: Peter Zijlstra Date: Tue Apr 5 17:23:49 2011 +0200 sched: Delay task_contributes_to_load() In prepratation of having to call task_contributes_to_load() without holding rq->lock, we need to store the result until we do and can update the rq accounting accordingly. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152729.151523907@chello.nl commit 3fe1698b7fe05aeb063564e71e40d09f28d8e80c Author: Peter Zijlstra Date: Tue Apr 5 17:23:48 2011 +0200 sched: Deal with non-atomic min_vruntime reads on 32bits In order to avoid reading partial updated min_vruntime values on 32bit implement a seqcount like solution. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.111378493@chello.nl Signed-off-by: Ingo Molnar commit 74f8e4b2335de45485b8d5b31a504747f13c8070 Author: Peter Zijlstra Date: Tue Apr 5 17:23:47 2011 +0200 sched: Remove rq argument to sched_class::task_waking() In preparation of calling this without rq->lock held, remove the dependency on the rq argument. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.071474242@chello.nl Signed-off-by: Ingo Molnar commit 7608dec2ce2004c234339bef8c8074e5e601d0e9 Author: Peter Zijlstra Date: Tue Apr 5 17:23:46 2011 +0200 sched: Drop the rq argument to sched_class::select_task_rq() In preparation of calling select_task_rq() without rq->lock held, drop the dependency on the rq argument. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152729.031077745@chello.nl Signed-off-by: Ingo Molnar commit 013fdb8086acaae5f8eb96f9ad48fcd98882ac46 Author: Peter Zijlstra Date: Tue Apr 5 17:23:45 2011 +0200 sched: Serialize p->cpus_allowed and ttwu() using p->pi_lock Currently p->pi_lock already serializes p->sched_class, also put p->cpus_allowed and try_to_wake_up() under it, this prepares the way to do the first part of ttwu() without holding rq->lock. By having p->sched_class and p->cpus_allowed serialized by p->pi_lock, we prepare the way to call select_task_rq() without holding rq->lock. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152728.990364093@chello.nl Signed-off-by: Ingo Molnar commit fd2f4419b4cbe8fe90796df9617c355762afd6a4 Author: Peter Zijlstra Date: Tue Apr 5 17:23:44 2011 +0200 sched: Provide p->on_rq Provide a generic p->on_rq because the p->se.on_rq semantics are unfavourable for lockless wakeups but needed for sched_fair. In particular, p->on_rq is only cleared when we actually dequeue the task in schedule() and not on any random dequeue as done by things like __migrate_task() and __sched_setscheduler(). This also allows us to remove p->se usage from !sched_fair code. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152728.949545047@chello.nl commit d7c01d27ab767a30d672d1fd657aa8336ebdcbca Author: Peter Zijlstra Date: Tue Apr 5 17:23:43 2011 +0200 sched: Clean up ttwu() stats Collect all ttwu() stat code into a single function and ensure its always called for an actual wakeup (changing p->state to TASK_RUNNING). Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152728.908177058@chello.nl commit 893633817f5b58f5227365d74344e0170a718213 Author: Peter Zijlstra Date: Tue Apr 5 17:23:42 2011 +0200 sched: Change the ttwu() success details try_to_wake_up() would only return a success when it would have to place a task on a rq, change that to every time we change p->state to TASK_RUNNING, because that's the real measure of wakeups. This results in that success is always true for the tracepoints. Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152728.866866929@chello.nl commit c2f7115e2e52a6c187b8c1f54f0e4970bb677be0 Author: Peter Zijlstra Date: Wed Apr 13 13:28:56 2011 +0200 sched: Move wq_worker_waking to the correct site wq_worker_waking_up() needs to match wq_worker_sleeping(), since the latter is only called on deactivate, move the former near activate. Signed-off-by: Peter Zijlstra Cc: Tejun Heo Link: http://lkml.kernel.org/n/top-t3m7n70n9frmv4pv2n5fwmov@git.kernel.org Signed-off-by: Ingo Molnar commit c6eb3dda25892f1f974f5420f63e6721aab02f6f Author: Peter Zijlstra Date: Tue Apr 5 17:23:41 2011 +0200 mutex: Use p->on_cpu for the adaptive spin Since we now have p->on_cpu unconditionally available, use it to re-implement mutex_spin_on_owner. Requested-by: Thomas Gleixner Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152728.826338173@chello.nl commit 3ca7a440da394808571dad32d33d3bc0389982e6 Author: Peter Zijlstra Date: Tue Apr 5 17:23:40 2011 +0200 sched: Always provide p->on_cpu Always provide p->on_cpu so that we can determine if its on a cpu without having to lock the rq. Reviewed-by: Frank Rowand Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110405152728.785452014@chello.nl Signed-off-by: Ingo Molnar commit 184748cc50b2dceb8287f9fb657eda48ff8fcfe7 Author: Peter Zijlstra Date: Tue Apr 5 17:23:39 2011 +0200 sched: Provide scheduler_ipi() callback in response to smp_send_reschedule() For future rework of try_to_wake_up() we'd like to push part of that function onto the CPU the task is actually going to run on. In order to do so we need a generic callback from the existing scheduler IPI. This patch introduces such a generic callback: scheduler_ipi() and implements it as a NOP. BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions! Acked-by: Russell King Acked-by: Martin Schwidefsky Acked-by: Chris Metcalf Acked-by: Jesper Nilsson Acked-by: Benjamin Herrenschmidt Signed-off-by: Ralf Baechle Reviewed-by: Frank Rowand Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Signed-off-by: Ingo Molnar Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl commit a4c98f8bbeafee12c979c90743f6fda94f7515c7 Merge: f4ad9bd 85f2e68 Author: Ingo Molnar Date: Thu Apr 14 08:50:37 2011 +0200 Merge branch 'linus' into sched/locking Merge reason: Pick up this upstream commit: 6631e635c65d: block: don't flush plugged IO on forced preemtion scheduling As it modifies the scheduler and we'll queue up dependent patches. Signed-off-by: Ingo Molnar commit 58fc7f1419560efa9c426b829c195050e0147d7f Author: Joerg Roedel Date: Mon Apr 11 11:13:24 2011 +0200 x86/amd-iommu: Add support for invalidate_all command This patch adds support for the invalidate_all command present in new versions of the AMD IOMMU. Signed-off-by: Joerg Roedel commit d99ddec3eee0be8a43b2c1ff624b9dfaaa26b959 Author: Joerg Roedel Date: Mon Apr 11 11:03:18 2011 +0200 x86/amd-iommu: Add extended feature detection This patch adds detection of the extended features of an AMD IOMMU. The available features are printed to dmesg on boot. Signed-off-by: Joerg Roedel commit 3905c54f2bd2c6f937f87307987ca072eabc3e7b Author: Stephen Rothwell Date: Tue Apr 12 14:00:40 2011 +1000 sched, sparc64: Turn cpu_coregroup_mask() into a real function This compile error triggers on Sparc64: kernel/sched.c:7140: error: 'cpu_coregroup_mask' undeclared here (not in a function) Because after the recent scheduler domain cleanups the scheduler uses this arch method as a function pointer in a scheduler topology data structure - which is not possible with a macro. Signed-off-by: Stephen Rothwell Acked-by: David S. Miller Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20110412140040.3020ef55.sfr@canb.auug.org.au Signed-off-by: Ingo Molnar commit 60495e7760d8ee364695006af37309b0755e0e17 Author: Peter Zijlstra Date: Thu Apr 7 14:10:04 2011 +0200 sched: Dynamic sched_domain::level Remove the SD_LV_ enum and use dynamic level assignments. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.969433965@chello.nl Signed-off-by: Ingo Molnar commit 54ab4ff4316eb329d2c1acc110fbc623d2966931 Author: Peter Zijlstra Date: Thu Apr 7 14:10:03 2011 +0200 sched: Move sched domain storage into the topology list In order to remove the last dependency on the statid domain levels, move the sd_data storage into the topology structure. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.924926412@chello.nl Signed-off-by: Ingo Molnar commit d069b916f7b50021d41d6ce498f86da32a7afaec Author: Peter Zijlstra Date: Thu Apr 7 14:10:02 2011 +0200 sched: Reverse the topology list In order to get rid of static sched_domain::level assignments, reverse the topology iteration. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.876506131@chello.nl Signed-off-by: Ingo Molnar commit 2c402dc3bb502e9dd74fce72c14d293fcef4719d Author: Peter Zijlstra Date: Thu Apr 7 14:10:01 2011 +0200 sched: Unify the sched_domain build functions Since all the __build_$DOM_sched_domain() functions do pretty much the same thing, unify them. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.826347257@chello.nl Signed-off-by: Ingo Molnar commit eb7a74e6cd936c00749e2921b9e058631d986648 Author: Peter Zijlstra Date: Thu Apr 7 14:10:00 2011 +0200 sched: Stuff the sched_domain creation in a data-structure In order to make the topology contruction fully dynamic, remove the still hard-coded list of possible domains and stick them in a data-structure. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.770335383@chello.nl Signed-off-by: Ingo Molnar commit d3081f52f29da1ba6c27685519a9222b39eac763 Author: Peter Zijlstra Date: Thu Apr 7 14:09:59 2011 +0200 sched: Create proper cpu_$DOM_mask() functions In order to unify the sched domain creation more, create proper cpu_$DOM_mask() functions for those domains that didn't already have one. Use the sched_domains_tmpmask for the weird NUMA domain span. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.717702108@chello.nl Signed-off-by: Ingo Molnar commit 4cb988395da6e16627a8be69729e50cd72ebb23e Author: Peter Zijlstra Date: Thu Apr 7 14:09:58 2011 +0200 sched: Avoid allocations in sched_domain_debug() Since we're all serialized by sched_domains_mutex we can use sched_domains_tmpmask and avoid having to do allocations. This means we can use sched_domains_debug() for cpu_attach_domain() again. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.664347467@chello.nl Signed-off-by: Ingo Molnar commit f96225fd51893b6650cffd5427f13f6b1b356488 Author: Peter Zijlstra Date: Thu Apr 7 14:09:57 2011 +0200 sched: Create persistent sched_domains_tmpmask Since sched domain creation is fully serialized by the sched_domains_mutex we can create a single persistent tmpmask to use during domain creation. This removes the need for s_data::send_covered. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.607287405@chello.nl Signed-off-by: Ingo Molnar commit 7dd04b730749f957c116f363524fd622b05e5141 Author: Peter Zijlstra Date: Thu Apr 7 14:09:56 2011 +0200 sched: Remove some dead code Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.553814623@chello.nl Signed-off-by: Ingo Molnar commit bf28b253266ebd73c331dde24d64606afde32ceb Author: Peter Zijlstra Date: Thu Apr 7 14:09:55 2011 +0200 sched: Remove nodemask allocation There's only one nodemask user left so remove it with a direct computation and save some memory and reduce some code-flow complexity. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.505608966@chello.nl Signed-off-by: Ingo Molnar commit 3bd65a80affb9768b91f03c56dba46ee79525f9b Author: Peter Zijlstra Date: Thu Apr 7 14:09:54 2011 +0200 sched: Simplify NODE/ALLNODES domain creation Don't treat ALLNODES/NODE different for difference's sake. Simply always create the ALLNODES domain and let the sd_degenerate() checks kill it when its redundant. This simplifies the code flow. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.455464579@chello.nl Signed-off-by: Ingo Molnar commit 3859173d43658d51a749bc0201b943922577d39c Author: Peter Zijlstra Date: Thu Apr 7 14:09:53 2011 +0200 sched: Reduce some allocation pressure Since we now allocate SD_LV_MAX * nr_cpu_ids sched_domain/sched_group structures when rebuilding the scheduler toplogy it might make sense to shrink that depending on the CONFIG_ options. This is only needed until we get rid of SD_LV_* alltogether and provide a full dynamic topology interface. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.406226449@chello.nl Signed-off-by: Ingo Molnar commit a6c75f2f8d988ecfecf971f98f1cb6fc4de522fe Author: Peter Zijlstra Date: Thu Apr 7 14:09:52 2011 +0200 sched: Avoid using sd->level Don't use sd->level for identifying properties of the domain. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.350174079@chello.nl Signed-off-by: Ingo Molnar commit 822ff793c34a5d4c8b5f3f9ce932602233d96464 Author: Peter Zijlstra Date: Thu Apr 7 14:09:51 2011 +0200 sched: Simplify the free path some If we check the root_domain reference count we can see if its been used or not, use this observation to simplify some of the return paths. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.298339503@chello.nl Signed-off-by: Ingo Molnar commit dce840a08702bd13a9a186e07e63d1ef82256b5e Author: Peter Zijlstra Date: Thu Apr 7 14:09:50 2011 +0200 sched: Dynamically allocate sched_domain/sched_group data-structures Instead of relying on static allocations for the sched_domain and sched_group trees, dynamically allocate and RCU free them. Allocating this dynamically also allows for some build_sched_groups() simplification since we can now (like with other simplifications) rely on the sched_domain tree instead of hard-coded knowledge. One tricky to note is that detach_destroy_domains() needs to hold rcu_read_lock() over the entire tear-down, per-cpu is not sufficient since that can lead to partial sched_group existance (could possibly be solved by doing the tear-down backwards but this is much more robust). A concequence of the above is that we can no longer print the sched_domain debug stuff from cpu_attach_domain() since that might now run with preemption disabled (due to classic RCU etc.) and sched_domain_debug() does some GFP_KERNEL allocations. Another thing to note is that we now fully rely on normal RCU and not RCU-sched, this is because with the new and exiting RCU flavours we grew over the years BH doesn't necessarily hold off RCU-sched grace periods (-rt is known to break this). This would in fact already cause us grief since we do sched_domain/sched_group iterations from softirq context. This patch is somewhat larger than I would like it to be, but I didn't find any means of shrinking/splitting this. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.245307941@chello.nl Signed-off-by: Ingo Molnar commit a9c9a9b6bff27ac9c746344a9c1a19bf3327002c Author: Peter Zijlstra Date: Thu Apr 7 14:09:49 2011 +0200 sched: Simplify sched_groups_power initialization Again, instead of relying on knowing the possible domains and their order, simply rely on the sched_domain tree and whatever domains are present in there to initialize the sched_group cpu_power. Note: we need to iterate the CPU mask backwards because of the cpumask_first() condition for iterating up the tree. By iterating the mask backwards we ensure all groups of a domain are set-up before starting on the parent groups that rely on its children to be completely done. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.187335414@chello.nl Signed-off-by: Ingo Molnar commit 21d42ccfd6c6c11f96c2acfd32a85cfc33514d3a Author: Peter Zijlstra Date: Thu Apr 7 14:09:48 2011 +0200 sched: Simplify finding the lowest sched_domain Instead of relying on knowing the build order and various CONFIG_ flags simply remember the bottom most sched_domain when we created the domain hierarchy. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.134511046@chello.nl Signed-off-by: Ingo Molnar commit 1cf51902546d60b8a7a6aba2dd557bd4ba8840ea Author: Peter Zijlstra Date: Thu Apr 7 14:09:47 2011 +0200 sched: Simplify sched_group creation Instead of calling build_sched_groups() for each possible sched_domain we might have created, note that we can simply iterate the sched_domain tree and call it for each sched_domain present. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.077862519@chello.nl Signed-off-by: Ingo Molnar commit 3739494e08da50c8a68d65eed5ba3012a54b40d4 Author: Peter Zijlstra Date: Thu Apr 7 14:09:46 2011 +0200 sched: Clean up some ALLNODES code Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122942.025636011@chello.nl Signed-off-by: Ingo Molnar commit cd4ea6ae3982f6861da3b510e69cbc194f331d83 Author: Peter Zijlstra Date: Thu Apr 7 14:09:45 2011 +0200 sched: Change NODE sched_domain group creation The NODE sched_domain is 'special' in that it allocates sched_groups per CPU, instead of sharing the sched_groups between all CPUs. While this might have some benefits on large NUMA and avoid remote memory accesses when iterating the sched_groups, this does break current code that assumes sched_groups are shared between all sched_domains (since the dynamic cpu_power patches). So refactor the NODE groups to behave like all other groups. (The ALLNODES domain again shared its groups across the CPUs for some reason). If someone does measure a performance decrease due to this change we need to revisit this and come up with another way to have both dynamic cpu_power and NUMA work nice together. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122941.978111700@chello.nl Signed-off-by: Ingo Molnar commit a06dadbec5c5df0bf3a35f33616f67d10ca9ba28 Author: Peter Zijlstra Date: Thu Apr 7 14:09:44 2011 +0200 sched: Simplify build_sched_groups() Notice that the mask being computed is the same as the domain span we just computed. By using the domain_span we can avoid some mask allocations and computations. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122941.925028189@chello.nl Signed-off-by: Ingo Molnar commit d274cb30f4a08045492d3f0c47cdf1a25668b1f5 Author: Peter Zijlstra Date: Thu Apr 7 14:09:43 2011 +0200 sched: Simplify ->cpu_power initialization The code in update_group_power() does what init_sched_groups_power() does and more, so remove the special init_ code and call the generic code instead. Also move the sd->span_weight initialization because update_group_power() needs it. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122941.875856012@chello.nl Signed-off-by: Ingo Molnar commit c4a8849af939082052d8117f9ea3e170a99ff232 Author: Peter Zijlstra Date: Thu Apr 7 14:09:42 2011 +0200 sched: Remove obsolete arch_ prefixes Non weak static functions clearly are not arch specific, so remove the arch_ prefix. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Cc: Nick Piggin Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20110407122941.820460566@chello.nl Signed-off-by: Ingo Molnar commit f4ad9bd208c98f32a6f9136618e0b8bebe3fb370 Author: Shaohua Li Date: Fri Apr 8 12:53:09 2011 +0800 sched: Eliminate dead code from wakeup_gran() calc_delta_fair() checks NICE_0_LOAD already, delete duplicate check. Signed-off-by: Shaohua Li Signed-off-by: Peter Zijlstra Cc: Mike Galbraith Link: http://lkml.kernel.org/r/1302238389.3981.92.camel@sli10-conroe Signed-off-by: Ingo Molnar commit fd7b5535e10ce820f030842da3f289f80ec0d4f3 Author: Joerg Roedel Date: Tue Apr 5 15:31:08 2011 +0200 x86/amd-iommu: Add ATS enable/disable code This patch adds the necessary code to the AMD IOMMU driver for enabling and disabling the ATS capability on a device and to setup the IOMMU data structures correctly. Signed-off-by: Joerg Roedel commit 60f723b4117507c05c8b0b5c8b98ecc12a76878e Author: Joerg Roedel Date: Tue Apr 5 12:50:24 2011 +0200 x86/amd-iommu: Add flag to indicate IOTLB support This patch adds a flag to the AMD IOMMU driver to indicate that all IOMMUs present in the system support device IOTLBs. Signed-off-by: Joerg Roedel commit cb41ed85efa01e633388314c03a4f3004c6b783b Author: Joerg Roedel Date: Tue Apr 5 11:00:53 2011 +0200 x86/amd-iommu: Flush device IOTLB if ATS is enabled This patch implements a function to flush the IOTLB on devices supporting ATS and makes sure that this TLB is also flushed if necessary. Signed-off-by: Joerg Roedel commit 9844b4e5dd1932e175a23d84ce09702bdf4b5689 Author: Joerg Roedel Date: Tue Apr 5 09:22:56 2011 +0200 x86/amd-iommu: Select PCI_IOV with AMD IOMMU driver In order to support ATS in the AMD IOMMU driver this patch makes sure that the generic support for ATS is compiled in. Signed-off-by: Joerg Roedel commit 5cdede2408e80f190c5595e592c24e77c1bf44b2 Author: Joerg Roedel Date: Mon Apr 4 15:55:18 2011 +0200 PCI: Move ATS declarations in seperate header file This patch moves the relevant declarations from the local header file in drivers/pci to a more accessible locations so that it can be used by the AMD IOMMU driver too. The file is named pci-ats.h because support for the PCI PRI capability will also be added there in a later patch-set. Signed-off-by: Joerg Roedel Acked-by: Jesse Barnes commit ce9c99af8d4b3b0b9463654fd252d8640d804dc3 Author: Ian Campbell Date: Fri Apr 8 07:42:29 2011 +0100 x86, cpu: Move AMD Elan Kconfig under "Processor family" Currently the option resides under X86_EXTENDED_PLATFORM due to historical nonstandard A20M# handling. However that is no longer the case and so Elan can be treated as part of the standard processor choice Kconfig option. Signed-off-by: Ian Campbell Link: http://lkml.kernel.org/r/1302245177.31620.47.camel@localhost.localdomain Cc: H. Peter Anvin Signed-off-by: H. Peter Anvin commit ba4b87ad5497cba555954885db99c99627f93748 Author: Stanislaw Gruszka Date: Thu Mar 31 08:08:09 2011 -0400 dma-debug: print information about leaked entry When driver leak dma mapping, print additional information about one of leaked entries, to to help investigate problem. Patch should be useful for debugging drivers, which maps many different class of buffers. Signed-off-by: Stanislaw Gruszka Signed-off-by: Joerg Roedel commit 7d0c5cc5be73f7ce26fdcca7b8ec2203f661eb93 Author: Joerg Roedel Date: Thu Apr 7 08:16:10 2011 +0200 x86/amd-iommu: Flush all internal TLBs when IOMMUs are enabled The old code only flushed a DTE or a domain TLB before it is actually used by the IOMMU driver. While this is efficient and works when done right it is more likely to introduce new bugs when changing code (which happened in the past). This patch adds code to flush all DTEs and all domain TLBs in each IOMMU right after it is enabled (at boot and after resume). This reduces the complexity of the driver and makes it less likely to introduce stale-TLB bugs in the future. Signed-off-by: Joerg Roedel commit d8c13085775c72e2d46edc54ed0c803c3a944ddb Author: Joerg Roedel Date: Wed Apr 6 18:51:26 2011 +0200 x86/amd-iommu: Rename iommu_flush_device This function operates on a struct device, so give it a name that represents that. As a side effect a new function is introduced which operates on am iommu and a device-id. It will be used again in a later patch. Signed-off-by: Joerg Roedel commit ac0ea6e92b2227c86fe4f7f9eb429071d617a25d Author: Joerg Roedel Date: Wed Apr 6 18:38:20 2011 +0200 x86/amd-iommu: Improve handling of full command buffer This patch improved the handling of commands when the IOMMU command buffer is nearly full. In this case it issues an completion wait command and waits until the IOMMU has processed it before continuing queuing new commands. Signed-off-by: Joerg Roedel commit 17b124bf1463582005d662d4dd95f037ad863c57 Author: Joerg Roedel Date: Wed Apr 6 18:01:35 2011 +0200 x86/amd-iommu: Rename iommu_flush* to domain_flush* These functions all operate on protection domains and not on singe IOMMUs. Represent that in their name. Signed-off-by: Joerg Roedel commit 61985a040f17c03b09a2772508ee02729571365b Author: Joerg Roedel Date: Wed Apr 6 17:46:49 2011 +0200 x86/amd-iommu: Remove command buffer resetting logic The logic to reset the command buffer caused more problems than it actually helped. The logic jumped in when the IOMMU hardware doesn't execute commands anymore but the reasons for this are usually not fixed by just resetting the command buffer. So the code can be removed to reduce complexity. Signed-off-by: Joerg Roedel commit 815b33fdc279d34ab40a8bfe1866623a4cc5669b Author: Joerg Roedel Date: Wed Apr 6 17:26:49 2011 +0200 x86/amd-iommu: Cleanup completion-wait handling This patch cleans up the implementation of completion-wait command sending. It also switches the completion indicator from the MMIO bit to a memory store which can be checked without IOMMU locking. As a side effect this patch makes the __iommu_queue_command function obsolete and so it is removed too. Signed-off-by: Joerg Roedel commit 993ba1585cbb03fab012e41d1a5d24330a283b31 Author: Tejun Heo Date: Tue Apr 5 00:24:00 2011 +0200 x86-32, numa: Update remap allocator comments Now that remap allocator is cleaned up, update comments such that they are in docbook function description format and reflect the actual implementation. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-15-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 198bd06bbfde2984027e91f64c55eb19a7034a27 Author: Tejun Heo Date: Tue Apr 5 00:23:59 2011 +0200 x86-32, numa: Remove redundant node_remap_size[] Remap area size can be determined from node_remap_start_vaddr[] and node_remap_end_vaddr[] making node_remap_size[] redundant. Remove it. While at it, make resume_map_numa_kva() use @nr_pages for number of pages instead of @size. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-14-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 1d85b61baf0334dd6bb88261bec42b808204d694 Author: Tejun Heo Date: Tue Apr 5 00:23:58 2011 +0200 x86-32, numa: Remove now useless node_remap_offset[] With lowmem address reservation moved into init_alloc_remap(), node_remap_offset[] is no longer useful. Remove it and related offset handling code. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-13-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit b2e3e4fa3eee752b893687783f2a427106c93423 Author: Tejun Heo Date: Tue Apr 5 00:23:57 2011 +0200 x86-32, numa: Make pgdat allocation use alloc_remap() pgdat allocation is handled differnetly from other remap allocations - it's reserved during initialization. There's no reason to handle this any differnetly. Remap allocator is initialized for every node and if init failed the allocation will fail and pgdat allocation can fall back to generic code like anyone else. Remove special init-time pgdat reservation and make allocate_pgdat() use alloc_remap() like everyone else. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-12-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 2a286344f06d6341740b284494379373e87648f7 Author: Tejun Heo Date: Tue Apr 5 00:23:56 2011 +0200 x86-32, numa: Move remapping for remap allocator into init_alloc_remap() There's no reason to perform the actual remapping separately. Collapse remap_numa_kva() into init_alloc_remap() and, while at it, make it less verbose. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-11-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 0e9f93c1c04c8ab10cc564df54a7ad0f83c67796 Author: Tejun Heo Date: Tue Apr 5 00:23:55 2011 +0200 x86-32, numa: Move lowmem address space reservation to init_alloc_remap() Remap alloc init is done in the following stages. 1. init_alloc_remap() calculates how much memory is necessary for each node and reserves node local memory. 2. initmem_init() collects how much each node needs and reserves a single contiguous lowmem area which can contain all. 3. init_remap_allocator() initializes allocator parameters from the determined lowmem address and per-node offsets. 4. Actual remap happens. There is no reason for the lowmem remap area to be reserved as a single contiguous area at one go. They don't interact with each other and the memblock allocator will put them side-by-side anyway. This patch breaks up the single lowmem address reservation and put per-node lowmem address reservation into init_alloc_remap() and initializes allocator parameters directly in the function as all the addresses are determined there. This merges steps 2 and 3 into 1. While at it, remove now largely irrelevant comments in init_alloc_remap(). This change causes the following behavior changes. * Remap lowmem areas are allocated in smaller per-node chunks. * Remap lowmem area reservation failure fail future remap allocations instead of panicking. * Remap allocator initialization is less verbose. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-10-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 82044c328d6f6b22882c2a936e487e6d2240817a Author: Tejun Heo Date: Tue Apr 5 00:23:54 2011 +0200 x86-32, numa: Make init_alloc_remap() less panicky Remap allocator failure isn't fatal. The callers are required to fall back to regular early memory allocation mechanisms on failure anyway, so there's no reason to panic on remap init failure. Whining and returning are enough. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-9-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 7210cf9217937e470a9acbc113a590f476b9c047 Author: Tejun Heo Date: Tue Apr 5 00:23:53 2011 +0200 x86-32, numa: Calculate remap size in common code Only pgdat and memmap use remap area and there isn't much benefit in allowing per-node override. In addition, the use of node_remap_size[] is confusing in that it contains number of bytes before remap initialization and then number of pages afterwards. Move remap size calculation for memap from specific NUMA config implementations to init_alloc_remap() and make node_remap_size[] static. The only behavior difference is that, before this patch, numaq_32 didn't consider max_pfn when calculating the memmap size but it's enforced after this patch, which is the right thing to do. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-8-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit af7c1a6e8374e05aab4a98ce4d2fb07b66506a02 Author: Tejun Heo Date: Tue Apr 5 00:23:52 2011 +0200 x86-32, numa: Make @size in init_aloc_remap() represent bytes @size variable in init_alloc_remap() is confusing in that it starts as number of bytes as its name implies and then becomes number of pages. Make it consistently represent bytes. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-7-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit c4d4f577d49c441ab4f1bb6068247dafb366e635 Author: Tejun Heo Date: Tue Apr 5 00:23:51 2011 +0200 x86-32, numa: Rename @node_kva to @node_pa in init_alloc_remap() init_alloc_remap() is about to do more and using _kva suffix for physical address becomes confusing because the function will be handling both physical and virtual addresses. Rename @node_kva to @node_pa. This is trivial rename and doesn't cause any behavior difference. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-6-git-send-email-tj@kernel.org Signed-off-by: H. Peter Anvin commit 5510db9c1be111528ce46c57f0bec1c9dce258f4 Author: Tejun Heo Date: Tue Apr 5 00:23:50 2011 +0200 x86-32, numa: Reorganize calculate_numa_remap_page() Separate the outer node walking loop and per-node logic from calculate_numa_remap_pages(). The outer loop is collapsed into initmem_init() and the per-node logic is moved into a new function - init_alloc_remap(). The new function name is confusing with the existing init_remap_allocator() and the behavior is the function isn't very clean either at this point, but this is to prepare for further cleanups and it will become prettier. This function doesn't introduce any behavior change. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-5-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 5b8443b25c0f323ec190d094e4b441957b02664e Author: Tejun Heo Date: Tue Apr 5 00:23:49 2011 +0200 x86-32, numa: Remove redundant top-down alloc code from remap initialization memblock_find_in_range() now does top-down allocation by default, so there's no reason for its callers to explicitly implement it by gradually lowering the start address. Remove redundant top-down allocation logic from init_meminit() and calculate_numa_remap_pages(). Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-4-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit a6c24f7a705d939ddd2fcaa443fa3d8e852b933d Author: Tejun Heo Date: Tue Apr 5 00:23:48 2011 +0200 x86-32, numa: Align pgdat size while initializing alloc_remap When pgdat is reserved in init_remap_allocator(), PAGE_SIZE aligned size will be used. Match the size alignment in initialization to avoid allocation failure down the road. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-3-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 3fe14ab541cd9b0d1f243afb7556046f12c8743c Author: Tejun Heo Date: Tue Apr 5 00:23:47 2011 +0200 x86-32, numa: Fix failure condition check in alloc_remap() node_remap_{start|end}_vaddr[] describe [start, end) ranges; however, alloc_remap() incorrectly failed when the current allocation + size equaled the end but it should fail only when it goes over. Fix it. Signed-off-by: Tejun Heo Link: http://lkml.kernel.org/r/1301955840-7246-2-git-send-email-tj@kernel.org Acked-by: Yinghai Lu Cc: David Rientjes Signed-off-by: H. Peter Anvin commit 11b6402c6673b530fac9920c5640c75e99fee956 Author: Joerg Roedel Date: Wed Apr 6 11:49:28 2011 +0200 x86/amd-iommu: Cleanup inv_pages command handling This patch reworks the processing of invalidate-pages commands to the IOMMU. The function building the the command is extended so we can get rid of another function. It was also renamed to match with the other function names. Signed-off-by: Joerg Roedel commit 94fe79e2f100bfcd8e7689cbf8838634779b80a2 Author: Joerg Roedel Date: Wed Apr 6 11:07:21 2011 +0200 x86/amd-iommu: Move inv-dte command building to own function This patch moves command building for the invalidate-dte command into its own function. Signed-off-by: Joerg Roedel commit ded467374a34eb80020c2213456b1d9ca946b88c Author: Joerg Roedel Date: Wed Apr 6 10:53:48 2011 +0200 x86/amd-iommu: Move compl-wait command building to own function This patch introduces a seperate function for building completion-wait commands. Signed-off-by: Joerg Roedel commit 660e34cebf0a11d54f2d5dd8838607452355f321 Author: Matthew Garrett Date: Mon Apr 4 13:55:05 2011 -0400 x86: Reorder reboot method preferences We have a never ending stream of 'reboot quirks' for new boxes that will not reboot properly under Linux (they will hang on reboot). The reason is widespread 'Windows compatible' assumption of modern x86 hardware, which expects the following reboot sequence: - hitting the ACPI reboot vector (if available) - trying the keyboard controller - hitting the ACPI reboot vector again - then giving the keyboard controller one last go This sequence expectation gets more and more embedded in modern hardware, which often lacks a keyboard controller and may even lock up if the legacy io ports are hit - and which hardware is often not tested with Linux during development. The end result is that reboot works under Windows-alike OSs but not under Linux. Rework our reboot process to meet this hardware externality a little better and match this assumption of newer x86 hardware. In addition to the ACPI,kbd,ACPI,kbd sequence we'll still fall through to attempting a legacy triple fault if nothing else works - and keep trying that and the kbd reset. Signed-off-by: Matthew Garrett [ this commit will also save special casing Oaktrail boards ] Acked-by: Alan Cox Cc: Linus Torvalds Cc: Andrew Morton Cc: Leann Ogasawara Cc: Dave Jones Cc: Len Brown LKML-Reference: <1301939705-2404-1-git-send-email-mjg@redhat.com> Signed-off-by: Ingo Molnar commit 5373db886b791b2bc7811e2c115377916c409a5d Author: Jan Glauber Date: Wed Mar 16 15:58:30 2011 -0400 jump label: Add s390 support Implement the architecture backend for jump label support on s390. For a shared kernel booted from a NSS silently disable jump labels because the NSS is read-only. Therefore jump labels will be disabled in a shared kernel and can't be activated. Signed-off-by: Jan Glauber LKML-Reference: <6935d2c41ce111e1719176ed4bbd3dbe4de80855.1300299760.git.jbaron@redhat.com> Acked-by: Peter Zijlstra Signed-off-by: Martin Schwidefsky Signed-off-by: Steven Rostedt commit ef64789413c73f32faa5e5f1bc393e5843b0aa51 Author: Jason Baron Date: Wed Mar 16 15:58:27 2011 -0400 jump label: Add _ASM_ALIGN for x86 and x86_64 The linker should not be adding holes to word size aligned pointers, but out of paranoia we are explicitly specifying that alignment. I have not seen any holes in the jump label section in practice. Signed-off-by: Jason Baron LKML-Reference: Acked-by: Peter Zijlstra Acked-by: Mathieu Desnoyers Signed-off-by: Steven Rostedt commit d430d3d7e646eb1eac2bb4aa244a644312e67c76 Author: Jason Baron Date: Wed Mar 16 17:29:47 2011 -0400 jump label: Introduce static_branch() interface Introduce: static __always_inline bool static_branch(struct jump_label_key *key); instead of the old JUMP_LABEL(key, label) macro. In this way, jump labels become really easy to use: Define: struct jump_label_key jump_key; Can be used as: if (static_branch(&jump_key)) do unlikely code enable/disale via: jump_label_inc(&jump_key); jump_label_dec(&jump_key); that's it! For the jump labels disabled case, the static_branch() becomes an atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(), atomic_dec() operations. We show testing results for this change below. Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct. Since we now require a 'struct jump_label_key *key', we can store a pointer into the jump table addresses. In this way, we can enable/disable jump labels, in basically constant time. This change allows us to completely remove the previous hashtable scheme. Thanks to Peter Zijlstra for this re-write. Testing: I ran a series of 'tbench 20' runs 5 times (with reboots) for 3 configurations, where tracepoints were disabled. jump label configured in avg: 815.6 jump label *not* configured in (using atomic reads) avg: 800.1 jump label *not* configured in (regular reads) avg: 803.4 Signed-off-by: Peter Zijlstra LKML-Reference: <20110316212947.GA8792@redhat.com> Signed-off-by: Jason Baron Suggested-by: H. Peter Anvin Tested-by: David Daney Acked-by: Ralf Baechle Acked-by: David S. Miller Acked-by: Mathieu Desnoyers Signed-off-by: Steven Rostedt commit ee5e51f51be755830f57445e268ba50e88ccbdbb Author: Jiri Olsa Date: Fri Mar 25 12:05:18 2011 +0100 tracing: Avoid soft lockup in trace_pipe running following commands: # enable the binary option echo 1 > ./options/bin # disable context info option echo 0 > ./options/context-info # tracing only events echo 1 > ./events/enable cat trace_pipe plus forcing system to generate many tracing events, is causing lockup (in NON preemptive kernels) inside tracing_read_pipe function. The issue is also easily reproduced by running ltp stress test. (ftrace_stress_test.sh) The reasons are: - bin/hex/raw output functions for events are set to trace_nop_print function, which prints nothing and returns TRACE_TYPE_HANDLED value - LOST EVENT trace do not handle trace_seq overflow These reasons force the while loop in tracing_read_pipe function never to break. The attached patch fixies handling of lost event trace, and changes trace_nop_print to print minimal info, which is needed for the correct tracing_read_pipe processing. v2 changes: - omit the cond_resched changes by trace_nop_print changes - WARN changed to WARN_ONCE and added info to be able to find out the culprit v3 changes: - make more accurate patch comment Signed-off-by: Jiri Olsa LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com> Signed-off-by: Steven Rostedt commit 1813dc3776c22ad4b0294a6df8434b9a02c98109 Author: Steven Rostedt Date: Mon Mar 21 23:36:31 2011 -0400 tracing: Print trace_bprintk() formats for modules too The file debugfs/tracing/printk_formats maps the addresses to the formats that are used by trace_bprintk() so that userspace tools can read the buffer and be able to decode trace_bprintk events to get the format saved when reading the ring buffer directly. This is because trace_bprintk() does not store the format into the buffer, but just the address of the format, which is hidden in the kernel memory. But currently it only exports trace_bprintk()s from the kernel core and not for modules. The modules need their formats exported as well. Signed-off-by: Steven Rostedt commit 0588fa30db44fd2d4032b36a061c87478a43fbee Author: Steven Rostedt Date: Mon Mar 21 22:59:21 2011 -0400 tracing: Convert trace_printk() formats for module to const char * The trace_printk() formats for modules do not show up in the debugfs/tracing/printk_formats file. Only the formats that are for trace_printk()s that are in the kernel core. To facilitate the change to add trace_printk() formats from modules into that file as well, we need to convert the structure that holds the formats from char fmt[], into const char *fmt, and allocate them separately. Signed-off-by: Steven Rostedt commit 64d21fc194e12bdf7347019bf10325a4b3d77e7b Author: Rakib Mullick Date: Sat Apr 2 13:17:48 2011 +0600 x86, mpparse: Put check_slot() into .init section check_slot() is only called from replace_intsrc_all() - which is in the .init section. So, put check_slot into the .init section as well, so it can be freed after system boot. Signed-off-by: Rakib Mullick LKML-Reference: Signed-off-by: Ingo Molnar commit 9402efaa96e4c047dcc63ad5db65384c328aee5f Merge: 78fca1b9 711b8c8 Author: Ingo Molnar Date: Mon Apr 4 16:39:20 2011 +0200 Merge branch 'x86-mm' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into x86/mm commit 711b8c87a5fe6de78e90411cb67b506babfef74a Author: Florian Mickler Date: Mon Apr 4 01:17:40 2011 +0200 x86-64, NUMA: Remove unused variable In case !CONFIG_ACPI_NUMA and !CONFIG_AMD_NUMA gcc emits a warning about the unused variable ret. As that variable is in fact not needed I choose to remove it. Signed-off-by: Florian Mickler LKML-Reference: <1301843624-22364-1-git-send-email-florian@mickler.org> Signed-off-by: Tejun Heo commit 3b16651f806d35b5c404f2525fbce76afa3c9297 Author: Tejun Heo Date: Fri Apr 1 11:15:12 2011 +0200 x86: Clean up memory model related configs in arch/x86/Kconfig * Remove bogus dependency on ARCH_SELECT_MEMORY_MODEL from ARCH_FLATMEM_ENABLE. ENABLE configs don't interfere with SELECT_MEMORY_MODEL. They just need to indicate whether the specific memory model is supported. * Relocate HAVE_ARCH_ALLOC_REMAP, ARCH_PROC_KCORE_TEXT and ARCH_SPARSEMEM_DEFAULT so that memory model related configs are together in consistent order. Signed-off-by: Tejun Heo Reviewed-by: Christoph Lameter Cc: Ingo Molnar Cc: Yinghai Lu Cc: "H. Peter Anvin" Cc: Thomas Gleixner commit 052936080c8fb2f791103995b21bd4018c8df886 Author: Tejun Heo Date: Fri Apr 1 11:15:12 2011 +0200 x86-64, NUMA: Remove custom phys_to_nid() implementation phys_to_nid() maps physical address to NUMA node id. This is implemented by building perfect hash in compute_hash_shift() during initialization. However, with SPARSE memory model, the nid is encoded in page flags. The perfect hash implementation was for DISCONTIG memory model which got removed years ago by b263295dbf (x86: 64-bit, make sparsemem vmemmap the only memory model). So, the perfect hash ends up being used only during initialization when the core SPARSE code already provides perfectly acceptable generic early_pfn_to_nid() implementation. Drop phys_to_nid() and use the generic ealry_pfn_to_nid() instead. Signed-off-by: Tejun Heo Reviewed-by: Christoph Lameter Acked-by: Yinghai Lu Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Thomas Gleixner commit 176fcc5c5f0131504a55e1e1d35389c49a9177c2 Author: Arnaldo Carvalho de Melo Date: Wed Mar 30 15:30:43 2011 -0300 perf script: Add more documentation about the -f/--fields parameters Using the commit log for 2c9e45f. Cc: David Ahern Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Tom Zanussi LKML-Reference: Signed-off-by: Arnaldo Carvalho de Melo commit 2c9e45f7a287384e1382932597e41a9a567811ba Author: David Ahern Date: Thu Mar 17 10:03:21 2011 -0600 perf script: If type not given fields apply to all event types Allow: perf script -f to be equivalent to: perf script -f trace: -f sw: -f hw: i.e., the specified fields apply to all event types if the type string is not given. The field (-f) arguments are processed in the order received. A later usage can reset a prior request. e.g., -f trace: -f comm,tid,time,sym The first -f suppresses trace events (field list is ""), but then the second invocation sets the fields to comm,tid,time,sym. In this case a warning is given to the user: "Overriding previous field request for all events." Alternativey, consider the order: -f comm,tid,time,sym -f trace: The first -f sets the fields for all events and the second -f suppresses trace events. The user is given a warning message about the override, and the result of the above is that only S/W and H/W events are displayed with the given fields. For the 'wildcard' option if a user selected field is invalid for an event type, a message is displayed to the user that the option is ignored for that type. For example: perf script -f comm,tid,trace 2>&1 | less 'trace' not valid for hardware events. Ignoring. 'trace' not valid for software events. Ignoring. Alternatively, if the type is given an invalid field is specified it is an error. For example: perf script -v -f sw:comm,tid,trace 2>&1 | less 'trace' not valid for software events. At this point usage is displayed, and perf-script exits. Finally, a user may not set fields to none for all event types. i.e., -f "" is not allowed. Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Paul Mackerras Cc: Peter Zijlstra Cc: linux-kernel@vger.kernel.org LPU-Reference: <1300377801-27246-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern Signed-off-by: Arnaldo Carvalho de Melo commit 09ca132a8e469f87504899b4016c7517511887d0 Author: Daniel Kiper Date: Mon Mar 28 11:35:59 2011 +0200 xen/balloon: Move dec_totalhigh_pages() from __balloon_append() to balloon_append() git commit 9be4d4575906af9698de660e477f949a076c87e1 (xen: add extra pages to balloon) splited balloon_append() into two functions (balloon_append() and __balloon_append()) and left decrementation of totalram_pages counter in __balloon_append(). In this situation if __balloon_append() is called on i386 with highmem page referenced then totalhigh_pages is decremented, however, it should not. This patch corrects that issue and moves dec_totalhigh_pages() from __balloon_append() to balloon_append(). Now totalram_pages and totalhigh_pages are decremented simultaneously only when balloon_append() is called. Acked-by: Ian Campbell Acked-by: Daniel De Graaf Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit 83be7e52d46a5b3a9955a38a9597bf1de1851ea7 Author: Daniel Kiper Date: Mon Mar 28 11:34:10 2011 +0200 xen/balloon: Clarify credit calculation Move credit calculation to current_target() and rename it to current_credit(). Acked-by: Ian Campbell Acked-by: Daniel De Graaf Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit 4dfe22f5f24345511c378272189b7504d67767fb Author: Daniel Kiper Date: Mon Mar 28 11:33:18 2011 +0200 xen/balloon: Simplify HVM integration Simplify HVM integration proposed by Stefano Stabellini in 53d5522cad291a0e93a385e0594b6aea6b54a071. Acked-by: Ian Campbell Acked-by: Daniel De Graaf Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit a7f3c8f1da1050ad778c3a3930f07c63a5ec570b Author: Daniel Kiper Date: Mon Mar 28 11:32:31 2011 +0200 xen/balloon: Use PageHighMem() for high memory page detection Replace pfn < max_low_pfn by !PageHighMem() in increase_reservation(). It makes more clearer what is going on. Acked-by: Ian Campbell Acked-by: Daniel De Graaf Signed-off-by: Daniel Kiper Signed-off-by: Konrad Rzeszutek Wilk commit ae18cbfe9f147085426635763f1fe0c68f1071e2 Merge: 1d32188 cd25f8b Author: Ingo Molnar Date: Wed Mar 30 09:06:51 2011 +0200 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core commit cd25f8bc2696664877b21d33b7994e12fa570919 Author: Lin Ming Date: Fri Mar 25 16:27:48 2011 +0800 perf probe: Add fastpath to do lookup by function name v3 -> v2: - Make pubname_search_cb more generic - Add fastpath to find_probes also v2 -> v1: - Don't compare file names with cu_find_realpath(...), instead, compare them with the name returned by dwarf_decl_file(sp_die) The vmlinux file may have thousands of CUs. We can lookup function name from .debug_pubnames section to avoid the slow loop on CUs. 1. Improvement data for find_line_range ./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \ -s /home/mlin/linux-2.6 \ --line csum_partial_copy_to_user > tmp.log before patch applied ===================== 847,988,276 cycles 0.355075856 seconds time elapsed after patch applied ===================== 206,102,622 cycles 0.086883555 seconds time elapsed 2. Improvement data for find_probes ./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \ -s /home/mlin/linux-2.6 \ --vars csum_partial_copy_to_user > tmp.log before patch applied ===================== 848,490,844 cycles 0.355307901 seconds time elapsed after patch applied ===================== 205,684,469 cycles 0.086694010 seconds time elapsed Acked-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: linux-kernel LKML-Reference: <1301041668.14111.52.camel@minggr.sh.intel.com> Signed-off-by: Lin Ming Signed-off-by: Arnaldo Carvalho de Melo commit 9d42a53e0f46505b39494041d514372235817e15 Author: Christoph Lameter Date: Sat Mar 12 12:51:12 2011 +0100 acpi throttling: Use this_cpu_has and simplify code current cpu With the this_cpu_xx we no longer need to pass an acpi structure to the msr management code. Simplifies code and improves performance. NOTE: This code is x86 specific (see #ifdef CONFIG_X86) but not under arch/x86. Signed-off-by: Christoph Lameter Acked-by: Tejun Heo Signed-off-by: Tejun Heo commit fe5042138b6fc60edde3b60025078884c2eb71ac Author: Christoph Lameter Date: Sat Mar 12 12:50:46 2011 +0100 x86: Use this_cpu_has for thermal_interrupt current cpu It is more effective to use a segment prefix instead of calculating the address of the current cpu area amd then testing flags. Signed-off-by: Christoph Lameter Acked-by: Tejun Heo Signed-off-by: Tejun Heo commit 349c004e3d31fda23ad225b61861be38047fff16 Author: Christoph Lameter Date: Sat Mar 12 12:50:10 2011 +0100 x86: A fast way to check capabilities of the current cpu Add this_cpu_has() which determines if the current cpu has a certain ability using a segment prefix and a bit test operation. For that we need to add bit operations to x86s percpu.h. Many uses of cpu_has use a pointer passed to a function to determine the current flags. That is no longer necessary after this patch. However, this patch only converts the straightforward cases where cpu_has is used with this_cpu_ptr. The rest is work for later. -tj: Rolled up patch to add x86_ prefix and use percpu_read() instead of percpu_read_stable(). Signed-off-by: Christoph Lameter Acked-by: Tejun Heo Signed-off-by: Tejun Heo commit 1d321881afb955bba1e0a8f50d3a04e111fcb581 Author: Cyrill Gorcunov Date: Mon Mar 28 00:46:11 2011 +0400 perf, x86: P4 PMU - clean up the code a bit No change on the functional level, just align the table properly. Signed-off-by: Cyrill Gorcunov Cc: Lin Ming LKML-Reference: <4D8FA213.5050108@openvz.org> Signed-off-by: Ingo Molnar commit 9f1f1bfd8d7e579f07dbe56d6f93bd594da43b3d Author: Rakib Mullick Date: Wed Mar 23 21:31:40 2011 +0600 x86, mpparse: Remove unnecessary variable 'ret' isn't used by check_slot(), gets initialized but has no real use, so remove it. Signed-off-by: Rakib Mullick LKML-Reference: Signed-off-by: Ingo Molnar commit 5d94e81f69d4b1d1102d3ab557ce0a817c11fbbb Author: Dan Williams Date: Tue Mar 8 10:36:19 2011 -0800 x86: Introduce pci_map_biosrom() The isci driver needs to retrieve its preboot OROM image which contains necessary runtime parameters like platform specific sas addresses and phy configuration. There is no ROM BAR associated with this area, instead we will need to scan legacy expansion ROM space. 1/ Promote the probe_roms_32 implementation to x86-64 2/ Add a facility to find and map an adapter rom by pci device (according to PCI Firmware Specification Revision 3.0) Signed-off-by: Dave Jiang LKML-Reference: <20110308183226.6246.90354.stgit@localhost6.localdomain6> Signed-off-by: Dan Williams Signed-off-by: H. Peter Anvin commit 45bb1674b976ef81429c1e42de05844b49d45dea Author: Daniel Drake Date: Sun Mar 13 15:10:17 2011 +0000 x86, olpc: Use device tree for platform identification Make OLPC fully depend on device tree, and use it to identify the OLPC platform details. Some nodes are exposed as platform devices where we plan to use device tree for device probing. Signed-off-by: Daniel Drake Acked-by: Grant Likely LKML-Reference: <20110313151017.C255F9D401E@zog.reactivated.net> Signed-off-by: H. Peter Anvin commit d79647aea22732f39c81bbdc80931f96b46023f0 Author: Daniel De Graaf Date: Mon Mar 7 15:18:57 2011 -0500 xen/gntdev,gntalloc: Remove unneeded VM flags The only time when granted pages need to be treated specially is when using Xen's PTE modification for grant mappings owned by another domain (that is, only gntdev on PV guests). Otherwise, the area does not require VM_DONTCOPY and VM_PFNMAP, since it can be accessed just like any other page of RAM. Since the vm_operations_struct close operations decrement reference counts, a corresponding open function that increments them is required now that it is possible to have multiple references to a single area. We are careful in the gntdev to check if we can remove those flags. The reason that we need to be careful in gntdev on PV guests is because we are not changing the PFN/MFN mapping on PV; instead, we change the application's page tables to point to the other domain's memory. This means that the vma cannot be copied without using another grant mapping hypercall; it also requires special handling on unmap, which is the reason for gntdev's dependency on the MMU notifier. For gntalloc, this is not a concern - the pages are owned by the domain using the gntalloc device, and can be mapped and unmapped in the same manner as any other page of memory. Acked-by: Ian Campbell Signed-off-by: Daniel De Graaf Signed-off-by: Konrad Rzeszutek Wilk [v2: Added in git commit "We are.." from email correspondence] commit a1c57e0fec53defe745e64417eacdbd3618c3e66 Author: John Stultz Date: Mon Apr 26 20:20:07 2010 -0700 blackfin: convert to clocksource_register_hz This converts the blackfin clocksource to use clocksource_register_hz. CC: Mike Frysinger CC: Thomas Gleixner Signed-off-by: John Stultz commit 75c4fd8c7862f37eeae5c80f33bbe4dce97571d4 Author: John Stultz Date: Mon Apr 26 20:23:11 2010 -0700 mips: convert to clocksource_register_hz/khz This converts the mips clocksources to use clocksource_register_hz/khz CC: Ralf Baechle CC: Thomas Gleixner Signed-off-by: John Stultz commit 39280742efb00ab61ad62486c737fdd3e980c30f Author: John Stultz Date: Mon Apr 26 20:24:37 2010 -0700 sparc: convert to clocksource_register_hz/khz This converts the sparc clocksources to use clocksource_register_hz/khz CC: "David S. Miller" CC: Thomas Gleixner Signed-off-by: John Stultz commit 7861434fe9b5c9dad1bbb1f674cded95950e778e Author: John Stultz Date: Tue Oct 19 17:49:07 2010 -0700 alpha: convert to clocksource_register_hz Converts alpha to use clocksource_register_hz. CC: Richard Henderson CC: Ivan Kokshaysky CC: Matt Turner CC: Thomas Gleixner Signed-off-by: John Stultz commit b8f39f7dfe12d4c8402c493a24fbf1e21d086771 Author: John Stultz Date: Mon Apr 26 20:22:23 2010 -0700 microblaze: convert to clocksource_register_hz/khz This converts the microblaze clocksources to use clocksource_register_hz/khz CC: Michal Simek CC: Thomas Gleixner Tested-by: Michal Simek Signed-off-by: John Stultz commit d60c3041778c11f564969fb62b337df68232ee80 Author: John Stultz Date: Mon Apr 26 20:20:47 2010 -0700 ia64: convert to clocksource_register_hz/khz This converts the ia64 clocksources to use clocksource_register_hz/khz CC: Tony Luck CC: Thomas Gleixner Tested-by: Tony Luck [clocksource_itc path] Signed-off-by: John Stultz commit b01cc1b0eae0dea19257b29347116505fbedf679 Author: John Stultz Date: Mon Apr 26 19:03:05 2010 -0700 x86: Convert remaining x86 clocksources to clocksource_register_hz/khz This converts the remaining x86 clocksources to use clocksource_register_hz/khz. CC: jacob.jun.pan@intel.com CC: Glauber Costa CC: Dimitri Sivanich CC: Rusty Russell CC: Jeremy Fitzhardinge CC: Chris McDermott CC: Thomas Gleixner Tested-by: Konrad Rzeszutek Wilk [xen] Signed-off-by: John Stultz commit 36d8593ec74dc04d3bd7c1c897a7b7cfbd0b0dc6 Author: Russell King - ARM Linux Date: Sat Feb 19 15:34:50 2011 +0000 Make clocksource name const As nothing should be writing to the clocksource name string, make the clocksource name pointer const. Build-tested on ARM Versatile Express. Signed-off-by: Russell King Signed-off-by: John Stultz