commit 6ec22f9b037fc0c2e00ddb7023fad279c365324d Merge: 83be7d7 9b3660a Author: Linus Torvalds Date: Sat Dec 5 15:33:27 2009 -0800 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Limit number of per cpu TSC sync messages x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes x86: Suppress stack overrun message for init_task x86: Fix cpu_devs[] initialization in early_cpu_init() x86: Remove CPU cache size output for non-Intel too x86: Minimise printk spew from per-vendor init code x86: Remove the CPU cache size printk's cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c x86: Make sure we also print a Code: line for show_regs() commit 83be7d764dc4b860712e392197ec27645f9d74a8 Merge: c2ed69c 0d0fbbd Author: Linus Torvalds Date: Sat Dec 5 15:32:35 2009 -0800 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t x86, cpuid: Simplify the code in cpuid_open x86, cpuid: Remove the bkl from cpuid_open() x86, msr: Remove the bkl from msr_open() x86: AMD Geode LX optimizations x86, msr: Unify rdmsr_on_cpus/wrmsr_on_cpus commit c2ed69cdc9da49a8d2d7b4212fd225abf902ceaa Merge: ef26b16 9eaa192 Author: Linus Torvalds Date: Sat Dec 5 15:32:18 2009 -0800 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix a section mismatch in arch/x86/kernel/setup.c x86: Fixup last users of irq_chip->typename x86: Remove BKL from apm_32 x86: Remove BKL from microcode x86: use kernel_stack_pointer() in kprobes.c x86: use kernel_stack_pointer() in kgdb.c x86: use kernel_stack_pointer() in dumpstack.c x86: use kernel_stack_pointer() in process_32.c commit ef26b1691d11e17af205a4ff9c91458d931d11db Merge: a77d2e0 7cff7ce Author: Linus Torvalds Date: Sat Dec 5 15:32:03 2009 -0800 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: include/linux/compiler-gcc4.h: Fix build bug - gcc-4.0.2 doesn't understand __builtin_object_size x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h x86/alternatives: Check replacementlen <= instrlen at build time x86, 64-bit: Set data segments to null after switching to 64-bit mode x86: Clean up the loadsegment() macro x86: Optimize loadsegment() x86: Add missing might_fault() checks to copy_{to,from}_user() x86-64: __copy_from_user_inatomic() adjustments x86: Remove unused thread_return label from switch_to() x86, 64-bit: Fix bstep_iret jump x86: Don't use the strict copy checks when branch profiling is in use x86, 64-bit: Move K8 B step iret fixup to fault entry asm x86: Generate cmpxchg build failures x86: Add a Kconfig option to turn the copy_from_user warnings into errors x86: Turn the copy_from_user check into an (optional) compile time warning x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy x86: Use __builtin_object_size() to validate the buffer size for copy_from_user() commit a77d2e081bbbccb38f42da45500dd089756efdfb Merge: 897e81b 7d1849a Author: Linus Torvalds Date: Sat Dec 5 15:31:25 2009 -0800 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) x86, apic: Enable lapic nmi watchdog on AMD Family 11h x86: Remove unnecessary mdelay() from cpu_disable_common() x86, ioapic: Document another case when level irq is seen as an edge x86, ioapic: Fix the EOI register detection mechanism x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq x86: SGI UV: Map low MMR ranges x86: apic: Print out SRAT table APIC id in hex x86: Re-get cfg_new in case reuse/move irq_desc x86: apic: Remove not needed #ifdef x86: io-apic: IO-APIC MMIO should not fail on resource insertion x86: Remove asm/apicnum.h x86: apic: Do not use stacked physid_mask_t x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit x86, ioapic: Use snrpintf while set names for IO-APIC resourses x86, apic: Use PAGE_SIZE instead of numbers x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs() x86: Use EOI register in io-apic on intel platforms x86: Force irq complete move during cpu offline x86: Remove move_cleanup_count from irq_cfg x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping ... commit 897e81bea1fcfcd2c5cdb720c9efdb25da9ff374 Merge: c3fa27d 0cf55e1 Author: Linus Torvalds Date: Sat Dec 5 15:30:49 2009 -0800 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits) sched, cputime: Introduce thread_group_times() sched, cputime: Cleanups related to task_times() Revert "sched, x86: Optimize branch hint in __switch_to()" sched: Fix isolcpus boot option sched: Revert 498657a478c60be092208422fefa9c7b248729c2 sched, time: Define nsecs_to_jiffies() sched: Remove task_{u,s,g}time() sched: Introduce task_times() to replace task_{u,s}time() pair sched: Limit the number of scheduler debug messages sched.c: Call debug_show_all_locks() when dumping all tasks sched, x86: Optimize branch hint in __switch_to() sched: Optimize branch hint in context_switch() sched: Optimize branch hint in pick_next_task_fair() sched_feat_write(): Update ppos instead of file->f_pos sched: Sched_rt_periodic_timer vs cpu hotplug sched, kvm: Fix race condition involving sched_in_preempt_notifers sched: More generic WAKE_AFFINE vs select_idle_sibling() sched: Cleanup select_task_rq_fair() sched: Fix granularity of task_u/stime() sched: Fix/add missing update_rq_clock() calls ... commit c3fa27d1367fac63ac8533d6f20ea851d0d70a10 Merge: 96fa2b5 d103d01 Author: Linus Torvalds Date: Sat Dec 5 15:30:21 2009 -0800 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (470 commits) x86: Fix comments of register/stack access functions perf tools: Replace %m with %a in sscanf hw-breakpoints: Keep track of user disabled breakpoints tracing/syscalls: Make syscall events print callbacks static tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook perf: Don't free perf_mmap_data until work has been done perf_event: Fix compile error perf tools: Fix _GNU_SOURCE macro related strndup() build error trace_syscalls: Remove unused syscall_name_to_nr() trace_syscalls: Simplify syscall profile trace_syscalls: Remove duplicate init_enter_##sname() trace_syscalls: Add syscall_nr field to struct syscall_metadata trace_syscalls: Remove enter_id exit_id trace_syscalls: Set event_enter_##sname->data to its metadata trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit perf_event: Initialize data.period in perf_swevent_hrtimer() perf probe: Simplify event naming perf probe: Add --list option for listing current probe events perf probe: Add argv_split() from lib/argv_split.c perf probe: Move probe event utility functions to probe-event.c ... commit 96fa2b508d2d3fe040cf4ef2fffb955f0a537ea1 Merge: 7a797cd b8007ef Author: Linus Torvalds Date: Sat Dec 5 09:53:36 2009 -0800 Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (40 commits) tracing: Separate raw syscall from syscall tracer ring-buffer-benchmark: Add parameters to set produce/consumer priorities tracing, function tracer: Clean up strstrip() usage ring-buffer benchmark: Run producer/consumer threads at nice +19 tracing: Remove the stale include/trace/power.h tracing: Only print objcopy version warning once from recordmcount tracing: Prevent build warning: 'ftrace_graph_buf' defined but not used ring-buffer: Move access to commit_page up into function used tracing: do not disable interrupts for trace_clock_local ring-buffer: Add multiple iterations between benchmark timestamps kprobes: Sanitize struct kretprobe_instance allocations tracing: Fix to use __always_unused attribute compiler: Introduce __always_unused tracing: Exit with error if a weak function is used in recordmcount.pl tracing: Move conditional into update_funcs() in recordmcount.pl tracing: Add regex for weak functions in recordmcount.pl tracing: Move mcount section search to front of loop in recordmcount.pl tracing: Fix objcopy revision check in recordmcount.pl tracing: Check absolute path of input file in recordmcount.pl tracing: Correct the check for number of arguments in recordmcount.pl ... commit 7a797cdcca2b3c0031e580203f18d6c9483aaec5 Merge: bb2166c c13d2f7 Author: Linus Torvalds Date: Sat Dec 5 09:53:21 2009 -0800 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix trace_marker output tracing: Fix event format export tracing: Fix return value of tracing_stats_read() commit bb2166c898adb5fe29bc450004926802d2a16035 Merge: 0bf7969 3476994 Author: Linus Torvalds Date: Sat Dec 5 09:53:08 2009 -0800 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Fix spurious irq seqfile conversion genirq: switch /proc/irq/*/spurious to seq_file irq: Do not attempt to create subdirectories if /proc/irq/ failed irq: Remove unused debug_poll_all_shared_irqs() irq: Fix docbook comments irq: trivial: Fix typo in comment for #endif commit 0bf7969feae831ede7276f7cc73b586ce0902374 Merge: 69f061e e5af022 Author: Linus Torvalds Date: Sat Dec 5 09:52:46 2009 -0800 Merge branch 'core-softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: softlockup: Fix hung_task_check_count sysctl commit 69f061e0c2ed47304b3eeac7fb7bd5268652dc50 Merge: 6077817 f84d49b Author: Linus Torvalds Date: Sat Dec 5 09:52:33 2009 -0800 Merge branch 'core-signal-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-signal-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: signal: Print warning message when dropping signals signal: Fix alternate signal stack check commit 607781762e7aae9c976f0a9a8829d4ba3e2da4ab Merge: d0b093a 8bfb2f8 Author: Linus Torvalds Date: Sat Dec 5 09:52:14 2009 -0800 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) rcu: Make RCU's CPU-stall detector be default rcu: Add expedited grace-period support for preemptible RCU rcu: Enable fourth level of TREE_RCU hierarchy rcu: Rename "quiet" functions rcu: Re-arrange code to reduce #ifdef pain rcu: Eliminate unneeded function wrapping rcu: Fix grace-period-stall bug on large systems with CPU hotplug rcu: Eliminate __rcu_pending() false positives rcu: Further cleanups of use of lastcomp rcu: Simplify association of forced quiescent states with grace periods rcu: Accelerate callback processing on CPUs not detecting GP end rcu: Mark init-time-only rcu_bootup_announce() as __init rcu: Simplify association of quiescent states with grace periods rcu: Rename dynticks_completed to completed_fqs rcu: Enable synchronize_sched_expedited() fastpath rcu: Remove inline from forward-referenced functions rcu: Fix note_new_gpnum() uses of ->gpnum rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter rcu: Cleanup: balance rcu_irq_enter()/rcu_irq_exit() calls ... commit d0b093a8b5ae34ee8be1f7e0dd197fe4788fa1d5 Merge: 3e72b81 5c82871 Author: Linus Torvalds Date: Sat Dec 5 09:50:22 2009 -0800 Merge branch 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ratelimit: Make suppressed output messages more useful printk: Remove ratelimit.h from kernel.h ratelimit: Fix/allow use in atomic contexts ratelimit: Use per ratelimit context locking commit 3e72b810e30cdf4655279dd767eb798ac7a8fe5e Merge: 9b269d4 c08f782 Author: Linus Torvalds Date: Sat Dec 5 09:49:59 2009 -0800 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mutex: Fix missing conditions to build mutex_spin_on_owner() mutex: Better control mutex adaptive spinning config locking, task_struct: Reduce size on TRACE_IRQFLAGS and 64bit locking: Use __[SPIN|RW]_LOCK_UNLOCKED in [spin|rw]_lock_init() locking: Remove unused prototype locking: Reduce ifdefs in kernel/spinlock.c locking: Make inlining decision Kconfig based commit 9b269d4034c7855ac34f0985cc55ee29bd80e80a Merge: 7b626ac 2ea6dec Author: Linus Torvalds Date: Sat Dec 5 09:49:46 2009 -0800 Merge branch 'core-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: generic-ipi: Add smp_call_function_any() generic-ipi: Fix misleading smp_call_function*() description commit 7b626acb8f983eb83b396ab96cc24b18d635d487 Merge: 1ebb275 4528752 Author: Linus Torvalds Date: Sat Dec 5 09:49:07 2009 -0800 Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits) x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree x86/amd-iommu: Remove amd_iommu_pd_table x86/amd-iommu: Move reset_iommu_command_buffer out of locked code x86/amd-iommu: Cleanup DTE flushing code x86/amd-iommu: Introduce iommu_flush_device() function x86/amd-iommu: Cleanup attach/detach_device code x86/amd-iommu: Keep devices per domain in a list x86/amd-iommu: Add device bind reference counting x86/amd-iommu: Use dev->arch->iommu to store iommu related information x86/amd-iommu: Remove support for domain sharing x86/amd-iommu: Rearrange dma_ops related functions x86/amd-iommu: Move some pte allocation functions in the right section x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc x86/amd-iommu: Use get_device_id and check_device where appropriate x86/amd-iommu: Move find_protection_domain to helper functions x86/amd-iommu: Simplify get_device_resources() x86/amd-iommu: Let domain_for_device handle aliases x86/amd-iommu: Remove iommu specific handling from dma_ops path x86/amd-iommu: Remove iommu parameter from __(un)map_single x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs ... commit 1ebb275afcf5a47092e995541d6c604eef96062a Merge: 83fdbfb 26bb750 Author: Linus Torvalds Date: Sat Dec 5 09:47:17 2009 -0800 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (31 commits) GFS2: Fix glock refcount issues writeback: remove unused nonblocking and congestion checks (gfs2) GFS2: drop rindex glock to refresh rindex list GFS2: Tag all metadata with jid GFS2: Locking order fix in gfs2_check_blk_state GFS2: Remove dirent_first() function GFS2: Display nobarrier option in /proc/mounts GFS2: add barrier/nobarrier mount options GFS2: remove division from new statfs code GFS2: Improve statfs and quota usability GFS2: Use dquot_send_warning() VFS: Export dquot_send_warning GFS2: Add set_xquota support GFS2: Add get_xquota support GFS2: Clean up gfs2_adjust_quota() and do_glock() GFS2: Remove constant argument from qd_get() GFS2: Remove constant argument from qdsb_get() GFS2: Add proper error reporting to quota sync via sysfs GFS2: Add get_xstate quota function GFS2: Remove obsolete code in quota.c ... commit 83fdbfbfe6e7e8906e3a3f8f6bc074d887e92109 Merge: d9b2c4d c84d6ef Author: Linus Torvalds Date: Sat Dec 5 09:44:57 2009 -0800 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (30 commits) TOMOYO: Add recursive directory matching operator support. remove CONFIG_SECURITY_FILE_CAPABILITIES compile option SELinux: print denials for buggy kernel with unknown perms Silence the existing API for capability version compatibility check. LSM: Move security_path_chmod()/security_path_chown() to after mutex_lock(). SELinux: header generation may hit infinite loop selinux: Fix warnings security: report the module name to security_module_request Config option to set a default LSM sysctl: require CAP_SYS_RAWIO to set mmap_min_addr tpm: autoload tpm_tis based on system PnP IDs tpm_tis: TPM_STS_DATA_EXPECT workaround define convenient securebits masks for prctl users (v2) tpm: fix header for modular build tomoyo: improve hash bucket dispersion tpm add default function definitions LSM: imbed ima calls in the security hooks SELinux: add .gitignore files for dynamic classes security: remove root_plug SELinux: fix locking issue introduced with c6d3aaa4e35c71a3 ... commit d9b2c4d0b03c721808c0d259e43a27f1e80205bc Merge: 27d16d0 5fa9167 Author: Linus Torvalds Date: Sat Dec 5 09:42:59 2009 -0800 Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6: (50 commits) pcmcia: rework the irq_req_t typedef pcmcia: remove deprecated handle_to_dev() macro pcmcia: pcmcia_request_window() doesn't need a pointer to a pointer pcmcia: remove unused "window_t" typedef pcmcia: move some window-related code to pcmcia_ioctl.c pcmcia: Change window_handle_t logic to unsigned long pcmcia: Pass struct pcmcia_socket to pcmcia_get_mem_page() pcmcia: Pass struct pcmcia_device to pcmcia_map_mem_page() pcmcia: Pass struct pcmcia_device to pcmcia_release_window() drivers/pcmcia: remove unnecessary kzalloc pcmcia: correct handling for Zoomed Video registers in topic.h pcmcia: fix printk formats pcmcia: autoload module pcmcia pcmcia/staging: update comedi drivers PCMCIA: stop duplicating pci_irq in soc_pcmcia_socket PCMCIA: ss: allow PCI IRQs > 255 PCMCIA: soc_common: remove 'dev' member from soc_pcmcia_socket PCMCIA: soc_common: constify soc_pcmcia_socket ops member PCMCIA: sa1111: remove duplicated initializers PCMCIA: sa1111: wrap soc_pcmcia_socket to contain sa1111 specific data ... commit 27d16d08717faeaa8afd1b736a096dbaab90f08e Author: David Daney Date: Fri Dec 4 17:44:54 2009 -0800 avr32: Convert BUG() to use unreachable() Use the new unreachable() macro instead of for(;;); Signed-off-by: David Daney Acked-by: Haavard Skinnemoen Signed-off-by: Linus Torvalds commit 5506e68975c346e55f786b554e28e368cdede444 Author: David Daney Date: Fri Dec 4 17:44:53 2009 -0800 s390: Convert BUG() to use unreachable() Use the new unreachable() macro instead of for(;;); Signed-off-by: David Daney Acked-by: Martin Schwidefsky CC: Heiko Carstens CC: linux390@de.ibm.com CC: linux-s390@vger.kernel.org Signed-off-by: Linus Torvalds commit 4ef5651e85589e3aaca704c6d2d7ae7e794e5d25 Author: David Daney Date: Fri Dec 4 17:44:52 2009 -0800 MIPS: Convert BUG() to use unreachable() Use the new unreachable() macro instead of while(1); Signed-off-by: David Daney Acked-by: Ralf Baechle CC: linux-mips@linux-mips.org Signed-off-by: Linus Torvalds commit a5fc5eba4dfcc284e6adcd7fdcd5b43182230d2b Author: David Daney Date: Fri Dec 4 17:44:51 2009 -0800 x86: Convert BUG() to use unreachable() Use the new unreachable() macro instead of for(;;);. When allyesconfig is built with a GCC-4.5 snapshot on i686 the size of the text segment is reduced by 3987 bytes (from 6827019 to 6823032). Signed-off-by: David Daney Acked-by: "H. Peter Anvin" CC: Thomas Gleixner CC: Ingo Molnar CC: x86@kernel.org Signed-off-by: Linus Torvalds commit 38938c879eb0c39edf85d5164aa0cffe2874304c Author: David Daney Date: Fri Dec 4 17:44:50 2009 -0800 Add support for GCC-4.5's __builtin_unreachable() to compiler.h (v2) Starting with version 4.5, GCC has a new built-in function __builtin_unreachable() that can be used in places like the kernel's BUG() where inline assembly is used to transfer control flow. This eliminated the need for an endless loop in these places. The patch adds a new macro 'unreachable()' that will expand to either __builtin_unreachable() or an endless loop depending on the compiler version. Change from v1: Simplify unreachable() for non-GCC 4.5 case. Signed-off-by: David Daney Acked-by: Ralf Baechle Signed-off-by: Linus Torvalds commit d103d01e4b19f185d3c85f77402b605534c32e89 Merge: 26fb20d 6f5f672 Author: Ingo Molnar Date: Thu Dec 3 20:11:37 2009 +0100 Merge branch 'perf/probes' into perf/core Merge reason: add these fixes to 'perf probe'. Signed-off-by: Ingo Molnar commit 26fb20d008d841268545c25bb183f21ed16db891 Merge: 23ba90e 767df1b Author: Ingo Molnar Date: Thu Dec 3 20:10:59 2009 +0100 Merge branch 'perf/mce' into perf/core Merge reason: It's ready for v2.6.33. Signed-off-by: Ingo Molnar commit 23ba90e328fd2326378447cafafa47defdfc83c2 Merge: e859cf8 8ea339a Author: Ingo Molnar Date: Thu Dec 3 20:10:35 2009 +0100 Merge branch 'perf/scripting' into perf/core Merge reason: it's ready for v2.6.33. Signed-off-by: Ingo Molnar commit 7d1849aff6687a135a8da3a75e32a00e3137a5e2 Author: Mikael Pettersson Date: Thu Dec 3 15:52:44 2009 +0100 x86, apic: Enable lapic nmi watchdog on AMD Family 11h The x86 lapic nmi watchdog does not recognize AMD Family 11h, resulting in: NMI watchdog: CPU not supported As far as I can see from available documentation (the BKDM), family 11h looks identical to family 10h as far as the PMU is concerned. Extending the check to accept family 11h results in: Testing NMI watchdog ... OK. I've been running with this change on a Turion X2 Ultra ZM-82 laptop for a couple of weeks now without problems. Signed-off-by: Mikael Pettersson Cc: Andreas Herrmann Cc: Joerg Roedel Cc: LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se> Signed-off-by: Ingo Molnar commit 26bb7505cf7db3560286be9f6384b6d3911f78b5 Author: Steven Whitehouse Date: Fri Nov 27 10:31:11 2009 +0000 GFS2: Fix glock refcount issues This patch fixes some ref counting issues. Firstly by moving the point at which we drop the ref count after a dlm lock operation has completed we ensure that we never call gfs2_glock_hold() on a lock with a zero ref count. Secondly, by using atomic_dec_and_lock() in gfs2_glock_put() we ensure that at no time will a glock with zero ref count appear on the lru_list. That means that we can remove the check for this in our shrinker (which was racy). Signed-off-by: Steven Whitehouse commit c29cd9004e72acb5a6cb8caf08508f1c5edee686 Author: Wu Fengguang Date: Wed Nov 18 18:09:41 2009 +0800 writeback: remove unused nonblocking and congestion checks (gfs2) No one is calling wb_writeback and write_cache_pages with wbc.nonblocking=1 any more. And lumpy pageout will want to do nonblocking writeback without the congestion wait. Signed-off-by: Wu Fengguang Signed-off-by: Steven Whitehouse commit 9ae3c6de6981a1e8765b5d029f94555fc0f0fea0 Author: Benjamin Marzinski Date: Tue Nov 10 12:54:56 2009 -0600 GFS2: drop rindex glock to refresh rindex list When a gfs2 filesystem is grown, it needs to rebuild the rindex list to be able to use the new space. gfs2 does this when the rindex is marked not uptodate, which happens when the rindex glock is dropped. However, on a single node setup, there is never any reason to drop the rindex glock, so gfs2 never invalidates the the rindex. This patch makes gfs2 automatically drop the rindex glock after filesystem grows, so it can refresh the rindex list. Signed-off-by: Benjamin Marzinski Signed-off-by: Steven Whitehouse commit 0ab7d13fcbd7ce1658c563e345990ba453719deb Author: Steven Whitehouse Date: Fri Nov 6 16:20:51 2009 +0000 GFS2: Tag all metadata with jid There are two spare field in the header common to all GFS2 metadata. One is just the right size to fit a journal id in it, and this patch updates the journal code so that each time a metadata block is modified, we tag it with the journal id of the node which is performing the modification. The reason for this is that it should make it much easier to debug issues which arise if we can tell which node was the last to modify a particular metadata block. Since the field is updated before the block is written into the journal, each journal should only contain metadata which is tagged with its own journal id. The one exception to this is the journal header block, which might have a different node's id in it, if that journal was recovered by another node in the cluster. Thus each journal will contain a record of which nodes recovered it, via the journal header. The other field in the metadata header could potentially be used to hold information about what kind of operation was performed, but for the time being we just zero it on each transaction so that if we use it for that in future, we'll know that the information (where it exists) is reliable. I did consider using the other field to hold the journal sequence number, however since in GFS2's journaling we write the modified data into the journal and not the original data, this gives no information as to what action caused the modification, so I think we can probably come up with a better use for those 64 bits in the future. Signed-off-by: Steven Whitehouse commit 2c77634965ee28c8b4790ffb5e83dd5ff7ac8988 Author: Steven Whitehouse Date: Fri Nov 6 11:10:51 2009 +0000 GFS2: Locking order fix in gfs2_check_blk_state In some cases we already have the rindex lock when we enter this function. Signed-off-by: Steven Whitehouse commit 1579343a73e32b5886e186e8f3e4db85e420ed3f Author: Steven Whitehouse Date: Fri Nov 6 11:06:37 2009 +0000 GFS2: Remove dirent_first() function This function only had one caller left, and that caller only called it for leaf blocks, hence one branch of the "if" was never taken. In addition the call to get_left had already verified the metadata type, so the function can be reduced to a single line of code in its caller. Signed-off-by: Steven Whitehouse commit cdcfde62dac64c86ff34e483c595d568a252c433 Author: Steven Whitehouse Date: Fri Oct 30 10:48:53 2009 +0000 GFS2: Display nobarrier option in /proc/mounts Since the default is barriers on, this only displays the nobarrier option when that is active. Signed-off-by: Steven Whitehouse commit f25934c5f88655a8d5c3c40a540daed1f0e6dedc Author: Christoph Hellwig Date: Fri Oct 30 08:03:27 2009 +0100 GFS2: add barrier/nobarrier mount options Currently gfs2 issues barrier unconditionally. There are various reasons to disable them, be that just for testing or for stupid devices flushing large battert backed caches. Add a nobarrier option that matches xfs and btrfs for this. Also add a symmetric barrier option to turn it back on at remount time. Signed-off-by: Christoph Hellwig Signed-off-by: Steven Whitehouse commit c14f5735e724cb5338ca8298d42b1658008a10d7 Author: Benjamin Marzinski Date: Mon Oct 26 13:29:47 2009 -0500 GFS2: remove division from new statfs code It's not necessary to do any 64bit division for the statfs sync code, so remove it. Signed-off-by: Benjamin Marzinski Signed-off-by: Steven Whitehouse commit 3d3c10f2ce80d2a19e5e02023c2b7ab7086c36d5 Author: Benjamin Marzinski Date: Tue Oct 20 02:39:44 2009 -0500 GFS2: Improve statfs and quota usability GFS2 now has three new mount options, statfs_quantum, quota_quantum and statfs_percent. statfs_quantum and quota_quantum simply allow you to set the tunables of the same name. Setting setting statfs_quantum to 0 will also turn on the statfs_slow tunable. statfs_percent accepts an integer between 0 and 100. Numbers between 1 and 100 will cause GFS2 to do any early sync when the local number of blocks free changes by at least statfs_percent from the totoal number of blocks free. Setting statfs_percent to 0 disables this. Signed-off-by: Benjamin Marzinski Signed-off-by: Steven Whitehouse commit 2ec4650526c5a94d96bb760001fe0685b15988de Author: Steven Whitehouse Date: Mon Sep 28 12:49:15 2009 +0100 GFS2: Use dquot_send_warning() This adds support to GFS2 to send quota warnings via netlink. Also it removes a stray \r which was left over from when the code used to print warnings on the console. Signed-off-by: Steven Whitehouse commit 86e931a35e93d94e6e91b57cc76456e16d188ea9 Author: Steven Whitehouse Date: Mon Sep 28 12:35:17 2009 +0100 VFS: Export dquot_send_warning Sending a message to userspace in a generic format to warn of events (e.g. quota exceeded) in the quota subsystem is a generically useful feature. This patch makes some minor changes to the send_message function from dquot.c renaming it quota_send_message, moving it to quota.c and exporting it for use by filesystems which do not use the dquot code. Signed-off-by: Steven Whitehouse commit e285c100362762f7440643be637dd332460fdc75 Author: Steven Whitehouse Date: Wed Sep 23 13:50:49 2009 +0100 GFS2: Add set_xquota support This patch adds the ability to set GFS2 quota limit and warning levels via the XFS quota API. Signed-off-by: Steven Whitehouse commit 113d6b3c99bf30d8083068d00e3c7304d91d4845 Author: Steven Whitehouse Date: Mon Sep 28 11:52:16 2009 +0100 GFS2: Add get_xquota support This adds support for viewing the current GFS2 quota settings via the XFS quota API. The setting of quotas will be addressed in a later patch. Fields which are not supported here are left set to zero. Signed-off-by: Steven Whitehouse Reviewed-by: Bob Peterson commit 1e72c0f7c40e665d2ed40014750fdd2fa9968bcf Author: Steven Whitehouse Date: Tue Sep 15 20:42:56 2009 +0100 GFS2: Clean up gfs2_adjust_quota() and do_glock() Both of these functions contained confusing and in one case duplicate code. This patch adds a new check in do_glock() so that we report -ENOENT if we are asked to sync a quota entry which doesn't exist. Due to the previous patch this is now reported correctly to userspace. Also there are a few new comments, and I hope that the code is easier to understand now. Signed-off-by: Steven Whitehouse commit 6a6ada81e4ffc222bf7e54ea7503c7cc98b4f0d8 Author: Steven Whitehouse Date: Tue Sep 15 16:30:38 2009 +0100 GFS2: Remove constant argument from qd_get() This function was only ever called with the "create" argument set to true, so we can remove it. Signed-off-by: Steven Whitehouse commit 33a82529e7007ed7beceebc6b3f3cddadb5b67f0 Author: Steven Whitehouse Date: Tue Sep 15 16:25:40 2009 +0100 GFS2: Remove constant argument from qdsb_get() The "create" argument to qdsb_get() was only ever set to true, so this patch removes that argument. Signed-off-by: Steven Whitehouse commit ea7623385930c63e2a35749cff9db72094cd06ad Author: Steven Whitehouse Date: Tue Sep 15 16:20:30 2009 +0100 GFS2: Add proper error reporting to quota sync via sysfs For some reason, the errors were not making it to userspace. Signed-off-by: Steven Whitehouse commit 1d371b5e179d943491a5fddad211cb317f38a142 Author: Steven Whitehouse Date: Fri Sep 11 15:57:27 2009 +0100 GFS2: Add get_xstate quota function This allows querying of the quota state via the XFS quota API. Signed-off-by: Steven Whitehouse commit 91094d0fb650decd8bf48b85d86c892d7ca913ee Author: Steven Whitehouse Date: Fri Sep 11 15:21:56 2009 +0100 GFS2: Remove obsolete code in quota.c There is no point in testing for GLF_DEMOTE here, we might as well always release the glock at that point. Signed-off-by: Steven Whitehouse commit cc632e7f93465597896862cf9e50baefb1999215 Author: Steven Whitehouse Date: Tue Sep 15 09:59:02 2009 +0100 GFS2: Hook gfs2_quota_sync into VFS via gfs2_quotactl_ops The plan is to add further operations to the gfs2_quotactl_ops in future patches. The sync operation is easy, so we start with that one. We plan to use the XFS quota control functions because they more closely match the GFS2 ones. Signed-off-by: Steven Whitehouse commit 8c42d637f6f2859e0fb28b78d5add7f0dc6d0973 Author: Steven Whitehouse Date: Fri Sep 11 14:36:44 2009 +0100 GFS2: Alter arguments of gfs2_quota/statfs_sync These two functions are altered so that gfs2_quota_sync may in future be called directly from the VFS. The GFS2 superblock changes to a VFS super block and there is an addition of an int argument which is currently ignored. Signed-off-by: Steven Whitehouse commit ab201832f75f58c8f5093436363f80ffa4a4c9a8 Author: Steven Whitehouse Date: Tue Sep 29 16:31:03 2009 +0100 VFS: Use GFP_NOFS in posix_acl_from_xattr() GFS2 needs to call this from under a glock, so we need GFP_NOFS and I suspect that other filesystems might require this too. Signed-off-by: Steven Whitehouse commit 106381bfba997b83b64f68f2210e154162fc38e6 Author: Steven Whitehouse Date: Tue Sep 29 16:26:23 2009 +0100 GFS2: Add cached ACLs support The other patches in this series have been building towards being able to support cached ACLs like other filesystems. The only real difference with GFS2 is that we have to invalidate the cache when we drop a glock, but that is dealt with in earlier patches. Signed-off-by: Steven Whitehouse commit 479c427dd60fe1aadbbf2e6cbf2f84942baeb210 Author: Steven Whitehouse Date: Fri Oct 2 12:00:00 2009 +0100 GFS2: Clean up ACLs To prepare for support for caching of ACLs, this cleans up the GFS2 ACL support by pushing the xattr code back into xattr.c and changing the acl_get function into one which only returns ACLs so that we can drop the caching function into it shortly. Signed-off-by: Steven Whitehouse commit 69dca42464962d8d0989b7e09877ba644c9cba66 Author: Steven Whitehouse Date: Tue Sep 29 12:40:19 2009 +0100 GFS2: Use gfs2_set_mode() instead of munge_mode() These two functions do the same thing, so lets only use one of them. Signed-off-by: Steven Whitehouse commit c65f7fb5342ecb8cb85e9b676327b3a43a5a4735 Author: Steven Whitehouse Date: Fri Oct 2 11:54:39 2009 +0100 GFS2: Use forget_all_cached_acls() Invalidate all the cached ACLs when we drop the glock. Signed-off-by: Steven Whitehouse commit 796bd9524731850967d437b7f47a86acc776ea89 Author: Steven Whitehouse Date: Tue Sep 29 12:27:23 2009 +0100 VFS: Add forget_all_cached_acls() This is required for cluster filesystems which want to use cached ACLs so that they can invalidate the cache when required. Signed-off-by: Steven Whitehouse Cc: Alexander Viro Cc: Christoph Hellwig commit 2646a1f61a3b5525914757f10fa12b5b94713648 Author: Steven Whitehouse Date: Fri Oct 2 11:50:54 2009 +0100 GFS2: Fix up system xattrs This code has been shamelessly stolen from XFS at the suggestion of Christoph Hellwig. I've not added support for cached ACLs so far... watch for that in a later patch, although this is designed in such a way that they should be easy to add. Signed-off-by: Steven Whitehouse Cc: Christoph Hellwig commit f55073ff1eaf99f6b3bc62134a456638bca043a3 Author: Steven Whitehouse Date: Mon Sep 28 10:30:49 2009 +0100 GFS2: Fix -o meta mounts for subsequent mounts (i.e. all but the first one) We have a long term plan to use the "-o meta" flag to GFS2 mounts to access the alternate root which is used to store metadata for a GFS2 filesystem. This will allow us to eventually remove support for the gfs2meta filesystem type (which is in any case just a "front end" to the gfs2 filesystem type with the meta/master root). Currently the "-o meta" option is only taken into account on the initial mount of the filesystem. Subsequent mounts of the same filesystem (i.e. on the same device) result in basically the same as bind mounting the root of the original mount. This patch changes that by using what is more or less a copy of get_sb_bdev() and extending it so that it will take into account the alternate root in all cases. The main difference is that we have to parse the mount options a bit earlier. We can then use them to select the appropriate root towards the end of the function. In addition this also fixes a bug where it was possible (but certainly not desirable) to set different ro/rw options for the meta root when mounted via the gfs2meta fs compared with the original mount. Signed-off-by: Steven Whitehouse Cc: Alexander Viro commit 7e71c55ee73988d0cb61045660b899eaac23bf8f Author: Steven Whitehouse Date: Tue Sep 22 10:56:16 2009 +0100 GFS2: Fix potential race in glock code We need to be careful of the ordering between clearing the GLF_LOCK bit and scheduling the workqueue. Signed-off-by: Steven Whitehouse commit c08f782985eed9959438368e84ce1d7f2ed03d95 Author: Frederic Weisbecker Date: Wed Dec 2 20:49:17 2009 +0100 mutex: Fix missing conditions to build mutex_spin_on_owner() We don't need to build mutex_spin_on_owner() if we have CONFIG_DEBUG_MUTEXES or CONFIG_HAVE_DEFAULT_NO_SPIN_MUTEXES as it won't be used under such configs. Use CONFIG_MUTEX_SPIN_ON_OWNER as it gathers all the necessary checks before building it. Signed-off-by: Frederic Weisbecker Acked-by: Peter Zijlstra LKML-Reference: <1259783357-8542-2-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar Cc: Peter Zijlstra commit c02260277e472095ffb3ad893be5eeab9dcefde3 Author: Frederic Weisbecker Date: Wed Dec 2 20:49:16 2009 +0100 mutex: Better control mutex adaptive spinning config Introduce CONFIG_MUTEX_SPIN_ON_OWNER so that we can centralize in a single place the conditions that determine its definition and use. Signed-off-by: Frederic Weisbecker Acked-by: Peter Zijlstra LKML-Reference: <1259783357-8542-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar Cc: Peter Zijlstra commit 4528752f49c1f4025473d12bc5fa9181085c3f22 Author: Darrick J. Wong Date: Wed Dec 2 15:05:56 2009 -0800 x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree On a multi-node x3950M2 system, there's a slight oddity in the PCI device tree for all secondary nodes: 30:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) \-33:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01) \-34:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) ...as compared to the primary node: 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) \-01:00.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02) 03:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01) \-04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) In both nodes, the LSI RAID controller hangs off a CalIOC2 device, but on the secondary nodes, the BIOS hides the VGA device and substitutes the device tree ending with the disk controller. It would seem that Calgary devices don't necessarily appear at the top of the PCI tree, which means that the current code to find the Calgary IOMMU that goes with a particular device is buggy. Rather than walk all the way to the top of the PCI device tree and try to match bus number with Calgary descriptor, the code needs to examine each parent of the particular device; if it encounters a Calgary with a matching bus number, simply use that. Otherwise, we BUG() when the bus number of the Calgary doesn't match the bus number of whatever's at the top of the device tree. Extra note: This patch appears to work correctly for the x3950 that came before the x3950 M2. Signed-off-by: Darrick J. Wong Acked-by: Muli Ben-Yehuda Cc: FUJITA Tomonori Cc: Joerg Roedel Cc: Yinghai Lu Cc: Jon D. Mason Cc: Corinna Schultz Cc: LKML-Reference: <20091202230556.GG10295@tux1.beaverton.ibm.com> Signed-off-by: Ingo Molnar commit 8bfb2f8e655b9d0c45fde679fcd5fd97e34513db Author: Paul E. McKenney Date: Wed Dec 2 12:10:16 2009 -0800 rcu: Make RCU's CPU-stall detector be default The RCU_CPU_STALL_DETECTOR costs almost nothing and has located some bugs that might otherwise have been difficult to track down. Make it be default for the TREE RCU implementations. The vmlinux size impact is limited (on 64-bit x86 defconfig): text data bss dec hex filename 8440248 1260076 995588 10695912 a334e8 vmlinux.before 8440774 1260060 995588 10696422 a336e6 vmlinux.after +526 bytes - acceptable default cost. For RAM starved systems, TINY_RCU does not support CPU-stall detection and is much smaller, but then again it is a uniprocessor... Signed-off-by: Paul E. McKenney Acked-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12597846162906-git-send-email-> [ v2: added image size calculations to the changelog ] Signed-off-by: Ingo Molnar commit d9a3da0699b24a589b27a61e1a5b5bd30d9db669 Author: Paul E. McKenney Date: Wed Dec 2 12:10:15 2009 -0800 rcu: Add expedited grace-period support for preemptible RCU Implement an synchronize_rcu_expedited() for preemptible RCU that actually is expedited. This uses synchronize_sched_expedited() to force all threads currently running in a preemptible-RCU read-side critical section onto the appropriate ->blocked_tasks[] list, then takes a snapshot of all of these lists and waits for them to drain. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1259784616158-git-send-email-> Signed-off-by: Ingo Molnar commit cf244dc01bf68e1ad338b82447f8686d24ea4435 Author: Paul E. McKenney Date: Wed Dec 2 12:10:14 2009 -0800 rcu: Enable fourth level of TREE_RCU hierarchy Enable a fourth level of rcu_node hierarchy for TREE_RCU and TREE_PREEMPT_RCU. This is for stress-testing and experiemental purposes only, although in theory this would enable 16,777,216 CPUs on 64-bit systems, though only 1,048,576 CPUs on 32-bit systems. Normal experimental use of this fourth level will normally set CONFIG_RCU_FANOUT=2, requiring a 16-CPU system, though the more adventurous (and more fortunate) experimenters may wish to chose CONFIG_RCU_FANOUT=3 for 81-CPU systems or even CONFIG_RCU_FANOUT=4 for 256-CPU systems. Signed-off-by: Paul E. McKenney Acked-by: Josh Triplett Acked-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12597846161257-git-send-email-> Signed-off-by: Ingo Molnar commit d3f6bad3911736e44ba11f3f3f6ac4e8c837fdfc Author: Paul E. McKenney Date: Wed Dec 2 12:10:13 2009 -0800 rcu: Rename "quiet" functions The number of "quiet" functions has grown recently, and the names are no longer very descriptive. The point of all of these functions is to do some portion of the task of reporting a quiescent state, so rename them accordingly: o cpu_quiet() becomes rcu_report_qs_rdp(), which reports a quiescent state to the per-CPU rcu_data structure. If this turns out to be a new quiescent state for this grace period, then rcu_report_qs_rnp() will be invoked to propagate the quiescent state up the rcu_node hierarchy. o cpu_quiet_msk() becomes rcu_report_qs_rnp(), which reports a quiescent state for a given CPU (or possibly a set of CPUs) up the rcu_node hierarchy. o cpu_quiet_msk_finish() becomes rcu_report_qs_rsp(), which reports a full set of quiescent states to the global rcu_state structure. o task_quiet() becomes rcu_report_unblock_qs_rnp(), which reports a quiescent state due to a task exiting an RCU read-side critical section that had previously blocked in that same critical section. As indicated by the new name, this type of quiescent state is reported up the rcu_node hierarchy (using rcu_report_qs_rnp() to do so). Signed-off-by: Paul E. McKenney Acked-by: Josh Triplett Acked-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12597846163698-git-send-email-> Signed-off-by: Ingo Molnar commit c84d6efd363a3948eb32ec40d46bab6338580454 Merge: 7539cf4 22763c5 Author: James Morris Date: Thu Dec 3 12:03:40 2009 +0530 Merge branch 'master' into next commit 7cff7ce94a7df2ccf5ac76b48ee0995fee2060df Author: Andrew Morton Date: Fri Oct 9 00:01:39 2009 -0700 include/linux/compiler-gcc4.h: Fix build bug - gcc-4.0.2 doesn't understand __builtin_object_size Maybe 4.1.0 doesn't too, but this fixed it for me. Caused by: 4a31276: x86: Turn the copy_from_user check into an (optional) compile time warning 9f0cf4a: x86: Use __builtin_object_size() to validate the buffer size for copy_from_user() Signed-off-by: Andrew Morton Cc: Arjan van de Ven LKML-Reference: <200910090724.n997OQl6013538@imap1.linux-foundation.org> Signed-off-by: Ingo Molnar commit 0cf55e1ec08bb5a22e068309e2d8ba1180ab4239 Author: Hidetoshi Seto Date: Wed Dec 2 17:28:07 2009 +0900 sched, cputime: Introduce thread_group_times() This is a real fix for problem of utime/stime values decreasing described in the thread: http://lkml.org/lkml/2009/11/3/522 Now cputime is accounted in the following way: - {u,s}time in task_struct are increased every time when the thread is interrupted by a tick (timer interrupt). - When a thread exits, its {u,s}time are added to signal->{u,s}time, after adjusted by task_times(). - When all threads in a thread_group exits, accumulated {u,s}time (and also c{u,s}time) in signal struct are added to c{u,s}time in signal struct of the group's parent. So {u,s}time in task struct are "raw" tick count, while {u,s}time and c{u,s}time in signal struct are "adjusted" values. And accounted values are used by: - task_times(), to get cputime of a thread: This function returns adjusted values that originates from raw {u,s}time and scaled by sum_exec_runtime that accounted by CFS. - thread_group_cputime(), to get cputime of a thread group: This function returns sum of all {u,s}time of living threads in the group, plus {u,s}time in the signal struct that is sum of adjusted cputimes of all exited threads belonged to the group. The problem is the return value of thread_group_cputime(), because it is mixed sum of "raw" value and "adjusted" value: group's {u,s}time = foreach(thread){{u,s}time} + exited({u,s}time) This misbehavior can break {u,s}time monotonicity. Assume that if there is a thread that have raw values greater than adjusted values (e.g. interrupted by 1000Hz ticks 50 times but only runs 45ms) and if it exits, cputime will decrease (e.g. -5ms). To fix this, we could do: group's {u,s}time = foreach(t){task_times(t)} + exited({u,s}time) But task_times() contains hard divisions, so applying it for every thread should be avoided. This patch fixes the above problem in the following way: - Modify thread's exit (= __exit_signal()) not to use task_times(). It means {u,s}time in signal struct accumulates raw values instead of adjusted values. As the result it makes thread_group_cputime() to return pure sum of "raw" values. - Introduce a new function thread_group_times(*task, *utime, *stime) that converts "raw" values of thread_group_cputime() to "adjusted" values, in same calculation procedure as task_times(). - Modify group's exit (= wait_task_zombie()) to use this introduced thread_group_times(). It make c{u,s}time in signal struct to have adjusted values like before this patch. - Replace some thread_group_cputime() by thread_group_times(). This replacements are only applied where conveys the "adjusted" cputime to users, and where already uses task_times() near by it. (i.e. sys_times(), getrusage(), and /proc//stat.) This patch have a positive side effect: - Before this patch, if a group contains many short-life threads (e.g. runs 0.9ms and not interrupted by ticks), the group's cputime could be invisible since thread's cputime was accumulated after adjusted: imagine adjustment function as adj(ticks, runtime), {adj(0, 0.9) + adj(0, 0.9) + ....} = {0 + 0 + ....} = 0. After this patch it will not happen because the adjustment is applied after accumulated. v2: - remove if()s, put new variables into signal_struct. Signed-off-by: Hidetoshi Seto Acked-by: Peter Zijlstra Cc: Spencer Candland Cc: Americo Wang Cc: Oleg Nesterov Cc: Balbir Singh Cc: Stanislaw Gruszka LKML-Reference: <4B162517.8040909@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit d99ca3b977fc5a93141304f571475c2af9e6c1c5 Author: Hidetoshi Seto Date: Wed Dec 2 17:26:47 2009 +0900 sched, cputime: Cleanups related to task_times() - Remove if({u,s}t)s because no one call it with NULL now. - Use cputime_{add,sub}(). - Add ifndef-endif for prev_{u,s}time since they are used only when !VIRT_CPU_ACCOUNTING. Signed-off-by: Hidetoshi Seto Cc: Peter Zijlstra Cc: Spencer Candland Cc: Americo Wang Cc: Oleg Nesterov Cc: Balbir Singh Cc: Stanislaw Gruszka LKML-Reference: <4B1624C7.7040302@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit be8147e68625a1adb111acfd6b98a492be4b74d0 Author: Tim Blechmann Date: Wed Dec 2 12:32:10 2009 +0100 Revert "sched, x86: Optimize branch hint in __switch_to()" This reverts commit a3a1de0c34de6f5f8332cd6151c46af7813c0fcb. Commit 8ec6993d9f7d961014af970ded57542961fe9ad9 cleared the es and ds selectors, so the original branch hints are correct now. Therefore the branch hint doesn't need to be removed. Signed-off-by: Tim Blechmann LKML-Reference: <4B16503A.8030508@klingt.org> Signed-off-by: Ingo Molnar commit 99063c0bcebcc913165a5d168050326eba3e0996 Author: Jan Beulich Date: Fri Nov 27 15:06:16 2009 +0000 x86/alternatives: No need for alternatives-asm.h to re-invent stuff already in asm.h This at once also gets the alignment specification right for x86-64. Signed-off-by: Jan Beulich LKML-Reference: <4B0FF8F80200007800022708@vpn.id2.novell.com> Signed-off-by: Ingo Molnar commit 01be50a308be466e122c3a8b3d535f1b673ecbd2 Author: Jan Beulich Date: Fri Nov 27 15:04:58 2009 +0000 x86/alternatives: Check replacementlen <= instrlen at build time Having run into the run-(boot-)time check a couple of times lately, I finally took time to find a build-time check so that one doesn't need to analyze the register/stack dump and resolve this (through manual lookup in vmlinux) to the offending construct. The assembler will emit a message like "Error: value of too large for field of 1 bytes at ", which while not pointing out the source location still makes analysis quite a bit easier. Signed-off-by: Jan Beulich LKML-Reference: <4B0FF8AA0200007800022703@vpn.id2.novell.com> Signed-off-by: Ingo Molnar commit bdddd2963c0264c56f18043f6fa829d3c1d3d1c0 Author: Rusty Russell Date: Wed Dec 2 14:09:16 2009 +1030 sched: Fix isolcpus boot option Anton Blanchard wrote: > We allocate and zero cpu_isolated_map after the isolcpus > __setup option has run. This means cpu_isolated_map always > ends up empty and if CPUMASK_OFFSTACK is enabled we write to a > cpumask that hasn't been allocated. I introduced this regression in 49557e620339cb13 (sched: Fix boot crash by zalloc()ing most of the cpu masks). Use the bootmem allocator if they set isolcpus=, otherwise allocate and zero like normal. Reported-by: Anton Blanchard Signed-off-by: Rusty Russell Cc: peterz@infradead.org Cc: Linus Torvalds Cc: LKML-Reference: <200912021409.17013.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar Tested-by: Anton Blanchard commit fa1452e808732ae10e8b1267fd75fc2d028d634b Author: Hiroshi Shimamoto Date: Mon Nov 30 14:59:44 2009 +0900 locking, task_struct: Reduce size on TRACE_IRQFLAGS and 64bit Reorder task_struct field for TRACE_IRQFLAGS to remove padding on 64-bit. Signed-off-by: Hiroshi Shimamoto Cc: Peter Zijlstra LKML-Reference: <4B135F50.8070302@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit e859cf8656043f158b4004ccc8cbbf1ba4f97177 Author: Masami Hiramatsu Date: Mon Nov 30 19:02:22 2009 -0500 x86: Fix comments of register/stack access functions Fix typos and some redundant comments of register/stack access functions in asm/ptrace.h. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Frederic Weisbecker Cc: Roland McGrath Cc: Oleg Nesterov Cc: Wenji Huang Cc: Mahesh J Salgaonkar LKML-Reference: <20091201000222.7669.7477.stgit@harusame> Signed-off-by: Ingo Molnar Suggested-by: Wenji Huang commit 93aaa45a6ad3f983180223601fc663cc551ad499 Author: Liming Wang Date: Wed Dec 2 16:42:54 2009 +0800 perf tools: Replace %m with %a in sscanf Not all glibc support %m and it results in a compile error if %m not supported. Replace it with %a and (float *) casts. Signed-off-by: Liming Wang Acked-by: Frederic Weisbecker Cc: peterz@infradead.org Cc: mhiramat@redhat.com LKML-Reference: <1259743374-9950-1-git-send-email-liming.wang@windriver.com> Signed-off-by: Ingo Molnar commit 6d20792e85187b27ae3d1b76678a2dd7025e8bc2 Author: Suresh Siddha Date: Tue Dec 1 15:31:18 2009 -0800 x86: Remove unnecessary mdelay() from cpu_disable_common() fixup_irqs() already has a mdelay(). Remove the extra and unnecessary mdelay() from cpu_disable_common(). Signed-off-by: Suresh Siddha Cc: Maciej W. Rozycki Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.232177348@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit 1c83995b6c7c6bb795bce80f75fbffb15f78db2d Author: Suresh Siddha Date: Tue Dec 1 15:31:17 2009 -0800 x86, ioapic: Document another case when level irq is seen as an edge In the case when cpu goes offline, fixup_irqs() will forward any unhandled interrupt on the offlined cpu to the new cpu destination that is handling the corresponding interrupt. This interrupt forwarding is done via IPI's. Hence, in this case also level-triggered io-apic interrupt will be seen as an edge interrupt in the cpu's APIC IRR. Document this scenario in the code which handles this case by doing an explicit EOI to the io-apic to clear remote IRR of the io-apic RTE. Requested-by: Maciej W. Rozycki Signed-off-by: Suresh Siddha Cc: Maciej W. Rozycki Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.143970505@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit c29d9db338db606c3335a03f337e1d4b7f6bb727 Author: Suresh Siddha Date: Tue Dec 1 15:31:16 2009 -0800 x86, ioapic: Fix the EOI register detection mechanism Maciej W. Rozycki reported: > 82093AA I/O APIC has its version set to 0x11 and it > does not support the EOI register. Similarly I/O APICs > integrated into the 82379AB south bridge and the 82374EB/SB > EISA component. IO-APIC versions below 0x20 don't support EOI register. Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic version as 0x2. This is an error with documentation and these ICH chips use io-apic's of version 0x20 and indeed has a working EOI register for the io-apic. Fix the EOI register detection mechanism to check for version 0x20 and beyond. And also, a platform can potentially have io-apic's with different versions. Make the EOI register check per io-apic. Reported-by: Maciej W. Rozycki Signed-off-by: Suresh Siddha Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.065361533@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit ca64c47cecd0321b2e0dcbd7aaff44b68ce20654 Author: Maciej W. Rozycki Date: Tue Dec 1 15:31:15 2009 -0800 x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq When the level-triggered interrupt is seen as an edge interrupt, we try to clear the remoteIRR explicitly (using either an io-apic eoi register when present or through the idea of changing trigger mode of the io-apic RTE to edge and then back to level). But this explicit try also needs to happen before we try to migrate the irq. Otherwise irq migration attempt will fail anyhow, as it postpones the irq migration to a later attempt when it sees the remoteIRR in the io-apic RTE still set. Signed-off-by: "Maciej W. Rozycki" Reviewed-by: Suresh Siddha Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233334.975416130@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit 1cedae72904b85462082dbcfd5190309ba37f8bd Author: Frederic Weisbecker Date: Wed Dec 2 07:32:16 2009 +0100 hw-breakpoints: Keep track of user disabled breakpoints When we disable a breakpoint through dr7, we unregister it right away, making us lose track of its corresponding address register value. It means that the following sequence would be unsupported: - set address in dr0 - enable it through dr7 - disable it through dr7 - enable it through dr7 because we lost the address register value when we disabled the breakpoint. Don't unregister the disabled breakpoints but rather disable them. Reported-by: "K.Prasad" Signed-off-by: Frederic Weisbecker LKML-Reference: <1259735536-9236-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 6b62fe019e39edfd1dbe3f224ecd0a87d9365223 Author: Frederic Weisbecker Date: Wed Dec 2 07:23:10 2009 +0100 tracing/syscalls: Make syscall events print callbacks static enter_syscall_print_##sname and exit_syscall_print_##sname don't need to have a global scope. Make them static. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Jason Baron Cc: Lai Jiangshan LKML-Reference: <1259734990-9034-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 3a9089fd78367e2c6c815129030b790a0f5c2715 Author: Jason Baron Date: Tue Dec 1 12:18:49 2009 -0500 tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook The introduction of the new 'DECLARE_EVENT_CLASS()' obviates the need for the 'TRACE_EVENT()' macro in some cases. Thus, docbook style comments that used to live with 'TRACE_EVENT()' are now moved to 'DEFINE_EVENT()'. Thus, we need to make the docbook system understand the new 'DEFINE_EVENT()' macro. In addition I've tried to futureproof the patch, by also adding support for 'DEFINE_SINGLE_EVENT()', since there has been discussion about renaming: TRACE_EVENT() -> DEFINE_SINGLE_EVENT(). Without this patch the tracepoint docbook fails to build. I've verified that this patch correctly builds the tracepoint docbook which currently covers signals, and irqs. Changes in v2: - properly indent perl 'if' statements Signed-off-by: Jason Baron Acked-by: Steven Rostedt Acked-by: Randy Dunlap Cc: William Cohen Cc: Frederic Weisbecker Cc: Mathieu Desnoyers Cc: Masami Hiramatsu LKML-Reference: <200912011718.nB1HIn7t011371@int-mx04.intmail.prod.int.phx2.redhat.com> Signed-off-by: Ingo Molnar commit 8592e6486a177a02f048567cb928bc3a1f9a86c3 Author: Tejun Heo Date: Wed Dec 2 12:56:46 2009 +0900 sched: Revert 498657a478c60be092208422fefa9c7b248729c2 498657a478c60be092208422fefa9c7b248729c2 incorrectly assumed that preempt wasn't disabled around context_switch() and thus was fixing imaginary problem. It also broke KVM because it depended on ->sched_in() to be called with irq enabled so that it can do smp calls from there. Revert the incorrect commit and add comment describing different contexts under with the two callbacks are invoked. Avi: spotted transposed in/out in the added comment. Signed-off-by: Tejun Heo Acked-by: Avi Kivity Cc: peterz@infradead.org Cc: efault@gmx.de Cc: rusty@rustcorp.com.au LKML-Reference: <1259726212-30259-2-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar commit ec70ccd806111ba3caf596def91a8580138b12db Author: Kristian Høgsberg Date: Tue Dec 1 15:05:01 2009 -0500 perf: Don't free perf_mmap_data until work has been done In the CONFIG_PERF_USE_VMALLOC case, perf_mmap_data_free() only schedules the cleanup of the perf_mmap_data struct. In that case we have to wait until the work has been done before we free data. Signed-off-by: Kristian Høgsberg Cc: David S. Miller Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: LKML-Reference: <1259697901-1747-1-git-send-email-krh@bitplanet.net> Signed-off-by: Ingo Molnar commit bdad0db7dbdb37d0bb3c7d0f65cd3ff599ea6ecb Author: Xiao Guangrong Date: Wed Dec 2 16:08:41 2009 +0800 perf_event: Fix compile error Fix: cc1: warnings being treated as errors builtin-probe.c: In function 'cmd_probe': builtin-probe.c:163: error: unused variable 'fd' Signed-off-by: Xiao Guangrong Cc: Masami Hiramatsu Cc: Peter Zijlstra LKML-Reference: <4B162089.8000907@cn.fujitsu.com> [ v2: use NO_LIBDWARF instead of __used ] Signed-off-by: Ingo Molnar commit c19e33aa840e9202ef8d4c93056b59f3edc2208d Author: Liming Wang Date: Wed Dec 2 14:11:46 2009 +0800 perf tools: Fix _GNU_SOURCE macro related strndup() build error strndup is a GNU extension. So dont include string.h without defining _GNU_SOURCE (it results in a compile error otherwise). Remove these includes as util.h does it already. Signed-off-by: Liming Wang Acked-by: Frederic Weisbecker Acked-by: Xiao Guangrong Cc: peterz@infradead.org Cc: mhiramat@redhat.com LKML-Reference: <1259734306-26323-1-git-send-email-liming.wang@windriver.com> Signed-off-by: Ingo Molnar commit 7be077f56370cd52c48c08272b0867132f87bc48 Author: Lai Jiangshan Date: Tue Dec 1 16:24:06 2009 +0800 trace_syscalls: Remove unused syscall_name_to_nr() After duplications are removed, syscall_name_to_nr() is unused. Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D2A6.6060803@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 3bbe84e9d385205d638035ee9dcc4db1b486ea08 Author: Lai Jiangshan Date: Tue Dec 1 16:24:01 2009 +0800 trace_syscalls: Simplify syscall profile use only one prof_sysenter_enable() instead of prof_sysenter_enable_##sname() use only one prof_sysenter_disable() instead of prof_sysenter_disable_##sname() use only one prof_sysexit_enable() instead of prof_sysexit_enable_##sname() use only one prof_sysexit_disable() instead of prof_sysexit_disable_##sname() Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D2A1.8060304@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit a1301da0997bf73c44dbe584e9070a13adc89672 Author: Lai Jiangshan Date: Tue Dec 1 16:23:55 2009 +0800 trace_syscalls: Remove duplicate init_enter_##sname() use only one init_syscall_trace instead of many init_enter_##sname()/init_exit_##sname() Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D29B.6090708@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit c252f65793874b56d50395ab604db465ce688665 Author: Lai Jiangshan Date: Tue Dec 1 16:23:47 2009 +0800 trace_syscalls: Add syscall_nr field to struct syscall_metadata Add syscall_nr field to struct syscall_metadata, it helps us to get syscall number easier. Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D293.6090800@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit fcc19438dda38dacc8c144e2db3ebc6b9fd4f8b8 Author: Lai Jiangshan Date: Tue Dec 1 16:23:36 2009 +0800 trace_syscalls: Remove enter_id exit_id use ->enter_event->id instead of ->enter_id use ->exit_event->id instead of ->exit_id Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D288.7030001@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 31c16b13349970b2684248c7d8608d2a96ae135d Author: Lai Jiangshan Date: Tue Dec 1 16:23:30 2009 +0800 trace_syscalls: Set event_enter_##sname->data to its metadata Set event_enter_##sname->data to its metadata, it makes codes simpler. Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D282.7050709@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit bf56a4ea9f1683c5b223fd3a5dbea23f1fa91c34 Author: Lai Jiangshan Date: Tue Dec 1 16:23:20 2009 +0800 trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit fix event_enter_##sname->event fix event_exit_##sname->event remove unused event_syscall_enter and event_syscall_exit Signed-off-by: Lai Jiangshan Acked-by: Jason Baron Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B14D278.4090209@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 59d069eb5ae9b033ed1c124c92e1532c4a958991 Author: Xiao Guangrong Date: Tue Dec 1 17:30:08 2009 +0800 perf_event: Initialize data.period in perf_swevent_hrtimer() In current code in perf_swevent_hrtimer(), data.period is not initialized, The result is obvious wrong: # ./perf record -f -e cpu-clock make # ./perf report # Samples: 1740 # # Overhead Command ...... # ........ ........ .......................................... # 1025422183050275328.00% sh libc-2.9.90.so ... 1025422183050275328.00% perl libperl.so ... 1025422168240043264.00% perl [kernel] ... 1025422030011210752.00% perl [kernel] ... Signed-off-by: Xiao Guangrong Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: LKML-Reference: <4B14E220.2050107@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit b498ce1f2753b9724b2fc05d2057f7d1490cfa93 Author: Masami Hiramatsu Date: Mon Nov 30 19:20:25 2009 -0500 perf probe: Simplify event naming Simplify event naming as _. Each event name is globally unique (group name is not checked). So, if there is schedule_0, next probe event on schedule() will be schedule_1. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201002024.10235.2353.stgit@harusame> Signed-off-by: Ingo Molnar commit 4de189fe6e5ad8241f6f8709d2e2ba4c3aeae33a Author: Masami Hiramatsu Date: Mon Nov 30 19:20:17 2009 -0500 perf probe: Add --list option for listing current probe events Add --list option for listing currently defined probe events in the kernel. This shows events in below format; [group:event] for example: [probe:schedule_0] schedule+30 cpu Note that source file/line information is not supported yet. So even if you added a probe by line, it will be shown in . Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201002017.10235.76575.stgit@harusame> Signed-off-by: Ingo Molnar commit e1c01d61a98703fcc80d15b8068ec36d5a215f7e Author: Masami Hiramatsu Date: Mon Nov 30 19:20:05 2009 -0500 perf probe: Add argv_split() from lib/argv_split.c Add argv_split() ported from lib/argv_split.c to string.c and use it in util/probe-event.c. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201002005.10235.55602.stgit@harusame> Signed-off-by: Ingo Molnar commit 50656eec82684d03add0f4f4b4875a20bd8f9755 Author: Masami Hiramatsu Date: Mon Nov 30 19:19:58 2009 -0500 perf probe: Move probe event utility functions to probe-event.c Split probe event (kprobe-events and perf probe events) utility functions from builtin-probe.c to probe-event.c. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201001958.10235.90243.stgit@harusame> Signed-off-by: Ingo Molnar commit 934b1f5fd0c9a2ddde5a4487695c126243d9a42b Author: Masami Hiramatsu Date: Mon Nov 30 19:19:51 2009 -0500 perf probe: Fix probe array index for multiple probe points Fix the index of formatted probe array for multiple probe points, which should be probes[i] instead of probes[0]. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201001950.10235.54781.stgit@harusame> Signed-off-by: Ingo Molnar commit 74ca4c0ece52a2d19dae1bcbfc24fcfc5facfeb4 Author: Masami Hiramatsu Date: Mon Nov 30 19:19:43 2009 -0500 perf probe: Fix argv array size in probe parser Since the syntax has been changed, probe definition needs parameters less than MAX_PROBE_ARGS + 1 (probe-point + arguments). Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201001943.10235.80367.stgit@harusame> Signed-off-by: Ingo Molnar commit 57d250df7deb3e1742fbf3cc3230119731109552 Author: Masami Hiramatsu Date: Mon Nov 30 19:19:34 2009 -0500 perf probe: Add probe-finder.h without libdwarf Add probe-finder.h as LIB_H without libdwarf, because that header is included even if no libdwarf. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201001934.10235.44656.stgit@harusame> Signed-off-by: Ingo Molnar commit f41b1e43c41e99c39a2222578a7806032c045c79 Author: Masami Hiramatsu Date: Mon Nov 30 19:19:27 2009 -0500 perf probe: Change a debugging message from pr_info to pr_debug Change annoying debug-info using notice from pr_info() to pr_debug(), since the message always printed when user adds a probe point which requires debug-info. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201001927.10235.63645.stgit@harusame> Signed-off-by: Ingo Molnar commit ba8665d7dd95eb6093ee06f8f624b6acb1e73206 Author: Masami Hiramatsu Date: Mon Nov 30 19:19:20 2009 -0500 trace_kprobes: Fix a memory leak bug and check kstrdup() return value Fix a memory leak case in create_trace_probe(). When an argument is too long (> MAX_ARGSTR_LEN), it just jumps to error path. In that case tp->args[i].name is not released. This also fixes a bug to check kstrdup()'s return value. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091201001919.10235.56455.stgit@harusame> Signed-off-by: Ingo Molnar commit 5cbd08056142dcb2aea0dca7261afcb810a63c55 Author: Li Zefan Date: Tue Dec 1 14:05:16 2009 +0800 perf timechart: Remove open-coded event parsing code Convert builtin-timechart.c to mmap_dispatch_perf_file() + perf_file_handler. Signed-off-by: Li Zefan Acked-by: Arjan van de Ven Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra LKML-Reference: <4B14B21C.2040406@cn.fujitsu.com> [ v2: cleaned up the printout, fixed a whitespace detail ] Signed-off-by: Ingo Molnar commit bab81b624e970f1138535a465ad2b26b6bb0dd6c Author: Li Zefan Date: Tue Dec 1 14:04:49 2009 +0800 perf annotate: Fix perf data parsing perf-annotate doesn't parse perf.data correctly in that it doesn't read perf header. Fix this by using mmap_dispatch_perf_file(). Before: TOTAL events: 17565 MMAP events: 3221 LOST events: 10 COMM events: 235 EXIT events: 2 THROTTLE events: 1 UNTHROTTLE events: 2 FORK events: 10 READ events: 1 SAMPLE events: 14083 After: TOTAL events: 17290 MMAP events: 3203 LOST events: 0 COMM events: 234 EXIT events: 1 THROTTLE events: 0 UNTHROTTLE events: 0 FORK events: 0 READ events: 0 SAMPLE events: 13852 Signed-off-by: Li Zefan Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra Cc: Arjan van de Ven LKML-Reference: <4B14B201.9030708@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 9eaa192d8988d621217a9e6071cd403fd6010496 Author: Helight.Xu Date: Mon Nov 30 18:33:51 2009 +0800 x86: Fix a section mismatch in arch/x86/kernel/setup.c copy_edd() should be __init. warning msg: WARNING: vmlinux.o(.text+0x7759): Section mismatch in reference from the function copy_edd() to the variable .init.data:boot_params The function copy_edd() references the variable __initdata boot_params. This is often because copy_edd lacks a __initdata annotation or the annotation of boot_params is wrong. Signed-off-by: ZhenwenXu LKML-Reference: <4B139F8F.4000907@gmail.com> Signed-off-by: H. Peter Anvin commit 8ea339adc0a48236008e59dd21564d71c37b331c Author: Tom Zanussi Date: Mon Nov 30 01:18:49 2009 -0600 perf trace/scripting: Add Fedora libperl install note to doc Fedora needs perl-ExtUtils-Embed for Perl scripting, which also brings along libperl-devel; note this info for the convenience of Fedora users. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259565529-6407-5-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 61381de0504181368672a83d2e14c38dbaf3c136 Author: Tom Zanussi Date: Mon Nov 30 01:18:48 2009 -0600 perf trace/scripting: Fix Perl common_* access functions The common_* functions (e.g. common_pc(), etc) are exported as common_* but named get_common_*, resulting in unresolved subroutine errors when executing scripts. Make the internal and external names match. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259565529-6407-4-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit e136323c5a8a7d91d17c5b7b340758bb9dd33739 Author: Tom Zanussi Date: Mon Nov 30 01:18:47 2009 -0600 perf trace/scripting: Ignore shadowed variable warning for perf-trace-perl.c The debugging versions of the ENTER and LEAVE internal perl macros, used when embedding perl, define a local block with a my_perl perl variable that shadows a global variable of the same name, which is also the name expected by the embedding API for the embedded interpreter. Since we don't have control over the code generated in this case (it's an externality) and can't get rid of the warning, ignore it. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259565529-6407-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit f8be4231f82ab56a87ce74906671afbe1aa9ec75 Author: Tom Zanussi Date: Mon Nov 30 01:18:46 2009 -0600 perf trace/scripting: Silence PERL_EMBED_* backtick errors The backtick shell substitutions for PERL_EMBED_LDOPT/CCOPT make a lot of noise on stderr if Embed.pm isn't installed - this silences them. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259565529-6407-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 5fa9167a1bf5f5a4b7282f5e7ac56a4a5a1fa044 Author: Dominik Brodowski Date: Sun Nov 8 17:24:46 2009 +0100 pcmcia: rework the irq_req_t typedef Most of the irq_req_t typedef'd struct can be re-worked quite easily: (1) IRQInfo2 was unused in any case, so drop it. (2) IRQInfo1 was used write-only, so drop it. (3) Instance (private data to be passed to the IRQ handler): Most PCMCIA drivers using pcmcia_request_irq() to actually register an IRQ handler set the "dev_id" to the same pointer as the "priv" pointer in struct pcmcia_device. Modify the two exceptions (ipwireless, ibmtr_cs) to also work this waym and set the IRQ handler's "dev_id" to p_dev->priv unconditionally. (4) Handler is to be of type irq_handler_t. (5) Handler != NULL already tells whether an IRQ handler is present. Therefore, we do not need the IRQ_HANDLER_PRESENT flag in irq_req_t.Attributes. CC: netdev@vger.kernel.org CC: linux-bluetooth@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-scsi@vger.kernel.org CC: alsa-devel@alsa-project.org CC: Jaroslav Kysela CC: Jiri Kosina CC: Karsten Keil for the Bluetooth parts: Acked-by: Marcel Holtmann Signed-off-by: Dominik Brodowski commit dd2e5a156525f11754d9b1e0583f6bb49c253d62 Author: Dominik Brodowski Date: Tue Nov 3 10:27:34 2009 +0100 pcmcia: remove deprecated handle_to_dev() macro Update remaining users and remove deprecated handle_to_dev() macro CC: Harald Welte CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-serial@vger.kernel.org Signed-off-by: Dominik Brodowski commit 6838b03fc6564ea07d0cd87ea6e198d90ab1fc3e Author: Dominik Brodowski Date: Tue Nov 3 01:31:52 2009 +0100 pcmcia: pcmcia_request_window() doesn't need a pointer to a pointer pcmcia_request_window() only needs a pointer to struct pcmcia_device, not a pointer to a pointer. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-scsi@vger.kernel.org CC: Jiri Kosina Acked-by: Karsten Keil (for ISDN) Signed-off-by: Dominik Brodowski commit 82f88e36004162f49a9340ffbbaebe89016e4835 Author: Dominik Brodowski Date: Tue Nov 3 01:16:12 2009 +0100 pcmcia: remove unused "window_t" typedef Signed-off-by: Dominik Brodowski commit d7b0364bfc71c4abc97dfc47f85bb32363266e4e Author: Dominik Brodowski Date: Tue Nov 3 01:05:33 2009 +0100 pcmcia: move some window-related code to pcmcia_ioctl.c pcmcia_get_window() and pcmcia_get_mem_page() were only called from pcmcia_ioctl.c. Therefore, move these functions to that file, and remove the useless EXPORTs. Signed-off-by: Dominik Brodowski commit 0bdf9b3dd3cfa5cbd5d55172c19f5dd166208e17 Author: Magnus Damm Date: Wed Dec 13 19:46:53 2006 +0900 pcmcia: Change window_handle_t logic to unsigned long Logic changes based on top of the other patches: This set of patches changed window_handle_t from being a pointer to an unsigned long. The unsigned long is now a simple index into socket->win[]. Going from a pointer to unsigned long should leave the user space interface unchanged unless I'm mistaken. This change results in code that is less error prone and a user space interface which is much cleaner and safer. A nice side effect is that we are also are able to remove all members except one from window_t. [ linux@dominikbrodowski.net: Update to 2.6.31. Also, a plain "index" to socket->win[] does not work, as several codepaths rely on "window_handle_t" being non-zero if used. Therefore, set the window_handle_t to the socket->win[] index + 1. ] CC: netdev@vger.kernel.org Signed-off-by: Magnus Damm Signed-off-by: Dominik Brodowski commit 16456ebabfec3f8f509fc18b45f256d066a1b360 Author: Magnus Damm Date: Wed Dec 13 19:46:48 2006 +0900 pcmcia: Pass struct pcmcia_socket to pcmcia_get_mem_page() No logic changes, just pass struct pcmcia_socket to pcmcia_get_mem_page() [linux@dominikbrodowski.net: update to 2.6.31] Signed-off-by: Magnus Damm Signed-off-by: Dominik Brodowski commit 868575d1e87ff2091800aea816972ddb46de60d5 Author: Magnus Damm Date: Wed Dec 13 19:46:43 2006 +0900 pcmcia: Pass struct pcmcia_device to pcmcia_map_mem_page() No logic changes, just pass struct pcmcia_device to pcmcia_map_mem_page() [linux@dominikbrodowski.net: update to 2.6.31] CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-scsi@vger.kernel.org CC: Jiri Kosina Acked-by: Karsten Keil (for ISDN) Signed-off-by: Magnus Damm Signed-off-by: Dominik Brodowski commit f5560da549ea2e32dd41e36548c0e7dee3d4aabb Author: Magnus Damm Date: Wed Dec 13 19:46:38 2006 +0900 pcmcia: Pass struct pcmcia_device to pcmcia_release_window() No logic changes, just pass struct pcmcia_device to pcmcia_release_window(). [linux@dominikbrodowski.net: update to 2.6.31] CC: netdev@vger.kernel.org CC: Jiri Kosina Signed-off-by: Magnus Damm Signed-off-by: Dominik Brodowski commit cf72344d1ad7b33805ef8d65e758b267e6f4cb8d Author: Ingo Molnar Date: Sat Nov 28 10:11:00 2009 +0100 perf scripting: Fix build Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: Signed-off-by: Ingo Molnar commit 1ae4a971250c55e473ca53c78011fcf73809885d Author: Tom Zanussi Date: Wed Nov 25 01:15:52 2009 -0600 perf trace: Add a scripts/perl/bin for perf trace shell scripts To capture the relevant events for a given Perl script and to avoid having to continually remember and type in long command-lines, add a scripts/perl/bin directory containing two simple shell scripts for each Perl script, one for recording and one for processing/display. For example, to record perf data for the rw-by-pid.pl script, run scripts/perl/bin/rw-by-pid-record and to actually run the script and display the output run scripts/perl/bin/rw-by-pid-report. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-8-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 89fbf0b8a021cbf60abeacfb6b538e97c83afada Author: Tom Zanussi Date: Wed Nov 25 01:15:51 2009 -0600 perf trace: Add Documentation for perf trace Perl support Adds perf-trace-perl Documentation and a link to it from the perf-trace page. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-7-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit d1b93772be78486397693fc39d3ddea3fda90105 Author: Tom Zanussi Date: Wed Nov 25 01:15:50 2009 -0600 perf trace: Add interface to access perf data from Perl handlers The Perl scripting support for perf trace allows most of a trace event's data to be accessed directly as handler arguments, but not all of it e.g. the less common fields aren't passed in. To give scripts access to the other fields and/or any other data or metadata in the main perf executable that might be useful, a way to access the C data in perf from Perl is needed; this patch uses the Perl XS facility to do it for the common_xxx event fields not passed to handler functions. Context.pm exports three functions to Perl scripts that access fields for the current event by calling back into perf: common_pc(), common_flags() and common_lock_depth(). Support for common_flags() field values was added to Core.pm and a script used to sanity check these and other basic scripting features, check-perf-trace.pl, was also added. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-6-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit bcefe12eff5dca6fdfa94ed85e5bee66380d5cd9 Author: Tom Zanussi Date: Wed Nov 25 01:15:49 2009 -0600 perf trace: Add perf trace scripting support modules for Perl Add Perf-Trace-Util Perl module and some scripts that use it. Core.pm contains Perl code to define and access flag and symbolic fields. Util.pm contains general-purpose utility functions. Also adds some makefile bits to install them in libexec/perf-core/scripts/perl (or wherever perfexec_instdir points). Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-5-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 16c632de64a74644a46e7636db26b2cfb530ca13 Author: Tom Zanussi Date: Wed Nov 25 01:15:48 2009 -0600 perf trace: Add Perl scripting support Implement trace_scripting_ops to make Perl a supported perf trace scripting language. Additionally adds code that allows Perl trace scripts to access the 'flag' and 'symbolic' (__print_flags(), __print_symbolic()) field information parsed from the trace format files. Also adds the Perl implementation of the generate_script() trace_scripting_op, which creates a ready-to-run perf trace Perl script based on existing trace data. Scripts generated by this implementation print out all the fields for each event mentioned in perf.data (and will detect and generate the proper scripting code for 'flag' and 'symbolic' fields), and will additionally generate handlers for the special 'trace_unhandled', 'trace_begin' and 'trace_end' handlers. Script authors can simply remove the printing code to implement their own custom event handling. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-4-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit eb9a42caa7a926beb935a22bc59d981b35f0b652 Author: Tom Zanussi Date: Wed Nov 25 01:15:47 2009 -0600 perf trace: Add flag/symbolic format_flags It's useful to know whether a field is a flag or symbolic field for e.g. when generating scripts - it allows us to translate those fields specially rather than literally as plain numeric values. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 956ffd027bedc4106b901eb6a50f0a6c6de4113d Author: Tom Zanussi Date: Wed Nov 25 01:15:46 2009 -0600 perf trace: Add scripting ops Adds an interface, scripting_ops, that when implemented for a particular scripting language enables built-in support for trace stream processing using that language. The interface is designed to enable full-fledged language interpreters to be embedded inside the perf executable and thereby make the full capabilities of the supported languages available for trace processing. See below for details on the interface. This patch also adds a couple command-line options to 'perf trace': The -s option option is used to specify the script to be run. Script names that can be used with -s take the form: [language spec:]scriptname[.ext] Scripting languages register a set of 'language specs' that can be used to specify scripts for the registered languages. The specs can be used either as prefixes or extensions. If [language spec:] is used, the script is taken as a script of the matching language regardless of any extension it might have. If [language spec:] is not used, [.ext] is used to look up the language it corresponds to. Language specs are case insensitive. e.g. Perl scripts can be specified in the following ways: Perl:scriptname pl:scriptname.py # extension ignored PL:scriptname scriptname.pl scriptname.perl The -g [language spec] option gives users an easy starting point for writing scripts in the specified language. Scripting support for a particular language can implement a generate_script() scripting op that outputs an empty (or near-empty) set of handlers for all the events contained in a given perf.data trace file - this option gives users a direct way to access that. Adding support for a scripting language --------------------------------------- The main thing that needs to be done do add support for a new language is to implement the scripting_ops interface: It consists of the following four functions: start_script() stop_script() process_event() generate_script() start_script() is called before any events are processed, and is meant to give the scripting language support an opportunity to set things up to receive events e.g. create and initialize an instance of a language interpreter. stop_script() is called after all events are processed, and is meant to give the scripting language support an opportunity to clean up e.g. destroy the interpreter instance, etc. process_event() is called once for each event and takes as its main parameter a pointer to the binary trace event record to be processed. The implementation is responsible for picking out the binary fields from the event record and sending them to the script handler function associated with that event e.g. a function derived from the event name it's meant to handle e.g. 'sched::sched_switch()'. The 'format' information for trace events can be used to parse the binary data and map it into a form usable by a given scripting language; see the Perl implemention in subsequent patches for one possible way to leverage the existing trace format parsing code in perf and map that info into specific scripting language types. generate_script() should generate a ready-to-run script for the current set of events in the trace, preferably with bodies that print out every field for each event. Again, look at the Perl implementation for clues as to how that can be done. This is an optional, but very useful op. Support for a given language should also add a language-specific setup function and call it from setup_scripting(). The language-specific setup function associates the the scripting ops for that language with one or more 'language specifiers' (see below) using script_spec_register(). When a script name is specified on the command line, the scripting ops associated with the specified language are used to instantiate and use the appropriate interpreter to process the trace stream. In general, it should be relatively easy to add support for a new language, especially if the language implementation supports an interface allowing an interpreter to be 'embedded' inside another program (in this case the containing program will be 'perf trace'). If so, it should be relatively straightforward to translate trace events into invocations of user-defined script functions where e.g. the function name corresponds to the event type and the function parameters correspond to the event fields. The event and field type information exported by the event tracing infrastructure (via the event 'format' files) should be enough to parse and send any piece of trace data to the user script. The easiest way to see how this can be done would be to look at the Perl implementation contained in perf/util/trace-event-perl.c/.h. There are a couple of other things that aren't covered by the scripting_ops or setup interface and are technically optional, but should be implemented if possible. One of these is support for 'flag' and 'symbolic' fields e.g. being able to use more human-readable values such as 'GFP_KERNEL' or HI/BLOCK_IOPOLL/TASKLET in place of raw flag values. See the Perl implementation to see how this can be done. The other thing is support for 'calling back' into the perf executable to access e.g. uncommon fields not passed by default into handler functions, or any metadata the implementation might want to make available to users via the language interface. Again, see the Perl implementation for examples. Signed-off-by: Tom Zanussi Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: anton@samba.org Cc: hch@infradead.org LKML-Reference: <1259133352-23685-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 1ed091c45ae33b2179d387573c3fe3f3b4adf60a Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:23 2009 -0200 perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t *self, struct addr_location *al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread *self, u8 cpumode, enum map_type type, u64 addr, struct addr_location *al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol *thread__find_function(struct thread *self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: John Kacur Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 62daacb51a2bf8480e6f6b3696b03f102fc15eb0 Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:22 2009 -0200 perf tools: Reorganize event processing routines, lotsa dups killed While implementing event__preprocess_sample, that will do all of the symbol lookup in one convenient function, I noticed that util/process_event.[ch] were not being used at all, then started looking if there were other functions that could be shared and... All those functions really don't need to receive offset + head, the only thing they did was common to all of them, so do it at one place instead. Stats about number of each type of event processed now is done in a central place. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: John Kacur Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-11-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 1de8e24520ffdcf2a90c842eed937f59079a2abd Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:21 2009 -0200 perf symbols: When not using modules, discard its symbols Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-10-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 95011c600740837288a3b34b411244a4d9157c4e Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:20 2009 -0200 perf symbols: Support multiple symtabs in struct thread Making the routines that were so far specific to the kernel maps useful for all threads. This is done by making the kernel maps be contained in a kernel "thread". This gets the kernel specific routines closer to the userspace counterparts, which will help in reducing the boilerplate for resolving a symbol, as will be demonstrated in the next patches. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-9-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 23ea4a3fadc6b1692dec935397ea15e2affc1cba Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:19 2009 -0200 perf symbols: Kernel_maps should be an array of MAP__NR_TYPES entries So that we can support multiple symbol table types. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-8-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 4e06255f5cf2acf6a5abfe7df8c9690463259dea Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:18 2009 -0200 perf symbols: Make the kallsyms loading routines part of the dso class So that the kallsyms loading routines are the direct counterpart of the vmlinux loading ones, i.e. dso__load_kallsyms is the counterpart of dso__load_vmlinux. In the process make them also use the symbols rb tree indexed by map->type, paving the way for supporting other types of symtabs, such as the next one to be supported: variables. This also allowed removal of yet another global variable: kernel_map__functions. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-7-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 6a4694a433a218c729d336b348a01bfc720da095 Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:17 2009 -0200 perf symbols: Better support for multiple symbol tables per dso By using an array of rb_roots in struct dso we can, from a struct map instance to get the right symbol rb_tree more easily. This way we can have just one symbol lookup method for struct map instances, map__find_symbol, instead of one per symtab type (functions, variables). Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-6-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 3610583c29563e23dd038d2870f59c88438bf7a3 Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:16 2009 -0200 perf symbols: Add a 'type' field to struct map That way we will be able to check if the right symtab is loaded in the underlying DSO. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-5-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 605ca4ba017455d39ac6991c58eb1e80fb8af48d Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:15 2009 -0200 perf symbols: Unexport kernel_map__functions perf annotate was the only user, and it doesn't really need it. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit b0da954a4759ac19fb80a959e53b613fe376bc12 Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:14 2009 -0200 perf symbols: Split the dsos list into kernel and user parts We don't need to look at modules in dsos__findnew because the kernel events come only with user DSOs. Also we need a way to list just the module DSOs so that we can create multiple sets of maps, now that we will support maps for the variables in a symtab. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 61f37a824d6782503ff66bf653f2e07902b641a1 Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:13 2009 -0200 perf symbols: Rename kernel_mapto kernel_map[s]__functions As we'll have kernel_map[s]__variables too. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 3f5ee186f615a720fe78eb33662ae4da57a1eee3 Author: Arnaldo Carvalho de Melo Date: Fri Nov 27 16:29:12 2009 -0200 perf symbols: Avoid annoying message about loading symbols This should be properly fixed when we remove the XXX comment in 'perf report', function resolve_symbol. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259346563-12568-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit efd44318157009274fa5962d60167ecfb954e246 Merge: 492667d 4f65ae3 Author: Joerg Roedel Date: Fri Nov 27 14:27:30 2009 +0100 Merge branch 'gart/fixes' into amd-iommu/2.6.33 commit 492667dacc0ac9763969155482b1261b34ccf450 Author: Joerg Roedel Date: Fri Nov 27 13:25:47 2009 +0100 x86/amd-iommu: Remove amd_iommu_pd_table The data that was stored in this table is now available in dev->archdata.iommu. So this table is not longer necessary. This patch removes the remaining uses of that variable and removes it from the code. Signed-off-by: Joerg Roedel commit 8eed9833346781dd15e3bef35a91b0a40787ea3c Author: Joerg Roedel Date: Thu Nov 26 15:45:41 2009 +0100 x86/amd-iommu: Move reset_iommu_command_buffer out of locked code This patch removes the ugly contruct where the iommu->lock must be released while before calling the reset_iommu_command_buffer function. Signed-off-by: Joerg Roedel commit b00d3bcff4d996f65e337d404b0df5dc201a01ab Author: Joerg Roedel Date: Thu Nov 26 15:35:33 2009 +0100 x86/amd-iommu: Cleanup DTE flushing code This patch cleans up the code to flush device table entries in the IOMMU. With this chance the driver can get rid of the iommu_queue_inv_dev_entry() function. Signed-off-by: Joerg Roedel commit 3fa43655d81d471d47c44b0db4e2be1f8af32207 Author: Joerg Roedel Date: Thu Nov 26 15:04:38 2009 +0100 x86/amd-iommu: Introduce iommu_flush_device() function This patch adds a function to flush a DTE entry for a given struct device and replaces iommu_queue_inv_dev_entry calls with this function where appropriate. Signed-off-by: Joerg Roedel commit 7f760ddd702d162d693bc79f62c3bdd7fe55bd9d Author: Joerg Roedel Date: Thu Nov 26 14:49:59 2009 +0100 x86/amd-iommu: Cleanup attach/detach_device code This patch cleans up the attach_device and detach_device paths and fixes reference counting while at it. Signed-off-by: Joerg Roedel commit 7c392cbe984d904f7c89a6a75b2ac245254e8da5 Author: Joerg Roedel Date: Thu Nov 26 11:13:32 2009 +0100 x86/amd-iommu: Keep devices per domain in a list This patch introduces a list to each protection domain which keeps all devices associated with the domain. This can be used later to optimize certain functions and to completly remove the amd_iommu_pd_table. Signed-off-by: Joerg Roedel commit 241000556f751dacd332df6ab2e903a23746e51e Author: Joerg Roedel Date: Wed Nov 25 15:59:57 2009 +0100 x86/amd-iommu: Add device bind reference counting This patch adds a reference count to each device to count how often the device was bound to that domain. This is important for single devices that act as an alias for a number of others. These devices must stay bound to their domains until all devices that alias to it are unbound from the same domain. Signed-off-by: Joerg Roedel commit 657cbb6b6cba0f9c98c5299e0c803b2c0e67ea0a Author: Joerg Roedel Date: Mon Nov 23 15:26:46 2009 +0100 x86/amd-iommu: Use dev->arch->iommu to store iommu related information This patch changes IOMMU code to use dev->archdata->iommu to store information about the alias device and the domain the device is attached to. This allows the driver to get rid of the amd_iommu_pd_table in the future. Signed-off-by: Joerg Roedel commit 8793abeb783c12cc37f92f6133fd6468152b98df Author: Joerg Roedel Date: Fri Nov 27 11:40:33 2009 +0100 x86/amd-iommu: Remove support for domain sharing This patch makes device isolation mandatory and removes support for the amd_iommu=share option. This simplifies the code in several places. Signed-off-by: Joerg Roedel commit 171e7b3739e175eea7b32eca9dbe189589e14a28 Author: Joerg Roedel Date: Tue Nov 24 17:47:56 2009 +0100 x86/amd-iommu: Rearrange dma_ops related functions This patch rearranges two dma_ops related functions so that their forward declarations are not longer necessary. Signed-off-by: Joerg Roedel commit 308973d3b958b9328a1051642c81ee6dbc5021a4 Author: Joerg Roedel Date: Tue Nov 24 17:43:32 2009 +0100 x86/amd-iommu: Move some pte allocation functions in the right section This patch moves alloc_pte() and fetch_pte() into the page table handling code section so that the forward declarations for them could be removed. Signed-off-by: Joerg Roedel commit 87a64d523825351a23743e69949c2a8c2077cecf Author: Joerg Roedel Date: Tue Nov 24 17:26:43 2009 +0100 x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc This function doesn't use the parameter anymore so it can be removed. Signed-off-by: Joerg Roedel commit 98fc5a693bbdda498a556654c70d1e31a186c988 Author: Joerg Roedel Date: Tue Nov 24 17:19:23 2009 +0100 x86/amd-iommu: Use get_device_id and check_device where appropriate The logic of these two functions is reimplemented (at least in parts) in places in the code. This patch removes these code duplications and uses the functions instead. As a side effect it moves check_device() to the helper function code section. Signed-off-by: Joerg Roedel commit 71c70984e5afc20d304fbb523f1c8bb42c4ceb36 Author: Joerg Roedel Date: Tue Nov 24 16:43:06 2009 +0100 x86/amd-iommu: Move find_protection_domain to helper functions This is a helper function and when its placed in the helper function section we can remove its forward declaration. Signed-off-by: Joerg Roedel commit 94f6d190eeed91cb2bb901aa7816edd1e2405347 Author: Joerg Roedel Date: Tue Nov 24 16:40:02 2009 +0100 x86/amd-iommu: Simplify get_device_resources() With the previous changes the get_device_resources function can be simplified even more. The only important information for the callers is the protection domain. This patch renames the function to get_domain() and let it only return the protection domain for a device. Signed-off-by: Joerg Roedel commit 15898bbcb48fc86c2baff156163df0941ecb6a15 Author: Joerg Roedel Date: Tue Nov 24 15:39:42 2009 +0100 x86/amd-iommu: Let domain_for_device handle aliases If there is no domain associated to a device yet and the device has an alias device which already has a domain, the original device needs to have the same domain as the alias device. This patch changes domain_for_device to handle this situation and directly assigns the alias device domain to the device in this situation. Signed-off-by: Joerg Roedel commit f3be07da531ceef1b51295e5becc9bc07670b671 Author: Joerg Roedel Date: Mon Nov 23 19:43:14 2009 +0100 x86/amd-iommu: Remove iommu specific handling from dma_ops path This patch finishes the removal of all iommu specific handling code in the dma_ops path. Signed-off-by: Joerg Roedel commit cd8c82e875c27ee0d8b59fb76bc12aa9db6a70c2 Author: Joerg Roedel Date: Mon Nov 23 19:33:56 2009 +0100 x86/amd-iommu: Remove iommu parameter from __(un)map_single With the prior changes this parameter is not longer required. This patch removes it from the function and all callers. Signed-off-by: Joerg Roedel commit 576175c2503ae9b0f930ee9a6a0abaf7ef8956ad Author: Joerg Roedel Date: Mon Nov 23 19:08:46 2009 +0100 x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs Since the assumption that an dma_ops domain is only bound to one IOMMU was given up we need to make alloc_new_range aware of it. Signed-off-by: Joerg Roedel commit 680525e06ddccda8c51bdddf532cd5b7d950c411 Author: Joerg Roedel Date: Mon Nov 23 18:44:42 2009 +0100 x86/amd-iommu: Remove iommu parameter from dma_ops_domain_(un)map The parameter is unused in these function so remove it from the parameter list. Signed-off-by: Joerg Roedel commit f99c0f1c75f75924a6f19cb40a21ccefc6e8754d Author: Joerg Roedel Date: Mon Nov 23 16:52:56 2009 +0100 x86/amd-iommu: Use check_device in get_device_resources Every call-place of get_device_resources calls check_device before it. So call it from get_device_resources directly and simplify the code. Signed-off-by: Joerg Roedel commit 420aef8a3acfc3e75427107e23d5a9bafd17c477 Author: Joerg Roedel Date: Mon Nov 23 16:14:57 2009 +0100 x86/amd-iommu: Use check_device for amd_iommu_dma_supported The check_device logic needs to include the dma_supported checks to be really sure. Merge the dma_supported logic into check_device and use it to implement dma_supported. Signed-off-by: Joerg Roedel commit 318afd41d2eca3224de3fd85a3b9a27a3010a98d Author: Joerg Roedel Date: Mon Nov 23 18:32:38 2009 +0100 x86/amd-iommu: Make np-cache a global flag The non-present cache flag was IOMMU local until now which doesn't make sense. Make this a global flag so we can remove the lase user of 'struct iommu' in the map/unmap path. Signed-off-by: Joerg Roedel commit 09b4280439ef6fdc55f1353a9135034336eb5d26 Author: Joerg Roedel Date: Fri Nov 20 17:02:44 2009 +0100 x86/amd-iommu: Reimplement flush_all_domains_on_iommu() This patch reimplements the function flush_all_domains_on_iommu to use the global protection domain list. Signed-off-by: Joerg Roedel commit e3306664eb307ae4cc93211cd9f12d0dbd49de65 Author: Joerg Roedel Date: Fri Nov 20 16:48:58 2009 +0100 x86/amd-iommu: Reimplement amd_iommu_flush_all_domains() This patch reimplementes the amd_iommu_flush_all_domains function to use the global protection domain list instead of flushing every domain on every IOMMU. Signed-off-by: Joerg Roedel commit aeb26f55337d4310840c8adc3ec7d6aebb714472 Author: Joerg Roedel Date: Fri Nov 20 16:44:01 2009 +0100 x86/amd-iommu: Implement protection domain list This patch adds code to keep a global list of all protection domains. This allows to simplify the resume code. Signed-off-by: Joerg Roedel commit 601367d76bd19b7eea2286ae99e5b1cb5d74f38d Author: Joerg Roedel Date: Fri Nov 20 16:08:55 2009 +0100 x86/amd-iommu: Remove iommu_flush_domain function This iommu_flush_tlb_pde function does essentially the same. So the iommu_flush_domain function is redundant and can be removed. Signed-off-by: Joerg Roedel commit dcd1e92e405449ecc5e8bd8fcfebf3b2a13d3d37 Author: Joerg Roedel Date: Fri Nov 20 15:30:58 2009 +0100 x86/amd-iommu: Use __iommu_flush_pages for tlb flushes This patch re-implements iommu_flush_tlb functions to use the __iommu_flush_pages logic. Signed-off-by: Joerg Roedel commit 6de8ad9b9ee0ec5b52ec8ec41401833e5e89186f Author: Joerg Roedel Date: Mon Nov 23 18:30:32 2009 +0100 x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs This patch extends the iommu_flush_pages function to flush the TLB entries on all IOMMUs the domain has devices on. This basically gives up the former assumption that dma_ops domains are only bound to one IOMMU in the system. For dma_ops domains this is still true but not for IOMMU-API managed domains. Giving this assumption up for dma_ops domains too allows code simplification. Further it splits out the main logic into a generic function which can be used by iommu_flush_tlb too. Signed-off-by: Joerg Roedel commit 0518a3a4585cb3eeeaf14ca57131f11d252130c6 Author: Joerg Roedel Date: Fri Nov 20 16:00:05 2009 +0100 x86/amd-iommu: Add function to complete a tlb flush This patch adds a function to the AMD IOMMU driver which completes all queued commands an all IOMMUs a specific domain has devices attached on. This is required in a later patch when per-domain flushing is implemented. Signed-off-by: Joerg Roedel commit c459611424d8b8396060eb766e23bd0c70c993bc Author: Joerg Roedel Date: Fri Nov 20 14:57:32 2009 +0100 x86/amd-iommu: Add per IOMMU reference counting This patch adds reference counting for protection domains per IOMMU. This allows a smarter TLB flushing strategy. Signed-off-by: Joerg Roedel commit bb52777ec4d736c2d7c4f037b32d4eeeb172ed89 Author: Joerg Roedel Date: Fri Nov 20 14:31:51 2009 +0100 x86/amd-iommu: Add an index field to struct amd_iommu This patch adds an index field to struct amd_iommu which can be used to lookup it up in an array. This index will be used in struct protection_domain to keep track which protection domain has devices behind which IOMMU. Signed-off-by: Joerg Roedel commit bf3118c1276d27fe9e84aa42382da25ee0750777 Author: Joerg Roedel Date: Fri Nov 20 13:39:19 2009 +0100 x86/amd-iommu: Update copyright headers This patch updates the copyright headers in the relevant AMD IOMMU driver files to match the date of the latest changes. Signed-off-by: Joerg Roedel commit 6a9401a7ac13e62ef2baf4d46e022d303edc3050 Author: Joerg Roedel Date: Fri Nov 20 13:22:21 2009 +0100 x86/amd-iommu: Separate internal interface definitions This patch moves all function declarations which are only used inside the driver code to a seperate header file. Signed-off-by: Joerg Roedel commit 52a11f354970e7301e1d1a029b87535be45abec9 Author: Lai Jiangshan Date: Wed Nov 25 16:33:15 2009 +0800 trace_kprobes: Don't output zero offset "symbol_name+0" is not so friendly. It makes the output longer. Signed-off-by: Lai Jiangshan Acked-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0CEBCB.7080309@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 3d9b2e1ddf42dd3df38af7794fa5e39cce760f3b Author: Lai Jiangshan Date: Wed Nov 25 16:32:47 2009 +0800 trace_kprobes: Always show group name Sometimes the group name is not "kprobes", It'll be better if we can read it from tracing/kprobe_events. # echo 'r:laijs/vfs_read vfs_read %ax' > kprobe_events # cat kprobe_events r:laijs/vfs_read vfs_read %ax=%ax Signed-off-by: Lai Jiangshan Acked-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0CEBAF.6000104@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit abab9d37d2a826fcf588c5f30152dbe05c40111c Author: Lai Jiangshan Date: Wed Nov 25 16:32:21 2009 +0800 trace_kprobes: Fix memory leak tp->nr_args is not set before we "goto error", it causes memory leak for free_trace_probe() use tp->nr_args to free memory of args. Signed-off-by: Lai Jiangshan Acked-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0CEB95.2060107@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 0f1ef51d244809f417bdf45cdb00109fb6005672 Author: Lai Jiangshan Date: Thu Nov 26 15:49:33 2009 +0800 trace_syscalls: Add syscall nr field Field syscall number is missed in syscall_enter_define_fields()/ syscall_exit_define_fields(). Syscall number is also needed for event filter or other users. Signed-off-by: Lai Jiangshan Acked-by: Frederic Weisbecker Cc: Jason Baron Cc: Steven Rostedt LKML-Reference: <4B0E330D.1070206@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit dd1853c3f493f6d22d9e5390b192a07b73d2ac0a Author: Frederic Weisbecker Date: Fri Nov 27 04:55:54 2009 +0100 hw-breakpoints: Use struct perf_event_attr to define kernel breakpoints Kernel breakpoints are created using functions in which we pass breakpoint parameters as individual variables: address, length and type. Although it fits well for x86, this just does not scale across architectures that may support this api later as these may have more or different needs. Pass in a perf_event_attr structure instead because it is meant to evolve as much as possible into a generic hardware breakpoint parameter structure. Reported-by: K.Prasad Signed-off-by: Frederic Weisbecker LKML-Reference: <1259294154-5197-2-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 5fa10b28e57f94a90535cfeafe89dcee9f47d540 Author: Frederic Weisbecker Date: Fri Nov 27 04:55:53 2009 +0100 hw-breakpoints: Use struct perf_event_attr to define user breakpoints In-kernel user breakpoints are created using functions in which we pass breakpoint parameters as individual variables: address, length and type. Although it fits well for x86, this just does not scale across archictectures that may support this api later as these may have more or different needs. Pass in a perf_event_attr structure instead because it is meant to evolve as much as possible into a generic hardware breakpoint parameter structure. Reported-by: K.Prasad Signed-off-by: Frederic Weisbecker LKML-Reference: <1259294154-5197-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit e5af02261668350b43eb7381648930bde8e872f7 Author: Anton Blanchard Date: Fri Nov 27 13:28:20 2009 +1100 softlockup: Fix hung_task_check_count sysctl I'm seeing spikes of up to 0.5ms in khungtaskd on a large machine. To reduce this source of jitter I tried setting hung_task_check_count to 0: # echo 0 > /proc/sys/kernel/hung_task_check_count which didn't have the intended response. Change to a post increment of max_count, so a value of 0 means check 0 tasks. Signed-off-by: Anton Blanchard Acked-by: Frederic Weisbecker Cc: msb@google.com LKML-Reference: <20091127022820.GU32182@kryten> Signed-off-by: Ingo Molnar commit b2e74a265ded1a185f762ebaab967e9e0d008dd8 Author: Stephane Eranian Date: Thu Nov 26 09:24:30 2009 -0800 perf_events: Fix read() bogus counts when in error state When a pinned group cannot be scheduled it goes into error state. Normally a group cannot go out of error state without being explicitly re-enabled or disabled. There was a bug in per-thread mode, whereby upon termination of the thread, the group would transition from error to off leading to bogus counts and timing information returned by read(). Fix it by clearing the error state. Signed-off-by: Stephane Eranian Acked-by: Peter Zijlstra Cc: Paul Mackerras Cc: perfmon2-devel@lists.sourceforge.net LKML-Reference: <4b0eb9ce.0508d00a.573b.ffffeab6@mx.google.com> Signed-off-by: Ingo Molnar commit 4d795fb17a02a87e35782773b88b7a63acfbeaae Author: Ingo Molnar Date: Thu Nov 26 13:11:46 2009 +0100 tracing: Fix kmem event exports Commit 53d0422 ("tracing: Convert some kmem events to DEFINE_EVENT") moved the kmem tracepoint creation from util.c to page_alloc.c, but forgot to move the exports. Move them back. Cc: Li Zefan Cc: Pekka Enberg Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Mel Gorman LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit b7b20df91d43d5e59578b8fc16e895c0c8cbd9b5 Author: Hidetoshi Seto Date: Thu Nov 26 14:49:27 2009 +0900 sched, time: Define nsecs_to_jiffies() Use of msecs_to_jiffies() for nsecs_to_cputime() have some problems: - The type of msecs_to_jiffies()'s argument is unsigned int, so it cannot convert msecs greater than UINT_MAX = about 49.7 days. - msecs_to_jiffies() returns MAX_JIFFY_OFFSET if MSB of argument is set, assuming that input was negative value. So it cannot convert msecs greater than INT_MAX = about 24.8 days too. This patch defines a new function nsecs_to_jiffies() that can deal greater values, and that can deal all incoming values as unsigned. Signed-off-by: Hidetoshi Seto Acked-by: Peter Zijlstra Cc: Stanislaw Gruszka Cc: Spencer Candland Cc: Oleg Nesterov Cc: Balbir Singh Cc: Amrico Wang Cc: Thomas Gleixner Cc: John Stultz LKML-Reference: <4B0E16E7.5070307@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit d5b7c78e975302a1bab28263266c39ecb71acad4 Author: Hidetoshi Seto Date: Thu Nov 26 14:49:05 2009 +0900 sched: Remove task_{u,s,g}time() Now all task_{u,s}time() pairs are replaced by task_times(). And task_gtime() is too simple to be an inline function. Cleanup them all. Signed-off-by: Hidetoshi Seto Acked-by: Peter Zijlstra Cc: Stanislaw Gruszka Cc: Spencer Candland Cc: Oleg Nesterov Cc: Balbir Singh Cc: Americo Wang LKML-Reference: <4B0E16D1.70902@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit d180c5bccec02612256fd8076ff3c1fac3429553 Author: Hidetoshi Seto Date: Thu Nov 26 14:48:30 2009 +0900 sched: Introduce task_times() to replace task_{u,s}time() pair Functions task_{u,s}time() are called in pair in almost all cases. However task_stime() is implemented to call task_utime() from its inside, so such paired calls run task_utime() twice. It means we do heavy divisions (div_u64 + do_div) twice to get utime and stime which can be obtained at same time by one set of divisions. This patch introduces a function task_times(*tsk, *utime, *stime) to retrieve utime and stime at once in better, optimized way. Signed-off-by: Hidetoshi Seto Acked-by: Peter Zijlstra Cc: Stanislaw Gruszka Cc: Spencer Candland Cc: Oleg Nesterov Cc: Balbir Singh Cc: Americo Wang LKML-Reference: <4B0E16AE.906@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit ba005e1f417295d28cd1563ab82bc33af07fb16a Author: Masami Hiramatsu Date: Tue Nov 24 16:56:58 2009 -0500 tracepoint: Add signal loss events Add signal_overflow_fail and signal_lose_info tracepoints for signal-lost events. Changes in v3: - Add docbook style comments Changes in v2: - Use siginfo string macro Suggested-by: Roland McGrath Reviewed-by: Jason Baron Signed-off-by: Masami Hiramatsu Acked-by: Roland McGrath Cc: systemtap Cc: DLE Cc: Oleg Nesterov LKML-Reference: <20091124215658.30449.9934.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit f9d4257e01d266e67420cc99d456b6d4c8464f54 Author: Masami Hiramatsu Date: Tue Nov 24 16:56:51 2009 -0500 tracepoint: Add signal deliver event Add a tracepoint where a process gets a signal. This tracepoint shows signal-number, sa-handler and sa-flag. Changes in v3: - Add docbook style comments Changes in v2: - Add siginfo argument - Fix comment Signed-off-by: Masami Hiramatsu Reviewed-by: Jason Baron Acked-by: Roland McGrath Cc: systemtap Cc: DLE Cc: Oleg Nesterov LKML-Reference: <20091124215651.30449.20926.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit d1eb650ff4130972fa21462fa49cd35a2865403b Author: Masami Hiramatsu Date: Tue Nov 24 16:56:45 2009 -0500 tracepoint: Move signal sending tracepoint to events/signal.h Move signal sending event to events/signal.h. This patch also renames sched_signal_send event to signal_generate. Changes in v4: - Fix a typo of task_struct pointer. Changes in v3: - Add docbook style comments Changes in v2: - Add siginfo argument - Add siginfo storing macro Signed-off-by: Masami Hiramatsu Reviewed-by: Jason Baron Acked-by: Roland McGrath Cc: systemtap Cc: DLE Cc: Oleg Nesterov LKML-Reference: <20091124215645.30449.60208.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 918bc960dc630b1a79c0d2991a81985812ff69f5 Author: Jack Steiner Date: Wed Nov 25 10:20:19 2009 -0600 x86: SGI UV: Map low MMR ranges Explicitly mmap the UV chipset MMR address ranges used to access blade-local registers. Although these same MMRs are also mmaped at higher addresses, the low range is more convenient when accessing blade-local registers. The low range addresses always alias to the local blade regardless of the blade id. Signed-off-by: Jack Steiner LKML-Reference: <20091125162018.GA25445@sgi.com> Signed-off-by: Ingo Molnar commit 16bc67edeb49b531940b2ba6c183780a1b5c472d Merge: f663011 047106a Author: Ingo Molnar Date: Thu Nov 26 10:50:39 2009 +0100 Merge branch 'sched/urgent' into sched/core Merge reason: Pick up fixes that did not make it into .32.0 Signed-off-by: Ingo Molnar commit 8ec6993d9f7d961014af970ded57542961fe9ad9 Author: Brian Gerst Date: Wed Nov 25 11:17:36 2009 -0500 x86, 64-bit: Set data segments to null after switching to 64-bit mode This prevents kernel threads from inheriting non-null segment selectors, and causing optimizations in __switch_to() to be ineffective. Signed-off-by: Brian Gerst Cc: Tim Blechmann Cc: Linus Torvalds Cc: H. Peter Anvin Cc: Jeremy Fitzhardinge Cc: Jan Beulich LKML-Reference: <1259165856-3512-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar commit 80bbf6b641c8843b9d751a1f299aa7ee073ab9d4 Author: Frederic Weisbecker Date: Wed Nov 25 21:20:53 2009 +0100 hw-breakpoints: Fix unused function in off-case bp_perf_event_destroy() is unused in its off-case version, let's remove it to fix the following warning reported by Stephen Rothwell in linux-next: kernel/perf_event.c:4306: warning: 'bp_perf_event_destroy' defined but not used Reported-by: Stephen Rothwell Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra LKML-Reference: <1259180453-5813-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 64b028b22616946a05bf9580f7f7f7ee2ac070b4 Author: Ingo Molnar Date: Thu Nov 26 10:37:55 2009 +0100 x86: Clean up the loadsegment() macro Make it readable in the source too, not just in the assembly output. No change in functionality. Cc: Brian Gerst LKML-Reference: <1259176706-5908-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar commit 79b0379cee09b00ef309384aff652e328e438c79 Author: Brian Gerst Date: Wed Nov 25 14:18:26 2009 -0500 x86: Optimize loadsegment() Zero the input register in the exception handler instead of using an extra register to pass in a zero value. Signed-off-by: Brian Gerst LKML-Reference: <1259176706-5908-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar commit 767df1bdd8cbff2c8c40c9ac8295bbdaa5fb24c4 Author: Hidetoshi Seto Date: Thu Nov 26 17:29:02 2009 +0900 x86, mce: Add __cpuinit to hotplug callback functions The mce_disable_cpu() and mce_reenable_cpu() are called only from mce_cpu_callback() which is marked as __cpuinit. So these functions can be __cpuinit too. Signed-off-by: Hidetoshi Seto Cc: Andi Kleen LKML-Reference: <4B0E3C4E.4090809@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit 9b3660a55a9052518c91cc7c62d89e22f3f6f490 Author: Mike Travis Date: Tue Nov 17 18:22:16 2009 -0600 x86: Limit number of per cpu TSC sync messages Limit the number of per cpu TSC sync messages by only printing to the console if an error occurs, otherwise print as a DEBUG message. The info message "Skipping synchronization ..." is only printed after the last cpu has booted. Signed-off-by: Mike Travis Cc: Heiko Carstens Cc: Roland Dreier Cc: Randy Dunlap Cc: Tejun Heo Cc: Andi Kleen Cc: Greg Kroah-Hartman Cc: Yinghai Lu Cc: David Rientjes Cc: Steven Rostedt Cc: Rusty Russell Cc: Hidetoshi Seto Cc: Jack Steiner Cc: Frederic Weisbecker LKML-Reference: <20091118002222.181053000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar commit f6630114d9198aa959ac95c131334c020038f253 Author: Mike Travis Date: Tue Nov 17 18:22:15 2009 -0600 sched: Limit the number of scheduler debug messages Remove the verbose scheduler debug messages unless kernel parameter "sched_debug" set. /proc/sched_debug unchanged. Signed-off-by: Mike Travis Cc: Heiko Carstens Cc: Roland Dreier Cc: Randy Dunlap Cc: Tejun Heo Cc: Andi Kleen Cc: Greg Kroah-Hartman Cc: Yinghai Lu Cc: David Rientjes Cc: Steven Rostedt Cc: Rusty Russell Cc: Hidetoshi Seto Cc: Jack Steiner Cc: Frederic Weisbecker LKML-Reference: <20091118002221.489305000@alcatraz.americas.sgi.com> Signed-off-by: Ingo Molnar commit 11e6635763bdc0e24b39a38876574660755acffc Author: Andrew Morton Date: Wed Nov 25 23:01:50 2009 -0800 kernel/hw_breakpoint.c: Fix local/global shadowing If the new percpu tree is combined with the perf events tree the following new warning triggers: kernel/hw_breakpoint.c: In function 'toggle_bp_task_slot': kernel/hw_breakpoint.c:151: warning: 'task_bp_pinned' is used uninitialized in this function Because it's not valid anymore to define a local variable and a percpu variable (even if it's file scope local) with the same name. Rename the local variable to resolve this. Signed-off-by: Andrew Morton Cc: Frederic Weisbecker Cc: K.Prasad Cc: Tejun Heo Cc: Linus Torvalds LKML-Reference: <200911260701.nAQ71owx016356@imap1.linux-foundation.org> [ v2: added changelog ] Signed-off-by: Ingo Molnar commit 2c31b7958fd21df9fa04e5c36cda0f063ac70b27 Author: Frederic Weisbecker Date: Thu Nov 26 06:04:38 2009 +0100 x86/hw-breakpoints: Don't lose GE flag while disabling a breakpoint When we schedule out a breakpoint from the cpu, we also incidentally remove the "Global exact breakpoint" flag from the breakpoint control register. It makes us losing the fine grained precision about the origin of the instructions that may trigger breakpoint exceptions for the other breakpoints running in this cpu. Reported-by: Prasad Signed-off-by: Frederic Weisbecker LKML-Reference: <1259211878-6013-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 605bfaee9078cd0b01d83402315389839ee4bb5c Author: Frederic Weisbecker Date: Thu Nov 26 05:35:42 2009 +0100 hw-breakpoints: Simplify error handling in breakpoint creation requests This simplifies the error handling when we create a breakpoint. We don't need to check the NULL return value corner case anymore since we have improved perf_event_create_kernel_counter() to always return an error code in the failure case. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Steven Rostedt Cc: Prasad LKML-Reference: <1259210142-5714-3-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit c6567f642e20bcc79abed030f44be5b0d6da2ded Author: Frederic Weisbecker Date: Thu Nov 26 05:35:41 2009 +0100 hw-breakpoints: Improve in-kernel event creation error granularity In fail case, perf_event_create_kernel_counter() returns NULL instead of an error, which doesn't help us to inform the user about the origin of the problem from the outer most callers. Often we can just return -EINVAL, which doesn't help anyone when it's eventually about a memory allocation failure. Then, this patch makes perf_event_create_kernel_counter() always return a detailed error code. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad LKML-Reference: <1259210142-5714-2-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit d99be40aff88722ab03ee295e4f6c13a4cca9a3d Author: Frederic Weisbecker Date: Thu Nov 26 05:35:40 2009 +0100 ksym_tracer: Fix breakpoint removal after modification The error path of a breakpoint modification is broken in the ksym tracer. A modified breakpoint hlist node is immediately released after its removal. Also we leak a breakpoint in this case. Fix the path. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Prasad LKML-Reference: <1259210142-5714-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 470dda7417f284b9cfc96560b2acd98df63798a2 Author: Li Zefan Date: Thu Nov 26 15:08:01 2009 +0800 tracing: Restore original format of sched events The original format for sched_stat_iowait and sched_stat_sleep: $ cat events/sched/sched_stat_iowait/format ... print fmt: "comm=%s pid=%d delay=%Lu [ns]", ... $ cat events/sched/sched_stat_sleep/format ... print fmt: "comm=%s pid=%d delay=%Lu [ns]", ... But commit commit 75ec29ab848a7e92a41aaafaeb33d1afbc839be4 ("tracing: Convert some sched trace events to DEFINE_EVENT and _PRINT") broke the format: $ cat events/sched/sched_stat_iowait/format print fmt: "task: %s:%d iowait: %Lu [ns]", ... $ cat events/sched/sched_stat_sleep/format print fmt: "task: %s:%d sleep: %Lu [ns]", ... No change in functionality. Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E2951.9050800@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit b5eb34c3592545c756e50d882c08417eb60740a7 Author: Li Zefan Date: Thu Nov 26 15:07:36 2009 +0800 tracing: Convert some ext4 events to DEFINE_TRACE Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 294695 6104 340 301139 49853 fs/ext4/ext4.o.old 289983 6104 324 296411 485db fs/ext4/ext4.o 5 events are convertd: ext4__write_begin: ext4_write_begin, ext4_da_write_begin ext4__write_end: ext4_{ordered, writeback, journalled}_write_end No change in functionality. Signed-off-by: Li Zefan Cc: Theodore Ts'o Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E2938.2040708@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 071688f36e7eba3e37b2fc48e35bfdab99b80b4d Author: Li Zefan Date: Thu Nov 26 15:06:55 2009 +0800 tracing: Convert some jbd2 events to DEFINE_EVENT Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 34903 1693 448 37044 90b4 fs/jbd2/journal.o.old 31931 1693 416 34040 84f8 fs/jbd2/journal.o Four events are converted: jbd2_commit: jbd2_start_commit, jbd2_commit_{locking, flushing, logging} No change in functionality. Signed-off-by: Li Zefan Cc: Theodore Ts'o Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E290F.7030909@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 77ca1e0294f25fc26053ba14353e703158acef26 Author: Li Zefan Date: Thu Nov 26 15:06:14 2009 +0800 tracing: Convert some block events to DEFINE_EVENT use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 53570 3284 184 57038 dece block/blk-core.o.old 43702 3284 144 47130 b81a block/blk-core.o 12 events are converted: block_rq: block_rq_insert, block_rq_issue block_rq_with_error: block_rq_{abort, requeue, complete} block_bio: block_bio_{backmerge, frontmerge, queue} block_get_rq: block_getrq, block_sleeprq block_unplug: block_unplug_timer, block_unplug_io No change in functionality. Signed-off-by: Li Zefan Cc: Jens Axboe Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E28E6.7060609@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 7703466b4c0a21b88d701882bef0d45bcb0a0281 Author: Li Zefan Date: Thu Nov 26 15:05:38 2009 +0800 tracing: Convert some power events to DEFINE_EVENT Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 4312 524 12 4848 12f0 kernel/trace/power-traces.o.old 3455 524 8 3987 f93 kernel/trace/power-traces.o Two events are converted: power: power_start, power_frequency No change in functionality. Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Arjan van de Ven LKML-Reference: <4B0E28C2.1090906@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 382ece710bf88b08440b598731361e5a47582b62 Author: Li Zefan Date: Thu Nov 26 15:05:03 2009 +0800 tracing: Convert some workqueue events to DEFINE_EVENT Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 13171 800 72 14043 36db kernel/workqueue.o.old 12243 800 68 13111 3337 kernel/workqueue.o Two events are converted: workqueue: workqueue_insertion, workqueue_execution No change in functionality. Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E289F.5010104@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit c467307c1a812c3150b27a68c2b2d3397bb40a4f Author: Li Zefan Date: Thu Nov 26 15:04:31 2009 +0800 tracing: Convert softirq events to DEFINE_EVENT Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 12781 952 36 13769 35c9 kernel/softirq.o.old 11981 952 32 12965 32a5 kernel/softirq.o Two events are converted: softirq: softirq_entry, softirq_exit No change in functionality. Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E287F.4030708@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 53d0422c2d10808fddb2c30859193bfea164c7e3 Author: Li Zefan Date: Thu Nov 26 15:04:10 2009 +0800 tracing: Convert some kmem events to DEFINE_EVENT Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 333987 69800 27228 431015 693a7 mm/built-in.o.old 330030 69800 27228 427058 68432 mm/built-in.o 8 events are converted: kmem_alloc: kmalloc, kmem_cache_alloc kmem_alloc_node: kmalloc_node, kmem_cache_alloc_node kmem_free: kfree, kmem_cache_free mm_page: mm_page_alloc_zone_locked, mm_page_pcpu_drain No change in functionality. Signed-off-by: Li Zefan Acked-by: Pekka Enberg Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Mel Gorman LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 925684d6d589e40e41007edf47c69e729d911263 Author: Li Zefan Date: Thu Nov 26 15:03:23 2009 +0800 tracing: Convert module refcnt events to DEFINE_EVENT Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 29854 1980 128 31962 7cda kernel/module.o.old 28750 1980 128 30858 788a kernel/module.o Two events are converted: module_refcnt: module_get, module_put No change in functionality. Signed-off-by: Li Zefan Cc: Rusty Russell Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E283B.3010508@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 091ad3658e3c76c5fb05f65bfb64a0246f8f31b5 Author: Ingo Molnar Date: Thu Nov 26 09:04:55 2009 +0100 events: Rename TRACE_EVENT_TEMPLATE() to DECLARE_EVENT_CLASS() It is not quite obvious at first sight what TRACE_EVENT_TEMPLATE does: does it define an event as well beyond defining a template? To clarify this, rename it to DECLARE_EVENT_CLASS, which follows the various 'DECLARE_*()' idioms we already have in the kernel: DECLARE_EVENT_CLASS(class) DEFINE_EVENT(class, event1) DEFINE_EVENT(class, event2) DEFINE_EVENT(class, event3) To complete this logic we should also rename TRACE_EVENT() to: DEFINE_SINGLE_EVENT(single_event) ... but in a more quiet moment of the kernel cycle. Cc: Pekka Enberg Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 67f2de0bf9141dd9fe9189d0caaa28d7ad21a523 Author: Ingo Molnar Date: Thu Nov 26 08:29:10 2009 +0100 x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks This warning: [ 847.140022] rb_producer D 0000000000000000 5928 519 2 0x00000000 [ 847.203627] BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/517 [ 847.207360] caller is show_stack_log_lvl+0x2e/0x241 [ 847.210364] Pid: 517, comm: khungtaskd Not tainted 2.6.32-rc8-tip+ #13761 [ 847.213395] Call Trace: [ 847.215847] [] debug_smp_processor_id+0x1f0/0x20a [ 847.216809] [] show_stack_log_lvl+0x2e/0x241 [ 847.220027] [] show_stack+0x1c/0x1e [ 847.223365] [] sched_show_task+0xe4/0xe9 [ 847.226694] [] check_hung_task+0x140/0x199 [ 847.230261] [] check_hung_uninterruptible_tasks+0x1b7/0x20f [ 847.233371] [] ? watchdog+0x0/0x50 [ 847.236683] [] watchdog+0x4e/0x50 [ 847.240034] [] kthread+0x97/0x9f [ 847.243372] [] child_rip+0xa/0x20 [ 847.246690] [] ? restore_args+0x0/0x30 [ 847.250019] [] ? _spin_lock+0xe/0x10 [ 847.253351] [] ? kthread+0x0/0x9f [ 847.256833] [] ? child_rip+0x0/0x20 Happens because on preempt-RCU, khungd calls show_stack() with preemption enabled. Make sure we are not preemptible while walking the IRQ and exception stacks on 64-bit. (32-bit stack dumping is preemption safe.) Signed-off-by: Ingo Molnar commit b803090615ccec669681ff85ce28671e7bfefa3d Author: Ingo Molnar Date: Thu Nov 26 08:17:31 2009 +0100 x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details Make the initialization more readable, plus tidy up a few small visual details as well. No change in functionality. LKML-Reference: Signed-off-by: Ingo Molnar commit b8007ef7422270864eae523cb38d7522a53a94d3 Author: Lai Jiangshan Date: Tue Nov 3 13:45:32 2009 +0800 tracing: Separate raw syscall from syscall tracer The current syscall tracer mixes raw syscalls and real syscalls. echo 1 > events/syscalls/enable And we get these from the output: (XXXX insteads " grep-20914 [001] 588211.446347" .. etc) XXXX: sys_read(fd: 3, buf: 80609a8, count: 7000) XXXX: sys_enter: NR 3 (3, 80609a8, 7000, a, 1000, bfce8ef8) XXXX: sys_read -> 0x138 XXXX: sys_exit: NR 3 = 312 XXXX: sys_read(fd: 3, buf: 8060ae0, count: 7000) XXXX: sys_enter: NR 3 (3, 8060ae0, 7000, a, 1000, bfce8ef8) XXXX: sys_read -> 0x138 XXXX: sys_exit: NR 3 = 312 There are 2 drawbacks here. A) two almost identical records are saved in ringbuffer when a syscall enters or exits. (4 records for every syscall) This wastes precious space in the ring buffer. B) the lines including "sys_enter/sys_exit" produces hardly any useful information for the output (no labels). The user can use this method to prevent these drawbacks: echo 1 > events/syscalls/enable echo 0 > events/syscalls/sys_enter/enable echo 0 > events/syscalls/sys_exit/enable But this is not user friendly. So we separate raw syscall from syscall tracer. After this fix applied: syscall tracer's output (echo 1 > events/syscalls/enable): XXXX: sys_read(fd: 3, buf: bfe87d88, count: 200) XXXX: sys_read -> 0x200 XXXX: sys_fstat64(fd: 3, statbuf: bfe87c98) XXXX: sys_fstat64 -> 0x0 XXXX: sys_close(fd: 3) raw syscall tracer's output (echo 1 > events/raw_syscalls/enable): XXXX: sys_enter: NR 175 (0, bf92bf18, bf92bf98, 8, b748cff4, bf92bef8) XXXX: sys_exit: NR 175 = 0 XXXX: sys_enter: NR 175 (2, bf92bf98, 0, 8, b748cff4, bf92bef8) XXXX: sys_exit: NR 175 = 0 XXXX: sys_enter: NR 3 (9, bf927f9c, 4000, b77e2518, b77dce60, bf92bff8) Signed-off-by: Lai Jiangshan LKML-Reference: <4AEFC37C.5080609@cn.fujitsu.com> Signed-off-by: Steven Rostedt commit 7ac074340480018681a0d72b324d4487543bdc0e Author: Steven Rostedt Date: Wed Nov 25 13:22:21 2009 -0500 ring-buffer-benchmark: Add parameters to set produce/consumer priorities Running the ring-buffer-benchmark's threads at the lowest priority may work well for keeping it in the background, but it is not appropriate for the benchmarks. This patch adds 4 parameters to the module: consumer_fifo consumer_nice producer_fifo producer_nice By default the consumer and producer still run at nice +19. If the *_fifo options are set, they will override the *_nice values. modprobe ring_buffer_benchmark consumer_nice=0 producer_fifo=10 The above will set the consumer thread to a nice value of 0, and the producer thread to a RT SCHED_FIFO priority of 10. Note, this patch also fixes a bug where calling set_user_nice on the consumer thread would oops the kernel when the parameter "disable_reader" is set. Signed-off-by: Steven Rostedt commit 28b4e0d86acf59ae3bc422921138a4958458326e Author: Tejun Heo Date: Wed Nov 25 22:24:44 2009 +0900 x86: Rename global percpu symbol dr7 to cpu_dr7 Percpu symbols now occupy the same namespace as other global symbols and as such short global symbols without subsystem prefix tend to collide with local variables. dr7 percpu variable used by x86 was hit by this. Rename it to cpu_dr7. The rename also makes it more consistent with its fellow cpu_debugreg percpu variable. Signed-off-by: Tejun Heo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Rusty Russell Cc: Christoph Lameter Cc: Linus Torvalds , Cc: Andrew Morton LKML-Reference: <20091125115856.GA17856@elte.hu> Signed-off-by: Ingo Molnar Reported-by: Stephen Rothwell commit 93335a21557e80f6a99bc2812c634e488139043c Author: Shmulik Ladkani Date: Wed Nov 25 15:23:41 2009 +0200 sched.c: Call debug_show_all_locks() when dumping all tasks In commit v2.6.21-691-g39bc89f ("make SysRq-T show all tasks again") the interface of show_state_filter() was changed: zero valued 'state_filter' specifies "dump all tasks" (instead of -1). However, the condition for calling debug_show_all_locks() ("show locks if all tasks are dumped") was not updated accordingly. Signed-off-by: Shmulik Ladkani Cc: peterz@infradead.org LKML-Reference: <4b0d2fe4.0ab6660a.6437.3cfc@mx.google.com> Signed-off-by: Ingo Molnar commit 273bee27fa9f79d94b78c83506016f2e41e78983 Author: FUJITA Tomonori Date: Wed Nov 25 08:46:28 2009 +0900 x86: Fix iommu=soft boot option iommu=soft boot option forces the kernel to use swiotlb. ( This has the side-effect of enabling the swiotlb over the GART if this boot option is provided. This is the desired behavior of the swiotlb boot option and works like that for all other hw-IOMMU drivers. ) Signed-off-by: FUJITA Tomonori Cc: yinghai@kernel.org LKML-Reference: <20091125084611O.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 99df5a6a215f026e62287083de2b44b22edd3623 Author: Tom Zanussi Date: Wed Nov 25 01:14:59 2009 -0600 trace/syscalls: Change ret param in struct syscall_trace_exit to long Commit ee949a86b3aef15845ea677aa60231008de62672 ("tracing/syscalls: Use long for syscall ret format and field definitions") changed the syscall exit return type to long, but forgot to change it in the struct. Signed-off-by: Tom Zanussi Cc: Steven Rostedt Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <1259133299-23594-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 0d0bea5ea4a0e91feff22ac5e32e14ff3a682247 Author: Tom Zanussi Date: Wed Nov 25 01:14:58 2009 -0600 perf tools: Add 'signed' flag setting back into trace-event-parse.c Commit 13999e59343b042b0807be2df6ae5895d29782a0 (perf tools: Handle the case with and without the "signed" trace field) removed code to set the FIELD_IS_SIGNED flag that was originally added by commit 26a50744b21fff65bd754874072857bee8967f4d (tracing/events: Add 'signed' field to format files). This adds it back. Signed-off-by: Tom Zanussi Cc: Steven Rostedt Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <1259133299-23594-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 9533ac6291d78cd16c4b11a15bfbb055affd76c3 Merge: fe61267 75ec29a Author: Ingo Molnar Date: Wed Nov 25 09:03:15 2009 +0100 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core commit 7539cf4b92be4aecc573ea962135f246a7a33401 Author: Tetsuo Handa Date: Tue Nov 24 22:00:05 2009 +0900 TOMOYO: Add recursive directory matching operator support. TOMOYO 1.7.1 has recursive directory matching operator support. I want to add it to TOMOYO for Linux 2.6.33 . ---------- [PATCH] TOMOYO: Add recursive directory matching operator support. This patch introduces new operator /\{dir\}/ which matches '/' + 'One or more repetitions of dir/' (e.g. /dir/ /dir/dir/ /dir/dir/dir/ ). Signed-off-by: Tetsuo Handa Acked-by: John Johansen Signed-off-by: James Morris commit 75ec29ab848a7e92a41aaafaeb33d1afbc839be4 Author: Steven Rostedt Date: Wed Nov 18 20:48:08 2009 -0500 tracing: Convert some sched trace events to DEFINE_EVENT and _PRINT Converting some of the scheduler trace events to use the TRACE_EVENT_TEMPLATE, DEFINE_EVENT and DEFINE_EVENT_PRINT helped to save some space: $ size kernel/sched.o-* text data bss dec hex filename 79299 6776 2520 88595 15a13 kernel/sched.o-notrace 101941 11896 2584 116421 1c6c5 kernel/sched.o-templ 104779 11896 2584 119259 1d1db kernel/sched.o-trace sched.o-notrace is without any tracepoints compiled sched.o-templ is with this patch sched.o-trace is the tracepoints before this patch The trace events converted to DEFINE_EVENT: sched_wakeup, sched_wakeup_new, sched_process_free, sched_process_exit, and sched_stat_wait. The trace events converted to DEFINE_EVENT_PRINT: sched_stat_sleep and sched_stat_iowait. Note, since the TRACE_EVENT_TEMPLATE always uses a print, the sched_stat_wait print format is defined in the template and this template is used by sched_stat_sleep and sched_stat_iowait. But the later two override the print format. Signed-off-by: Steven Rostedt commit e5bc9721684e9412f3e0465222f317c362a8ab47 Author: Steven Rostedt Date: Wed Nov 18 20:36:26 2009 -0500 tracing: Create new DEFINE_EVENT_PRINT After creating the TRACE_EVENT_TEMPLATE I started to look at other trace points to see what duplication was made. I noticed that there are several trace points where they are almost identical except for the name and the output format. Since TRACE_EVENT_TEMPLATE was successful in bringing down the size of trace events, I added a DEFINE_EVENT_PRINT. DEFINE_EVENT_PRINT is used just like DEFINE_EVENT is. That is, the DEFINE_EVENT_PRINT also uses a TRACE_EVENT_TEMPLATE, but it allows the developer to overwrite the print format. If there are two or more TRACE_EVENTS that are identical except for the name and print, then they can be converted to use a TRACE_EVENT_TEMPLATE. Since the TRACE_EVENT_TEMPLATE already does the print output, the first trace event would have its print format held in the TRACE_EVENT_TEMPLATE and be defined with a DEFINE_EVENT. The rest will use the DEFINE_EVENT_PRINT and override the print format. Converting the sched trace points to both DEFINE_EVENT and DEFINE_EVENT_PRINT. Five were converted to DEFINE_EVENT and two were converted to DEFINE_EVENT_PRINT. I was able to get the following: $ size kernel/sched.o-* text data bss dec hex filename 79299 6776 2520 88595 15a13 kernel/sched.o-notrace 101941 11896 2584 116421 1c6c5 kernel/sched.o-templ 104779 11896 2584 119259 1d1db kernel/sched.o-trace sched.o-notrace is the scheduler compiled with no trace points. sched.o-templ is with the use of DEFINE_EVENT and DEFINE_EVENT_PRINT sched.o-trace is the current trace events. Signed-off-by: Steven Rostedt commit ff038f5c37c2070829004a0678372766c2b32180 Author: Steven Rostedt Date: Wed Nov 18 20:27:27 2009 -0500 tracing: Create new TRACE_EVENT_TEMPLATE There are some places in the kernel that define several tracepoints and they are all identical besides the name. The code to enable, disable and record is created for every trace point even if most of the code is identical. This patch adds TRACE_EVENT_TEMPLATE that lets the developer create a template TRACE_EVENT and create trace points with DEFINE_EVENT, which is based off of a given template. Each trace point used by this will share most of the code, and bring down the size of the kernel when there are several duplicate events. Usage is: TRACE_EVENT_TEMPLATE(name, proto, args, tstruct, assign, print); Which would be the same as defining a normal TRACE_EVENT. To create the trace events that the trace points will use: DEFINE_EVENT(template, name, proto, args) is done. The template is the name of the TRACE_EVENT_TEMPLATE to use. The name is the name of the trace point. The parameters proto and args must be the same as the proto and args of the template. If they are not the same, then a compile error will result. I tried hard removing this duplication but the C preprocessor is not powerful enough (or my CPP magic experience points is not at a high enough level) to not need them. A lot of trace events are coming in with new XFS development. Most of the trace points are identical except for the name. The following shows the advantage of having TRACE_EVENT_TEMPLATE: $ size fs/xfs/xfs.o.* text data bss dec hex filename 452114 2788 3520 458422 6feb6 fs/xfs/xfs.o.old 638482 38116 3744 680342 a6196 fs/xfs/xfs.o.template 996954 38116 4480 1039550 fdcbe fs/xfs/xfs.o.trace xfs.o.old is without any tracepoints. xfs.o.template uses the new TRACE_EVENT_TEMPLATE. xfs.o.trace uses the current TRACE_EVENT macros. Requested-by: Christoph Hellwig Signed-off-by: Steven Rostedt commit fe6126722718e51fba4879517c11ac12d9775bcc Author: Frederic Weisbecker Date: Tue Nov 24 20:38:22 2009 +0100 perf_events: Fix bad software/trace event recursion counting Commit 4ed7c92d68a5387ba5f7030dc76eab03558e27f5 (perf_events: Undo some recursion damage) has introduced a bad reference counting of the recursion context. putting the context behaves like getting it, dropping every software/trace events after the first one in a context. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Arjan van de Ven Cc: Li Zefan Cc: Steven Rostedt LKML-Reference: <1259091502-5171-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 1261a02a0c0ab8e643125705f0d1d83e5090e4d1 Author: Stephane Eranian Date: Tue Nov 24 05:27:18 2009 -0800 perf_events, x86: Fix validate_event bug The validate_event() was failing on valid event combinations. The function was assuming that if x86_schedule_event() returned 0, it meant error. But x86_schedule_event() returns the counter index and 0 is a perfectly valid value. An error is returned if the function returns a negative value. Furthermore, validate_event() was also failing for event groups because the event->pmu was not set until after hw_perf_event_init(). Signed-off-by: Stephane Eranian Cc: peterz@infradead.org Cc: paulus@samba.org Cc: perfmon2-devel@lists.sourceforge.net Cc: eranian@gmail.com LKML-Reference: <4b0bdf36.1818d00a.07cc.25ae@mx.google.com> Signed-off-by: Ingo Molnar -- arch/x86/kernel/cpu/perf_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) commit fcf1203a919c3a3d212c0ed01f5240fd592bf5ae Author: Arnaldo Carvalho de Melo Date: Tue Nov 24 13:01:52 2009 -0200 perf symbols: Rename find_symbol routines to find_function Paving the way for supporting variable in adition to function symbols. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259074912-5924-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 727dad10c17cbaade3cb6a56bd4863a4630f4d13 Author: Arnaldo Carvalho de Melo Date: Tue Nov 24 12:05:17 2009 -0200 perf tools: Remove unused wrapper routines And also make xrealloc and xmalloc weak symbols so that we don't have this problem: /usr/lib/gcc/x86_64-redhat-linux/4.4.1/../../../../lib64/libiberty.a(xmalloc.o): In function `xrealloc': (.text+0xc0): multiple definition of `xrealloc' libperf.a(wrapper.o):/home/acme_unencrypted/git/linux-2.6-tip/tools/perf/util/wrapper.c:67: first defined here collect2: ld returned 1 exit status Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259071517-3242-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 364794845cbc49e638b83d7ef739524291e1e961 Author: Arnaldo Carvalho de Melo Date: Tue Nov 24 12:05:16 2009 -0200 perf tools: Introduce zalloc() for the common calloc(1, N) case This way we type less characters and it looks more like the kzalloc kernel counterpart. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259071517-3242-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit b32d133aec5dc882cf783a293f393bfb3f4379e1 Author: Arnaldo Carvalho de Melo Date: Tue Nov 24 12:05:15 2009 -0200 perf symbols: Simplify symbol machinery setup And also express its configuration toggles via a struct. Now all one has to do is to call symbol__init(NULL) if the defaults are OK, or pass a struct symbol_conf pointer with the desired configuration. If a tool uses kernel_maps__find_symbol() to look at the kernel and modules mappings for a symbol but didn't call symbol__init() first, that will generate a one time warning too, alerting the subcommand developer that symbol__init() must be called. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259071517-3242-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 7cc017edb9459193d3b581155a14029e4bef0c49 Author: Arnaldo Carvalho de Melo Date: Tue Nov 24 12:05:14 2009 -0200 perf top: Always show the DSO column, even if its all the same Ingo found it confusing, and I agree with that, for 'perf report' its OK because it is static, but for a tool refreshing it the eventual switch from column to summary at the top may seem confusing. Suggested-by: Ingo Molnar Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259071517-3242-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit e74328d3a17ed75ffdf72b86f289965823a47240 Author: John Kacur Date: Tue Nov 24 15:35:01 2009 +0100 perf tools: Use common process_event functions for annotate and report Prevent bit-rot in perf-annotate by using common functions where possible. Here we create process_events.[ch] to hold the common functions. Signed-off-by: John Kacur Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: acme@redhat.com LKML-Reference: <1259073301-11506-3-git-send-email-jkacur@redhat.com> Signed-off-by: Ingo Molnar commit c9c7ccaf3a2686ed3a44d69bb1f8b55eeead8a4e Author: John Kacur Date: Tue Nov 24 15:35:00 2009 +0100 perf tools: Add perf.data to .gitignore Signed-off-by: John Kacur Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: acme@redhat.com LKML-Reference: <1259073301-11506-2-git-send-email-jkacur@redhat.com> Signed-off-by: Ingo Molnar commit 1263d736a9031f3d943819662d4bad727d64bf24 Merge: 184d3da 12eac0b Author: Ingo Molnar Date: Tue Nov 24 16:36:03 2009 +0100 Merge branch 'perf/bench' into perf/core Merge reason: Looks mergable - ready it for the merge window. Signed-off-by: Ingo Molnar commit a49ed0bf427a8328a3296eebedc7697fe5098dbf Author: Thomas Gleixner Date: Mon Nov 16 19:57:50 2009 +0100 locking: Use __[SPIN|RW]_LOCK_UNLOCKED in [spin|rw]_lock_init() SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated. Replace them with the __*_LOCK_UNLOCKED variants. Signed-off-by: Thomas Gleixner commit c9286b7e293a1ea054e857ff3f5a23d0ad8d4f36 Author: Thomas Gleixner Date: Mon Nov 16 19:50:38 2009 +0100 locking: Remove unused prototype commit 910067d1(remove generic__raw_read_trylock()) removed the implementation but left the prototype around. Remove it. Signed-off-by: Thomas Gleixner commit a3a1de0c34de6f5f8332cd6151c46af7813c0fcb Author: Tim Blechmann Date: Tue Nov 24 11:55:15 2009 +0100 sched, x86: Optimize branch hint in __switch_to() Branch hint profiling on my nehalem machine showed 96% incorrect branch hints: 6548732 174664120 96 __switch_to process_64.c 406 6548745 174565593 96 __switch_to process_64.c 410 Signed-off-by: Tim Blechmann Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <4B0BBB93.3080307@klingt.org> Signed-off-by: Ingo Molnar commit 710390d90f143a9ebb87a475215140f426792efd Author: Tim Blechmann Date: Tue Nov 24 11:55:27 2009 +0100 sched: Optimize branch hint in context_switch() Branch hint profiling on my nehalem machine showed over 90% incorrect branch hints: 10420275 170645395 94 context_switch sched.c 3043 10408421 171098521 94 context_switch sched.c 3050 Signed-off-by: Tim Blechmann Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <4B0BBB9F.6080304@klingt.org> Signed-off-by: Ingo Molnar commit 36ace27e3e60d44ea69ce394b2e45386ae98d9d9 Author: Tim Blechmann Date: Tue Nov 24 11:55:45 2009 +0100 sched: Optimize branch hint in pick_next_task_fair() Branch hint profiling on my nehalem machine showed 90% incorrect branch hints: 15728471 158903754 90 pick_next_task_fair sched_fair.c 1555 Signed-off-by: Tim Blechmann Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <4B0BBBB1.2050100@klingt.org> Signed-off-by: Ingo Molnar commit 184d3da8ef0ca552dffa0fdd35c046e058a2cf9a Author: Stephane Eranian Date: Mon Nov 23 21:40:49 2009 -0800 perf_events: Fix bogus copy_to_user() in perf_event_read_group() When using an event group, the value and id for non leaders events were wrong due to invalid offset into the outgoing buffer. Signed-off-by: Stephane Eranian Acked-by: Peter Zijlstra Cc: paulus@samba.org Cc: perfmon2-devel@lists.sourceforge.net LKML-Reference: <4b0b71e1.0508d00a.075e.ffff84a3@mx.google.com> Signed-off-by: Ingo Molnar commit b23d5767a5818caec8547d0bce1588b02bdecd30 Author: Li Zefan Date: Tue Nov 24 13:27:11 2009 +0800 perf kmem: Add help file Add Documentation/perf-kmem.txt Signed-off-by: Li Zefan Acked-by: Pekka Enberg Cc: Eduard - Gabriel Munteanu Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: linux-mm@kvack.org LKML-Reference: <4B0B6EAF.80802@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 079d3f653134e2f2ac99dae28b08c0cc64268103 Author: Li Zefan Date: Tue Nov 24 13:26:55 2009 +0800 perf kmem: Measure kmalloc/kfree CPU ping-pong call-sites Show statistics for allocations and frees on different cpus: ------------------------------------------------------------------------------------------------------ Callsite | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag ------------------------------------------------------------------------------------------------------ perf_event_alloc.clone.0+0 | 7504/682 | 7128/648 | 11 | 0 | 5.011% alloc_buffer_head+16 | 288/57 | 280/56 | 5 | 0 | 2.778% radix_tree_preload+51 | 296/296 | 288/288 | 1 | 0 | 2.703% tracepoint_add_probe+32e | 157/31 | 154/30 | 5 | 0 | 1.911% do_maps_open+0 | 796/12 | 792/12 | 66 | 0 | 0.503% sock_alloc_send_pskb+16e | 23780/495 | 23744/494 | 48 | 38 | 0.151% anon_vma_prepare+9a | 3744/44 | 3740/44 | 85 | 0 | 0.107% d_alloc+21 | 64948/164 | 64944/164 | 396 | 0 | 0.006% proc_alloc_inode+23 | 262292/676 | 262288/676 | 388 | 0 | 0.002% create_object+28 | 459600/200 | 459600/200 | 2298 | 71 | 0.000% journal_start+67 | 14440/40 | 14440/40 | 361 | 0 | 0.000% get_empty_filp+df | 53504/256 | 53504/256 | 209 | 0 | 0.000% getname+2a | 823296/4096 | 823296/4096 | 201 | 0 | 0.000% seq_read+2b0 | 544768/4096 | 544768/4096 | 133 | 0 | 0.000% seq_open+6d | 17024/128 | 17024/128 | 133 | 0 | 0.000% mmap_region+2e6 | 11704/88 | 11704/88 | 133 | 0 | 0.000% single_open+0 | 1072/16 | 1072/16 | 67 | 0 | 0.000% __alloc_skb+2e | 12544/256 | 12544/256 | 49 | 38 | 0.000% __sigqueue_alloc+4a | 1296/144 | 1296/144 | 9 | 8 | 0.000% tracepoint_add_probe+6f | 80/16 | 80/16 | 5 | 0 | 0.000% ------------------------------------------------------------------------------------------------------ ... Signed-off-by: Li Zefan Acked-by: Pekka Enberg Cc: Eduard - Gabriel Munteanu Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: linux-mm@kvack.org LKML-Reference: <4B0B6E9F.6020309@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 7d0d39459dab20bf60cac30a1a7d50b286c60cc1 Author: Li Zefan Date: Tue Nov 24 13:26:31 2009 +0800 perf kmem: Collect cross node allocation statistics Show cross node memory allocations: # ./perf kmem SUMMARY ======= ... Cross node allocations: 0/3633 Signed-off-by: Li Zefan Acked-by: Pekka Enberg Cc: Eduard - Gabriel Munteanu Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: linux-mm@kvack.org LKML-Reference: <4B0B6E87.10906@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 29b3e15289eb66788a0bf5ea4903f9fbeb1ec751 Author: Li Zefan Date: Tue Nov 24 13:26:10 2009 +0800 perf kmem: Default to sort by fragmentation Make the output sort by fragmentation by default. Also make the usage of "--sort" option consistent with other perf tools. That is, we support multi keys: "--sort key1[,key2]...". # ./perf kmem --stat caller ------------------------------------------------------------------------------ Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag ------------------------------------------------------------------------------ __netdev_alloc_skb+23 | 5048/1682 | 4564/1521 | 3| 9.588% perf_event_alloc.clone.0+0 | 7504/682 | 7128/648 | 11| 5.011% tracepoint_add_probe+32e | 157/31 | 154/30 | 5| 1.911% alloc_buffer_head+16 | 456/57 | 448/56 | 8| 1.754% radix_tree_preload+51 | 584/292 | 576/288 | 2| 1.370% ... TODO: - Extract duplicate code in builtin-kmem.c and builtin-sched.c into util/sort.c. Signed-off-by: Li Zefan Acked-by: Pekka Enberg Cc: Eduard - Gabriel Munteanu Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: linux-mm@kvack.org LKML-Reference: <4B0B6E72.7010200@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 7707b6b6f8d9188b612f9fc88c65411264b1ed57 Author: Li Zefan Date: Tue Nov 24 13:25:48 2009 +0800 perf kmem: Add new option to show raw ip Add option "--raw-ip" to show raw ip instead of symbols: # ./perf kmem --stat caller --raw-ip ------------------------------------------------------------------------------ Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag ------------------------------------------------------------------------------ 0xc05301aa | 733184/4096 | 733184/4096 | 179| 0.000% 0xc0542ba0 | 483328/4096 | 483328/4096 | 118| 0.000% ... Also show symbols with format sym+offset instead of sym/offset. Signed-off-by: Li Zefan Acked-by: Pekka Enberg Cc: Eduard - Gabriel Munteanu Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: linux-mm@kvack.org LKML-Reference: <4B0B6E5C.4080900@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit ee3d250446f1c1be4eceab48f3a23794d9a6564c Author: Paul Mackerras Date: Tue Nov 24 15:19:43 2009 +1100 perf tools: Fix compilation on powerpc Currently, perf fails to compile on powerpc with this error: CC util/header.o In file included from util/../perf.h:17, from util/header.c:9: util/../../../arch/powerpc/include/asm/unistd.h:360:27: error: linux/linkage.h: No such file or directory make: *** [util/header.o] Error 1 The reason is that we still have a #define __KERNEL__ in effect at the point where gets included, which means we get extra stuff that we don't need or want. This fixes the problem by undefining __KERNEL__ once we have included the file for which we need __KERNEL__ defined. Signed-off-by: Paul Mackerras Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra LKML-Reference: <19211.24287.453183.78836@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar commit b3a222e52e4d4be77cc4520a57af1a4a0d8222d1 Author: Serge E. Hallyn Date: Mon Nov 23 16:21:30 2009 -0600 remove CONFIG_SECURITY_FILE_CAPABILITIES compile option As far as I know, all distros currently ship kernels with default CONFIG_SECURITY_FILE_CAPABILITIES=y. Since having the option on leaves a 'no_file_caps' option to boot without file capabilities, the main reason to keep the option is that turning it off saves you (on my s390x partition) 5k. In particular, vmlinux sizes came to: without patch fscaps=n: 53598392 without patch fscaps=y: 53603406 with this patch applied: 53603342 with the security-next tree. Against this we must weigh the fact that there is no simple way for userspace to figure out whether file capabilities are supported, while things like per-process securebits, capability bounding sets, and adding bits to pI if CAP_SETPCAP is in pE are not supported with SECURITY_FILE_CAPABILITIES=n, leaving a bit of a problem for applications wanting to know whether they can use them and/or why something failed. It also adds another subtly different set of semantics which we must maintain at the risk of severe security regressions. So this patch removes the SECURITY_FILE_CAPABILITIES compile option. It drops the kernel size by about 50k over the stock SECURITY_FILE_CAPABILITIES=y kernel, by removing the cap_limit_ptraced_target() function. Changelog: Nov 20: remove cap_limit_ptraced_target() as it's logic was ifndef'ed. Signed-off-by: Serge E. Hallyn Acked-by: Andrew G. Morgan" Signed-off-by: James Morris commit 0bce95279909aa4cc401a2e3140b4295ca22e72a Author: Eric Paris Date: Mon Nov 23 16:47:23 2009 -0500 SELinux: print denials for buggy kernel with unknown perms Historically we've seen cases where permissions are requested for classes where they do not exist. In particular we have seen CIFS forget to set i_mode to indicate it is a directory so when we later check something like remove_name we have problems since it wasn't defined in tclass file. This used to result in a avc which included the permission 0x2000 or something. Currently the kernel will deny the operations (good thing) but will not print ANY information (bad thing). First the auditdeny field is no extended to include unknown permissions. After that is fixed the logic in avc_dump_query to output this information isn't right since it will remove the permission from the av and print the phrase "". This takes us back to the behavior before the classmap rewrite. Signed-off-by: Eric Paris Signed-off-by: James Morris commit fa7c27ee9394fc0d52404b2a89882e95868a60b9 Author: Frederic Weisbecker Date: Mon Nov 23 22:30:12 2009 +0100 hw-breakpoints: Fix misordered ifdef Fix a misplaced ifdef. We need the perf event headers also in off-case to avoid the following build error: include/linux/hw_breakpoint.h:94: error: expected declaration specifiers or '...' before 'perf_callback_t' include/linux/hw_breakpoint.h:102: error: expected declaration specifiers or '...' before 'perf_callback_t' include/linux/hw_breakpoint.h:109: error: expected declaration specifiers or '...' before 'perf_callback_t' include/linux/hw_breakpoint.h:116: error: expected declaration specifiers or '...' before 'perf_callback_t' Reported-by: Kisskb-bot by Michael Ellerman Signed-off-by: Frederic Weisbecker Cc: Prasad LKML-Reference: <1259011812-8093-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit c4a5af54c8ef277a59189fc9358e190f3c1b8206 Author: Andrew G. Morgan Date: Mon Nov 23 04:57:52 2009 +0000 Silence the existing API for capability version compatibility check. When libcap, or other libraries attempt to confirm/determine the supported capability version magic, they generally supply a NULL dataptr to capget(). In this case, while returning the supported/preferred magic (via a modified header content), the return code of this system call may be 0, -EINVAL, or -EFAULT. No libcap code depends on the previous -EINVAL etc. return code, and all of the above three return codes can accompany a valid (successful) attempt to determine the requested magic value. This patch cleans up the system call to return 0, if the call is successfully being used to determine the supported/preferred capability magic value. Signed-off-by: Andrew G. Morgan Acked-by: Steve Grubb Acked-by: Serge Hallyn Signed-off-by: James Morris commit fe542cf59bf0b31afe72b9e9749c0f6645419fa0 Author: Tetsuo Handa Date: Sun Nov 22 11:49:55 2009 +0900 LSM: Move security_path_chmod()/security_path_chown() to after mutex_lock(). We should call security_path_chmod()/security_path_chown() after mutex_lock() in order to avoid races. Signed-off-by: Tetsuo Handa Acked-by: John Johansen Signed-off-by: James Morris commit 1b145ae58035f30353d78d25bea665091df9b438 Author: Arnaldo Carvalho de Melo Date: Mon Nov 23 17:51:09 2009 -0200 perf kmem: Resolve symbols E.g.: [root@doppio linux-2.6-tip]# perf kmem record sleep 3s [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.804 MB perf.data (~35105 samples) ] [root@doppio linux-2.6-tip]# perf kmem --stat caller | head -10 ------------------------------------------------------------------------------ Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag ------------------------------------------------------------------------------ getname/40 | 1519616/4096 | 1519616/4096 | 371| 0.000% seq_read/a2 | 987136/4096 | 987136/4096 | 241| 0.000% __netdev_alloc_skb/43 | 260368/1049 | 259968/1048 | 248| 0.154% __alloc_skb/5a | 77312/256 | 77312/256 | 302| 0.000% proc_alloc_inode/33 | 76480/632 | 76472/632 | 121| 0.010% get_empty_filp/8d | 70272/192 | 70272/192 | 366| 0.000% split_vma/8e | 42064/176 | 42064/176 | 239| 0.000% [root@doppio linux-2.6-tip]# Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Pekka Enberg Cc: Eduard - Gabriel Munteanu Cc: Frédéric Weisbecker Cc: linux-mm@kvack.org Cc: Li Zefan Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Steven Rostedt LKML-Reference: <1259005869-13487-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 2890284bcf5c13c10fae8a0c20ad2f575118a092 Author: Arnaldo Carvalho de Melo Date: Mon Nov 23 17:51:08 2009 -0200 perf tools: Move graph_line and graph_dotted_line from top So that they can be used in other tools. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259005869-13487-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 27c13ecec4d8856687b50b959e1146845b478f95 Author: Borislav Petkov Date: Sat Nov 21 14:01:45 2009 +0100 x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes display_cacheinfo() doesn't display anything anymore and it is used to detect CPU cache sizes. Rename it accordingly. Signed-off-by: Borislav Petkov LKML-Reference: <20091121130145.GA31357@liondog.tnic> Signed-off-by: H. Peter Anvin commit cc612d8199089413719397c9d92e5823da578eac Author: Arnaldo Carvalho de Melo Date: Mon Nov 23 16:39:10 2009 -0200 perf symbols: Look for vmlinux in more places Now that we can check the buildid to see if it really matches, this can be done safely: vmlinux /boot/vmlinux /boot/vmlinux- /lib/modules//build/vmlinux /usr/lib/debug/lib/modules/%s/vmlinux More can be added - if you know about distros that put the vmlinux somewhere else please let us know. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 0444c9bd0cf4e0eb946a7fcaf34765accfa9404a Author: Jan Beulich Date: Fri Nov 20 14:03:05 2009 +0000 x86: Tighten conditionals on MCE related statistics irq_thermal_count is only being maintained when X86_THERMAL_VECTOR, and both X86_THERMAL_VECTOR and X86_MCE_THRESHOLD don't need extra wrapping in X86_MCE conditionals. Signed-off-by: Jan Beulich Cc: Hidetoshi Seto Cc: Yong Wang Cc: Suresh Siddha Cc: Andi Kleen Cc: Borislav Petkov Cc: Arjan van de Ven LKML-Reference: <4B06AFA902000078000211F8@vpn.id2.novell.com> Signed-off-by: Ingo Molnar commit 429947248f814e90f416ab4f68a871ab628000c3 Author: Jan Blunck Date: Fri Nov 20 17:40:37 2009 +0100 sched_feat_write(): Update ppos instead of file->f_pos sched_feat_write() should update ppos instead of file->f_pos. (This reduces some BKL dependencies of this code.) Signed-off-by: Jan Blunck Cc: jkacur@redhat.com Cc: Arnd Bergmann Cc: Frederic Weisbecker Cc: Jamie Lokier Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Alan Cox LKML-Reference: <1258735245-25826-8-git-send-email-jblunck@suse.de> Signed-off-by: Ingo Molnar commit 1b290d670ffa883b7e062177463a8efd00eaa2c1 Author: Frederic Weisbecker Date: Mon Nov 23 15:42:35 2009 +0100 perf tools: Add support for breakpoint events in perf tools Add the breakpoint events support with this new sysnopsis: mem:addr[:access] Where addr is a raw addr value in the kernel and access can be either [r][w][x] Example to profile tasklist_lock: $ grep tasklist_lock /proc/kallsyms ffffffff8189c000 D tasklist_lock $ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1 $ perf report # Samples: 62 # # Overhead Command Shared Object Symbol # ........ ............... ............. ...... # 29.03% swapper [kernel] [k] _raw_read_trylock 29.03% swapper [kernel] [k] _raw_read_unlock 19.35% init [kernel] [k] _raw_read_trylock 19.35% init [kernel] [k] _raw_read_unlock 1.61% events/0 [kernel] [k] _raw_read_trylock 1.61% events/0 [kernel] [k] _raw_read_unlock Coming soon: - Support for symbols in the event definition. - Default period to 1 for breakpoint events because these are not high frequency events. The same thing is needed for trace events. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad LKML-Reference: <1258987355-8751-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad commit f5ffe02e5046003ae7e2ce70d3d1c2a73331268b Author: Frederic Weisbecker Date: Mon Nov 23 15:42:34 2009 +0100 perf: Add kernel side syscall events support for breakpoints Add the remaining necessary bits to support breakpoints created through perf syscall. We don't use the software counter interface as: - We don't need to check against recursion, this is already done in hardware breakpoints arch level. - We already know the perf event we are dealing with when the event is to be committed. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad LKML-Reference: <1258987355-8751-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit fdf6bc95229821e3d9405eba28925b76e92b74d0 Author: Frederic Weisbecker Date: Mon Nov 23 15:42:33 2009 +0100 hw-breakpoints: Check the breakpoint params from perf tools Perf tools create perf events as disabled in the beginning. Breakpoints are then considered like ptrace temporary breakpoints, only meant to reserve a breakpoint slot until we get all the necessary informations from the user. In this case, we don't check the address that is breakpointed as it is NULL in the ptrace case. But perf tools don't have the same purpose, events are created disabled to wait for all events to be created before enabling all of them. We want to check the breakpoint parameters in this case. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad LKML-Reference: <1258987355-8751-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit e6db4876575f3fdd5b1df2cbff826df95ab9af6a Author: Frederic Weisbecker Date: Mon Nov 23 15:42:32 2009 +0100 hw-breakpoints: Include only linux/perf_event.h from kernel part of bp headers As userspace only needs the breakpoints enum types from the breakpoints headers. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad LKML-Reference: <1258987355-8751-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit ba6909b719a5ccc0c8100d2895bb7ff557b2eeae Author: K.Prasad Date: Mon Nov 23 21:17:13 2009 +0530 hw-breakpoint: Attribute authorship of hw-breakpoint related files Attribute authorship to developers of hw-breakpoint related files. Signed-off-by: K.Prasad Cc: Alan Stern Cc: Frederic Weisbecker LKML-Reference: <20091123154713.GA5593@in.ibm.com> [ v2: moved it to latest -tip ] Signed-off-by: Ingo Molnar commit acd1d7c1f8f3d848a3c5327dc09f8c1efb971678 Author: Peter Zijlstra Date: Mon Nov 23 15:00:36 2009 +0100 perf_events: Restore sanity to scaling land It is quite possible to call update_event_times() on a context that isn't actually running and thereby confuse the thing. perf stat was reporting !100% scale values for software counters (2e2af50b perf_events: Disable events when we detach them, solved the worst of that, but there was still some left). The thing that happens is that because we are not self-reaping (we have a caring parent) there is a time between the last schedule (out) and having do_exit() called which will detach the events. This period would be accounted as enabled,!running because the event->state==INACTIVE, even though !event->ctx->is_active. Similar issues could have been observed by calling read() on a event while the attached task was not scheduled in. Solve this by teaching update_event_times() about ctx->is_active. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <1258984836.4531.480.camel@laptop> Signed-off-by: Ingo Molnar commit be831297716036de5b24308447ecb69f1706a846 Author: Joerg Roedel Date: Mon Nov 23 12:50:00 2009 +0100 x86/amd-iommu: attach devices to pre-allocated domains early For some devices the ACPI table may define unity map requirements which must me met when the IOMMU is enabled. So we need to attach devices to their domains as early as possible so that these mappings are in place when needed. This patch assigns the domains right after they are allocated. Otherwise this can result in I/O page faults before a driver binds to a device and BIOS is still using it. Cc: stable@kernel.org Signed-off-by: Joerg Roedel commit 9f800de38b05d84809e89f16671d636a140eede7 Author: Joerg Roedel Date: Mon Nov 23 12:45:25 2009 +0100 x86/amd-iommu: un__init iommu_setup_msi This function may be called on the resume path and can not be dropped after booting. Cc: stable@kernel.org Signed-off-by: Joerg Roedel commit 4ed7c92d68a5387ba5f7030dc76eab03558e27f5 Author: Peter Zijlstra Date: Mon Nov 23 11:37:29 2009 +0100 perf_events: Undo some recursion damage Make perf_swevent_get_recursion_context return a context number and disable preemption. This could be used to remove the IRQ disable from the trace bit and index the per-cpu buffer with. Signed-off-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Paul Mackerras LKML-Reference: <20091123103819.993226816@chello.nl> Signed-off-by: Ingo Molnar commit f67218c3e93abaf0f480bb94b53d234853ffe4de Author: Peter Zijlstra Date: Mon Nov 23 11:37:27 2009 +0100 perf_events: Fix __perf_event_exit_task() vs. update_event_times() locking Move the update_event_times() call in __perf_event_exit_task() into list_del_event() because that holds the proper lock (ctx->lock) and seems a more natural place to do the last time update. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <20091123103819.842455480@chello.nl> Signed-off-by: Ingo Molnar commit 5e942bb33371254a474653123cd9e13a4c89ee44 Author: Peter Zijlstra Date: Mon Nov 23 11:37:26 2009 +0100 perf_events: Update the context time on exit It appeared we did call update_event_times() on exit, but we failed to update the context time, which renders the former moot. Locking is a bit iffy, we call update_event_times under ctx->mutex instead of ctx->lock - the next patch fixes this. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <20091123103819.764207355@chello.nl> Signed-off-by: Ingo Molnar commit 2e2af50b1fab3c40636839a7f439c167ae559533 Author: Peter Zijlstra Date: Mon Nov 23 11:37:25 2009 +0100 perf_events: Disable events when we detach them If we leave the event in STATE_INACTIVE, any read of the event after the detach will increase the running count but not the enabled count and cause funny scaling artefacts. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <20091123103819.689055515@chello.nl> Signed-off-by: Ingo Molnar commit 6c2bfcbe58e0dd39554be88940149f5aa11e17d1 Author: Peter Zijlstra Date: Mon Nov 23 11:37:24 2009 +0100 perf_events: Fix style nits Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <20091123103819.613427378@chello.nl> Signed-off-by: Ingo Molnar commit a66a3052e2d4c5815d7ad26887b1d4193206e691 Author: Peter Zijlstra Date: Mon Nov 23 11:37:23 2009 +0100 perf_events: Undo copy/paste damage We had two almost identical functions, avoid the duplication. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091123103819.537537928@chello.nl> Signed-off-by: Ingo Molnar commit a4234bfcf4d72a10a99176cdef007345e9c3b4aa Author: Ingo Molnar Date: Mon Nov 23 10:57:59 2009 +0100 perf_events: Optimize the swcounter hotpath The structure init creates a bit memcpy, which shows up big time in perf annotate output: : ffffffff810a859d <__perf_sw_event>: 1.68 : ffffffff810a859d: 55 push %rbp 1.69 : ffffffff810a859e: 41 89 fa mov %edi,%r10d 0.01 : ffffffff810a85a1: 49 89 c9 mov %rcx,%r9 0.00 : ffffffff810a85a4: 31 c0 xor %eax,%eax 1.71 : ffffffff810a85a6: b9 16 00 00 00 mov $0x16,%ecx 0.00 : ffffffff810a85ab: 48 89 e5 mov %rsp,%rbp 0.00 : ffffffff810a85ae: 48 83 ec 60 sub $0x60,%rsp 1.52 : ffffffff810a85b2: 48 8d 7d a0 lea -0x60(%rbp),%rdi 85.20 : ffffffff810a85b6: f3 ab rep stos %eax,%es:(%rdi) None of the callees depends on the structure being pre-initialized, so only initialize ->addr. This gets rid of the memcpy overhead. Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: Signed-off-by: Ingo Molnar commit 0e7810be30f66e9f430c4ce2cd3b14634211690f Author: Jan Beulich Date: Fri Nov 20 14:00:14 2009 +0000 x86: Suppress stack overrun message for init_task init_task doesn't get its stack end location set to STACK_END_MAGIC, and hence the message is confusing rather than helpful in this case. Signed-off-by: Jan Beulich LKML-Reference: <4B06AEFE02000078000211F4@vpn.id2.novell.com> Signed-off-by: Ingo Molnar commit 457dc928f586f3f4b930206965e6db270034e97e Author: Ingo Molnar Date: Mon Nov 23 11:03:28 2009 +0100 tracing, function tracer: Clean up strstrip() usage Clean up strstrip() usage - which also addresses this build warning: kernel/trace/ftrace.c: In function 'ftrace_pid_write': kernel/trace/ftrace.c:3004: warning: ignoring return value of 'strstrip', declared with attribute warn_unused_result Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo LKML-Reference: Signed-off-by: Ingo Molnar commit 6e3d8330ae2c4b2c11a9577a0130d2ecda1c610d Author: Ingo Molnar Date: Mon Nov 23 10:19:20 2009 +0100 perf events: Do not generate function trace entries in perf code Decreases perf overhead when function tracing is enabled, by about 50%. Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: Signed-off-by: Ingo Molnar commit 163d3866cfa79aa5945f1ee5e43fb3ed1455f75c Author: Yinghai Lu Date: Sat Nov 21 00:23:37 2009 -0800 x86: apic: Print out SRAT table APIC id in hex Make it consistent with APIC MADT print out, for big systems APIC id in hex is more readable. Signed-off-by: Yinghai Lu LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar commit 37ef2a3029fde884808ff1b369677abc7dd9a79a Author: Yinghai Lu Date: Sat Nov 21 00:23:37 2009 -0800 x86: Re-get cfg_new in case reuse/move irq_desc When irq_desc is moved, we need to make sure to use the right cfg_new. Signed-off-by: Yinghai Lu LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar commit e670761f12f4069d204f433bf547d9c679a4fd05 Author: Yinghai Lu Date: Sat Nov 21 00:23:37 2009 -0800 x86: apic: Remove not needed #ifdef Suresh made dmar_table_init() already have that protection. Signed-off-by: Yinghai Lu LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar commit bfd451184d80301d1ae970b1ebffde1e9c6240f9 Author: Simon Kaempflein Date: Mon Nov 16 15:25:53 2009 +1000 perf record, x86: Print more intelligent error message when sampling fails Print more accurate error message when "perf record" fails because there is no APIC support, on x86. Signed-off-by: Ingo Molnar commit 98e4833ba3c314c99dc364012fba6ac894230ad0 Author: Ingo Molnar Date: Mon Nov 23 08:03:09 2009 +0100 ring-buffer benchmark: Run producer/consumer threads at nice +19 The ring-buffer benchmark threads run on nice 0 by default, using up a lot of CPU time and slowing down the system: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1024 root 20 0 0 0 0 D 95.3 0.0 4:01.67 rb_producer 1023 root 20 0 0 0 0 R 93.5 0.0 2:54.33 rb_consumer 21569 mingo 40 0 14852 1048 772 R 3.6 0.1 0:00.05 top 1 root 40 0 4080 928 668 S 0.0 0.0 0:23.98 init Renice them to +19 to make them less intrusive. Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit 81516c5fc83a13a1d12f466aa7e14f5fd62a63ce Author: Michael S. Tsirkin Date: Sun Nov 22 14:13:35 2009 +0200 perf: Use default compiler mode by default gcc with no flags typically is a sane default for systems to use, and looking at the running kernel is probably broken for cross-builds anyway, so let's not do this. Add EXTRA_CFLAGS so that users can override default gcc mode if they want to. Signed-off-by: Michael S. Tsirkin Acked-by: Arjan van de Ven Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091122121335.GA24254@redhat.com> Signed-off-by: Ingo Molnar commit 85c3b529f8ad4d65ba86b982ef050212ae7dd976 Author: Eric Paris Date: Fri Nov 20 11:00:12 2009 -0500 SELinux: header generation may hit infinite loop If a permission name is long enough the selinux class definition generation tool will go into a infinite loop. This is because it's macro max() is fooled into thinking it is dealing with unsigned numbers. This patch makes sure the macro always uses signed number so 1 > -1. Signed-off-by: Eric Paris Signed-off-by: James Morris commit 6ebb237bece23275d1da149b61a342f0d4d06a08 Author: Paul E. McKenney Date: Sun Nov 22 08:53:50 2009 -0800 rcu: Re-arrange code to reduce #ifdef pain Remove #ifdefs from kernel/rcupdate.c and include/linux/rcupdate.h by moving code to include/linux/rcutiny.h, include/linux/rcutree.h, and kernel/rcutree.c. Also remove some definitions that are no longer used. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1258908830885-git-send-email-> Signed-off-by: Ingo Molnar commit 9f680ab41485edfdc96331b70afa7513aa0a7720 Author: Paul E. McKenney Date: Sun Nov 22 08:53:49 2009 -0800 rcu: Eliminate unneeded function wrapping The functions rcu_init() is a wrapper for __rcu_init(), and also sets up the CPU-hotplug notifier for rcu_barrier_cpu_hotplug(). But TINY_RCU doesn't need CPU-hotplug notification, and the rcu_barrier_cpu_hotplug() is a simple wrapper for rcu_cpu_notify(). So push rcu_init() out to kernel/rcutree.c and kernel/rcutiny.c and get rid of the wrapper function rcu_barrier_cpu_hotplug(). Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12589088302320-git-send-email-> Signed-off-by: Ingo Molnar commit b668c9cf3e58739dac54a1d6f42f2b4bdd980b3e Author: Paul E. McKenney Date: Sun Nov 22 08:53:48 2009 -0800 rcu: Fix grace-period-stall bug on large systems with CPU hotplug When the last CPU of a given leaf rcu_node structure goes offline, all of the tasks queued on that leaf rcu_node structure (due to having blocked in their current RCU read-side critical sections) are requeued onto the root rcu_node structure. This requeuing is carried out by rcu_preempt_offline_tasks(). However, it is possible that these queued tasks are the only thing preventing the leaf rcu_node structure from reporting a quiescent state up the rcu_node hierarchy. Unfortunately, the old code would fail to do this reporting, resulting in a grace-period stall given the following sequence of events: 1. Kernel built for more than 32 CPUs on 32-bit systems or for more than 64 CPUs on 64-bit systems, so that there is more than one rcu_node structure. (Or CONFIG_RCU_FANOUT is artificially set to a number smaller than CONFIG_NR_CPUS.) 2. The kernel is built with CONFIG_TREE_PREEMPT_RCU. 3. A task running on a CPU associated with a given leaf rcu_node structure blocks while in an RCU read-side critical section -and- that CPU has not yet passed through a quiescent state for the current RCU grace period. This will cause the task to be queued on the leaf rcu_node's blocked_tasks[] array, in particular, on the element of this array corresponding to the current grace period. 4. Each of the remaining CPUs corresponding to this same leaf rcu_node structure pass through a quiescent state. However, the task is still in its RCU read-side critical section, so these quiescent states cannot be reported further up the rcu_node hierarchy. Nevertheless, all bits in the leaf rcu_node structure's ->qsmask field are now zero. 5. Each of the remaining CPUs go offline. (The events in step #4 and #5 can happen in any order as long as each CPU passes through a quiescent state before going offline.) 6. When the last CPU goes offline, __rcu_offline_cpu() will invoke rcu_preempt_offline_tasks(), which will move the task to the root rcu_node structure, but without reporting a quiescent state up the rcu_node hierarchy (and this failure to report a quiescent state is the bug). But because this leaf rcu_node structure's ->qsmask field is already zero and its ->block_tasks[] entries are all empty, force_quiescent_state() will skip this rcu_node structure. Therefore, grace periods are now hung. This patch abstracts some code out of rcu_read_unlock_special(), calling the result task_quiet() by analogy with cpu_quiet(), and invokes task_quiet() from both rcu_read_lock_special() and __rcu_offline_cpu(). Invoking task_quiet() from __rcu_offline_cpu() reports the quiescent state up the rcu_node hierarchy, fixing the bug. This ends up requiring a separate lock_class_key per level of the rcu_node hierarchy, which this patch also provides. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12589088301770-git-send-email-> Signed-off-by: Ingo Molnar commit 50e5095afa8c2be0f35e5c0e21d5f7912340e8f2 Author: Arnaldo Carvalho de Melo Date: Sun Nov 22 14:59:22 2009 -0200 perf report: Do map lookups in resolve_callchain() Bug introduced in 439d473b4777de510e1322168ac6f2f377ecd5bc, making the initial map be used for all IPs, so that symbols outside this initial map would either be erroneously resolved or not resolve at all. Reported-by: Ingo Molnar Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258909162-28496-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 87f8ea4cd3680ef7f4da4391aed97abb25eae333 Author: Arnaldo Carvalho de Melo Date: Sun Nov 22 13:21:41 2009 -0200 perf symbols: Show messages about module loading only if verbose >= 1 Suggested-by: Ingo Molnar Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258903301-20584-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit b197c7ef7169bd5f11fb9d803b322d0daef7e256 Author: Michael S. Tsirkin Date: Sun Nov 22 15:13:11 2009 +0200 perf tools: Suggest static libraries as well On error, suggest installing static libraries along with shared libraries. Signed-off-by: Michael S. Tsirkin Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091122131311.GA24318@redhat.com> Signed-off-by: Ingo Molnar commit 7baed9af4bf0d7850045e36d19a43a2c76872b62 Author: Michael S. Tsirkin Date: Sun Nov 22 13:27:27 2009 +0200 perf tools: Add V=2 option to help debug config issues Make standard error show up on console when V=2 is set. Signed-off-by: Michael S. Tsirkin Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091122112726.GC13644@redhat.com> Signed-off-by: Ingo Molnar commit 78a14e273d93dfbea9673f9b10398c538096302d Author: Julia Lawall Date: Sat Nov 21 12:50:34 2009 +0100 drivers/pcmcia: remove unnecessary kzalloc The result of calling kzalloc is never used or freed. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // @r exists@ local idexpression x; statement S; expression E; identifier f,f1,l; position p1,p2; expression *ptr != NULL; @@ x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...); ... if (x == NULL) S <... when != x when != if (...) { <+...x...+> } ( x->f1 = E | (x->f1 == NULL || ...) | f(...,x->f1,...) ) ...> ( return \(0\|<+...x...+>\|ptr\); | return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line) // Signed-off-by: Julia Lawall Signed-off-by: Dominik Brodowski commit 645e8cc0c9f01f07f384fd522b782e5e6ae9de18 Author: Ingo Molnar Date: Sun Nov 22 12:20:19 2009 +0100 perf_events: Fix modular build Fix: ERROR: "perf_swevent_put_recursion_context" [fs/ext4/ext4.ko] undefined! ERROR: "perf_swevent_get_recursion_context" [fs/ext4/ext4.ko] undefined! Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Jason Baron LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit e57cfcdac6badd846a1cd831de54a1359c2d1eea Author: Pekka Enberg Date: Sun Nov 22 12:29:44 2009 +0200 perf symbols: Fix ELF header errors during "perf kmem record" The write_event() function in builtin-record.c writes out all mmap()'d DSOs including non-ELF files like GNOME resource files and such. Therefore, check for ELF_K_ELF in filename__read_build_id() before attempting to read the ELF header with gelf_getehdr(). Fixes the following error messages when running "perf kmem record": penberg@penberg-laptop:~/src/linux/tools/perf$ perf kmem record ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.753 MB perf.data (~32885 samples) ] filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. filename__read_build_id: cannot get elf header. Signed-off-by: Pekka Enberg Cc: Arnaldo Carvalho de Melo Cc: Li Zefan Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Steven Rostedt LKML-Reference: <1258885784-11709-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar commit f3ced7cdb24e7968a353d828955fa2daf4167e72 Author: Pekka Enberg Date: Sun Nov 22 11:58:00 2009 +0200 perf kmem: Add --sort hit and --sort frag This patch adds support for "--sort hit" and "--sort frag" to the "perf kmem" tool. The former was already mentioned in the help text and the latter is useful for finding call-sites that exhibit worst case behavior for SLAB allocators. Signed-off-by: Pekka Enberg Cc: Li Zefan Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Steven Rostedt Cc: Eduard - Gabriel Munteanu Cc: linux-mm@kvack.org LKML-Reference: <1258883880-7149-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar commit 96b02d78a7e47cd189f6b307c5513fec6b2155dc Author: Márton Németh Date: Sat Nov 21 23:10:15 2009 +0100 perf_event: Remove redundant zero fill The buffer is first zeroed out by memset(). Then strncpy() is used to fill the content. The strncpy() function also pads the string till the end of the specified length, which is redundant. The strncpy() does not ensures that the string will be properly closed with 0. Use strlcpy() instead. The semantic match that finds this kind of pattern is as follows: (http://coccinelle.lip6.fr/) // @@ expression buffer; expression size; expression str; @@ memset(buffer, 0, size); ... - strncpy( + strlcpy( buffer, str, sizeof(buffer) ); @@ expression buffer; expression size; expression str; @@ memset(&buffer, 0, size); ... - strncpy( + strlcpy( &buffer, str, sizeof(buffer)); @@ expression buffer; identifier field; expression size; expression str; @@ memset(buffer, 0, size); ... - strncpy( + strlcpy( buffer->field, str, sizeof(buffer->field) ); @@ expression buffer; identifier field; expression size; expression str; @@ memset(&buffer, 0, size); ... - strncpy( + strlcpy( buffer.field, str, sizeof(buffer.field)); // On strncpy() vs strlcpy() see http://www.gratisoft.us/todd/papers/strlcpy.html . Signed-off-by: Márton Németh Cc: Julia Lawall Cc: cocci@diku.dk Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <4B086547.5040100@freemail.hu> Signed-off-by: Ingo Molnar commit 12eac0bf0461910ae6dd7f071f156f75461a37cf Author: Hitoshi Mitake Date: Fri Nov 20 12:37:17 2009 +0900 perf bench: Make the mem/memcpy tests more user-friendly mem-memcpy.c uses perf event system calls to obtain CPU clocks. And it suddenly dies with BUG_ON() when it running on Linux doesn't support perf event. Also fail at calloc() can occur easily when too large length is passed. Fail of calloc() causes sudden death with assert(). These behaviours are not friendly. So I fixed the treating of errors. Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <1258688237-3797-1-git-send-email-mitake@dcl.info.waseda.ac.jp> [ v2: improved a few small details ] Signed-off-by: Ingo Molnar commit 5093ebad5f2348076fdc3dac7d2358b1ad7f85f7 Author: Frederic Weisbecker Date: Sun Nov 22 05:21:35 2009 +0100 hw-breakpoints: Separate the kernel part from breakpoint headers So that we can include this header from userspace tools, like perf tools, to get the breakpoint types and len definitions. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Prasad LKML-Reference: <1258863695-10464-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit b3a75542d329ce4e1c66b293cefeb4429a2af043 Author: Frederic Weisbecker Date: Sun Nov 22 05:21:34 2009 +0100 hw-breakpoints: Remove x86 specific headers from core file Remove asm/processor.h and asm/debugreg.h as these headers are not used anymore in the hw-breakpoints core file. Signed-off-by: Frederic Weisbecker Cc: Prasad LKML-Reference: <1258863695-10464-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 28889bf9e2db29747d58cd47a92d727f927c3aee Author: Frederic Weisbecker Date: Sun Nov 22 05:21:33 2009 +0100 tracing: Forget about the NMI buffer for syscall events We are never in an NMI context when we commit a syscall trace to perf. So just forget about the nmi buffer there. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Steven Rostedt Cc: Jason Baron LKML-Reference: <1258863695-10464-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit ce71b9df8893ec954e56c5979df6da274f20f65e Author: Frederic Weisbecker Date: Sun Nov 22 05:26:55 2009 +0100 tracing: Use the perf recursion protection from trace event When we commit a trace to perf, we first check if we are recursing in the same buffer so that we don't mess-up the buffer with a recursing trace. But later on, we do the same check from perf to avoid commit recursion. The recursion check is desired early before we touch the buffer but we want to do this check only once. Then export the recursion protection from perf and use it from the trace events before submitting a trace. v2: Put appropriate Reported-by tag Reported-by: Peter Zijlstra Signed-off-by: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Steven Rostedt Cc: Masami Hiramatsu Cc: Jason Baron LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit e25613683bd5c46d3e8c8ae6416dccc9f357dcdc Author: Arnaldo Carvalho de Melo Date: Sat Nov 21 14:31:26 2009 -0200 perf trace: Read_tracing_data should die() another day It better propagate errors, also if we do a simple: [root@doppio linux-2.6-tip]# perf record -R -a -f sleep 3s ; perf trace [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.182 MB perf.data (~7972 samples) ] Fatal: not an trace data file [root@doppio linux-2.6-tip]# That is what is expected, right? I.e. as we didn't specify any tracepoint event via -e, it should gracefully bail out and not SEGFAULT. Signed-off-by: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258821086-11521-3-git-send-email-acme@infradead.org> [ Fixed the error messages some more ] Signed-off-by: Ingo Molnar commit c12e15e71d4b32da045e798ffd21cbb6197d1c65 Author: Arnaldo Carvalho de Melo Date: Sat Nov 21 14:31:25 2009 -0200 perf symbols: Old versions of elf.h don't have NT_GNU_BUILD_ID Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258821086-11521-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 90c83218c32d7c474da810cd3c9973a43ecbcb9b Author: Arnaldo Carvalho de Melo Date: Sat Nov 21 14:31:24 2009 -0200 perf symbols: Fixup kernel_maps__fixup_end end map We better call this routine after both the kernel and modules are loaded, because as it was if there weren't modules it would not be called, resulting in kernel_map->end remaining at zero, so no map would be found and consequently the kernel symtab wouldn't get loaded, i.e. no kernel symbols would be resolved. Also this fixes another case, that is when we _have_ modules, but the last map would have its ->end address not set before we loaded its symbols, which would never happen because ->end was not set. Reported-by: Ingo Molnar Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258821086-11521-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 8904b18046c2f050107f6449e887e7c1142b9ab9 Author: Stephane Eranian Date: Fri Nov 20 22:19:57 2009 +0100 perf_events: Fix default watermark calculation This patch fixes the default watermark value for the sampling buffer. With the existing calculation (watermark = max(PAGE_SIZE, max_size / 2)), no notification was ever received when the buffer was exactly 1 page. This was because you would never cross the threshold (there is no partial samples). In certain configuration, there was no possibilty detecting the problem because there was not enough space left to store the LOST record.In fact, there may be a more generic problem here. The kernel should ensure that there is alaways enough space to store one LOST record. This patch sets the default watermark to half the buffer size. With such limit, we are guaranteed to get a notification even with a single page buffer assuming no sample is bigger than a page. Signed-off-by: Stephane Eranian Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212509.344964101@chello.nl> Signed-off-by: Ingo Molnar LKML-Reference: <1256302576-6169-1-git-send-email-eranian@gmail.com> commit 6f10581aeaa5543a3b7a8c7a87a064375ec357f8 Author: Peter Zijlstra Date: Fri Nov 20 22:19:56 2009 +0100 perf: Fix locking for PERF_FORMAT_GROUP We should hold event->child_mutex when iterating the inherited counters, we should hold ctx->mutex when iterating siblings. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212509.251030114@chello.nl> Signed-off-by: Ingo Molnar commit 59ed446f792cc07d37b1536b9c4664d14e25e425 Author: Peter Zijlstra Date: Fri Nov 20 22:19:55 2009 +0100 perf: Fix event scaling for inherited counters Properly account the full hierarchy of counters for both the count (we already did so) and the scale times (new). Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212509.153379276@chello.nl> Signed-off-by: Ingo Molnar commit 2b8988c9f7defe319cffe0cd362a7cd356c86f62 Author: Peter Zijlstra Date: Fri Nov 20 22:19:54 2009 +0100 perf: Fix time locking Most sites updating ctx->time and event times do so under ctx->lock, make sure they all do. This was made possible by removing the __perf_event_read() call from __perf_event_sync_stat(), which already had this lock taken. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212509.102316434@chello.nl> Signed-off-by: Ingo Molnar commit 58e5ad1de3d6ad931c84f0cc8ef0655c922f30ad Author: Peter Zijlstra Date: Fri Nov 20 22:19:53 2009 +0100 perf: Simplify __perf_event_read cpuctx is always active, task context is always active for current the previous condition verifies that if its a task context its for current, hence we can assume ctx->is_active. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212509.000272254@chello.nl> Signed-off-by: Ingo Molnar commit 3dbebf15c5d3e265f751eec72c1538a00da4be27 Author: Peter Zijlstra Date: Fri Nov 20 22:19:52 2009 +0100 perf: Simplify __perf_event_sync_stat Removes constraints from __perf_event_read() by leaving it with a single callsite; this callsite had ctx->lock held, the other one does not. Removes some superfluous code from __perf_event_sync_stat(). Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.918544317@chello.nl> Signed-off-by: Ingo Molnar commit f6f83785222b0ee037f7be90731f62a649292b5e Author: Peter Zijlstra Date: Fri Nov 20 22:19:51 2009 +0100 perf: Optimize __perf_event_read() Both callers actually have IRQs disabled, no need doing so again. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.863685796@chello.nl> Signed-off-by: Ingo Molnar commit 02ffdbc866c8b1c8644601e9aa6155700eed4c91 Author: Peter Zijlstra Date: Fri Nov 20 22:19:50 2009 +0100 perf: Optimize perf_event_task_sched_out Remove an update_context_time() call from the perf_event_task_sched_out() path and into the branch its needed. The call was both superfluous, because __perf_event_sched_out() already does it, and wrong, because it was done without holding ctx->lock. Place it in perf_event_sync_stat(), which is the only place it is needed and which does already hold ctx->lock. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.779516394@chello.nl> Signed-off-by: Ingo Molnar commit abf4868b8548cae18d4fe8bbfb4e207443be01be Author: Peter Zijlstra Date: Fri Nov 20 22:19:49 2009 +0100 perf: Fix PERF_FORMAT_GROUP scale info As Corey reported, the total_enabled and total_running times could occasionally be 0, even though there were events counted. It turns out this is because we record the times before reading the counter while the latter updates the times. This patch corrects that. While looking at this code I found that there is a lot of locking iffyness around, the following patches correct most of that. Reported-by: Corey Ashford Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.685559857@chello.nl> Signed-off-by: Ingo Molnar commit f6d9dd237da400effb265f3554c64413f8a3e7b4 Author: Peter Zijlstra Date: Fri Nov 20 22:19:48 2009 +0100 perf: Optimize perf_event_mmap_ctx() Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.606459548@chello.nl> Signed-off-by: Ingo Molnar commit f6595f3a9680c86b6332f881a7ae2cbbcfdc8619 Author: Peter Zijlstra Date: Fri Nov 20 22:19:47 2009 +0100 perf: Optimize perf_event_comm_ctx() Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.527608793@chello.nl> Signed-off-by: Ingo Molnar commit d6ff86cfb50a72df820e7e839836d55d245306fb Author: Peter Zijlstra Date: Fri Nov 20 22:19:46 2009 +0100 perf: Optimize perf_event_task_ctx() Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.452227115@chello.nl> Signed-off-by: Ingo Molnar commit 81520183878a8813c71c9372de28bb70913ba549 Author: Peter Zijlstra Date: Fri Nov 20 22:19:45 2009 +0100 perf: Optimize perf_swevent_ctx_event() Remove a rcu_read_{,un}lock() pair and a few conditionals. We can remove the rcu_read_lock() by increasing the scope of one in the calling function. We can do away with the system_state check if the machine still boots after this patch (seems to be the case). We can do away with the list_empty() check because the bare list_for_each_entry_rcu() reduces to that now that we've removed everything else. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.378188589@chello.nl> Signed-off-by: Ingo Molnar commit 0cff784ae41cc125368ae77f1c01328ae2fdc6b3 Author: Peter Zijlstra Date: Fri Nov 20 22:19:44 2009 +0100 perf: Optimize some swcounter attr.sample_period==1 paths Avoid the rather expensive perf_swevent_set_period() if we know we have to sample every single event anyway. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.299508332@chello.nl> Signed-off-by: Ingo Molnar commit 453f19eea7dbad837425e9b07d84568d14898794 Author: Peter Zijlstra Date: Fri Nov 20 22:19:43 2009 +0100 perf: Allow for custom overflow handlers in-kernel perf users might wish to have custom actions on the sample interrupt. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091120212508.222339539@chello.nl> Signed-off-by: Ingo Molnar commit ef6ae724253429ac70d81e65d052f6a346d330bd Author: Arnaldo Carvalho de Melo Date: Fri Nov 20 20:51:29 2009 -0200 perf symbols: Change the kernel DSO name if it comes from kallsyms So that the user have a clearer indication about the source of the symbols, as we only state buildid mismatches in verbose mode, because 'perf top' would overwrite such warning anyway. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258757489-5978-6-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit fbd733b815a5a57d7eb0d904edc49d18fd12df5c Author: Arnaldo Carvalho de Melo Date: Fri Nov 20 20:51:28 2009 -0200 perf symbols: Check vmlinux buildid E.g.: [root@doppio linux-2.6-tip]# perf top -v --vmlinux ../build/tip/vmlinux > /dev/null build_id in vmlinux is e96699725a47413a50c231864a8e7a8ced40a31b while expected is 18e7cc53db62a7d35e9d6f6c9ddc23017d38ee9a, ignoring it I.e. perf top was told to use a vmlinux file that is not the one currently running on the machine, it ignores it and falls back to using /proc/kallsyms. This solves many, at first, mysterious results when people have a stale vmlinux file while keeping the default of trying to use the vmlinux file in the current directory in things like 'perf annotate' where the DWARF info is required and thus we can't use just /proc/kallsyms. Modules buildids are already being checked as of the previous changeset in this series, because we are using the default dso__load routine, that will look at a series of places looking for the best file with a matching buildid, starting in the -debuginfo directories. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258757489-5978-5-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit c338aee853db197e1855b393e6d6cc667784537f Author: Arnaldo Carvalho de Melo Date: Fri Nov 20 20:51:27 2009 -0200 perf symbols: Do lazy symtab loading for the kernel & modules too Just like we do with the other DSOs. This also simplifies the kernel_maps setup process, now all that the tools need to do is to call kernel_maps__init and the maps for the modules and kernel will be created, then, later, when kernel_maps__find_symbol() is used, it will also call maps__find_symbol that already checks if the symtab was loaded, loading it if needed. Now if one does 'perf top --hide_kernel_symbols' we won't pay the price of loading the (many) symbols in /proc/kallsyms or vmlinux. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258757489-5978-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 78075caad99dc36ec6ef5826b7a5273ea14295fc Author: Arnaldo Carvalho de Melo Date: Fri Nov 20 20:51:26 2009 -0200 perf symbols: Introduce dso__build_id_equal Will be used in more places. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258757489-5978-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit fd7a346ea292074e9f6cdb5232a57c56bf98fdc9 Author: Arnaldo Carvalho de Melo Date: Fri Nov 20 20:51:25 2009 -0200 perf symbols: Filename__read_build_id should look at .notes section too In the kernel we have more than one notes section, so the linker script combines all and puts them into a ".notes" combined section. So we need to look at both sections and also traverse them looking at multiple GElf_Nhdr entries till we find the one we want, with the build_id. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258757489-5978-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 6671cb1674e69e2aba3d610714bdd3e97a7b51ff Author: Arnaldo Carvalho de Melo Date: Fri Nov 20 20:51:24 2009 -0200 perf symbols: Remove unrelated actions from dso__load_kernel_sym It should just load kernel symbols, not load the list of modules. There are more stuff to move to other routines, but lets do it in several steps. End goal is to be able to defer symbol table loading till we find a hit for that map address range. So that the kernel & modules are handled just like all the other DSOs in the system. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258757489-5978-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 96200591a34f8ecb98481c626125df43a2463b55 Merge: 7031281 68efa37 Author: Ingo Molnar Date: Sat Nov 21 14:07:23 2009 +0100 Merge branch 'tracing/hw-breakpoints' into perf/core Conflicts: arch/x86/kernel/kprobes.c kernel/trace/Makefile Merge reason: hw-breakpoints perf integration is looking good in testing and in reviews, plus conflicts are mounting up - so merge & resolve. Signed-off-by: Ingo Molnar commit 7031281e02bf951a2259849217193fb9d75a9762 Merge: ba77c9e d62d77f Author: Ingo Molnar Date: Sat Nov 21 13:57:35 2009 +0100 Merge branch 'perf/urgent' into perf/core Conflicts: tools/perf/util/symbol.c Merge reason: this fix will get merged in .33, not .32, plus resolve the conflict. Signed-off-by: Ingo Molnar commit 6f5f67267dc4faecd9cba63894de92ca92a608b8 Author: Masami Hiramatsu Date: Fri Nov 20 12:13:14 2009 -0500 x86: insn decoder test checks objdump version Check objdump version before using it for insn decoder build test, because some older objdump can't decode AVX code correctly. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Stephen Rothwell Cc: Randy Dunlap Cc: Jim Keniston LKML-Reference: <20091120171314.6715.30390.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: H. Peter Anvin commit 80509e27e40d7554e576405ed9f5b7966c567112 Author: Masami Hiramatsu Date: Fri Nov 20 12:13:08 2009 -0500 x86: Fix insn decoder test typos Fix postest_verbose to posttest_verbose, and add posttest_64bit option for CONFIG_64BIT != y, since old command just passed '-' instead of '-n' when CONFIG_64BIT is not set. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Stephen Rothwell Cc: Randy Dunlap Cc: Jim Keniston LKML-Reference: <20091120171307.6715.66099.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: H. Peter Anvin commit 34769945f7cd9ab470413ffe64426e3ad069f49e Author: Thomas Gleixner Date: Fri Nov 20 11:46:21 2009 +0100 genirq: Fix spurious irq seqfile conversion single_open data argument must be PDE(inode)->data instead of NULL otherwise seq_file->private is always NULL and we always read the spurious data of irq 0. Reported-by: Alexey Dobriyan Signed-off-by: Thomas Gleixner commit ba77c9e11111a172c9e8687fe16a6a173a61916f Author: Li Zefan Date: Fri Nov 20 15:53:25 2009 +0800 perf: Add 'perf kmem' tool This tool is mostly a perf version of kmemtrace-user. The following information is provided by this tool: - the total amount of memory allocated and fragmentation per call-site - the total amount of memory allocated and fragmentation per allocation - total memory allocated and fragmentation in the collected dataset - ... Sample output: # ./perf kmem record ^C # ./perf kmem --stat caller --stat alloc -l 10 ------------------------------------------------------------------------------ Callsite | Total_alloc/Per | Total_req/Per | Hit | Fragmentation ------------------------------------------------------------------------------ 0xc052f37a | 790528/4096 | 790528/4096 | 193 | 0.000% 0xc0541d70 | 524288/4096 | 524288/4096 | 128 | 0.000% 0xc051cc68 | 481600/200 | 481600/200 | 2408 | 0.000% 0xc0572623 | 297444/676 | 297440/676 | 440 | 0.001% 0xc05399f1 | 73476/164 | 73472/164 | 448 | 0.005% 0xc05243bf | 51456/256 | 51456/256 | 201 | 0.000% 0xc0730d0e | 31844/497 | 31808/497 | 64 | 0.113% 0xc0734c4e | 17152/256 | 17152/256 | 67 | 0.000% 0xc0541a6d | 16384/128 | 16384/128 | 128 | 0.000% 0xc059c217 | 13120/40 | 13120/40 | 328 | 0.000% 0xc0501ee6 | 11264/88 | 11264/88 | 128 | 0.000% 0xc04daef0 | 7504/682 | 7128/648 | 11 | 5.011% 0xc04e14a3 | 4216/191 | 4216/191 | 22 | 0.000% 0xc05041ca | 3524/44 | 3520/44 | 80 | 0.114% 0xc0734fa3 | 2104/701 | 1620/540 | 3 | 23.004% 0xc05ec9f1 | 2024/289 | 2016/288 | 7 | 0.395% 0xc06a1999 | 1792/256 | 1792/256 | 7 | 0.000% 0xc0463b9a | 1584/144 | 1584/144 | 11 | 0.000% 0xc0541eb0 | 1024/16 | 1024/16 | 64 | 0.000% 0xc06a19ac | 896/128 | 896/128 | 7 | 0.000% 0xc05721c0 | 772/12 | 768/12 | 64 | 0.518% 0xc054d1e6 | 288/57 | 280/56 | 5 | 2.778% 0xc04b562e | 157/31 | 154/30 | 5 | 1.911% 0xc04b536f | 80/16 | 80/16 | 5 | 0.000% 0xc05855a0 | 64/64 | 36/36 | 1 | 43.750% ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Alloc Ptr | Total_alloc/Per | Total_req/Per | Hit | Fragmentation ------------------------------------------------------------------------------ 0xda884000 | 1052672/4096 | 1052672/4096 | 257 | 0.000% 0xda886000 | 262144/4096 | 262144/4096 | 64 | 0.000% 0xf60c7c00 | 16512/128 | 16512/128 | 129 | 0.000% 0xf59a4118 | 13120/40 | 13120/40 | 328 | 0.000% 0xdfd4b2c0 | 11264/88 | 11264/88 | 128 | 0.000% 0xf5274600 | 7680/256 | 7680/256 | 30 | 0.000% 0xe8395000 | 5948/594 | 5464/546 | 10 | 8.137% 0xe59c3c00 | 5748/479 | 5712/476 | 12 | 0.626% 0xf4cd1a80 | 3524/44 | 3520/44 | 80 | 0.114% 0xe5bd1600 | 2892/482 | 2856/476 | 6 | 1.245% ... | ... | ... | ... | ... ------------------------------------------------------------------------------ SUMMARY ======= Total bytes requested: 2333626 Total bytes allocated: 2353712 Total bytes wasted on internal fragmentation: 20086 Internal fragmentation: 0.853375% TODO: - show sym+offset in 'callsite' column - show cross node allocation stats - collect more useful stats? - ... Signed-off-by: Li Zefan Acked-by: Pekka Enberg Acked-by: Peter Zijlstra Cc: Frederic Weisbecker Cc: Steven Rostedt Cc: Eduard - Gabriel Munteanu Cc: linux-mm@kvack.org LKML-Reference: <4B064AF5.9060208@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit ce64c62074d945fe5f8a7f01bdc30125f994ea67 Author: Masami Hiramatsu Date: Mon Nov 16 18:06:31 2009 -0500 x86: Instruction decoder test should generate build warning Since some instructions are not decoded correctly by older versions of objdump, it may cause false positive error in insn decoder posttest. This changes build error of insn decoder test to build warning. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Stephen Rothwell Cc: Randy Dunlap Cc: Jim Keniston Cc: Stephen Rothwell LKML-Reference: <20091116230631.5250.41579.stgit@harusame> Signed-off-by: Ingo Molnar commit 6b0cb5f9f7033c72b19697c33deab83f0dd9848d Author: Arnaldo Carvalho de Melo Date: Thu Nov 19 14:55:57 2009 -0200 perf tools: Don't die() in mmap_dispatch_perf_file Propagate the error, that, interestingly, are already handled by all callers :-) Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258649757-17554-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit d5eed904bb6010b429b82c47e7cdb6a32f0c1343 Author: Arnaldo Carvalho de Melo Date: Thu Nov 19 14:55:56 2009 -0200 perf tools: Eliminate some more die() uses in library functions This time in perf_header__adds_write, propagating the do_write error returns. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258649757-17554-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 4dc0a04bb18fe9b80cefa08694f46a3a19ebfe50 Author: Arnaldo Carvalho de Melo Date: Thu Nov 19 14:55:55 2009 -0200 perf tools: perf_header__read() shouldn't die() And also don't call the constructor in it, this way it adheres to the model the other methods follow. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258649757-17554-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 2446042c93bfc6eeebfc89e88fdef2435d2bb5c4 Author: Arnaldo Carvalho de Melo Date: Wed Nov 18 20:20:53 2009 -0200 perf symbols: Capture the running kernel buildid too [root@doppio linux-2.6-tip]# perf record -a -f sleep 3s ; perf buildid-list | grep vmlinux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.171 MB perf.data (~7489 samples) ] 18e7cc53db62a7d35e9d6f6c9ddc23017d38ee9a vmlinux [root@doppio linux-2.6-tip]# Several refactorings were needed so that we can have symmetry between dsos__load_modules() and dsos__load_kernel(), i.e. those functions will respectively create and add to the dsos list the loaded modules and kernel, with its buildids, but not load its symbols. That is something the subcomands that need will have to call dso__load_kernel_sym(), just like we do with modules with dsos__load_module_sym()/dso__load_module_sym(). Next csets will actually use this info to stop producing bogus results using mismatched vmlinux and .ko files. Signed-off-by: Arnaldo Carvalho de Melo Cc: Roland McGrath Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258582853-8579-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit f1617b40596cb341ee6602a9d969c5e4cebe9260 Author: Arnaldo Carvalho de Melo Date: Wed Nov 18 20:20:52 2009 -0200 perf symbols: Record the build_ids of kernel modules too [root@doppio linux-2.6-tip]# perf record -a sleep 2s;perf buildid-list|tail [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.162 MB perf.data (~7078 samples) ] 881588fa57b3c1696bc91e5e804a11304f093535 [cfg80211] 4d47ce1da9d16bad00c962c072451b7c681e82df [snd_page_alloc] 5146377e89a7caac617f9782f1a02e46263d3a31 [rfkill] 2153b937bff0d345fea83b63a2e1d3138569f83d [i915] 4e6fb1bb97362e3ee4d306988b9ad6912d5fb9ae [drm_kms_helper] f56ef2bf853e3a798f0d8d51f797622e5dc4420e [drm] b0d157a3b5c4e017329ffc07c64623cd6ad65e95 [i2c_algo_bit] 8125374b905ef9fa8b65d98e166b008ad952f198 [i2c_core] fc875c6e5a90e7b915e9d445d0efc859e1b2678c [video] 4b43c5006589f977e9762fdfc7ac1a92b72fca52 [output] [root@doppio linux-2.6-tip]# elfutils libdwfl/linux-kernel-modules.c was used as reference, as suggested by Roland McGrath. Signed-off-by: Arnaldo Carvalho de Melo Cc: Roland McGrath Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258582853-8579-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit e30a3d12ddf04add3268bfceb0e57ffe47f254c6 Author: Arnaldo Carvalho de Melo Date: Wed Nov 18 20:20:51 2009 -0200 perf symbols: Kill struct build_id_list and die() another day No need for this struct and its allocations, we can just use the ->build_id member we already have in struct dso, then ask for it to be read, and later traverse the dsos list, writing the buildid table to the perf.data file. As a bonus, one more die() function got killed. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258582853-8579-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit d3379ab9050e5522da2aac53d413651fc06be562 Author: Arnaldo Carvalho de Melo Date: Wed Nov 18 20:20:50 2009 -0200 perf symbols: Fix comparision of build_ids When we read the build_id from the DSO name to then index into /usr/lib/debug/.buildid/DSO_BUILD_ID[0:2]/DSO_BUILD_ID[2:], we were jumping directly to the comparision with the buildid we already have in dso->build_id (that came from the perf.data build_id section, collected at perf record time) unconditionally, even if we didn't had recorded it, and furthermore, comparing a formatted buildid with a rawbuildid, yikes. Fix it by deleting the dso__read_build_id() function, that was really misdesigned anyway, and do the necessary checks and correct comparison of raw buildids. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258582853-8579-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 827f3b4974c5db2968d4979fe6a0ae00ab37bdd8 Author: Hitoshi Mitake Date: Wed Nov 18 00:20:09 2009 +0900 perf bench: Add memcpy() benchmark 'perf bench mem memcpy' is a benchmark suite for measuring memcpy() performance. Example on a Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz: | % perf bench mem memcpy -l 1GB | # Running mem/memcpy benchmark... | # Copying 1MB Bytes from 0xb7d98008 to 0xb7e99008 ... | | 726.216412 MB/Sec Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <1258471212-30281-1-git-send-email-mitake@dcl.info.waseda.ac.jp> [ v2: updated changelog, clarified history of builtin-bench.c ] Signed-off-by: Ingo Molnar commit b269876c8d57fb8c801bea1fc34b461646c5abd0 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 18:38:02 2009 -0200 perf top: Don't allocate the source parsing members upfront Defer to parse_source() time allocating it. Now we use about this much memory: 1724 root 20 0 42104 10m 940 S 0.0 0.4 0:00.23 perf Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258490282-1821-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 5a8e5a3065bf04b7673262fd6c46123e4b888d2b Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 18:38:01 2009 -0200 perf top: Allocate space only for the number of counters used Reducing memory consumption on a typical desktop machine: From: 32710 root 20 0 172m 142m 1056 S 0.0 4.7 0:00.37 perf To: 420 root 20 0 47528 16m 1056 R 0.3 0.5 0:00.24 perf Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258490282-1821-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 51a472decb845e920137284a5cfef51fb7d61206 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 18:38:00 2009 -0200 perf top: Introduce helper function to access symbol from sym_entry Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258490282-1821-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 1a105f743d9fa5f7b8eeeca0afb789951164a361 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 15:40:55 2009 -0200 perf top: Suppress DSO column if only one is present E.g. [root@doppio ~]# perf top -U --------------------------------------------------------------------------- PerfTop: 482 irqs/sec kernel:100.0% [1000Hz cycles], (all, 2 CPUs) --------------------------------------------------------------------------- DSO: vmlinux samples pcnt function _______ _____ _________________________ 471.00 47.9% read_hpet 57.00 5.8% acpi_os_read_port 30.00 3.1% hpet_next_event 30.00 3.1% find_busiest_group 22.00 2.2% schedule 18.00 1.8% sched_clock_local 14.00 1.4% _spin_lock_irqsave 14.00 1.4% native_read_tsc 13.00 1.3% trace_hardirqs_off 9.00 0.9% fget_light 9.00 0.9% ioread8 8.00 0.8% do_sys_poll Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258479655-28662-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 13cc5079f235906e60577dbce8da2f9607e67e93 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 15:40:54 2009 -0200 perf top: Auto adjust symbol and dso widths We pre-calculate the symbol name length, then after we sort the entries to print, calculate the biggest one and use that for the symbol name width justification, then use the dso->long_name->len to justificate the DSO name, deciding whether using the short or long name depending on how much space we have on the terminal. IOW give as much info to the user as the terminal width allows. Suggested-by: Ingo Molnar Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258479655-28662-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit cfc10d3bcc50d70f72c0f43d03eee965c726ccc0 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 15:40:53 2009 -0200 perf symbols: Add a long_name_len member to struct dso Using a two bytes hole we already had and since we also need to calculate this strlen for fetching the buildids. We'll use it in 'perf top' to auto-adjust the output based on the terminal width. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1258479655-28662-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 11ada26c78febe4662a8e848f3bff74e3200c920 Author: Luck, Tony Date: Tue Nov 17 09:05:56 2009 -0800 perf tools: Add ia64 support for tools/perf/ Compiler on ia64 rejects the "-m64" option. Add arch specific pieces to perf.h Signed-off-by: Tony Luck Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <4b02d7f43514327a@agluck-desktop.sc.intel.com> Signed-off-by: Ingo Molnar commit 192dcf1d1775736627280a5dd4cb0f605b21857a Author: Josh Stone Date: Wed Nov 18 13:06:55 2009 -0800 tracing: Remove the stale include/trace/power.h Commit 6161352 moved the power tracing to include/trace/events/, but left the old header behind. No one is using the old header, and its declarations are now incorrect, so it should be removed. Signed-off-by: Josh Stone Acked-by: Arjan van de Ven Cc: Frank Ch. Eigler Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <1258578415-14752-1-git-send-email-jistone@redhat.com> Signed-off-by: Ingo Molnar commit 821d35a56044e522e811f6a1e8632cc230360280 Author: Alan Cox Date: Wed Nov 18 14:39:51 2009 +0000 selinux: Fix warnings scripts/selinux/genheaders/genheaders.c:20: warning: no previous prototype for ?usage? scripts/selinux/genheaders/genheaders.c:26: warning: no previous prototype for ?stoupperx? Signed-off-by: Alan Cox Acked-by: WANG Cong Signed-off-by: James Morris commit 2ea6dec4a22a6f66f6633876212fd4d195cf8277 Author: Rusty Russell Date: Tue Nov 17 14:27:27 2009 -0800 generic-ipi: Add smp_call_function_any() Andrew points out that acpi-cpufreq uses cpumask_any, when it really would prefer to use the same CPU if possible (to avoid an IPI). In general, this seems a good idea to offer. [ tglx: Documented selection preference and Inlined the UP case to avoid the copy of smp_call_function_single() and the extra EXPORT ] Signed-off-by: Rusty Russell Cc: Ingo Molnar Cc: Venkatesh Pallipadi Cc: Len Brown Cc: Zhao Yakui Cc: Dave Jones Cc: Thomas Gleixner Cc: Mike Galbraith Cc: "Zhang, Yanmin" Signed-off-by: Andrew Morton Signed-off-by: Thomas Gleixner commit a1afb6371bb5341057056194d1168753f6d77242 Author: Alexey Dobriyan Date: Fri Aug 28 22:19:33 2009 +0400 genirq: switch /proc/irq/*/spurious to seq_file [ tglx: compacted it a bit ] Signed-off-by: Alexey Dobriyan LKML-Reference: <20090828181743.GA14050@x200.localdomain> Signed-off-by: Andrew Morton Signed-off-by: Thomas Gleixner commit 6dbfe5a57db3564adf7b2a65068e40f1b4a0d2db Author: Thomas Gleixner Date: Tue Nov 17 18:27:18 2009 +0100 x86: Fixup last users of irq_chip->typename The typename member of struct irq_chip was kept for migration purposes and is obsolete since more than 2 years. Fix up the leftovers. Signed-off-by: Thomas Gleixner commit 638adb0561264a3360a53e93def62288c85d8373 Author: Steven Rostedt Date: Tue Nov 17 10:48:25 2009 -0500 tracing: Only print objcopy version warning once from recordmcount If the user has an older version of objcopy, that can not handle converting local symbols to global and vice versa, then some functions will not be part of the dynamic function tracer. The current code in recordmcount.pl will print a warning in this case. Unfortunately, there exists lots of files that may have this issue with older objcopys and this will cause a warning for every file compiled with this issue. This patch solves this overwhelming output by creating a .tmp_quiet_recordmcount file on the first instance the warning is encountered. The warning will not print if this file exists. The temp file is deleted at the beginning of the compile to ensure that the warning will happen once again on new compiles (because the issue is still present). Reported-by: Andrew Morton Cc: Sam Ravnborg Signed-off-by: Steven Rostedt commit f6060f46819f313d34a8c8151390cda509c23389 Author: Lai Jiangshan Date: Thu Nov 5 11:16:17 2009 +0800 tracing: Prevent build warning: 'ftrace_graph_buf' defined but not used Prevent build warning when CONFIG_FUNCTION_GRAPH_TRACER is not set. Signed-off-by: Lai Jiangshan LKML-Reference: <4AF24381.5060307@cn.fujitsu.com> Signed-off-by: Steven Rostedt commit d62d77fd18cc82e839e49b7ef3360e4411f7d2e5 Author: Nick Piggin Date: Tue Nov 17 12:29:38 2009 +0100 perf annotate: Allocate history size correctly Symbol offset history table size does not get updated properly when it is being resized. This leads to garbage results in perf annotate. Signed-off-by: Nick Piggin Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit c13d2f7c3231e873f30db92b96c8caa48f100f33 Author: Carsten Emde Date: Mon Nov 16 20:56:13 2009 +0100 tracing: Fix trace_marker output When a string was written to /tracing/trace_marker, some strange characters appeared in the trace output instead of the string, since a vprint function erroneously called a vararg print function with a va_list argument. This patch fixes the problem and simplifies the related code. Signed-off-by: Carsten Emde LKML-Reference: <4B01AE5D.1010801@osadl.org> Signed-off-by: Steven Rostedt commit 5a50e33cc916f6a81cb96f0f24f6a88c9ab78b79 Author: Steven Rostedt Date: Tue Nov 17 08:43:01 2009 -0500 ring-buffer: Move access to commit_page up into function used With the change of the way we process commits. Where a commit only happens at the outer most level, and that we don't need to worry about a commit ending after the rb_start_commit() has been called, the code use to grab the commit page before the tail page to prevent a possible race. But this race no longer exists with the rb_start_commit() rb_end_commit() interface. Signed-off-by: Steven Rostedt commit 751386507701010831d72c522171753d2cd903d2 Author: Michael S. Tsirkin Date: Thu Oct 29 17:20:02 2009 +0200 perf tools: Support static build This makes it possible to build perf statically, by performing: make LDFLAGS=-static Since static libraries are only searched in the order they are specified, move library list from LDFLAGS to EXTLIBS, so that they are put at the end of linker command line. Signed-off-by: Michael S. Tsirkin Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091029152002.GA5406@redhat.com> [ v2: resolved conflicts ] Signed-off-by: Ingo Molnar commit a7b63425a41cd6a8d50f76fef0660c5110f97e91 Merge: 35039eb 3726cc7 Author: Ingo Molnar Date: Tue Nov 17 10:16:43 2009 +0100 Merge branch 'perf/core' into perf/probes Resolved merge conflict in tools/perf/Makefile Merge reason: we want to queue up a dependent patch. Signed-off-by: Ingo Molnar commit 123bf0e2eddcda36a33bdfc87aa1fb07229f07b5 Author: Ingo Molnar Date: Sun Nov 15 21:19:52 2009 +0900 x86: gart: Clean up the code a bit Clean up various small stylistic details in the GART code. No functionality changed. Cc: FUJITA Tomonori Cc: Jesse Barnes Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 1f7564ca831a00b21bb493ef174c845b2ba9e64d Author: FUJITA Tomonori Date: Sun Nov 15 21:19:54 2009 +0900 x86: Calgary: Remove unnecessary DMA_ERROR_CODE usage This cleans up iommu_alloc() a bit and removes unnecessary DMA_ERROR_CODE usage. Signed-off-by: FUJITA Tomonori Acked-by: Jesse Barnes Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 8fd524b355daef0945692227e726fb444cebcd4f Author: FUJITA Tomonori Date: Sun Nov 15 21:19:53 2009 +0900 x86: Kill bad_dma_address variable This kills bad_dma_address variable, the old mechanism to enable IOMMU drivers to make dma_mapping_error() work in IOMMU's specific way. bad_dma_address variable was introduced to enable IOMMU drivers to make dma_mapping_error() work in IOMMU's specific way. However, it can't handle systems that use both swiotlb and HW IOMMU. SO we introduced dma_map_ops->mapping_error to solve that case. Intel VT-d, GART, and swiotlb already use dma_map_ops->mapping_error. Calgary, AMD IOMMU, and nommu use zero for an error dma address. This adds DMA_ERROR_CODE and converts them to use it (as SPARC and POWER does). Signed-off-by: FUJITA Tomonori Acked-by: Jesse Barnes Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-3-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 42109197eb7c01080eea6d9cd48ca23cbc3c566c Author: FUJITA Tomonori Date: Sun Nov 15 21:19:52 2009 +0900 x86: gart: Add own dma_mapping_error function GART IOMMU is the only user of bad_dma_address variable. This patch converts GART to use the newer mechanism, fill in ->mapping_error() in struct dma_map_ops, to make dma_mapping_error() work in IOMMU specific way. Signed-off-by: FUJITA Tomonori Acked-by: Jesse Barnes Cc: muli@il.ibm.com Cc: joerg.roedel@amd.com LKML-Reference: <1258287594-8777-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 99f4c9de2b707795acb215e2e94df7ea266042b5 Merge: 62ad33f 156171c Author: Ingo Molnar Date: Tue Nov 17 07:51:02 2009 +0100 Merge commit 'v2.6.32-rc7' into core/iommu Merge reason: Add fixes we'll depend on. Signed-off-by: Ingo Molnar commit 3726cc75e581c157202da93bb2333cce25c15c98 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 01:18:12 2009 -0200 perf tools: Don't die() in do_write() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258427892-16312-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit a9a70bbce7ab0bf3b1cba3ac662c4d502da6305c Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 01:18:11 2009 -0200 perf tools: Don't die() in perf_header__new() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258427892-16312-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 5875412152ce67fb5087157b86ab6597f91d23e8 Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 01:18:10 2009 -0200 perf tools: Don't die() in perf_header_attr__add_id() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258427892-16312-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 11deb1f9f6ca6318fa9470e024b9f0634df48b4c Author: Arnaldo Carvalho de Melo Date: Tue Nov 17 01:18:09 2009 -0200 perf tools: Don't die() in perf_header__add_attr() Propagate the errors instead, the users are the ones to decide what to do if a library call fails. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258427892-16312-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 1124ba73be6a758965340bd997593b2996649d60 Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 21:45:25 2009 -0200 perf buildid-list: Always show the DSO name Porcelain can ignore it, humans can make more sense of it. Suggested-by: Frederic Weisbecker Suggested-by: Ingo Molnar Suggested-by: Peter Zijlstra Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258415125-15019-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 8ffcda17314cfeb698a667567ea63f63362dffbb Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 21:45:24 2009 -0200 perf top: Introduce --hide_{user,kernel}_symbols Default continues to be showing all symbols. 'K' and 'U' can be used to toggle showing kernel and user symbols. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258415125-15019-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 3b6ed98895b0fccd8c387f3fc44016fb922c0658 Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 19:30:27 2009 -0200 perf top: Use all the lines in the screen By querying the current number of rows, if the user specifies the number of entries, use that instead. If the user uses the 'e' command to change the number of lines 0 will mean do it automatically, any other number disables the auto resizing. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258407027-384-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit dc79c0fc08a94b857aed446bfb47cdfde529400c Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 19:30:26 2009 -0200 perf tools: Don't die in perf_header_attr__new() We really should propagate such kinds of errors so that users of these library functions decide what to do in such cases instead of exiting in random places like now. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258407027-384-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 35039eb6b199749943547c8572be6604edf00229 Author: Masami Hiramatsu Date: Mon Nov 16 18:06:24 2009 -0500 x86: Show symbol name if insn decoder test failed Show symbol name if insn decoder test find a difference. This will help us to find out where the issue is. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Stephen Rothwell Cc: Randy Dunlap Cc: Jim Keniston Cc: Stephen Rothwell LKML-Reference: <20091116230624.5250.49813.stgit@harusame> Signed-off-by: Ingo Molnar commit d65ff75fbe6f8ac7c17f18e4108521898468822c Author: Masami Hiramatsu Date: Mon Nov 16 18:06:18 2009 -0500 x86: Add verbose option to insn decoder test Add verbose option to insn decoder test. This dumps decoded instruction when building kernel with V=1. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Stephen Rothwell Cc: Randy Dunlap Cc: Jim Keniston Cc: Stephen Rothwell LKML-Reference: <20091116230618.5250.18762.stgit@harusame> Signed-off-by: Ingo Molnar commit c34984b2bbc77596c97c333539bffc90d2033178 Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 16:32:45 2009 -0200 perf buildid-list: New plumbing command With this we can list the buildids in a perf.data file so that we can pipe them to other, distro specific tools that from the buildids can figure out separate packages (foo-debuginfo) where we can find the matching symtabs so that perf report can do its job. E.g: [acme@doppio linux-2.6-tip]$ perf buildid-list | head -5 8e08b117e5458ad3f85da16d42d0fc5cd21c5869 520c2387a587cc5acfcf881e27dba1caaeab4b1f ec8dd400904ddfcac8b1c343263a790f977159dc 7caedbca5a6d8ab39a7fe44bd28c07d3e14a3f3f 379bb828fd08859dbea73279f04abefabc95a6a3 [acme@doppio linux-2.6-tip]$ perf buildid-list -v | head -5 8e08b117e5458ad3f85da16d42d0fc5cd21c5869 /sbin/init 520c2387a587cc5acfcf881e27dba1caaeab4b1f /lib64/ld-2.10.1.so ec8dd400904ddfcac8b1c343263a790f977159dc /lib64/libc-2.10.1.so 7caedbca5a6d8ab39a7fe44bd28c07d3e14a3f3f /sbin/udevd 379bb828fd08859dbea73279f04abefabc95a6a3 /lib64/libdl-2.10.1.so [acme@doppio linux-2.6-tip]$ Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258396365-29217-5-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 9e03eb2d512e7f3a1e562d4b922aa8b1891750b6 Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 16:32:44 2009 -0200 perf tools: Introduce dsos__fprintf_buildid To print the buildids in the list of dsos. Will be used by 'perf buildid-list' Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258396365-29217-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 37562eac3767c7f07bb1a1329708ff6453e47570 Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 16:32:43 2009 -0200 perf tools: Generalize perf_header__adds_read() Renaming it to perf_header__process_sections() and passing a callback to handle each feature. The next changesets will introduce 'perf buildid-list' that will handle just the HEADER_BUILD_ID table, ignoring all the other features. Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258396365-29217-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 8f41146aedf803856fb6477056e3960cb9ba8f9c Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 16:32:42 2009 -0200 perf tools: Debug.h needs to include event.h for event_t Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258396365-29217-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 84fe8488ade7922afa9f3aa77c22d2d92beb9660 Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 16:32:41 2009 -0200 perf symbols: Pass the offset to perf_header__read_build_ids() Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258396365-29217-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 82164161679c448f33092945ea97cb547a13683a Author: Arnaldo Carvalho de Melo Date: Mon Nov 16 13:48:11 2009 -0200 perf symbols: Call the symbol filter in dso__synthesize_plt_symbols() We need to pass the symbol to the filter so that, for instance, 'perf top' can do filtering and also set the private area it manages, setting the ->map pointer, etc. I found this while running 'perf top' on a machine where hits happened on PLT symbols, where ->map wasn't being set up and segfaults thus happened. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <1258386491-20278-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit e79c65a97c01d5da4317f44f9f98b3814e091a43 Author: Cyrill Gorcunov Date: Mon Nov 16 18:14:26 2009 +0300 x86: io-apic: IO-APIC MMIO should not fail on resource insertion If IO-APIC base address is 1K aligned we should not fail on resourse insertion procedure. For this sake we define IO_APIC_SLOT_SIZE constant which should cover all IO-APIC direct accessible registers. An example of a such configuration is there http://marc.info/?l=linux-kernel&m=118114792006520 | | Quoting the message | | IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 | IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47 | IOAPIC[2]: apic_id 4, version 32, address 0xfec80400, GSI 48-71 | IOAPIC[3]: apic_id 5, version 32, address 0xfec84000, GSI 72-95 | IOAPIC[4]: apic_id 8, version 32, address 0xfec84400, GSI 96-119 | Reported-by: "Maciej W. Rozycki" Signed-off-by: Cyrill Gorcunov Acked-by: Yinghai Lu LKML-Reference: <20091116151426.GC5653@lenovo> Signed-off-by: Ingo Molnar commit 3c93ca00eeeb774c7dd666cc7286a9e90c53e998 Author: Frederic Weisbecker Date: Mon Nov 16 15:42:18 2009 +0100 x86: Add missing might_fault() checks to copy_{to,from}_user() On x86-64, copy_[to|from]_user() rely on assembly routines that never call might_fault(), making us missing various lockdep checks. This doesn't apply to __copy_from,to_user() that explicitly handle these calls, neither is it a problem in x86-32 where copy_to,from_user() rely on the "__" prefixed versions that also call might_fault(). Signed-off-by: Frederic Weisbecker Cc: Arjan van de Ven Cc: Linus Torvalds Cc: Nick Piggin Cc: Peter Zijlstra LKML-Reference: <1258382538-30979-1-git-send-email-fweisbec@gmail.com> [ v2: fix module export ] Signed-off-by: Ingo Molnar commit 559fdc3c1b624edb1933a875022fe7e27934d11c Author: Peter Zijlstra Date: Mon Nov 16 12:45:14 2009 +0100 perf_event: Optimize perf_output_lock() The purpose of perf_output_{un,}lock() is to: 1) avoid publishing incomplete data [ possible when publishing a head that is ahead of an entry that is still being written ] 2) guarantee fwd progress [ a simple refcount on pending writers doesn't need to drop to 0, making it so would end up implementing something like forced quiecent states of RCU ] To satisfy the above without undue complexity it serializes between CPUs, this means that a pending writer can only be the same cpu in a nested context, and since (under normal operation) a cpu always makes progress we're good -- if the head is only published when the bottom most writer completes. Now we don't need to disable IRQs in order to serialize between CPUs, disabling preemption ought to be sufficient, esp since we already deal with nesting due to NMIs. This avoids potentially expensive (and needless) local IRQ disable/enable ops. Signed-off-by: Peter Zijlstra Cc: Paul Mackerras Cc: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <1258373161.26714.254.camel@laptop> Signed-off-by: Ingo Molnar commit 047106adcc85e3023da210143a6ab8a55df9e0fc Author: Peter Zijlstra Date: Mon Nov 16 10:28:09 2009 +0100 sched: Sched_rt_periodic_timer vs cpu hotplug Heiko reported a case where a timer interrupt managed to reference a root_domain structure that was already freed by a concurrent hot-un-plug operation. Solve this like the regular sched_domain stuff is also synchronized, by adding a synchronize_sched() stmt to the free path, this ensures that a root_domain stays present for any atomic section that could have observed it. Reported-by: Heiko Carstens Signed-off-by: Peter Zijlstra Acked-by: Heiko Carstens Cc: Gregory Haskins Cc: Siddha Suresh B Cc: Martin Schwidefsky LKML-Reference: <1258363873.26714.83.camel@laptop> Signed-off-by: Ingo Molnar commit 62ad33f67003b9a7b6013f0511579b9805e11626 Author: Hiroshi Shimamoto Date: Mon Nov 16 11:44:30 2009 +0900 x86: Don't put iommu_shutdown_noop() in init section It causes kernel panic on shutdown or reboot. Signed-off-by: Hiroshi Shimamoto Acked-by: FUJITA Tomonori LKML-Reference: <4B00BC8E.50801@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit 7255fe2a42c612f2b8fe4c347f0a5f0c97d85a46 Author: Lucas De Marchi Date: Sun Nov 15 12:05:08 2009 -0200 perf stat: Do not print ratio when task-clock event is not counted The ratio between the number of events and the time elapsed makes sense only if task-clock event is counted. Otherwise it will be simply a (confusing) # 0.000 M/sec This patch outputs the ratio only if task-clock event is counted. Some test examples of before and after: Before: [lucas@skywalker linux.trees.git]$ sudo perf stat -e branch-misses -a -- sleep 1 Performance counter stats for 'sleep 1': 1367818 branch-misses # 0.000 M/sec 1.001494325 seconds time elapsed After (without task-clock): [lucas@skywalker perf]$ sudo ./perf stat -e branch-misses -a -- sleep 1 Performance counter stats for 'sleep 1': 1135044 branch-misses 1.001370775 seconds time elapsed After (with task-clock): [lucas@skywalker perf]$ sudo ./perf stat -e branch-misses -e task-clock -a -- sleep 1 Performance counter stats for 'sleep 1': 1070111 branch-misses # 0.534 M/sec 2002.730893 task-clock-msecs # 1.999 CPUs 1.001640292 seconds time elapsed Signed-off-by: Lucas De Marchi Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091115140507.GB21561@skywalker.lan> Signed-off-by: Ingo Molnar commit d2fb8b4151a92223da6a84006f8f248ebeb6677d Author: Hitoshi Mitake Date: Sun Nov 15 20:36:53 2009 +0900 perf tools: Add new perf_atoll() function to parse string representing size in bytes This patch modifies util/string.[ch] to add new function: perf_atoll() to parse string representing size in bytes. This function parses (\d+)(b|B|kb|KB|mb|MB|gb|GB) (e.g. "256MB") and returns its numeric value. (e.g. 268435456) Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Frederic Weisbecker LKML-Reference: <1258285013-4759-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 498657a478c60be092208422fefa9c7b248729c2 Author: Tejun Heo Date: Fri Nov 13 18:33:53 2009 +0900 sched, kvm: Fix race condition involving sched_in_preempt_notifers In finish_task_switch(), fire_sched_in_preempt_notifiers() is called after finish_lock_switch(). However, depending on architecture, preemption can be enabled after finish_lock_switch() which breaks the semantics of preempt notifiers. So move it before finish_arch_switch(). This also makes the in- notifiers symmetric to out- notifiers in terms of locking - now both are called under rq lock. Signed-off-by: Tejun Heo Acked-by: Avi Kivity Cc: Peter Zijlstra LKML-Reference: <4AFD2801.7020900@kernel.org> Signed-off-by: Ingo Molnar commit 0ffa798d947f5f5e40690cc9d38e678080a34f87 Merge: 39dc78b c86e2ea c5659b7 Author: Ingo Molnar Date: Sun Nov 15 09:51:19 2009 +0100 Merge branches 'perf/powerpc' and 'perf/bench' into perf/core Merge reason: Both 'perf bench' and the pending PowerPC changes are now ready for the next merge window. Signed-off-by: Ingo Molnar commit 39dc78b6510323848e3356452f7dab9499736978 Merge: 4c49b12 156171c Author: Ingo Molnar Date: Sun Nov 15 09:50:38 2009 +0100 Merge commit 'v2.6.32-rc7' into perf/core Merge reason: pick up perf fixlets Signed-off-by: Ingo Molnar commit 14722485830fe6baba738b91d96f06fbd6cf7a18 Author: Jan Beulich Date: Fri Nov 13 11:56:24 2009 +0000 x86-64: __copy_from_user_inatomic() adjustments This v2.6.26 commit: ad2fc2c: x86: fix copy_user on x86 rendered __copy_from_user_inatomic() identical to copy_user_generic(), yet didn't make the former just call the latter from an inline function. Furthermore, this v2.6.19 commit: b885808: [PATCH] Add proper sparse __user casts to __copy_to_user_inatomic converted the return type of __copy_to_user_inatomic() from unsigned long to int, but didn't do the same to __copy_from_user_inatomic(). Signed-off-by: Jan Beulich Cc: Linus Torvalds Cc: Alexander Viro Cc: Arjan van de Ven Cc: Andi Kleen Cc: LKML-Reference: <4AFD5778020000780001F8F4@vpn.id2.novell.com> Signed-off-by: Ingo Molnar commit f4131c6259b46bd84dcfcd3bb9ed08e99e2875a4 Author: FUJITA Tomonori Date: Sat Nov 14 21:26:50 2009 +0900 x86: Make calgary_iommu_init() static This makes calgary_iommu_init() static and moves it to remove the forward declaration. Signed-off-by: FUJITA Tomonori Cc: muli@il.ibm.com LKML-Reference: <20091114212603U.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 6959450e567c1f17d3ce8489099fc56c3721d577 Author: FUJITA Tomonori Date: Sat Nov 14 20:46:38 2009 +0900 swiotlb: Remove duplicate swiotlb_force extern declarations Signed-off-by: FUJITA Tomonori Cc: tony.luck@intel.com LKML-Reference: <1258199198-16657-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 94a15564ac63af6bb2ff8d4d04f86d5e7ee0278a Author: FUJITA Tomonori Date: Sat Nov 14 20:46:37 2009 +0900 x86: Move iommu_shutdown_noop to x86_init.c iommu_init_noop() is in arch/x86/kernel/x86_init.c but iommu_shutdown_noop() in arch/x86/include/asm/iommu.h. This moves iommu_shutdown_noop() to x86_init.c for consistency. Signed-off-by: FUJITA Tomonori LKML-Reference: <1258199198-16657-3-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit a3b28ee1090072092e2be043c24df94230e725b2 Author: FUJITA Tomonori Date: Sat Nov 14 20:46:36 2009 +0900 x86: Set dma_ops to nommu_dma_ops by default We set dma_ops to nommu_dma_ops at two different places for x86_32 and x86_64. This unifies them by setting dma_ops to nommu_dma_ops by default. Signed-off-by: FUJITA Tomonori LKML-Reference: <1258199198-16657-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 68efa37df779b3e04280598e8b5b3a1919b65fee Author: Ingo Molnar Date: Sat Nov 14 01:35:29 2009 +0100 hw-breakpoints, x86: Fix modular KVM build This build error: arch/x86/kvm/x86.c:3655: error: implicit declaration of function 'hw_breakpoint_restore' Happens because in the CONFIG_KVM=m case there's no 'CONFIG_KVM' define in the kernel - it's CONFIG_KVM_MODULE in that case. Make the prototype available unconditionally. Cc: Frederic Weisbecker Cc: Prasad LKML-Reference: <1258114575-32655-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 31c997cac76e62918858a432fff6e43fd48425f9 Author: Ingo Molnar Date: Sat Nov 14 10:34:41 2009 +0100 x86: Fix cpu_devs[] initialization in early_cpu_init() Yinghai Lu noticed that this commit: 0388423: x86: Minimise printk spew from per-vendor init code mistakenly left out the initialization of cpu_devs[] in the !PROCESSOR_SELECT case. Fix it. Reported-by: Yinghai Lu Cc: Dave Jones LKML-Reference: <20091113203000.GA19160@redhat.com> Signed-off-by: Ingo Molnar commit 2f51f9884f6a36b0fe9636d5a1937e5cbd25723b Author: Paul E. McKenney Date: Fri Nov 13 19:51:39 2009 -0800 rcu: Eliminate __rcu_pending() false positives Now that there are both ->gpnum and ->completed fields in the rcu_node structure, __rcu_pending() should check rdp->gpnum and rdp->completed against rnp->gpnum and rdp->completed, respectively, instead of the prior comparison against the rcu_state fields rsp->gpnum and rsp->completed. Given the old comparison, __rcu_pending() could return 1, resulting in a needless raise_softirq(RCU_SOFTIRQ). This useless work would happen if RCU responded to a scheduling-clock interrupt after the rcu_state fields had been updated, but before the rcu_node fields had been updated. Changing the comparison from the rcu_state fields to the rcu_node fields prevents this useless work from happening. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12581706991966-git-send-email-> Signed-off-by: Ingo Molnar commit 560d4bc0df9a5e63b980432282d8c2bd3559ec74 Author: Paul E. McKenney Date: Fri Nov 13 19:51:38 2009 -0800 rcu: Further cleanups of use of lastcomp Now that a copy of the rsp->completed flag is available in all rcu_node structures, make full use of it. It is still legitimate to access rsp->completed while holding the root rcu_node structure's lock, however. Also, tighten up force_quiescent_state()'s checks for end of current grace period. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1258170699933-git-send-email-> Signed-off-by: Ingo Molnar commit 4c49b12853fbb5eff4849b7b6a1e895776f027a1 Author: Arjan van de Ven Date: Fri Nov 13 21:47:33 2009 -0800 perf_event: Fix invalid type in ioctl definition u64 is invalid in userspace headers, including ioctl definitions; use __u64 instead Signed-off-by: Arjan van de Ven Cc: LKML-Reference: <20091113214733.7cd76be9@infradead.org> Signed-off-by: Ingo Molnar commit 811cb50baf63461ce0bdb234927046131fc7fa8b Author: Johannes Berg Date: Fri Nov 13 23:40:09 2009 +0100 tracing: Fix event format export For some reason the export of the event print format to userspace uses '#fmt' which breaks if the format string is anything but a plain string, for example if it is built with macros then the macro names are exported instead of their contents. Use "\"%s\"", fmt instead of "%s", #fmt to export the string and not the way it is built. For example, in net/mac80211/driver-trace.h for the trace event drv_start there is: TP_printk( LOCAL_PR_FMT, LOCAL_PR_ARG ) Which use to produce: print fmt: LOCAL_PR_FMT, REC->wiphy_name Now produces: print fmt: "%s", REC->wiphy_name Signed-off-by: Johannes Berg LKML-Reference: <20091113224009.GB23942@elte.hu> Signed-off-by: Steven Rostedt commit b01c845f0f2e3f9e54e6a78d5d56895f5b95e818 Author: Roland Dreier Date: Fri Nov 13 14:38:26 2009 -0800 x86: Remove CPU cache size output for non-Intel too As Dave Jones said about the output in intel_cacheinfo.c: "They aren't useful, and pollute the dmesg output a lot (especially on machines with many cores). Also the same information can be trivially found out from userspace." Give the generic display_cacheinfo() function the same treatment. Signed-off-by: Roland Dreier Acked-by: Dave Jones Cc: Mike Travis Cc: Andi Kleen Cc: Heiko Carstens Cc: Randy Dunlap Cc: Tejun Heo Cc: Greg Kroah-Hartman Cc: Yinghai Lu Cc: David Rientjes Cc: Steven Rostedt Cc: Rusty Russell Cc: Hidetoshi Seto Cc: Jack Steiner Cc: Frederic Weisbecker LKML-Reference: Signed-off-by: Ingo Molnar commit 688bcaff291cf2fe2734e43f2793d4d05b850518 Author: Ingo Molnar Date: Sat Nov 14 01:12:47 2009 +0100 hw-breakpoints: Fix build on !perf architectures the arch/alpha build fails with: In file included from tip/kernel/exit.c:52: tip/include/linux/hw_breakpoint.h: In function 'hw_breakpoint_addr': tip/include/linux/hw_breakpoint.h:21: error: 'struct perf_event' has no member named 'attr' [...] Move these helper inlines inside the CONFIG_HAVE_HW_BREAKPOINT ifdef. Cc: Frederic Weisbecker Cc: Prasad LKML-Reference: <1258114575-32655-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 0388423dba2217b4e5b6c61690b0506d13b25a49 Author: Dave Jones Date: Fri Nov 13 15:30:00 2009 -0500 x86: Minimise printk spew from per-vendor init code In the default case where the kernel supports all CPU vendors, we currently print out a bunch of not useful messages on every system. 32-bit: KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC 64-bit: KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls Given that "what CPUs does the kernel support" isn't useful for the "support everything" case, we can suppress these printk's. Signed-off-by: Dave Jones LKML-Reference: <20091113203000.GA19160@redhat.com> Signed-off-by: Ingo Molnar commit 687b16fb617bd446439425a368ad7c7bbd202c73 Author: Frederic Weisbecker Date: Fri Nov 13 13:16:15 2009 +0100 hw-breakpoints: Provide an off-case for counter_arch_bp() If an arch doesn't support the hw breakpoints, counter_arch_bp() has no off case to cover the missing breakpoint info structure from the perf event. The result is a build error in non-x86 configs. Reported-by: Ingo Molnar Signed-off-by: Frederic Weisbecker Cc: Prasad LKML-Reference: <1258114575-32655-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar Cc: Prasad commit 8e13c7b772387f55dc05c6a0e5b30010c3c46ff9 Author: Thomas Gleixner Date: Mon Nov 9 15:21:41 2009 +0000 locking: Reduce ifdefs in kernel/spinlock.c With the Kconfig based inline decisions we can remove extra ifdefs in kernel/spinlock.c by creating the complex lockbreak functions as inlines which are inserted into the non inlined lock functions. No functional change. Signed-off-by: Thomas Gleixner LKML-Reference: <20091109151428.548614772@linutronix.de> Acked-by: Heiko Carstens Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra commit 6beb000923882f6204ea2cfcd932e568e900803f Author: Thomas Gleixner Date: Mon Nov 9 15:21:34 2009 +0000 locking: Make inlining decision Kconfig based commit 892a7c67 (locking: Allow arch-inlined spinlocks) implements the selection of which lock functions are inlined based on defines in arch/.../spinlock.h: #define __always_inline__LOCK_FUNCTION Despite of the name __always_inline__* the lock functions can be built out of line depending on config options. Also if the arch does not set some inline defines the generic code might set them; again depending on config options. This makes it unnecessary hard to figure out when and which lock functions are inlined. Aside of that it makes it way harder and messier for -rt to manipulate the lock functions. Convert the inlining decision to CONFIG switches. Each lock function is inlined depending on CONFIG_INLINE_*. The configs implement the existing dependencies. The architecture code can select ARCH_INLINE_* to signal that it wants the corresponding lock function inlined. ARCH_INLINE_* is necessary as Kconfig ignores "depends on" restrictions when a config element is selected. No functional change. Signed-off-by: Thomas Gleixner LKML-Reference: <20091109151428.504477141@linutronix.de> Acked-by: Heiko Carstens Reviewed-by: Ingo Molnar Acked-by: Peter Zijlstra commit 67178767b936fb47a3a5e88097cff41ccbda7acb Author: Frederic Weisbecker Date: Fri Nov 13 10:06:34 2009 +0100 tracing: Rename 'lockdep' event subsystem into 'lock' Lockdep events subsystem gathers various locking related events such as a request, release, contention or acquisition of a lock. The name of this event subsystem is a bit of a misnomer since these events are not quite related to lockdep but more generally to locking, ie: these events are not reporting lock dependencies or possible deadlock scenario but pure locking events. Hence this rename. Signed-off-by: Frederic Weisbecker Acked-by: Peter Zijlstra Acked-by: Hitoshi Mitake Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Steven Rostedt Cc: Li Zefan LKML-Reference: <1258103194-843-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 8e9aa8f067d2dcd9457980ced618e1cffbcfba46 Author: Paul E. McKenney Date: Thu Nov 12 22:35:04 2009 -0800 rcu: Simplify association of forced quiescent states with grace periods The force_quiescent_state() function also took a snapshot of the ->completed field, which was as obnoxious as it was in rcu_sched_qs() and friends. So snapshot ->gpnum-1. Also, since the dyntick_record_completed() and dyntick_recall_completed() functions are now simple assignments that are independent of CONFIG_NO_HZ, and since their names are now misleading, get rid of them. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12580941042308-git-send-email-> Signed-off-by: Ingo Molnar commit b32e9eb6ad29572b4451847d0e8227c9be2b6d69 Author: Paul E. McKenney Date: Thu Nov 12 22:35:03 2009 -0800 rcu: Accelerate callback processing on CPUs not detecting GP end An earlier fix for a race resulted in a situation where the CPUs other than the CPU that detected the end of the grace period would not process their callbacks until the next grace period started. This means that these other CPUs would unnecessarily demand that an extra grace period be started. This patch eliminates this extra grace period and speeds callback processing by propagating rsp->completed to the rcu_node structures in the case where the CPU detecting the end of the grace period sees no reason to start a new grace period. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1258094104417-git-send-email-> Signed-off-by: Ingo Molnar commit fe3bcfe1f6c1fc4ea7706ac2d05e579fd9092682 Author: Peter Zijlstra Date: Thu Nov 12 15:55:29 2009 +0100 sched: More generic WAKE_AFFINE vs select_idle_sibling() Instead of only considering SD_WAKE_AFFINE | SD_PREFER_SIBLING domains also allow all SD_PREFER_SIBLING domains below a SD_WAKE_AFFINE domain to change the affinity target. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: <20091112145610.909723612@chello.nl> Signed-off-by: Ingo Molnar commit a50bde5130f65733142b32975616427d0ea50856 Author: Peter Zijlstra Date: Thu Nov 12 15:55:28 2009 +0100 sched: Cleanup select_task_rq_fair() Clean up the new affine to idle sibling bits while trying to grok them. Should not have any function differences. Signed-off-by: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: <20091112145610.832503781@chello.nl> Signed-off-by: Ingo Molnar commit 15cd8812ab2ce62a2f779e93a8398bdad752291a Author: Dave Jones Date: Thu Nov 12 18:15:43 2009 -0500 x86: Remove the CPU cache size printk's They aren't really useful, and they pollute the dmesg output a lot (especially on machines with many cores). Also the same information can be trivially found out from userspace. Reported-by: Mike Travis Signed-off-by: Dave Jones Acked-by: H. Peter Anvin Cc: Andi Kleen Cc: Heiko Carstens Cc: Roland Dreier Cc: Randy Dunlap Cc: Tejun Heo Cc: Greg Kroah-Hartman Cc: Yinghai Lu Cc: David Rientjes Cc: Steven Rostedt Cc: Rusty Russell Cc: Hidetoshi Seto Cc: Jack Steiner Cc: Frederic Weisbecker LKML-Reference: <20091112231542.GA7129@redhat.com> Signed-off-by: Ingo Molnar commit 761b1d26df542fd5eb348837351e4d2f3bc7bffe Author: Hidetoshi Seto Date: Thu Nov 12 13:33:45 2009 +0900 sched: Fix granularity of task_u/stime() Originally task_s/utime() were designed to return clock_t but later changed to return cputime_t by following commit: commit efe567fc8281661524ffa75477a7c4ca9b466c63 Author: Christian Borntraeger Date: Thu Aug 23 15:18:02 2007 +0200 It only changed the type of return value, but not the implementation. As the result the granularity of task_s/utime() is still that of clock_t, not that of cputime_t. So using task_s/utime() in __exit_signal() makes values accumulated to the signal struct to be rounded and coarse grained. This patch removes casts to clock_t in task_u/stime(), to keep granularity of cputime_t over the calculation. v2: Use div_u64() to avoid error "undefined reference to `__udivdi3`" on some 32bit systems. Signed-off-by: Hidetoshi Seto Acked-by: Peter Zijlstra Cc: xiyou.wangcong@gmail.com Cc: Spencer Candland Cc: Oleg Nesterov Cc: Stanislaw Gruszka LKML-Reference: <4AFB9029.9000208@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit 055a00865dcfc8e61f3cbefbb879c9577bd36ae5 Author: Mike Galbraith Date: Thu Nov 12 11:07:44 2009 +0100 sched: Fix/add missing update_rq_clock() calls kthread_bind(), migrate_task() and sched_fork were missing updates, and try_to_wake_up() was updating after having already used the stale clock. Aside from preventing potential latency hits, there' a side benefit in that early boot printk time stamps become monotonic. Signed-off-by: Mike Galbraith Acked-by: Peter Zijlstra LKML-Reference: <1258020464.6491.2.camel@marge.simson.net> Signed-off-by: Ingo Molnar LKML-Reference: commit db48cccc7c709ccfa7cb4ac702bc27c216bffee7 Author: Hiroshi Shimamoto Date: Thu Nov 12 11:25:34 2009 +0900 perf_event, x86: Annotate init functions and data Annotate init functions and data with __init and __initconst. Signed-off-by: Hiroshi Shimamoto Cc: Peter Zijlstra Cc: Stephane Eranian LKML-Reference: <4AFB721E.8070203@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit cffd377e5879ea58522224a785a083f201afd80e Author: Hidetoshi Seto Date: Thu Nov 12 15:52:40 2009 +0900 x86, mce: Fix __init annotations The intel_init_thermal() is called from resume path, so it cannot be marked as __init. OTOH mce_banks_init() is only called from __mcheck_cpu_cap_init() which is marked as __cpuinit, so it can be also marked as __cpuinit. Signed-off-by: Hidetoshi Seto Acked-by: Yong Wang LKML-Reference: <4AFBB0B8.2070501@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit 8b2a5dac7859dd1954095fce8b6445c3ceb36ef6 Author: Steven Rostedt Date: Wed Nov 11 19:36:03 2009 -0500 tracing: do not disable interrupts for trace_clock_local Disabling interrupts in trace_clock_local takes quite a performance hit to the recording of traces. Using perf top we see: ------------------------------------------------------------------------------ PerfTop: 244 irqs/sec kernel:100.0% [1000Hz cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 2842.00 - 40.4% : trace_clock_local 1043.00 - 14.8% : rb_reserve_next_event 784.00 - 11.1% : ring_buffer_lock_reserve 600.00 - 8.5% : __rb_reserve_next 579.00 - 8.2% : rb_end_commit 440.00 - 6.3% : ring_buffer_unlock_commit 290.00 - 4.1% : ring_buffer_producer_thread [ring_buffer_benchmark] 155.00 - 2.2% : debug_smp_processor_id 117.00 - 1.7% : trace_recursive_unlock 103.00 - 1.5% : ring_buffer_event_data 28.00 - 0.4% : do_gettimeofday 22.00 - 0.3% : _spin_unlock_irq 14.00 - 0.2% : native_read_tsc 11.00 - 0.2% : getnstimeofday Where trace_clock_local is 40% of the tracing, and the time for recording a trace according to ring_buffer_benchmark is 210ns. After converting the interrupts to preemption disabling we have from perf top: ------------------------------------------------------------------------------ PerfTop: 1084 irqs/sec kernel:99.9% [1000Hz cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 1277.00 - 16.8% : native_read_tsc 1148.00 - 15.1% : rb_reserve_next_event 896.00 - 11.8% : ring_buffer_lock_reserve 688.00 - 9.1% : __rb_reserve_next 664.00 - 8.8% : rb_end_commit 563.00 - 7.4% : ring_buffer_unlock_commit 508.00 - 6.7% : _spin_unlock_irq 365.00 - 4.8% : debug_smp_processor_id 321.00 - 4.2% : trace_clock_local 303.00 - 4.0% : ring_buffer_producer_thread [ring_buffer_benchmark] 273.00 - 3.6% : native_sched_clock 122.00 - 1.6% : trace_recursive_unlock 113.00 - 1.5% : sched_clock 101.00 - 1.3% : ring_buffer_event_data 53.00 - 0.7% : tick_nohz_stop_sched_tick Where trace_clock_local drops from 40% to only taking 4% of the total time. The trace time also goes from 210ns down to 179ns (31ns). I talked with Peter Zijlstra about the impact that sched_clock may have without having interrupts disabled, and he told me that if a timer interrupt comes in, sched_clock may report a wrong time. Balancing a seldom incorrect timestamp with a 15% performance boost, I'll take the performance boost. Acked-by: Peter Zijlstra Signed-off-by: Steven Rostedt commit a6f0eb6adc42e5eed3f35af99c61c0e411b16f8e Author: Steven Rostedt Date: Wed Nov 11 17:14:07 2009 -0500 ring-buffer: Add multiple iterations between benchmark timestamps The ring_buffer_benchmark does a gettimeofday after every write to the ring buffer in its measurements. This adds the overhead of the call to gettimeofday to the measurements and does not give an accurate picture of the length of time it takes to record a trace. This was first noticed with perf top: ------------------------------------------------------------------------------ PerfTop: 679 irqs/sec kernel:99.9% [1000Hz cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 1673.00 - 27.8% : trace_clock_local 806.00 - 13.4% : do_gettimeofday 590.00 - 9.8% : rb_reserve_next_event 554.00 - 9.2% : native_read_tsc 431.00 - 7.2% : ring_buffer_lock_reserve 365.00 - 6.1% : __rb_reserve_next 355.00 - 5.9% : rb_end_commit 322.00 - 5.4% : getnstimeofday 268.00 - 4.5% : ring_buffer_unlock_commit 262.00 - 4.4% : ring_buffer_producer_thread [ring_buffer_benchmark] 113.00 - 1.9% : read_tsc 91.00 - 1.5% : debug_smp_processor_id 69.00 - 1.1% : trace_recursive_unlock 66.00 - 1.1% : ring_buffer_event_data 25.00 - 0.4% : _spin_unlock_irq And the length of each write to the ring buffer measured at 310ns. This patch adds a new module parameter called "write_interval" which is defaulted to 50. This is the number of writes performed between timestamps. After this patch perf top shows: ------------------------------------------------------------------------------ PerfTop: 244 irqs/sec kernel:100.0% [1000Hz cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 2842.00 - 40.4% : trace_clock_local 1043.00 - 14.8% : rb_reserve_next_event 784.00 - 11.1% : ring_buffer_lock_reserve 600.00 - 8.5% : __rb_reserve_next 579.00 - 8.2% : rb_end_commit 440.00 - 6.3% : ring_buffer_unlock_commit 290.00 - 4.1% : ring_buffer_producer_thread [ring_buffer_benchmark] 155.00 - 2.2% : debug_smp_processor_id 117.00 - 1.7% : trace_recursive_unlock 103.00 - 1.5% : ring_buffer_event_data 28.00 - 0.4% : do_gettimeofday 22.00 - 0.3% : _spin_unlock_irq 14.00 - 0.2% : native_read_tsc 11.00 - 0.2% : getnstimeofday do_gettimeofday dropped from 13% usage to a mere 0.4%! (using the default 50 interval) The measurement for each timestamp went from 310ns to 210ns. That's 100ns (1/3rd) overhead that the gettimeofday call was introducing. Signed-off-by: Steven Rostedt commit a646365cc330b5aaf4452c91f61b1e0d1acf68d0 Author: Roel Kluin Date: Wed Nov 11 22:26:35 2009 +0100 tracing: Fix return value of tracing_stats_read() The function tracing_stats_read() mistakenly returns ENOMEM instead of the negative value -ENOMEM. Signed-off-by: Roel Kluin LKML-Reference: <4AFB2C0B.50605@gmail.com> Signed-off-by: Steven Rostedt commit 0e0fc1c23e04c15e814763f2b366e92d87d8b95d Author: Paul E. McKenney Date: Wed Nov 11 11:28:06 2009 -0800 rcu: Mark init-time-only rcu_bootup_announce() as __init Because rcu_bootup_announce() is used only at boot time, mark it as __init, presumably so that its memory can be reclaimed. Suggested-by: Joe Perches Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <20091111192806.GA10073@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar commit b285fab4185a9b3db953726f0dd9d343a6e389db Author: Avi Cohen Stuart Date: Tue Nov 10 22:43:46 2009 +0100 pcmcia: correct handling for Zoomed Video registers in topic.h Fix handling of Zoomed Video Registers in the Topic pcmcia controller ( http://bugzilla.kernel.org/show_bug.cgi?id=14581 ). The information has been retrieved from the Topic manual which can be obtained from Toshiba. The Zoomed Video is used with PCMCIA Cards like the Margi DVD-to-Go. [linux@dominikbrodowski.net: whitespace & commit message fix] Signed-off-by: Avi Cohen Stuart Signed-off-by: Dominik Brodowski commit e657ea17ef2d7f364e5c2625157f6cc0584ac7ad Author: Randy Dunlap Date: Wed Nov 11 09:31:07 2009 -0800 pcmcia: fix printk formats Fix printk format warnings on sizeof() [size_t] arguments. drivers/char/pcmcia/cm4040_cs.c:267: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'size_t' drivers/char/pcmcia/cm4040_cs.c:272: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'size_t' CC: Harald Welte Signed-off-by: Randy Dunlap Signed-off-by: Dominik Brodowski commit b18485e7acfe1a634615d1c628ef644c0d58d472 Author: FUJITA Tomonori Date: Thu Nov 12 00:03:28 2009 +0900 swiotlb: Remove the swiotlb variable usage POWERPC doesn't expect it to be used. This fixes the linux-next build failure reported by Stephen Rothwell: lib/swiotlb.c: In function 'setup_io_tlb_npages': lib/swiotlb.c:114: error: 'swiotlb' undeclared (first use in this function) Reported-by: Stephen Rothwell Signed-off-by: FUJITA Tomonori Cc: peterz@infradead.org LKML-Reference: <20091112000258F.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit ce6b5d768c79b9d5dd6345c033bae781d5ca9b8e Author: Yong Wang Date: Wed Nov 11 15:51:25 2009 +0800 x86: Mark the thermal init functions __init Mark the thermal init functions __init so that the init memory can be freed. Signed-off-by: Yong Wang LKML-Reference: <20091111075125.GA17900@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar commit 5d7bdab75cd56d2bdc0986ae5546be3b09fea70a Author: Michael Cree Date: Wed Nov 11 20:43:03 2009 +1300 perf tools: Test -fstack-protector-all compiler option for inclusion in CFLAGS Some architectures (e.g. Alpha) do not support the -fstack-protector-all compiler option and the use of the option with -Werror causes the compiler to abort and the build fails. Test that the compiler supports -fstack-protector-all before inclusion in CFLAGS. Signed-off-by: Michael Cree Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <20091111074302.GA3728@omega> Signed-off-by: Ingo Molnar commit 9e827dd00a94136b944a538bede67c944d0b740a Author: Frederic Weisbecker Date: Wed Nov 11 04:51:07 2009 +0100 perf tools: Bring linear set of section headers for features Build a set of section headers for features right after the datas. Each implemented feature will have one of such section header that provides the offset and the size of the data manipulated by the feature. The trace informations have moved after the data and are recorded on exit time. The new layout is as follows: ----------------------- ___ [ magic ] | [ header size ] | [ attr size ] | [ attr content offset ] | [ attr content size ] | [ data offset ] File Headers [ data size ] | [ event_types offset ] | [ event_types size ] | [ feature bitmap ] v [ attr section ] [ events section ] ___ [ X ] | [ X ] | [ X ] Datas [ X ] | [ X ] v ___ [ Feature 1 offset ] | [ Feature 1 size ] Features headers [ Feature 2 offset ] | [ Feature 2 size ] v [ Feature 1 content ] [ Feature 2 content ] ----------------------- We have as many feature's section headers as we have features in use for the current file. Say Feat 1 and Feat 3 are used by the file, but not Feat 2. Then the feature headers will be like follows: [ Feature 1 offset ] | [ Feature 1 size ] Features headers [ Feature 3 offset ] | [ Feature 3 size ] v There is no hole to cover Feature 2 that is not in use here. We only need to cover the needed headers in order, from the lowest feature bit to the highest. Currently we have two features: HEADER_TRACE_INFO and HEADER_BUILD_ID. Both have their contents that follow the feature headers. Putting the contents right after the feature headers is not mandatory though. While we keep the feature headers right after the data and in order, their offsets can point everywhere. We have just put the two above feature contents in the end of the file for convenience. The purpose of this layout change is to have a file format that scales while keeping it simple: having such linear feature headers is less error prone wrt forward/backward compatibility as the content of a feature can be put anywhere, its location can even change by the time, it's fine because its headers will tell where it is. And we know how to find these headers, following the above rules. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Hitoshi Mitake LKML-Reference: <1257911467-28276-6-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 3e13ab2d83b6867a20663c73c184f29c2fde1558 Author: Frederic Weisbecker Date: Wed Nov 11 04:51:06 2009 +0100 perf tools: Use perf_header__set/has_feat whenever possible And drop the alternate checks/sets using set_bit or other kind of helpers. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Hitoshi Mitake LKML-Reference: <1257911467-28276-5-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 4778d2e4f410c6eea32f594cb2be9590bcb28b84 Author: Frederic Weisbecker Date: Wed Nov 11 04:51:05 2009 +0100 perf tools: Read the build-ids from the header layer Keep the build-ids reading implementation in the data mapping but move its call to the headers so that we have a better control on it (offset seeking, size passing, etc..). Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Hitoshi Mitake LKML-Reference: <1257911467-28276-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 57f395a7eabb913d3605d7392be5bdb0837c9f3d Author: Frederic Weisbecker Date: Wed Nov 11 04:51:04 2009 +0100 perf tools: Split up build id saving into fetch and write We are saving the build id once we stop the profiling. And only after doing that we know if we need to set that feature in the header through the feature bitmap. But if we want a proper feature support in the headers, using a rule of offset/size pairs in sections, we need to know in advance how many features we need to set in the headers, so that we can reserve rooms for their section headers. The current state doesn't allow that, as it forces us to first save the build-ids to the file right after the datas instead of planning any structured layout. That's why this splits up the build-ids processing in two parts: one that fetches the build-ids from the Dso objects, and one that saves them into the file. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Hitoshi Mitake LKML-Reference: <1257911467-28276-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 8671dab9d5b2f0b444b8d09792384dccbfd43d14 Author: Frederic Weisbecker Date: Wed Nov 11 04:51:03 2009 +0100 perf tools: Move the build-id storage operations to headers So that it makes easier to control it. Especially because we plan to give it a feature section. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Hitoshi Mitake LKML-Reference: <1257911467-28276-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit de8967214d8ce536161a1ad6538ad1cb82e7428d Author: Frederic Weisbecker Date: Wed Nov 11 04:51:02 2009 +0100 perf tools: Synthetize the targeted process Don't forget to also synthetize the targeted process from perf record or we'll miss its dso in the events and then we won't be able to deal with its build-id. We are missing it because it is created after the existing synthetized tasks but before the counters are enabled and can send its mapping event. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Hitoshi Mitake LKML-Reference: <1257911467-28276-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit c64ac3ce06558e534aec62b1fadeb0a3f111dac1 Author: Paul E. McKenney Date: Tue Nov 10 13:37:22 2009 -0800 rcu: Simplify association of quiescent states with grace periods The rdp->passed_quiesc_completed fields are used to properly associate the recorded quiescent state with a grace period. It is OK to wrongly associate a given quiescent state with a preceding grace period, but it is fatal to associate a given quiescent state with a grace period that begins after the quiescent state occurred. Grace periods are numbered, and the following fields track them: o ->gpnum is the number of the grace period currently in progress, or the number of the last grace period to complete if no grace period is currently in progress. o ->completed is the number of the last grace period to have completed. These two fields are equal if there is no grace period in progress, otherwise ->gpnum is one greater than ->completed. But the rdp->passed_quiesc_completed field compared against ->completed, and if equal, the quiescent state is presumed to count against the current grace period. The earlier code copied rdp->completed to rdp->passed_quiesc_completed, which has been made to work, but is error-prone. In contrast, copying one less than rdp->gpnum is guaranteed safe, because rdp->gpnum is not incremented until after the start of the corresponding grace period. At the end of the grace period, when ->completed has incremented, then any quiescent periods recorded previously will be discarded. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12578890421011-git-send-email-> Signed-off-by: Ingo Molnar commit 4bcfe055030d9e953945def3864f7e6997b27782 Author: Paul E. McKenney Date: Tue Nov 10 13:37:21 2009 -0800 rcu: Rename dynticks_completed to completed_fqs This field is used whether or not CONFIG_NO_HZ is set, so the old name of ->dynticks_completed is quite misleading. Change to ->completed_fqs, given that it the value that force_quiescent_state() is trying to drive the ->completed field away from. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12578890423298-git-send-email-> Signed-off-by: Ingo Molnar commit 956539b75921f561c0956c22d37320780e8b4ba1 Author: Paul E. McKenney Date: Tue Nov 10 13:37:20 2009 -0800 rcu: Enable synchronize_sched_expedited() fastpath This patch adds a counter increment to enable tasks to actually take the synchronize_sched_expedited() function's fastpath. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1257889042435-git-send-email-> Signed-off-by: Ingo Molnar commit dbe01350fa8ce0c11948ab7d6be71a4d901be151 Author: Paul E. McKenney Date: Tue Nov 10 13:37:19 2009 -0800 rcu: Remove inline from forward-referenced functions Some variants of gcc are reputed to dislike forward references to functions declared "inline". Remove the "inline" keyword from such functions. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12578890422402-git-send-email-> Signed-off-by: Ingo Molnar commit 200a9ae2801bc725f2c41ab13f6e0fb1610d2fb6 Author: Dimitri Sivanich Date: Tue Nov 10 13:58:35 2009 -0600 x86: Remove asm/apicnum.h arch/x86/include/asm/apicnum.h is not referenced anywhere anymore. Its definitions appear in apicdef.h. Remove it. Signed-off-by: Dimitri Sivanich Acked-by: Cyrill Gorcunov Acked-by: Mike Travis LKML-Reference: <20091110195835.GA4393@sgi.com> Signed-off-by: Ingo Molnar commit ffd44db5f02af32bcc25a8eb5981bf02a141cdab Author: Peter Zijlstra Date: Tue Nov 10 20:12:01 2009 +0100 sched: Make sure task has correct sched_class after policy change From the code in rt_mutex_setprio(), it is evident that the intention is that task's with a RT 'prio' value as a consequence of receiving a PI boost also have their 'sched_class' field set to '&rt_sched_class'. However, Peter noticed that the code in __setscheduler() could result in this intention being frustrated. Fix it. Reported-by: Peter Williams Signed-off-by: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: <1257880321.4108.457.camel@laptop> Signed-off-by: Ingo Molnar commit c5659b74f052150791750234f92dcfb29d27efa5 Author: Hitoshi Mitake Date: Wed Nov 11 00:04:02 2009 +0900 perf bench: Improve sched-message.c with more comfortable output This patch improves sched-message.c with more comfortable output. Change points are comment style description and formatting numerical values and its units. Example: | % perf bench sched messaging | # Running sched/messaging benchmark... | # 20 sender and receiver processes per group | # 10 groups == 400 processes run | | Total time: 1.490 [sec] Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257865442-20252-4-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar Cc: Peter Zijlstra Cc: Paul Mackerras commit ff676b193a401b23c84a79a7ec06559f3eaae917 Author: Hitoshi Mitake Date: Wed Nov 11 00:04:01 2009 +0900 perf bench: Improve sched-pipe.c with more comfortable output This patch improves sched-pipe.c with more comfortable output. Change points are comment style description and formatting numerical values and its units. Example: | % ./perf bench sched pipe | # Running sched/pipe benchmark... | # Extecuted 1000000 pipe operations between two tasks | | Total time:5.822 [sec] | | 5.822553 usecs/op | 171745 ops/sec Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257865442-20252-3-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 79e295d4bd0f524257299e7c4e42f643f21abcc2 Author: Hitoshi Mitake Date: Wed Nov 11 00:04:00 2009 +0900 perf bench: Improve builtin-bench.c for more friendly output This patch makes output of perf bench more friendly. Current style of putput, keeping user wait and printing everything suddenly when we finish, may confuse users. So I improved it: | % perf bench sched messaging | # Running sched/messaging benchmark... <- printed right after invocation | # 20 sender and receiver processes per group | # 10 groups == 400 processes run | | Total time: 1.476 [sec] Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257865442-20252-2-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit b4941a9a606f0131559cc040b64e8437ac7b32c5 Author: Ingo Molnar Date: Tue Nov 10 14:37:58 2009 +0100 x86: Add iommu_init to x86_init_ops, fix build Most of the time x86_init.h is included in pci-dma.c - but not always, leading to this rare build failure: arch/x86/kernel/pci-dma.c:296: error: 'x86_init' undeclared (first use in this function) So include asm/x86_init.h explicitly. Cc: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 8d8d61aadb9d8cce07f7dcdb77a4c20a25d36d07 Author: Hitoshi Mitake Date: Tue Nov 10 20:50:55 2009 +0900 perf bench: Modify command-list.txt for the entry of perf-bench This patch modifies command-list.txt for the entry of perf-bench. So perf will show 'bench' in command list. Example: % perf usage: perf [--version] [--help] COMMAND [ARGS] The most commonly used perf commands are: annotate Read perf.data (created by perf record) and display annotated code bench General framework for benchmark suites ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ list List all symbolic event types probe Define new dynamic tracepoints record Run a command and record its profile into perf.data report Read perf.data (created by perf record) and display the profile sched Tool to trace/measure scheduler properties (latencies) stat Run a command and gather performance counter statistics timechart Tool to visualize total system behavior during a workload top System profiling tool. trace Read perf.data (created by perf record) and display trace output See 'perf help COMMAND' for more information on a specific command. Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257853855-28934-4-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 9fbc04f2493929a69fd9e53b5fb53c127d7950d5 Author: Hitoshi Mitake Date: Tue Nov 10 20:50:54 2009 +0900 perf bench: Add new document about perf-bench This patch adds new document about perf-bench. Man page and html will be provided for user. Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257853855-28934-3-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 606bc1e18d346fc7d7fb333909cc95b06b1ca5b1 Author: Ingo Molnar Date: Tue Nov 10 20:50:53 2009 +0900 perf bench: Clean up bench/bench.h Clean up initializers in bench.h: - No need to break the line for function prototypes, they are more readable in a single line. (even if checkpatch complains about it - We try to align definitions / structure fields vertically, to make it all a bit more readable. Signed-off-by: Ingo Molnar Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257853855-28934-2-git-send-email-mitake@dcl.info.waseda.ac.jp> commit 72d03802b8b5c841ab1da82bff0652628cbadf60 Author: FUJITA Tomonori Date: Tue Nov 10 21:35:17 2009 +0900 x86, 32-bit: Fix swiotlb boot crash Ingo Molnar reported this boot crash: [ 8.655620] pata_amd 0000:00:06.0: version 0.4.1 [ 8.660286] BUG: unable to handle kernel NULL pointer dereference at 00000034 [ 8.663572] IP: [] dma_supported+0x3b/0xa4 [ 8.663572] *pde = 00000000 Initialize dma_ops properly in the 32-bit case. Signed-off-by: Ingo Molnar commit 75f1cdf1dda92cae037ec848ae63690d91913eac Author: FUJITA Tomonori Date: Tue Nov 10 19:46:20 2009 +0900 x86: Handle HW IOMMU initialization failure gracefully If HW IOMMU initialization fails (Intel VT-d often does this, typically due to BIOS bugs), we fall back to nommu. It doesn't work for the majority since nowadays we have more than 4GB memory so we must use swiotlb instead of nommu. The problem is that it's too late to initialize swiotlb when HW IOMMU initialization fails. We need to allocate swiotlb memory earlier from bootmem allocator. Chris explained the issue in detail: http://marc.info/?l=linux-kernel&m=125657444317079&w=2 The current x86 IOMMU initialization sequence is too complicated and handling the above issue makes it more hacky. This patch changes x86 IOMMU initialization sequence to handle the above issue cleanly. The new x86 IOMMU initialization sequence are: 1. we initialize the swiotlb (and setting swiotlb to 1) in the case of (max_pfn > MAX_DMA32_PFN && !no_iommu). dma_ops is set to swiotlb_dma_ops or nommu_dma_ops. if swiotlb usage is forced by the boot option, we finish here. 2. we call the detection functions of all the IOMMUs 3. the detection function sets x86_init.iommu.iommu_init to the IOMMU initialization function (so we can avoid calling the initialization functions of all the IOMMUs needlessly). 4. if the IOMMU initialization function doesn't need to swiotlb then sets swiotlb to zero (e.g. the initialization is sucessful). 5. if we find that swiotlb is set to zero, we free swiotlb resource. Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-10-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit ad32e8cb86e7894aac51c8963eaa9f36bb8a4e14 Author: FUJITA Tomonori Date: Tue Nov 10 19:46:19 2009 +0900 swiotlb: Defer swiotlb init printing, export swiotlb_print_info() This enables us to avoid printing swiotlb memory info when we initialize swiotlb. After swiotlb initialization, we could find that we don't need swiotlb. This patch removes the code to print swiotlb memory info in swiotlb_init() and exports the function to do that. Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com Cc: tony.luck@intel.com Cc: benh@kernel.crashing.org LKML-Reference: <1257849980-22640-9-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: merge up conflict ] Signed-off-by: Ingo Molnar commit 5740afdb68abadc473fd5392df733558a58c1254 Author: FUJITA Tomonori Date: Tue Nov 10 19:46:18 2009 +0900 swiotlb: Add swiotlb_free() function swiotlb_free() function frees all allocated memory for swiotlb. We need to initialize swiotlb before IOMMU initialization (x86 and powerpc needs to allocate memory from bootmem allocator). If IOMMU initialization is successful, we need to free swiotlb resource (don't want to waste 64MB). Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-8-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: build fix for the !CONFIG_SWIOTLB case ] Signed-off-by: Ingo Molnar commit 9f993ac3f708b661207ed7de521f245586217a68 Author: FUJITA Tomonori Date: Tue Nov 10 19:46:17 2009 +0900 bootmem: Add free_bootmem_late() Add a new function for freeing bootmem after the bootmem allocator has been released and the unreserved pages given to the page allocator. This allows us to reserve bootmem and then release it if we later discover it was not needed. ( This new API will be used by the swiotlb code to recover a significant amount of RAM (64MB). ) Signed-off-by: FUJITA Tomonori Acked-by: Pekka Enberg Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com Cc: hannes@cmpxchg.org Cc: tj@kernel.org Cc: akpm@linux-foundation.org Cc: Linus Torvalds LKML-Reference: <1257849980-22640-7-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 9d5ce73a64be2be8112147a3e0b551ad9cd1247b Author: FUJITA Tomonori Date: Tue Nov 10 19:46:16 2009 +0900 x86: intel-iommu: Convert detect_intel_iommu to use iommu_init hook This changes detect_intel_iommu() to set intel_iommu_init() to iommu_init hook if detect_intel_iommu() finds the IOMMU. Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-6-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: build fix for the !CONFIG_DMAR case ] Signed-off-by: Ingo Molnar commit ea1b0d3945c7374849235b6ecaea1191ee1d9d50 Author: FUJITA Tomonori Date: Tue Nov 10 19:46:15 2009 +0900 x86: amd_iommu: Convert amd_iommu_detect() to use iommu_init hook This changes amd_iommu_detect() to set amd_iommu_init to iommu_init hook if amd_iommu_detect() finds the AMD IOMMU. We can kill the code to check if we found the IOMMU in amd_iommu_init() since amd_iommu_detect() sets amd_iommu_init() only when it found the IOMMU. Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-5-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit de957628ce7c84764ff41331111036b3ae5bad0f Author: FUJITA Tomonori Date: Tue Nov 10 19:46:14 2009 +0900 x86: GART: Convert gart_iommu_hole_init() to use iommu_init hook This changes gart_iommu_hole_init() to set gart_iommu_init() to iommu_init hook if gart_iommu_hole_init() finds the GART IOMMU. We can kill the code to check if we found the IOMMU in gart_iommu_init() since gart_iommu_hole_init() sets gart_iommu_init() only when it found the IOMMU. Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit d7b9f7be216b04ff9d108f856bc03d96e7b3439c Author: FUJITA Tomonori Date: Tue Nov 10 19:46:13 2009 +0900 x86: Calgary: Convert detect_calgary() to use iommu_init hook This changes detect_calgary() to set init_calgary() to iommu_init hook if detect_calgary() finds the Calgary IOMMU. We can kill the code to check if we found the IOMMU in init_calgary() since detect_calgary() sets init_calgary() only when it found the IOMMU. Signed-off-by: FUJITA Tomonori Acked-by: Muli Ben-Yehuda Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com LKML-Reference: <1257849980-22640-3-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit d07c1be0693e0902d743160b8b638585b808f8ac Author: FUJITA Tomonori Date: Tue Nov 10 19:46:12 2009 +0900 x86: Add iommu_init to x86_init_ops We call the detections functions of all the IOMMUs then all their initialization functions. The latter is pointless since we don't detect multiple different IOMMUs. What we need to do is calling the initialization function of the detected IOMMU. This adds iommu_init hook to x86_init_ops so if an IOMMU detection function can set its initialization function to the hook. Signed-off-by: FUJITA Tomonori Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com LKML-Reference: <1257849980-22640-2-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 59d8eb53ea9947db7cad8ebc31b0fb54f23a9851 Author: Frederic Weisbecker Date: Tue Nov 10 11:03:12 2009 +0100 hw-breakpoints: Wrap in the KVM breakpoint active state check Wrap in the cpu dr7 check that tells if we have active breakpoints that need to be restored in the cpu. This wrapper makes the check more self-explainable and also reusable for any further other uses. Reported-by: Jan Kiszka Signed-off-by: Frederic Weisbecker Cc: Avi Kivity Cc: "K. Prasad" commit f60d24d2ad04977b0bd9e3eb35dba2d2fa569af9 Author: Frederic Weisbecker Date: Tue Nov 10 10:17:07 2009 +0100 hw-breakpoints: Fix broken hw-breakpoint sample module The hw-breakpoint sample module has been broken during the hw-breakpoint internals refactoring. Propagate the changes to it. Reported-by: "K. Prasad" Signed-off-by: Frederic Weisbecker commit 9f6b3c2c30cfbb1166ce7e74a8f9fd93ae19d2de Author: Frederic Weisbecker Date: Mon Nov 9 21:03:43 2009 +0100 hw-breakpoints: Fix broken a.out format dump Fix the broken a.out format dump. For now we only dump the ptrace breakpoints. TODO: Dump every perf breakpoints for the current thread, not only ptrace based ones. Reported-by: Ingo Molnar Signed-off-by: Frederic Weisbecker Cc: "K. Prasad" commit 676c0dbe6e514fdd8e434a9e623c781aa9b40b15 Author: Paul Mundt Date: Mon Nov 9 17:37:34 2009 +0900 ksym_tracer: Support read accesses independent of read/write. All of the infrastructure already exists to support read accesses for platforms that support a read access independently of read/write (such as in the case of the SuperH UBC). This just trivially hooks up the read case by itself. Signed-off-by: Paul Mundt Cc: Ingo Molnar Cc: Li Zefan Cc: Prasad Cc: Alan Stern Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Jan Kiszka Cc: Jiri Slaby Cc: Avi Kivity Cc: Paul Mackerras Cc: Mike Galbraith Cc: Masami Hiramatsu Cc: Arjan van de Ven LKML-Reference: <20091109083733.GA25848@linux-sh.org> Signed-off-by: Frederic Weisbecker commit 41855b77547fa18d90ed6a5d322983d3fdab1959 Author: Joe Perches Date: Mon Nov 9 17:58:50 2009 -0800 x86: GART: pci-gart_64.c: Use correct length in strncmp Signed-off-by: Joe Perches Cc: # .3x.x LKML-Reference: <1257818330.12852.72.camel@Joe-Laptop.home> Signed-off-by: Ingo Molnar commit a2202aa29289db64ca7988b12343158b67b27f10 Author: Yong Wang Date: Tue Nov 10 09:38:24 2009 +0800 x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP value On platforms where the BIOS handles the thermal monitor interrupt, APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS must not touch it. Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all the LVT entries except the mask bit. Essentially this results in all LVT entries including the thermal monitoring interrupt set to masked (clearing the bios programmed value for APIC_LVTTHMR). And this leads to kernel take over the thermal monitoring interrupt on AP's but not on BSP (leaving the bios programmed value only on BSP). As a result of this, we have seen system hangs when the thermal monitoring interrupt is generated. Fix this by reading the initial value of thermal LVT entry on BSP and if bios has taken over the control, then program the same value on all AP's and leave the thermal monitoring interrupt control on all the logical cpu's to the bios. Signed-off-by: Yong Wang Reviewed-by: Suresh Siddha Cc: Borislav Petkov Cc: Arjan van de Ven LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar Cc: stable@kernel.org commit 7abc07531383ac7f727cc9d44e1360a829f2082e Author: Cyrill Gorcunov Date: Tue Nov 10 01:06:59 2009 +0300 x86: apic: Do not use stacked physid_mask_t We should not use physid_mask_t as a stack based variable in apic code. This type depends on MAX_APICS parameter which may be huge enough. Especially it became a problem with apic NOOP driver which is portable between 32 bit and 64 bit environment (where we have really huge MAX_APICS). So apic driver should operate with pointers and a caller in turn should aware of allocation physid_mask_t variable. As a side (but positive) effect -- we may use already implemented physid_set_mask_of_physid function eliminating default_apicid_to_cpu_present completely. Note that physids_coerce and physids_promote turned into static inline from macro (since macro hides the fact that parameter is being interpreted as unsigned long, make it explicit). Signed-off-by: Cyrill Gorcunov Cc: Yinghai Lu Cc: Maciej W. Rozycki Cc: Stephen Rothwell LKML-Reference: <20091109220659.GA5568@lenovo> Signed-off-by: Ingo Molnar commit 158ba827f6deef4102c5247ed4b6a587f0bd6a07 Author: Hitoshi Mitake Date: Tue Nov 10 08:20:02 2009 +0900 perf bench: Modify builtin-pipe.c for processing common options This patch modifies builtin-pipe.c for processing common options. The first option added is "--format". Users of perf bench will be able to specify output style by --format. Usage example: % ./perf bench sched pipe # with no style specify (executing 1000000 pipe operations between two tasks) Total time:5.855 sec 5.855061 usecs/op 170792 ops/sec % ./perf bench --format=simple sched pipe # specified simple 5.988 Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257808802-9420-5-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar Cc: Peter Zijlstra Cc: Paul Mackerras commit cced06c62a9db6bd6d77e3f0a57dbe47a26d881e Author: Hitoshi Mitake Date: Tue Nov 10 08:20:01 2009 +0900 perf bench: Modify bench/bench-messaging.c to adopt unified output formatting This patch modifies bench/bench-messaging.c to adopt unified output formatting: --format option. Usage example: % ./perf bench sched messaging # with no style specify (20 sender and receiver processes per group) (10 groups == 400 processes run) Total time:1.431 sec % ./perf bench --format=simple sched messaging # specified simple 1.431 Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257808802-9420-4-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 386d7e9e542c2115d5d300747e57f503458a1617 Author: Hitoshi Mitake Date: Tue Nov 10 08:20:00 2009 +0900 perf bench: Modify builtin-bench.c for processing common options This patch modifies builtin-bench.c for processing common options. The first option added is "--format". Users of perf bench will be able to specify output style by --format. Usage example: % ./perf bench sched messaging # with no style specify (20 sender and receiver processes per group) (10 groups == 400 processes run) Total time:1.431 sec % ./perf bench --format=simple sched messaging # specified simple 1.431 Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257808802-9420-3-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 242aa14a67f4e19453fc8a51cffc5ac5ee5bcbd1 Author: Hitoshi Mitake Date: Tue Nov 10 08:19:59 2009 +0900 perf bench: Add format constants to bench.h for unified output formatting This patch adds some constants and extern declaration to bench.h. These are used for unified output formatting of 'perf bench'. Signed-off-by: Hitoshi Mitake Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1257808802-9420-2-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit eae0c9dfb534cb3449888b9601228efa6480fdb5 Author: Mike Galbraith Date: Tue Nov 10 03:50:02 2009 +0100 sched: Fix and clean up rate-limit newidle code Commit 1b9508f, "Rate-limit newidle" has been confirmed to fix the netperf UDP loopback regression reported by Alex Shi. This is a cleanup and a fix: - moved to a more out of the way spot - fix to ensure that balancing doesn't try to balance runqueues which haven't gone online yet, which can mess up CPU enumeration during boot. Reported-by: Alex Shi Reported-by: Zhang, Yanmin Signed-off-by: Mike Galbraith Acked-by: Peter Zijlstra Cc: # .32.x: a1f84a3: sched: Check for an idle shared cache Cc: # .32.x: 1b9508f: sched: Rate-limit newidle Cc: # .32.x: fd21073: sched: Fix affinity logic Cc: # .32.x LKML-Reference: <1257821402.5648.17.camel@marge.simson.net> Signed-off-by: Ingo Molnar commit 9160306e6f5b68bb64630c9031c517ca1cf463db Author: Paul E. McKenney Date: Mon Nov 2 13:52:29 2009 -0800 rcu: Fix note_new_gpnum() uses of ->gpnum Impose a clear locking design on the note_new_gpnum() function's use of the ->gpnum counter. This is done by updating rdp->gpnum only from the corresponding leaf rcu_node structure's rnp->gpnum field, and even then only under the protection of that same rcu_node structure's ->lock field. Performance and scalability are maintained using a form of double-checked locking, and excessive spinning is avoided by use of the spin_trylock() function. The use of spin_trylock() is safe due to the fact that CPUs who fail to acquire this lock will try again later. The hierarchical nature of the rcu_node data structure limits contention (which could be limited further if need be using the RCU_FANOUT kernel parameter). Without this patch, obscure but quite possible races could result in a quiescent state that occurred during one grace period to be accounted to the following grace period, causing this following grace period to end prematurely. Not good! Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: # .32.x LKML-Reference: <12571987492350-git-send-email-> Signed-off-by: Ingo Molnar commit d09b62dfa336447c52a5ec9bb88adbc479b0f3b8 Author: Paul E. McKenney Date: Mon Nov 2 13:52:28 2009 -0800 rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter Impose a clear locking design on the rcu_process_gp_end() function's use of the ->completed counter. This is done by creating a ->completed field in the rcu_node structure, which can safely be accessed under the protection of that structure's lock. Performance and scalability are maintained by using a form of double-checked locking, so that rcu_process_gp_end() only acquires the leaf rcu_node structure's ->lock if a grace period has recently ended. This fix reduces rcutorture failure rate by at least two orders of magnitude under heavy stress with force_quiescent_state() being invoked artificially often. Without this fix, unsynchronized access to the ->completed field can cause rcu_process_gp_end() to advance callbacks whose grace period has not yet expired. (Bad idea!) Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: # .32.x LKML-Reference: <12571987494069-git-send-email-> Signed-off-by: Ingo Molnar commit 281d150c5f8892f158747594ab49ce2823fd8b8c Author: Paul E. McKenney Date: Mon Nov 2 13:52:27 2009 -0800 rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter Impose a clear locking design on non-NO_HZ handling of the ->completed counter. This increases the distance between the RCU and the CPU-hotplug mechanisms. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: # .32.x LKML-Reference: <12571987491353-git-send-email-> Signed-off-by: Ingo Molnar commit 7e1a2766e67a529f62c8cfba0a47d63fc4f7fa8a Merge: c5e0cb3 83f5b01 Author: Ingo Molnar Date: Tue Nov 10 04:10:31 2009 +0100 Merge branch 'core/urgent' into core/rcu Merge reason: Pick up RCU fixlet to base further commits on. Signed-off-by: Ingo Molnar commit dd8dbf2e6880e30c00b18600c962d0cb5a03c555 Author: Eric Paris Date: Tue Nov 3 16:35:32 2009 +1100 security: report the module name to security_module_request For SELinux to do better filtering in userspace we send the name of the module along with the AVC denial when a program is denied module_request. Example output: type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null) type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc: denied { module_request } for pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system Signed-off-by: Eric Paris Signed-off-by: James Morris commit ca2b900f9af1586b9889ccc4b12e453c13268bd5 Author: Zeev Tarantov Date: Mon Nov 9 13:26:13 2009 +0200 perf tools: Fix syntax in documentation Fix trivial syntax in perf-events user-space tools documentation. Signed-off-by: Zeev Tarantov Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <12d7e64c0911081811i7e5b466cu6706ff6ab3e70db4@mail.gmail.com> Signed-off-by: Ingo Molnar commit f84d49b218b7d4c6cba2e0b41f24bd4045403962 Author: Naohiro Ooiwa Date: Mon Nov 9 00:46:42 2009 +0900 signal: Print warning message when dropping signals When the system has too many timers or too many aggregate queued signals, the EAGAIN error is returned to application from kernel, including timer_create() [POSIX.1b]. It means that the app exceeded the limit of pending signals, but in general application writers do not expect this outcome and the current silent failure can cause rare app failures under very high load. This patch adds a new message when we reach the limit and if print_fatal_signals is enabled: task/1234: reached RLIMIT_SIGPENDING, dropping signal If you see this message and your system behaved unexpectedly, you can run following command to lift the limit: # ulimit -i unlimited With help from Hiroshi Shimamoto . Signed-off-by: Naohiro Ooiwa Cc: Andrew Morton Cc: Hiroshi Shimamoto Cc: Roland McGrath Cc: Peter Zijlstra Cc: oleg@redhat.com LKML-Reference: <4AF6E7E2.9080406@miraclelinux.com> [ Modified a few small details, gave surrounding code some love. ] Signed-off-by: Ingo Molnar commit 638bba55fe6440439005f02fcd6b0c1f908d0d11 Author: Dominik Brodowski Date: Sat Nov 7 12:26:17 2009 +0100 pcmcia: autoload module pcmcia Attempt to load the "pcmcia" module for 16-bit PCMCIA cards, so that PCMCIA support becomes available without pcmciautils/udev userspace interaction. Based on a suggestion and a patch Signed-off-by: Komuro but converted it to request_module_nowait() and move it to a later stage. Signed-off-by: Dominik Brodowski commit 55a19b39acb8888af8e9cfe5b762d03c52fdb48c Author: Dominik Brodowski Date: Thu Oct 29 00:54:49 2009 +0100 pcmcia/staging: update comedi drivers Update comedi PCMCIA drivers to work with recent PCMCIA changes documented in Documentation/pcmcia/driver-changes.txt: - use pcmcia_config_loop() - don't use PCMCIA_DEBUG, but use dev_dbg() - don't use cs_error() - re-use prod_id and card_id values already stored Acked-by: Greg Kroah-Hartman Signed-off-by: Dominik Brodowski commit 66024db57d5b9011e274b314affad68f370c0d6f Author: Russell King - ARM Linux Date: Sun Mar 29 22:45:26 2009 +0100 PCMCIA: stop duplicating pci_irq in soc_pcmcia_socket skt->irq is a mere duplication of pcmcia_socket's pci_irq member. Get rid of it. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit 1689164a272a962572a1f31af715dfe462cf7910 Author: Russell King - ARM Linux Date: Sun Mar 29 22:43:43 2009 +0100 PCMCIA: ss: allow PCI IRQs > 255 Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit f397b9c5dcc30a575973b2e4f0a602fc85b38853 Author: Russell King - ARM Linux Date: Sun Mar 29 22:12:34 2009 +0100 PCMCIA: soc_common: remove 'dev' member from soc_pcmcia_socket The 'dev' member is now only ever written, so we can safely remove it. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit b62d99b5028b6df1c32b864fd9dd32ad6b42d396 Author: Russell King - ARM Linux Date: Sun Mar 29 22:14:32 2009 +0100 PCMCIA: soc_common: constify soc_pcmcia_socket ops member No one should modify the ops structure supplied to soc_pcmcia_socket so make it const. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit dabd14684bc2375bf69f227f04993a4dc2fd3a16 Author: Russell King - ARM Linux Date: Sun Mar 29 22:35:11 2009 +0100 PCMCIA: sa1111: remove duplicated initializers Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit 701a5dc05ad99a06958b3f97cb69d99b47cebee3 Author: Russell King - ARM Linux Date: Sun Mar 29 19:42:44 2009 +0100 PCMCIA: sa1111: wrap soc_pcmcia_socket to contain sa1111 specific data Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit da4f007375197d6683461b995d404b01a7fdf2f5 Author: Russell King - ARM Linux Date: Sun Mar 29 19:23:42 2009 +0100 PCMCIA: soc_common: push socket probe down into SoC specific support Move the individual socket probing and initialization down into the SoC specific support files, thereby allowing soc_common_drv_pcmcia_probe to be eliminated. soc_common.c now no longer deals with distinct groups of sockets. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit be85458edce0f165cff62622f5e73b1d17b1e228 Author: Russell King - ARM Linux Date: Thu Mar 26 22:21:18 2009 +0000 PCMCIA: soc_common: push socket removal down to SoC specific support Mechanically transplant the removal code from soc_common into each SoC specific base support file, thereby allowing soc_common_drv_pcmcia_remove to be removed. No other changes. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit 097e296d6175881eba7244de7222de61e9569911 Author: Russell King - ARM Linux Date: Thu Mar 26 21:45:05 2009 +0000 PCMCIA: soc_common: provide single socket add/remove functionality Factor out the functionality for adding and removing a single socket, thereby allowing SoCs to individually register each socket. The advantage of this approach is that SoCs can then extend soc_pcmcia_socket as they wish. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit 0f767de6a26a07f7d58394512b6f6c96322f047f Author: Russell King - ARM Linux Date: Thu Mar 26 21:14:19 2009 +0000 PCMCIA: soc_common: convert to a stand alone module Convert soc_common.c to be a stand alone module, rather than wrapping it up into the individual SoC specific base modules. In doing this, we need to add init/exit functions for soc_common to register/remove the cpufreq notifier. Signed-off-by: Russell King Signed-off-by: Dominik Brodowski commit a7149f9a26eb44a5658d56335c23104ba529e9f6 Author: Dominik Brodowski Date: Sat Oct 24 18:07:16 2009 +0200 pcmcia: use dev_dbg and dev_print in pd6729.c As suggested by Wolfram Sang , use dev_dbg(), and dev_{err,warn,info}() in pd6729.c, and add some "\n" suggested by Komuro . In the ISR, use pr_devel() and dev_vdbg() as they are only compiled if DEBUG (or, for dev_vdbg(), VERBOSE_DEBUG) are set explicitly. CC: Komuro Acked-by: Wolfram Sang Signed-off-by: Dominik Brodowski commit 9cb495bb41f07a3ebfc60d3b9d26017a1fd7050c Author: Dominik Brodowski Date: Sat Oct 24 15:57:22 2009 +0200 pcmcia: remove now-defunct cs_error, pcmcia_error_{func,ret} As all in-tree drivers have been converted to not use cs_error() any more, drop these functions and definitions, and update the Documentation. Signed-off-by: Dominik Brodowski commit 9b44de2015ff4a2ed1d56efedfcc72b917d356a6 Author: Dominik Brodowski Date: Sat Oct 24 15:55:39 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (misc drivers) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-mtd@lists.infradead.org CC: linux-usb@vger.kernel.org Signed-off-by: Dominik Brodowski commit 7c5af6ffd69bb2bb3c86b374153627529d67598c Author: Dominik Brodowski Date: Sat Oct 24 15:55:12 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (sound) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: Jaroslav Kysela CC: alsa-devel@alsa-project.org Signed-off-by: Dominik Brodowski commit 9ec0bf41b5030ccc691049754ed1398cad5e953e Author: Dominik Brodowski Date: Sat Oct 24 15:54:46 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (serial_cs) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-serial@vger.kernel.org CC: Russell King Signed-off-by: Dominik Brodowski commit 3e7166178a83fef690dcbfcdaeda192f7282a9a4 Author: Dominik Brodowski Date: Sat Oct 24 15:54:14 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (scsi) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-scsi@vger.kernel.org Signed-off-by: Dominik Brodowski commit 2caff14713d53abba273e6095495788e2720f756 Author: Dominik Brodowski Date: Sat Oct 24 15:53:36 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (wireless) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-wireless@vger.kernel.org CC: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski commit 624dd66957e53e15cf40e937b50597c4d41f0e99 Author: Dominik Brodowski Date: Sat Oct 24 15:52:44 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (ray-cs.c) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-wireless@vger.kernel.org CC: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski commit dd0fab5b940c0b65f26ac5b01485bac1f690ace6 Author: Dominik Brodowski Date: Sat Oct 24 15:51:05 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (net) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Only some rare debug checks are now hidden behind "#ifdef DEBUG" or "#if 0". Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski commit e773cfe167c320d07b9423bc51fc4ab0221775a4 Author: Dominik Brodowski Date: Sat Oct 24 15:50:13 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (isdn) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: Karsten Keil Signed-off-by: Dominik Brodowski commit cbf624f0e18c4a05219855663a3e5f9fe8f2d876 Author: Dominik Brodowski Date: Sat Oct 24 15:47:29 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (char) Convert PCMCIA drivers to use the dynamic debug infrastructure, instead of requiring manual settings of PCMCIA_DEBUG. Only some rare extra debug checks in cm4000_cs.c cm4040_cs.c are now hidden behind a "#ifdef CM4000_DEBUG" or "#ifdef CM4040_DEBUG". Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: Harald Welte CC: Jiri Kosina CC: David Sterba Signed-off-by: Dominik Brodowski commit 5ff0cfc67f00fe0feaa1da0b2359232ea4aa0ee7 Author: Hitoshi Mitake Date: Mon Nov 9 12:31:05 2009 +0900 perf bench: Fix bench/sched-pipe.c to wait for child process Ingo reported this small 'perf bench sched pipe' output problem: | $ ./perf bench sched pipe | (executing 1000000 pipe operations between two tasks) | | Total time:4.898 sec | $ 4.898586 usecs/op | 204140 ops/sec | | the shell prompt came back before the usecs/op and ops/sec line | was printed. Process teardown race, lack of wait() or so? This caused by lack of calling waitpid() by parent process, so I added it. Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Jiri Kosina LKML-Reference: <1257737465-7546-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 6e65f92ff0d6f18580737321718d09035085a3fb Author: John Johansen Date: Thu Nov 5 17:03:20 2009 -0800 Config option to set a default LSM The LSM currently requires setting a kernel parameter at boot to select a specific LSM. This adds a config option that allows specifying a default LSM that is used unless overridden with the security= kernel parameter. If the the config option is not set the current behavior of first LSM to register is used. Signed-off-by: John Johansen Acked-by: Serge Hallyn Signed-off-by: James Morris commit 0e1a6ef2dea88101b056b6d9984f3325c5efced3 Author: Kees Cook Date: Sun Nov 8 09:37:00 2009 -0800 sysctl: require CAP_SYS_RAWIO to set mmap_min_addr Currently the mmap_min_addr value can only be bypassed during mmap when the task has CAP_SYS_RAWIO. However, the mmap_min_addr sysctl value itself can be adjusted to 0 if euid == 0, allowing a bypass without CAP_SYS_RAWIO. This patch adds a check for the capability before allowing mmap_min_addr to be changed. Signed-off-by: Kees Cook Acked-by: Serge Hallyn Signed-off-by: James Morris commit f4a70c55376683213229af7266dc57ad81aee354 Author: Cyrill Gorcunov Date: Sun Nov 8 16:16:45 2009 +0300 x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit In fact it's never get used on x86-64 (for 64 bit platform we use differ technique to enumerate io-units). Reported-by: Stephen Rothwell Signed-off-by: Cyrill Gorcunov Cc: Peter Zijlstra LKML-Reference: <20091108131645.GD5300@lenovo> Signed-off-by: Ingo Molnar commit 9ac3e58ceff0b7b8b981c09c38a28742270eea12 Author: Dominik Brodowski Date: Sat Oct 24 15:45:06 2009 +0200 pcmcia: deprecate CS_CHECK (bluetooth) Remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-bluetooth@vger.kernel.org Signed-off-by: Dominik Brodowski commit 444486a5f9d2737b50e53dc140292899b9497808 Author: Dominik Brodowski Date: Fri Oct 23 12:55:28 2009 +0200 pcmcia: use dynamic debug infrastructure, deprecate CS_CHECK (ide) ide-cs.c is the only PCMCIA device driver making use of CONFIG_PCMCIA_DEBUG, so convert it to use the dynamic debug infrastructure. Also, remove all usages of the CS_CHECK macro and replace them with proper Linux style calling and return value checking. The extra error reporting may be dropped, as the PCMCIA core already complains about any (non-driver-author) errors. CC: linux-ide@vger.kernel.org Signed-off-by: Dominik Brodowski commit 6d9a299f675b176e2f81e1f6d5a361a1173971ea Author: Dominik Brodowski Date: Sat Oct 24 12:20:18 2009 +0200 pcmcia: extend error reporting and debug messages in core Add a few more error and debug messages to the PCMCIA core. Signed-off-by: Dominik Brodowski commit c9f50dddd184a020d64dab63fa795967f0f14aa4 Author: Dominik Brodowski Date: Fri Oct 23 12:56:46 2009 +0200 pcmcia: use dynamic debug in PCMCIA socket drivers Make use of the dynamic debug infrastructure in various PCMCIA socket drivers. By doing so, only the drivers relying on soc_common make use of CONFIG_PCMCIA_DEBUG. Therefore, update the Kconfig entry accordingly. Signed-off-by: Dominik Brodowski commit d50dbec3ce52e1608636b8a624d087da9ced8cde Author: Dominik Brodowski Date: Fri Oct 23 12:51:28 2009 +0200 pcmcia: use dynamic debug instead of custom infrastructure Use the generic "dynamic debug" infrastructure instead of CONIG_PCMCIA_DEBUG in the PCMCIA core (pcmcia.ko and pcmcia_core.ko). To enable debugging, enable CONFIG_DYNAMIC_DEBUG, mount debugfs and $ echo -n 'module pcmcia_core +p' > /sys/kernel/debug/dynamic_debug/control for the complete module "pcmcia_core", for example. For more detailled instructions, please see Documentation/dynamic-debug-howto.txt Signed-off-by: Dominik Brodowski commit 18a7a19b37838789452e0bd2855a51475628b971 Author: Dominik Brodowski Date: Mon Oct 19 00:07:39 2009 +0200 pcmcia: remove pcmcia_get_{first,next}_tuple() Remove the pcmcia_get_{first,next}_tuple() calls no longer needed by (current) pcmcia device drivers. Signed-off-by: Dominik Brodowski commit 18b61b97294dad74dd00a1aa8efed0cfacb95aff Author: Dominik Brodowski Date: Sun Oct 18 23:57:58 2009 +0200 pcmcia: convert pcmciamtd driver to use new CIS helpers Convert the (broken) pcmciamtd driver to use the new CIS helpers. CC: David.Woodhouse@intel.com CC: linux-mtd@lists.infradead.org Signed-off-by: Dominik Brodowski commit 37ace3d4131ae80f370eb1230fa7db2b3eedf17c Author: Dominik Brodowski Date: Sun Oct 18 23:56:41 2009 +0200 pcmcia: convert ssb pcmcia driver to use new CIS helpers SSB is a prime example of how to make use of the new CIS helpers. CC: Michael Buesch Acked-by: John W. Linville Signed-off-by: Dominik Brodowski commit dddfbd824b96a25da0b2f1cf35c0be33ef2422fe Author: Dominik Brodowski Date: Sun Oct 18 23:54:24 2009 +0200 pcmcia: convert net pcmcia drivers to use new CIS helpers Use the new CIS helpers in net pcmcia drivers, which allows for a few code cleanups. This revision does not remove the phys_addr assignment in 3c589_cs.c -- a bug noted by Komuro CC: David S. Miller CC: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski commit 91284224da5b15ec6c2b45e10fa5eccd1c92a204 Author: Dominik Brodowski Date: Sun Oct 18 23:32:33 2009 +0200 pcmcia: add new CIS access helpers As a replacement to pcmcia_get_{first,next}_tuple() and pcmcia_get_tuple_data(), three new -- and easier to use -- functions are added: - pcmcia_get_tuple() to get the very first CIS entry of one type. - pcmcia_loop_tuple() to loop over all CIS entries of one type. - pcmcia_get_mac_from_cis() to read out the hardware MAC address from CISTPL_FUNCE. Only a handful of drivers need these functions anyway, as most CIS access is already handled by pcmcia_loop_config(), which now shares the same backed (pccard_loop_tuple()) with pcmcia_loop_tuple(). A pcmcia_get_mac_from_cis() bug noted by Komuro has been fixed in this revision. Signed-off-by: Dominik Brodowski commit af757923a92e6e9dbfdb6b0264be14c564e1c466 Author: Dominik Brodowski Date: Sun Oct 18 19:48:39 2009 +0200 ipwireless: make more use of pcmcia_loop_config() Within the pcmcia_loop_config() callback, we already have all tuple data available we need. Also add a fix to release the IO resource (at least within pcmcia_loop_config() error path). CC: Jiri Kosina CC: David Sterba Signed-off-by: Dominik Brodowski commit aaa8cfdada648a6bae32f62df76cc60137a2b323 Author: Dominik Brodowski Date: Sun Oct 18 18:28:39 2009 +0200 pcmcia: use pcmcia_loop_config in misc pcmcia drivers Use pcmcia_loop_config() in a few drivers missed during the first round. On fmvj18x_cs.c it -- strangely -- only requries us to set conf.ConfigIndex, which is done by the core, so include an empty loop function which returns 0 unconditionally. CC: David S. Miller CC: David Sterba CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org For the ipwireless part: Acked-by: Jiri Kosina Acked-by: John W. Linville Signed-off-by: Dominik Brodowski commit 7d2e8d00b47b973c92db4df7444d5e6d3bb945f9 Author: Dominik Brodowski Date: Sun Oct 18 18:22:32 2009 +0200 pcmcia: use pre-determined values A few PCMCIA network drivers can make use of values provided by the pcmcia core, instead of tedious, independent CIS parsing. xirc32ps_cs.c: manf_id hostap_cs.c: multifunction count b43/pcmcia.c: ConfigBase address and "Present" smc91c92_cs.c: By default, mhz_setup() can use VERS_1 as it is stored in struct pcmcia_device. Only some cards require workarounds, such as reading out VERS_1 twice. CC: David S. Miller CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org Acked-by: John W. Linville Signed-off-by: Dominik Brodowski commit 549104f22b3cd4761145eb5fba6ee4d59822da61 Author: Clark Williams Date: Sun Nov 8 09:03:07 2009 -0600 perf tools: Modify perf routines to use new debugfs routines modify perf.c get_debugfs_mntpnt() to use the util/debugfs.c debugfs_find_mountpoint() modify util/parse-events.c to use debugfs_valid_mountpoint(). Signed-off-by: Clark Williams Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra LKML-Reference: <20091101155720.624cc87e@torg> Signed-off-by: Ingo Molnar commit afe61f677866ffc484e69c4ecca2d316d564d78b Author: Clark Williams Date: Sun Nov 8 09:01:37 2009 -0600 perf tools: Add debugfs utility routines for perf Add routines to locate the debugfs mount point and to manage the mounting and unmounting of the debugfs. Signed-off-by: Clark Williams Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra LKML-Reference: <20091101155621.2b3503ee@torg> Signed-off-by: Ingo Molnar commit 4343fe1024e09e17667f95620ed3e69a7a5f4389 Author: Cyrill Gorcunov Date: Sun Nov 8 18:54:31 2009 +0300 x86, ioapic: Use snrpintf while set names for IO-APIC resourses We should be ready that one day MAX_IO_APICS may raise its number. To prevent memory overwrite we're to use safe snprintf while set IO-APIC resourse name. Signed-off-by: Cyrill Gorcunov Cc: Yinghai Lu LKML-Reference: <20091108155431.GC25940@lenovo> Signed-off-by: Ingo Molnar commit 46dc281b1bb02527195fe2ad50a3af6d7f7f7325 Author: Cyrill Gorcunov Date: Sun Nov 8 18:53:56 2009 +0300 x86, apic: Use PAGE_SIZE instead of numbers The whole page is reserved for IO-APIC fixmap due to non-cacheable requirement. So lets note this explicitly instead of playing with numbers. Signed-off-by: Cyrill Gorcunov Cc: Yinghai Lu Cc: Maciej W. Rozycki LKML-Reference: <20091108155356.GB25940@lenovo> Signed-off-by: Ingo Molnar commit 30ff21e31fe5c8b7b1b7d30cc41e32bc4ee9f175 Author: Li Zefan Date: Thu Sep 10 09:35:20 2009 +0800 ksym_tracer: Remove KSYM_SELFTEST_ENTRY The macro used to be used in both trace_selftest.c and trace_ksym.c, but no longer, so remove it from header file. Signed-off-by: Li Zefan Cc: Prasad Signed-off-by: Frederic Weisbecker commit ba1c813a6b9a0ef14d7112daf51270eff326f037 Author: Frederic Weisbecker Date: Thu Sep 10 09:26:21 2009 +0200 hw-breakpoints: Arbitrate access to pmu following registers constraints Allow or refuse to build a counter using the breakpoints pmu following given constraints. We keep track of the pmu users by using three per cpu variables: - nr_cpu_bp_pinned stores the number of pinned cpu breakpoints counters in the given cpu - nr_bp_flexible stores the number of non-pinned breakpoints counters in the given cpu. - task_bp_pinned stores the number of pinned task breakpoints in a cpu The latter is not a simple counter but gathers the number of tasks that have n pinned breakpoints. Considering HBP_NUM the number of available breakpoint address registers: task_bp_pinned[0] is the number of tasks having 1 breakpoint task_bp_pinned[1] is the number of tasks having 2 breakpoints [...] task_bp_pinned[HBP_NUM - 1] is the number of tasks having the maximum number of registers (HBP_NUM). When a breakpoint counter is created and wants an access to the pmu, we evaluate the following constraints: == Non-pinned counter == - If attached to a single cpu, check: (per_cpu(nr_bp_flexible, cpu) || (per_cpu(nr_cpu_bp_pinned, cpu) + max(per_cpu(task_bp_pinned, cpu)))) < HBP_NUM -> If there are already non-pinned counters in this cpu, it means there is already a free slot for them. Otherwise, we check that the maximum number of per task breakpoints (for this cpu) plus the number of per cpu breakpoint (for this cpu) doesn't cover every registers. - If attached to every cpus, check: (per_cpu(nr_bp_flexible, *) || (max(per_cpu(nr_cpu_bp_pinned, *)) + max(per_cpu(task_bp_pinned, *)))) < HBP_NUM -> This is roughly the same, except we check the number of per cpu bp for every cpu and we keep the max one. Same for the per tasks breakpoints. == Pinned counter == - If attached to a single cpu, check: ((per_cpu(nr_bp_flexible, cpu) > 1) + per_cpu(nr_cpu_bp_pinned, cpu) + max(per_cpu(task_bp_pinned, cpu))) < HBP_NUM -> Same checks as before. But now the nr_bp_flexible, if any, must keep one register at least (or flexible breakpoints will never be be fed). - If attached to every cpus, check: ((per_cpu(nr_bp_flexible, *) > 1) + max(per_cpu(nr_cpu_bp_pinned, *)) + max(per_cpu(task_bp_pinned, *))) < HBP_NUM Changes in v2: - Counter -> event rename Changes in v5: - Fix unreleased non-pinned task-bound-only counters. We only released it in the first cpu. (Thanks to Paul Mackerras for reporting that) Changes in v6: - Currently, events scheduling are done in this order: cpu context pinned + cpu context non-pinned + task context pinned + task context non-pinned events. Then our current constraints are right theoretically but not in practice, because non-pinned counters may be scheduled before we can apply every possible pinned counters. So consider non-pinned counters as pinned for now. Signed-off-by: Frederic Weisbecker Cc: Prasad Cc: Alan Stern Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Ingo Molnar Cc: Jan Kiszka Cc: Jiri Slaby Cc: Li Zefan Cc: Avi Kivity Cc: Paul Mackerras Cc: Mike Galbraith Cc: Masami Hiramatsu Cc: Paul Mundt commit 24f1e32c60c45c89a997c73395b69c8af6f0a84e Author: Frederic Weisbecker Date: Wed Sep 9 19:22:48 2009 +0200 hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events This patch rebase the implementation of the breakpoints API on top of perf events instances. Each breakpoints are now perf events that handle the register scheduling, thread/cpu attachment, etc.. The new layering is now made as follows: ptrace kgdb ftrace perf syscall \ | / / \ | / / / Core breakpoint API / / | / | / Breakpoints perf events | | Breakpoints PMU ---- Debug Register constraints handling (Part of core breakpoint API) | | Hardware debug registers Reasons of this rewrite: - Use the centralized/optimized pmu registers scheduling, implying an easier arch integration - More powerful register handling: perf attributes (pinned/flexible events, exclusive/non-exclusive, tunable period, etc...) Impact: - New perf ABI: the hardware breakpoints counters - Ptrace breakpoints setting remains tricky and still needs some per thread breakpoints references. Todo (in the order): - Support breakpoints perf counter events for perf tools (ie: implement perf_bpcounter_event()) - Support from perf tools Changes in v2: - Follow the perf "event " rename - The ptrace regression have been fixed (ptrace breakpoint perf events weren't released when a task ended) - Drop the struct hw_breakpoint and store generic fields in perf_event_attr. - Separate core and arch specific headers, drop asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h - Use new generic len/type for breakpoint - Handle off case: when breakpoints api is not supported by an arch Changes in v3: - Fix broken CONFIG_KVM, we need to propagate the breakpoint api changes to kvm when we exit the guest and restore the bp registers to the host. Changes in v4: - Drop the hw_breakpoint_restore() stub as it is only used by KVM - EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a module - Restore the breakpoints unconditionally on kvm guest exit: TIF_DEBUG_THREAD doesn't anymore cover every cases of running breakpoints and vcpu->arch.switch_db_regs might not always be set when the guest used debug registers. (Waiting for a reliable optimization) Changes in v5: - Split-up the asm-generic/hw-breakpoint.h moving to linux/hw_breakpoint.h into a separate patch - Optimize the breakpoints restoring while switching from kvm guest to host. We only want to restore the state if we have active breakpoints to the host, otherwise we don't care about messed-up address registers. - Add asm/hw_breakpoint.h to Kbuild - Fix bad breakpoint type in trace_selftest.c Changes in v6: - Fix wrong header inclusion in trace.h (triggered a build error with CONFIG_FTRACE_SELFTEST Signed-off-by: Frederic Weisbecker Cc: Prasad Cc: Alan Stern Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Ingo Molnar Cc: Jan Kiszka Cc: Jiri Slaby Cc: Li Zefan Cc: Avi Kivity Cc: Paul Mackerras Cc: Mike Galbraith Cc: Masami Hiramatsu Cc: Paul Mundt commit 2ae8bb75db1f3de422eb5898f2a063c46c36dba8 Author: Tejun Heo Date: Mon Oct 26 15:41:46 2009 +0100 x86: Fix iommu=nodac parameter handling iommu=nodac should forbid dac instead of enabling it. Fix it. Signed-off-by: Tejun Heo Acked-by: FUJITA Tomonori Cc: Matteo Frigo Cc: # .32.x and older LKML-Reference: <4AE5B52A.4050408@kernel.org> Signed-off-by: Ingo Molnar commit d8c80ce091f6ead6710bc71b58f2c32e5bf855e4 Author: Lai Jiangshan Date: Tue Oct 27 15:45:23 2009 +0800 sched, no_hz: Remove unused rq->last_tick_seen field In 15934a37324f32e0fda633dc7984a671ea81cd75, field last_tick_seen is added to struct rq. But it is unused now. Signed-off-by: Lai Jiangshan Cc: Guillaume Chazarain LKML-Reference: <4AE6A513.6010100@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit c82a43d40b93200a10a9fec0a489791e65e135ca Author: Cyrill Gorcunov Date: Mon Oct 26 23:28:11 2009 +0300 irq: Do not attempt to create subdirectories if /proc/irq/ failed If a parent directory (ie /proc/irq/) could not be created we should not attempt to create subdirectories. Otherwise it would lead that "smp_affinity" and "spurious" entries are may be registered under /proc root instead of a proper place. Signed-off-by: Cyrill Gorcunov Cc: Rusty Russell Cc: Yinghai Lu LKML-Reference: <20091026202811.GD5321@lenovo> Signed-off-by: Ingo Molnar commit 338bac527ed0e35b4cb50390972f15d3cbce92ca Author: FUJITA Tomonori Date: Tue Oct 27 16:34:44 2009 +0900 x86: Use x86_platform for iommu_shutdown This patch cleans up pci_iommu_shutdown() a bit to use x86_platform (similar to how IA64 initializes an IOMMU driver). This adds iommu_shutdown() to x86_platform to avoid calling every IOMMUs' shutdown functions in pci_iommu_shutdown() in order. The IOMMU shutdown functions are platform specific (we don't have multiple different IOMMU hardware) so the current way is pointless. An IOMMU driver sets x86_platform.iommu_shutdown to the shutdown function if necessary. Signed-off-by: FUJITA Tomonori Cc: joerg.roedel@amd.com LKML-Reference: <20091027163358F.fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar commit 0d0fbbddcc27c062815732b38c44b544e656c799 Author: Rusty Russell Date: Thu Nov 5 22:45:41 2009 +1030 x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t This makes the declarations match the definitions, which already use 'struct cpumask'. Signed-off-by: Rusty Russell Acked-by: Borislav Petkov LKML-Reference: <200911052245.41803.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar commit c12a229bc5971534537a7d0e49e44f9f1f5d0336 Author: Masami Hiramatsu Date: Thu Nov 5 11:03:59 2009 -0500 x86: Remove unused thread_return label from switch_to() Remove unused thread_return label from switch_to() macro on x86-64. Since this symbol cuts into schedule(), backtrace at the latter half of schedule() was always shown as thread_return(). Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE LKML-Reference: <20091105160359.5181.26225.stgit@harusame> Signed-off-by: Ingo Molnar commit 8d06367fa79c053a4a56a2ce0bb9e840f5da1236 Author: Arnaldo Carvalho de Melo Date: Wed Nov 4 18:50:43 2009 -0200 perf symbols: Use the buildids if present With this change 'perf record' will intercept PERF_RECORD_MMAP calls, creating a linked list of DSOs, then when the session finishes, it will traverse this list and read the buildids, stashing them at the end of the file and will set up a new feature bit in the header bitmask. 'perf report' will then notice this feature and populate the 'dsos' list and set the build ids. When reading the symtabs it will refuse to load from a file that doesn't have the same build id. This improves the reliability of the profiler output, as symbols and profiling data is more guaranteed to match. Example: [root@doppio ~]# perf report | head /home/acme/bin/perf with build id b1ea544ac3746e7538972548a09aadecc5753868 not found, continuing without symbols # Samples: 2621434559 # # Overhead Command Shared Object Symbol # ........ ............... ............................. ...... # 7.91% init [kernel] [k] read_hpet 7.64% init [kernel] [k] mwait_idle_with_hints 7.60% swapper [kernel] [k] read_hpet 7.60% swapper [kernel] [k] mwait_idle_with_hints 3.65% init [kernel] [k] 0xffffffffa02339d9 [root@doppio ~]# In this case the 'perf' binary was an older one, vanished, so its symbols probably wouldn't match or would cause subtly different (and misleading) output. Next patches will support the kernel as well, reading the build id notes for it and the modules from /sys. Another patch should also introduce a new plumbing command: 'perf list-buildids' that will then be used in porcelain that is distro specific to fetch -debuginfo packages where such buildids are present. This will in turn allow for one to run 'perf record' in one machine and 'perf report' in another. Future work on having the buildid sent directly from the kernel in the PERF_RECORD_MMAP event is needed to close races, as the DSO can be changed during a 'perf record' session, but this patch at least helps with non-corner cases and current/older kernels. Signed-off-by: Arnaldo Carvalho de Melo Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: Jim Keniston Cc: K. Prasad Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: Roland McGrath Cc: Srikar Dronamraju Cc: Steven Rostedt LKML-Reference: <1257367843-26224-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 444a2a3bcd6d5bed5c823136f68fcc93c0fe283f Author: Frederic Weisbecker Date: Fri Nov 6 04:13:05 2009 +0100 tracing, perf_events: Protect the buffer from recursion in perf While tracing using events with perf, if one enables the lockdep:lock_acquire event, it will infect every other perf trace events. Basically, you can enable whatever set of trace events through perf but if this event is part of the set, the only result we can get is a long list of lock_acquire events of rcu read lock, and only that. This is because of a recursion inside perf. 1) When a trace event is triggered, it will fill a per cpu buffer and submit it to perf. 2) Perf will commit this event but will also protect some data using rcu_read_lock 3) A recursion appears: rcu_read_lock triggers a lock_acquire event that will fill the per cpu event and then submit the buffer to perf. 4) Perf detects a recursion and ignores it 5) Perf continues its work on the previous event, but its buffer has been overwritten by the lock_acquire event, it has then been turned into a lock_acquire event of rcu read lock Such scenario also happens with lock_release with rcu_read_unlock(). We could turn the rcu_read_lock() into __rcu_read_lock() to drop the lock debugging from perf fast path, but that would make us lose the rcu debugging and that doesn't prevent from other possible kind of recursion from perf in the future. This patch adds a recursion protection based on a counter on the perf trace per cpu buffers to solve the problem. -v2: Fixed lost whitespace, added reviewed-by tag Signed-off-by: Frederic Weisbecker Reviewed-by: Masami Hiramatsu Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Steven Rostedt Cc: Li Zefan Cc: Jason Baron LKML-Reference: <1257477185-7838-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit bfde82ef51e3ea6ab8634d0fdbf5adcdd1b429cb Author: Hitoshi Mitake Date: Thu Nov 5 09:31:37 2009 +0900 perf bench: Add subcommand 'bench' to the Makefile This patch modifies Makefile for new files related to 'bench' subcommand. The new code is active from this point on. Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-8-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit dcba8848d3bc83ec9ee0858b9ae6e4f1c1fa7fa3 Author: Hitoshi Mitake Date: Thu Nov 5 09:31:36 2009 +0900 perf bench: Add new subcommand 'bench' to perf.c This patch modifies perf.c for invoking 'bench' subcommand. Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-7-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 11bd341c043348ecb7462d3bd8e1ad6d00f6892a Author: Hitoshi Mitake Date: Thu Nov 5 09:31:35 2009 +0900 perf bench: Modify builtin.h for new prototype This patch modifies builtin.h to add prototype of cmd_bench(). Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-6-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 629cc356653719c206a05f4dee5c5e242edb6546 Author: Hitoshi Mitake Date: Thu Nov 5 09:31:34 2009 +0900 perf bench: Add builtin-bench.c: General framework for benchmark suites This patch adds builtin-bench.c builtin-bench.c is a general framework for benchmark suites. Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-5-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit c7d9300f367f480aee4663a0e3695c5b48859a1a Author: Hitoshi Mitake Date: Thu Nov 5 09:31:33 2009 +0900 perf bench: Add sched-pipe.c: Benchmark for pipe() system call This patch adds bench/sched-pipe.c. bench/sched-pipe.c is a benchmark program to measure performance of pipe() system call. This benchmark is based on pipe-test-1m.c by Ingo Molnar: http://people.redhat.com/mingo/cfs-scheduler/tools/pipe-test-1m.c Example of use: % perf bench sched pipe (executing 1000000 pipe operations between two tasks) Total time:4.499 sec 4.499179 usecs/op 222262 ops/sec % perf bench sched pipe -s -l 1000 0.015 Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-4-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit e27454cc6352c4226ddc76f5e3a5dedd7dff456a Author: Hitoshi Mitake Date: Thu Nov 5 09:31:32 2009 +0900 perf bench: Add sched-messaging.c: Benchmark for scheduler and IPC mechanisms based on hackbench This patch adds bench/sched-messaging.c. This benchmark measures performance of scheduler and IPC mechanisms, and is based on hackbench by Rusty Russell. Example of usage: % perf bench sched messaging -g 20 -l 1000 -s 5.432 # in sec % perf bench sched messaging # run with default options (20 sender and receiver processes per group) (10 groups == 400 processes run) Total time:0.308 sec % perf bench sched messaging -t -g 20 # # be multi-thread, with 20 groups (20 sender and receiver threads per group) (20 groups == 800 threads run) Total time:0.582 sec ( Rusty is the original author of hackbench.c and he said the code is and was under the GPLv2 so fine to be merged. ) Signed-off-by: Hitoshi Mitake Acked-by: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-3-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit c426bba069e65ea438880a04aa4e7c5b880e1728 Author: Hitoshi Mitake Date: Thu Nov 5 09:31:31 2009 +0900 perf bench: Add new directory and header for new subcommand 'bench' This patch adds bench/ directory and bench/bench.h. bench/ directory will contain modules for bench subcommand. bench/bench.h is for listing prototypes of module functions. Signed-off-by: Hitoshi Mitake Cc: Rusty Russell Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Arnaldo Carvalho de Melo Cc: fweisbec@gmail.com Cc: Jiri Kosina LKML-Reference: <1257381097-4743-2-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Ingo Molnar commit 2da3e160cb3d226d87b907fab26850d838ed8d7c Author: Frederic Weisbecker Date: Thu Nov 5 23:06:50 2009 +0100 hw-breakpoint: Move asm-generic/hw_breakpoint.h to linux/hw_breakpoint.h We plan to make the breakpoints parameters generic among architectures. For that it's better to move the asm-generic header to a generic linux header. Signed-off-by: Frederic Weisbecker commit fd210738f6601d0fb462df9a2fe5a41896ff6a8f Author: Mike Galbraith Date: Thu Nov 5 10:57:46 2009 +0100 sched: Fix affinity logic in select_task_rq_fair() Ingo Molnar reported: [ 26.804000] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/10 [ 26.808000] caller is vmstat_update+0x26/0x70 [ 26.812000] Pid: 10, comm: events/1 Not tainted 2.6.32-rc5 #6887 [ 26.816000] Call Trace: [ 26.820000] [] ? printk+0x28/0x3c [ 26.824000] [] debug_smp_processor_id+0xf0/0x110 [ 26.824000] mount used greatest stack depth: 1464 bytes left [ 26.828000] [] vmstat_update+0x26/0x70 [ 26.832000] [] worker_thread+0x188/0x310 [ 26.836000] [] ? worker_thread+0x127/0x310 [ 26.840000] [] ? autoremove_wake_function+0x0/0x60 [ 26.844000] [] ? worker_thread+0x0/0x310 [ 26.848000] [] kthread+0x7c/0x90 [ 26.852000] [] ? kthread+0x0/0x90 [ 26.856000] [] kernel_thread_helper+0x7/0x10 [ 26.860000] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/10 [ 26.864000] caller is vmstat_update+0x3c/0x70 Because this commit: a1f84a3: sched: Check for an idle shared cache in select_task_rq_fair() broke ->cpus_allowed. Signed-off-by: Mike Galbraith Cc: Peter Zijlstra Cc: arjan@infradead.org Cc: LKML-Reference: <1257415066.12867.1.camel@marge.simson.net> Signed-off-by: Ingo Molnar commit 1b9508f6831e10d53256825de8904caa22d1ca2c Author: Mike Galbraith Date: Wed Nov 4 17:53:50 2009 +0100 sched: Rate-limit newidle Rate limit newidle to migration_cost. It's a win for all stages of sysbench oltp tests. Signed-off-by: Mike Galbraith Cc: Peter Zijlstra LKML-Reference: Signed-off-by: Ingo Molnar commit a1f84a3ab8e002159498814eaa7e48c33752b04b Author: Mike Galbraith Date: Tue Oct 27 15:35:38 2009 +0100 sched: Check for an idle shared cache in select_task_rq_fair() When waking affine, check for an idle shared cache, and if found, wake to that CPU/sibling instead of the waker's CPU. This improves pgsql+oltp ramp up by roughly 8%. Possibly more for other loads, depending on overlap. The trade-off is a roughly 1% peak downturn if tasks are truly synchronous. Signed-off-by: Mike Galbraith Cc: Arjan van de Ven Cc: Peter Zijlstra Cc: LKML-Reference: <1256654138.17752.7.camel@marge.simson.net> Signed-off-by: Ingo Molnar commit 2a855dd01bc1539111adb7233f587c5c468732ac Author: Sebastian Andrzej Siewior Date: Sun Oct 25 15:37:58 2009 +0100 signal: Fix alternate signal stack check All architectures in the kernel increment/decrement the stack pointer before storing values on the stack. On architectures which have the stack grow down sas_ss_sp == sp is not on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is on the alternate signal stack. On architectures which have the stack grow up sas_ss_sp == sp is on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is not on the alternate signal stack. The current implementation fails for architectures which have the stack grow down on the corner case where sas_ss_sp == sp.This was reported as Debian bug #544905 on AMD64. Simplified test case: http://download.breakpoint.cc/tc-sig-stack.c The test case creates the following stack scenario: 0xn0300 stack top 0xn0200 alt stack pointer top (when switching to alt stack) 0xn01ff alt stack end 0xn0100 alt stack start == stack pointer If the signal is sent the stack pointer is pointing to the base address of the alt stack and the kernel erroneously decides that it has already switched to the alternate stack because of the current check for "sp - sas_ss_sp < sas_ss_size" On parisc (stack grows up) the scenario would be: 0xn0200 stack pointer 0xn01ff alt stack end 0xn0100 alt stack start = alt stack pointer base (when switching to alt stack) 0xn0000 stack base This is handled correctly by the current implementation. [ tglx: Modified for archs which have the stack grow up (parisc) which would fail with the correct implementation for stack grows down. Added a check for sp >= current->sas_ss_sp which is strictly not necessary but makes the code symetric for both variants ] Signed-off-by: Sebastian Andrzej Siewior Cc: Oleg Nesterov Cc: Roland McGrath Cc: Kyle McMartin Cc: stable@kernel.org LKML-Reference: <20091025143758.GA6653@Chamillionaire.breakpoint.cc> Signed-off-by: Thomas Gleixner commit 663e69592856df53ef52969482ef413a96bc4e06 Author: Thomas Gleixner Date: Wed Nov 4 14:22:21 2009 +0100 irq: Remove unused debug_poll_all_shared_irqs() commit 74296a8ed added this function for debug purposes, but it was never used for anything. Remove it. Signed-off-by: Thomas Gleixner commit 24b26d4211130b6455692804c14d537158855cd7 Author: Liuweni Date: Wed Nov 4 20:11:05 2009 +0800 irq: Fix docbook comments Fix docbook comments to match the actual function names (set_irq_msi, handle_percpu_irq). Signed-off-by: Liuwenyi Signed-off-by: Thomas Gleixner commit ce7c42710e2dd133f10b7fc9ed9c73bdd2435f7a Author: Rusty Russell Date: Tue Nov 3 14:53:52 2009 +1030 cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c Ingo wants the certainty of a static cpumask (rather than a cpumask_var_t), but cpumask_t will some day be undefined to avoid on-stack declarations. This is what DECLARE_BITMAP/to_cpumask() is for. Signed-off-by: Rusty Russell LKML-Reference: <200911031453.52394.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar commit acc3f5d7cabbfd6cec71f0c1f9900621fa2d6ae7 Author: Rusty Russell Date: Tue Nov 3 14:53:40 2009 +1030 cpumask: Partition_sched_domains takes array of cpumask_var_t Currently partition_sched_domains() takes a 'struct cpumask *doms_new' which is a kmalloc'ed array of cpumask_t. You can't have such an array if 'struct cpumask' is undefined, as we plan for CONFIG_CPUMASK_OFFSTACK=y. So, we make this an array of cpumask_var_t instead: this is the same for the CONFIG_CPUMASK_OFFSTACK=n case, but requires multiple allocations for the CONFIG_CPUMASK_OFFSTACK=y case. Hence we add alloc_sched_domains() and free_sched_domains() functions. Signed-off-by: Rusty Russell Cc: Peter Zijlstra LKML-Reference: <200911031453.40668.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar commit e2c880630438f80b474378d5487b511b07665051 Author: Rusty Russell Date: Tue Nov 3 14:53:15 2009 +1030 cpumask: Simplify sched_rt.c find_lowest_rq() wants to call pick_optimal_cpu() on the intersection of sched_domain_span(sd) and lowest_mask. Rather than doing a cpus_and into a temporary, we can open-code it. This actually makes the code slightly clearer, IMHO. Signed-off-by: Rusty Russell Acked-by: Gregory Haskins Cc: Steven Rostedt LKML-Reference: <200911031453.15350.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar commit 09879b99d44d701c603935ef2549004405d7f8f9 Author: Hiroshi Shimamoto Date: Wed Nov 4 12:58:15 2009 +0900 x86: Gitignore: arch/x86/lib/inat-tables.c Ignore generated file arch/x86/lib/inat-tables.c. Signed-off-by: Hiroshi Shimamoto Acked-by: Masami Hiramatsu LKML-Reference: <4AF0FBD7.7000501@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit 77b44d1b7c28360910cdbd427fb62d485c08674c Author: Masami Hiramatsu Date: Tue Nov 3 19:12:47 2009 -0500 tracing/kprobes: Rename Kprobe-tracer to kprobe-event Rename Kprobes-based event tracer to kprobes-based tracing event (kprobe-event), since it is not a tracer but an extensible tracing event interface. This also changes CONFIG_KPROBE_TRACER to CONFIG_KPROBE_EVENT and sets it y by default. Signed-off-by: Masami Hiramatsu Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091104001247.3454.14131.stgit@harusame> Signed-off-by: Ingo Molnar commit 91365bbe4f8c39a821f390f785d606304d6dee3c Author: Masami Hiramatsu Date: Tue Nov 3 19:12:38 2009 -0500 perf/probes: Rename perf probe events group name Rename the group name of perf probe events to 'probe'. Signed-off-by: Masami Hiramatsu Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091104001238.3454.70508.stgit@harusame> Signed-off-by: Ingo Molnar commit a225a1d911f0e434dc0407df29fd08e4388f3fa4 Author: Masami Hiramatsu Date: Tue Nov 3 19:12:30 2009 -0500 perf/probes: Fall back to non-dwarf if possible Fall back to non-dwarf probe point if the probe definition may not need dwarf analysis, when perf can't find vmlinux/debuginfo. This might skip some inlined code of target function. Signed-off-by: Masami Hiramatsu Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091104001229.3454.63987.stgit@harusame> Signed-off-by: Ingo Molnar commit a7f4328b91fb6e71dbe1fa4d46f3597c9555014d Author: Masami Hiramatsu Date: Tue Nov 3 19:12:21 2009 -0500 perf/probes: Improve error messages Improve error messages in perf-probe so that users can figure out problems easily. Reported-by: Ingo Molnar Signed-off-by: Masami Hiramatsu Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091104001221.3454.52030.stgit@harusame> Signed-off-by: Ingo Molnar commit c43f9d1e61e265c6bfafdd65c7f07c8d71a7efc3 Author: Masami Hiramatsu Date: Tue Nov 3 19:12:13 2009 -0500 perf/probes: Update Documentation/perf-probe.txt Update Documentation/perf-probe.txt accoding to recent syntax changes. Signed-off-by: Masami Hiramatsu Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091104001212.3454.19415.stgit@harusame> Signed-off-by: Ingo Molnar commit 45a5c8bad827ebb9c9798becc15bce2e804d49e0 Author: Dhaval Giani Date: Wed Nov 4 03:15:44 2009 +0530 sched: Add USER_SCHED to feature removal list Peter Zijlstra suggested that we remove USER_SCHED at: http://lkml.org/lkml/2009/3/21/67 Removing USER_SCHED removes a lot of code from the scheduler and simplifies the code. We already have the ability to do user based classification which is tightened using PAM in userspace. Schedule USER_SCHED for removal in 2.6.34 Signed-off-by: Dhaval Giani Acked-by: Peter Zijlstra Cc: Balbir Singh Cc: Bharata B Rao Cc: Serge E. Hallyn Cc: Srivatsa Vaddagiri LKML-Reference: <20091103214544.GI5495@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar commit 2643ce11457a99a85c5bed8dd631e35968e6ca5a Author: Arnaldo Carvalho de Melo Date: Tue Nov 3 21:46:10 2009 -0200 perf symbols: Factor out buildid reading routine So that we can run it without having a DSO instance. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1257291970-8208-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit a2e71271535fde493c32803b1f34789f97efcb5e Merge: 6d7aa9d b419148 Author: Ingo Molnar Date: Wed Nov 4 11:54:15 2009 +0100 Merge commit 'v2.6.32-rc6' into perf/core Conflicts: tools/perf/Makefile Merge reason: Resolve the conflict, merge to upstream and merge in perf fixes so we can add a dependent patch. Signed-off-by: Ingo Molnar commit 9824a2b728b63e7ff586b9fd9293c819be79f0f3 Author: Hiroshi Shimamoto Date: Wed Nov 4 16:16:54 2009 +0900 sched: Remove unused cpu_nr_migrations() cpu_nr_migrations() is not used, remove it. Signed-off-by: Hiroshi Shimamoto Cc: Peter Zijlstra LKML-Reference: <4AF12A66.6020609@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit 2a2bb3142d326bb28b03875cabfc49baaac9a14a Author: Hiroshi Shimamoto Date: Wed Nov 4 16:16:10 2009 +0900 sched: Remove unused time_sync_thresh declaration time_sync_thresh had been removed. Signed-off-by: Hiroshi Shimamoto Cc: Peter Zijlstra LKML-Reference: <4AF12A3A.5050200@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit 1477b6a7edd9ffa7bba4f9779ce9a76ce92761ed Author: Hiroshi Shimamoto Date: Wed Nov 4 16:14:16 2009 +0900 sched: Remove unused __schedule() declaration __schedule() had been removed. Signed-off-by: Hiroshi Shimamoto Cc: Peter Zijlstra LKML-Reference: <4AF129C8.3030008@ct.jp.nec.com> Signed-off-by: Ingo Molnar commit 97829de5a3b88899c5f3ac8802d11868bf4180ba Author: Brian Gerst Date: Tue Nov 3 14:02:05 2009 -0500 x86, 64-bit: Fix bstep_iret jump This jump should be unconditional. Signed-off-by: Brian Gerst LKML-Reference: <1257274925-15713-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar commit 97eaf5300b9d0cd99c310bf8c4a0f2f3296d88a3 Author: Frederic Weisbecker Date: Sun Oct 18 15:33:50 2009 +0200 perf/core: Add a callback to perf events A simple callback in a perf event can be used for multiple purposes. For example it is useful for triggered based events like hardware breakpoints that need a callback to dispatch a triggered breakpoint event. v2: Simplify a bit the callback attribution as suggested by Paul Mackerras Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: "K.Prasad" Cc: Alan Stern Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Ingo Molnar Cc: Paul Mackerras Cc: Mike Galbraith Cc: Paul Mundt commit 6d7aa9d721c8c640066142fd9534afcdf68d7f9d Author: Arnaldo Carvalho de Melo Date: Tue Nov 3 15:52:18 2009 -0200 perf symbols: Initialize dso->loaded Brown paper bag bug introduced in: 66bd8424cc05e800db384053bf7ab967e4658468 ("perf tools: Delay loading symtabs till we hit a map with it") Without this we were not loading any symtabs that happened to be on a DSO for which the allocated memory for ->loaded was !0. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1257270738-5669-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit c1e530178540df26eb39f10a972d06f96302ceb4 Author: Thiago Farina Date: Tue Nov 3 08:28:45 2009 -0500 perf: Clean up trivial style issues in builtin-help.c Pointed out by checkpatch. Signed-off-by: Thiago Farina Cc: a.p.zijlstra@chello.nl Cc: paulus@samba.org Cc: Valdis.Kletnieks@vt.edu LKML-Reference: <1257254925-5423-1-git-send-email-tfransosi@gmail.com> Signed-off-by: Ingo Molnar commit 41a48d14f6991020c9bb6b93e289ca5b411ed09a Author: Paul Mundt Date: Mon Oct 5 19:23:06 2009 +0900 x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread(). flush_thread() tries to do a TIF_DEBUG check before calling in to flush_thread_hw_breakpoint() (which subsequently clears the thread flag), but for some reason, the x86 code is manually clearing TIF_DEBUG immediately before the test, so this path will never be taken. This kills off the erroneous clear_tsk_thread_flag() and lets flush_thread_hw_breakpoint() actually get invoked. Presumably folks were getting lucky with testing and the free_thread_info() -> free_thread_xstate() path was taking care of the flush there. Signed-off-by: Paul Mundt Acked-by: "K.Prasad" Cc: Ingo Molnar Cc: Alan Stern LKML-Reference: <20091005102306.GA7889@linux-sh.org> Signed-off-by: Frederic Weisbecker commit fb0459d75c1d0a4ba3cafdd2c754e7486968a676 Author: Arjan van de Ven Date: Fri Sep 25 12:25:56 2009 +0200 perf/core: Provide a kernel-internal interface to get to performance counters There are reasons for kernel code to ask for, and use, performance counters. For example, in CPU freq governors this tends to be a good idea, but there are other examples possible as well of course. This patch adds the needed bits to do enable this functionality; they have been tested in an experimental cpufreq driver that I'm working on, and the changes are all that I needed to access counters properly. [fweisbec@gmail.com: added pid to perf_event_create_kernel_counter so that we can profile a particular task too TODO: Have a better error reporting, don't just return NULL in fail case.] v2: Remove the wrong comment about the fact perf_event_create_kernel_counter must be called from a kernel thread. Signed-off-by: Arjan van de Ven Acked-by: Peter Zijlstra Cc: "K.Prasad" Cc: Alan Stern Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Ingo Molnar Cc: Jan Kiszka Cc: Jiri Slaby Cc: Li Zefan Cc: Avi Kivity Cc: Paul Mackerras Cc: Mike Galbraith Cc: Masami Hiramatsu Cc: Paul Mundt Cc: Jan Kiszka Cc: Avi Kivity LKML-Reference: <20090925122556.2f8bd939@infradead.org> Signed-off-by: Frederic Weisbecker commit 12e4db4790b1bd2b7ec70eb2a1386c00fc683740 Author: Arnaldo Carvalho de Melo Date: Tue Nov 3 11:29:07 2009 -0200 perf probe: Annotate variable initialization Annotate away this false positive warning on older GCCs: cc1: warnings being treated as errors builtin-probe.c: In function ‘parse_probe_event’: builtin-probe.c:72: warning: ‘nc’ is used uninitialized in this function Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Masami Hiramatsu LKML-Reference: <1257254947-16789-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit a489ca355efaf9efa4990b0f8f30ab650a206273 Author: Arjan van de Ven Date: Mon Nov 2 16:59:15 2009 -0800 x86: Make sure we also print a Code: line for show_regs() show_regs() is called as a mini BUG() equivalent in some places, specifically for the "scheduling while atomic" case. Unfortunately right now it does not print a Code: line unlike a real bug/oops. This patch changes the x86 implementation of show_regs() so that it calls the same function as oopses do to print the registers as well as the Code: line. Signed-off-by: Arjan van de Ven LKML-Reference: <20091102165915.4a980fc0@infradead.org> Signed-off-by: Ingo Molnar commit 31bde71c202722a76686c3cf69a254c8a912275a Author: Matt Domsch Date: Tue Nov 3 12:05:50 2009 +1100 tpm: autoload tpm_tis based on system PnP IDs The tpm_tis driver already has a list of supported pnp_device_ids. This patch simply exports that list as a MODULE_DEVICE_TABLE() so that the module autoloader will discover and load the module at boottime. Signed-off-by: Matt Domsch Acked-by: Rajiv Andrade Signed-off-by: Andrew Morton Signed-off-by: James Morris commit 900b20d5900045fb9b48f2fb3d80cbdbae3f44c0 Author: Ingo Molnar Date: Mon Nov 2 19:25:25 2009 +0100 perf tools: Fix missing symtabs printouts Fix: util/map.c: In function ‘map__find_symbol’: util/map.c:97: error: field precision should have type ‘int’, but argument 3 has type ‘size_t’ Also clean up some line wrap damage - we dont line-wrap printk messages. Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256927305-4628-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 4dae560f97fa438f373b53e14b30149c9e44a600 Author: Ananth N Mavinakayanahalli Date: Fri Oct 30 19:23:10 2009 +0530 kprobes: Sanitize struct kretprobe_instance allocations For as long as kretprobes have existed, we've allocated NR_CPUS instances of kretprobe_instance structures. With the default value of CONFIG_NR_CPUS increasing on certain architectures, we are potentially wasting kernel memory. See http://sourceware.org/bugzilla/show_bug.cgi?id=10839#c3 for more details. Use a saner num_possible_cpus() instead of NR_CPUS for allocation. Signed-off-by: Ananth N Mavinakayanahalli Acked-by: Masami Hiramatsu Cc: Jim Keniston Cc: fweisbec@gmail.com LKML-Reference: <20091030135310.GA22230@in.ibm.com> Signed-off-by: Ingo Molnar commit d70a5402f9c2e2671b809363616b3508b4c5a565 Author: Arnaldo Carvalho de Melo Date: Fri Oct 30 16:28:25 2009 -0200 perf tools: Improve message about missing symtabs for deleted DSOs Instead of: no symbols found in /usr/lib/gstreamer-0.10/libgsttypefindfunctions.so (deleted), maybe install a debug package? no symbols found in /usr/lib/gstreamer-0.10/libgstaudioconvert.so (deleted), maybe install a debug package? We now emit: /usr/lib/gstreamer-0.10/libgsttypefindfunctions.so was updated, restart the long running apps that use it! /usr/lib/gstreamer-0.10/libgstaudioconvert.so was updated, restart the long running apps that use it! Which is far less misleading about what the cause of the symbol mismatch is. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256927305-4628-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit 00a192b395b0606ad0265243844b3cd68e73420a Author: Arnaldo Carvalho de Melo Date: Fri Oct 30 16:28:24 2009 -0200 perf tools: Simplify the symbol priv area mechanism Before we were storing this in the DSO, but in fact this is a property of the 'symbol' class, not something that will vary among DSOs, so move it to a global variable and initialize it using the existing symbol__init routine. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256927305-4628-2-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit afb7b4f08e274cecd8337f9444affa288a9cd4c1 Author: Arnaldo Carvalho de Melo Date: Fri Oct 30 16:28:23 2009 -0200 perf tools: Factor out the map initialization Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256927305-4628-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit c5e0cb3ddc5f14cedcfc50c0fb3b5fc6b56576da Author: Lai Jiangshan Date: Wed Oct 28 08:14:48 2009 -0700 rcu: Cleanup: balance rcu_irq_enter()/rcu_irq_exit() calls Currently, rcu_irq_exit() is invoked only for CONFIG_NO_HZ, while rcu_irq_enter() is invoked unconditionally. This patch moves rcu_irq_exit() out from under CONFIG_NO_HZ so that the calls are balanced. This patch has no effect on the behavior of the kernel because both rcu_irq_enter() and rcu_irq_exit() are empty for !CONFIG_NO_HZ, but the code is easier to understand if the calls are obviously balanced in all cases. Signed-off-by: Lai Jiangshan Signed-off-by: Paul E. McKenney Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12567428891605-git-send-email-> Signed-off-by: Ingo Molnar commit 5231a68614b94f60e8f6c56bc6e3d75955b9e75e Author: Suresh Siddha Date: Mon Oct 26 14:24:36 2009 -0800 x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs() To ensure that we handle all the pending interrupts (destined for this cpu that is going down) in the interrupt subsystem before the cpu goes offline, fixup_irqs() does: local_irq_enable(); mdelay(1); local_irq_disable(); Enabling interrupts is not a good thing as this cpu is already offline. So this patch replaces that logic with, mdelay(1); check APIC_IRR bits Retrigger the irq at the new destination if any interrupt has arrived via IPI. For IO-APIC level triggered interrupts, this retrigger IPI will appear as an edge interrupt. ack_apic_level() will detect this condition and IO-APIC RTE's remoteIRR is cleared using directed EOI(using IO-APIC EOI register) on Intel platforms and for others it uses the existing mask+edge logic followed by unmask+level. We can also remove mdelay() and then send spuriuous interrupts to new cpu targets for all the irqs that were handled previously by this cpu that is going offline. While it works, I have seen spurious interrupt messages (nothing wrong but still annoying messages during cpu offline, which can be seen during suspend/resume etc) Signed-off-by: Suresh Siddha Acked-by: Gary Hade Cc: Eric W. Biederman LKML-Reference: <20091026230002.043281924@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit b3ec0a37a7907813bb4fb85a2d94102c152470b7 Author: Suresh Siddha Date: Mon Oct 26 14:24:35 2009 -0800 x86: Use EOI register in io-apic on intel platforms IO-APIC's in intel chipsets support EOI register starting from IO-APIC version 2. Use that when ever we need to clear the IO-APIC RTE's RemoteIRR bit explicitly. Signed-off-by: Suresh Siddha Acked-by: Gary Hade Cc: Eric W. Biederman LKML-Reference: <20091026230001.947855317@sbs-t61.sc.intel.com> [ Marked use_eio_reg as __read_mostly, fixed small details ] Signed-off-by: Ingo Molnar commit a5e74b841930bec78a4684ab9f208b2ddfe7c736 Author: Suresh Siddha Date: Mon Oct 26 14:24:34 2009 -0800 x86: Force irq complete move during cpu offline When a cpu goes offline, fixup_irqs() try to move irq's currently destined to the offline cpu to a new cpu. But this attempt will fail if the irq is recently moved to this cpu and the irq still hasn't arrived at this cpu (for non intr-remapping platforms this is when we free the vector allocation at the previous destination) that is about to go offline. This will endup with the interrupt subsystem still pointing the irq to the offline cpu, causing that irq to not work any more. Fix this by forcing the irq to complete its move (its been a long time we moved the irq to this cpu which we are offlining now) and then move this irq to a new cpu before this cpu goes offline. Signed-off-by: Suresh Siddha Acked-by: Gary Hade Cc: Eric W. Biederman LKML-Reference: <20091026230001.848830905@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit 23359a88e7eca3c4f402562b102f23014db3c2aa Author: Suresh Siddha Date: Mon Oct 26 14:24:33 2009 -0800 x86: Remove move_cleanup_count from irq_cfg move_cleanup_count for each irq in irq_cfg is keeping track of the total number of cpus that need to free the corresponding vectors associated with the irq which has now been migrated to new destination. As long as this move_cleanup_count is non-zero (i.e., as long as we have n't freed the vector allocations on the old destinations) we were preventing the irq's further migration. This cleanup count is unnecessary and it is enough to not allow the irq migration till we send the cleanup vector to the previous irq destination, for which we already have irq_cfg's move_in_progress. All we need to make sure is that we free the vector at the old desintation but we don't need to wait till that gets freed. Signed-off-by: Suresh Siddha Acked-by: Gary Hade Cc: Eric W. Biederman LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit 84e21493a3b28c9fefe99fe827fc0c0c101a813d Author: Suresh Siddha Date: Mon Oct 26 14:24:32 2009 -0800 x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping In the presence of interrupt-remapping, irqs will be migrated in the process context and we don't do (and there is no need to) irq_chip mask/unmask while migrating the interrupt. Similarly fix the fixup_irqs() that get called during cpu offline and avoid calling irq_chip mask/unmask for irqs that are ok to be migrated in the process context. While we didn't observe any race condition with the existing code, this change takes complete advantage of interrupt-remapping in the newer generation platforms and avoids any potential HW lockup's (that often worry Eric :) Signed-off-by: Suresh Siddha Acked-by: Eric W. Biederman Cc: garyhade@us.ibm.com LKML-Reference: <20091026230001.661423939@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit 7a7732bc0f7c46f217dbec723f25366b6285cc42 Author: Suresh Siddha Date: Mon Oct 26 14:24:31 2009 -0800 x86: Unify fixup_irqs() for 32-bit and 64-bit kernels There is no reason to have different fixup_irqs() for 32-bit and 64-bit kernels. Unify by using the superior 64-bit version for both the kernels. Signed-off-by: Suresh Siddha Signed-off-by: Gary Hade Cc: Eric W. Biederman LKML-Reference: <20091026230001.562512739@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar commit 5e9b397292ca0b9409dced33e3a22ec993377064 Author: Li Zefan Date: Mon Nov 2 08:51:13 2009 +0800 tracing: Fix to use __always_unused attribute ____ftrace_check_##name() is used for compile-time check on F_printk() only, so it should be marked as __unused instead of __used. Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Linus Torvalds LKML-Reference: <4AEE2D01.4010305@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 7b2a35132ad0a70902dcd2844c27ed64cda0ce9b Author: Li Zefan Date: Mon Nov 2 08:50:52 2009 +0800 compiler: Introduce __always_unused I wrote some code which is used as compile-time checker, and the code should be elided after compile. So I need to annotate the code as "always unused", compared to "maybe unused". Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Linus Torvalds LKML-Reference: <4AEE2CEC.8040206@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 3507d612366a4e81226295f646410130a1f62a5c Author: Rajiv Andrade Date: Thu Sep 10 17:09:35 2009 -0300 tpm_tis: TPM_STS_DATA_EXPECT workaround Some newer Lenovo models are shipped with a TPM that doesn't seem to set the TPM_STS_DATA_EXPECT status bit when sending it a burst of data, so the code understands it as a failure and doesn't proceed sending the chip the intended data. In this patch we bypass this bit check in case the itpm module parameter was set. This patch is based on Andy Isaacson's one: http://marc.info/?l=linux-kernel&m=124650185023495&w=2 It was heavily discussed how should we deal with identifying the chip in kernel space, but the required patch to do so was NACK'd: http://marc.info/?l=linux-kernel&m=124650186423711&w=2 This way we let the user choose using this workaround or not based on his observations on this code behavior when trying to use the TPM. Fixed a checkpatch issue present on the previous patch, thanks to Daniel Walker. Signed-off-by: Rajiv Andrade Acked-by: Eric Paris Tested-by: Seiji Munetoh Signed-off-by: James Morris commit 5975c725dfd6f7d36f493ab1453fbdbd35c1f0e3 Author: Serge E. Hallyn Date: Thu Oct 29 11:40:17 2009 -0500 define convenient securebits masks for prctl users (v2) Hi James, would you mind taking the following into security-testing? The securebits are used by passing them to prctl with the PR_{S,G}ET_SECUREBITS commands. But the defines must be shifted to be used in prctl, which begs to be confused and misused by userspace. So define some more convenient values for userspace to specify. This way userspace does prctl(PR_SET_SECUREBITS, SECBIT_NOROOT); instead of prctl(PR_SET_SECUREBITS, 1 << SECURE_NOROOT); (Thanks to Michael for the idea) This patch also adds include/linux/securebits to the installed headers. Then perhaps it can be included by glibc's sys/prctl.h. Changelog: Oct 29: Stephen Rothwell points out that issecure can be under __KERNEL__. Oct 14: (Suggestions by Michael Kerrisk): 1. spell out SETUID in SECBIT_NO_SETUID* 2. SECBIT_X_LOCKED does not imply SECBIT_X 3. add definitions for keepcaps Oct 14: As suggested by Michael Kerrisk, don't use SB_* as that convention is already in use. Use SECBIT_ prefix instead. Signed-off-by: Serge E. Hallyn Acked-by: Andrew G. Morgan Acked-by: Michael Kerrisk Cc: Ulrich Drepper Cc: James Morris Signed-off-by: James Morris commit c4b8ac2c1aee1398b9378b8730bac56294b3410b Author: Li Hong Date: Wed Oct 28 13:07:43 2009 +0800 tracing: Exit with error if a weak function is used in recordmcount.pl If a weak function is used as a relocation reference for mcount callers and that function is overridden, it will cause ftrace to fail at run time. The current code should prevent a weak function from being used, but if one is, the code should exit with an error to fail at compile time. Signed-off-by: Li Hong LKML-Reference: <20091028050743.GH30758@uhli> Signed-off-by: Steven Rostedt commit 6092858c60f168c1950f8ad73880d54271696ec5 Author: Li Hong Date: Wed Oct 28 13:07:03 2009 +0800 tracing: Move conditional into update_funcs() in recordmcount.pl Move all the condition validations into the function update_funcs(). Also update_funcs should not die if $ref_func is undefined for there may be more than one valid section in an object file. Signed-off-by: Li Hong LKML-Reference: <20091028050703.GG30758@uhli> Signed-off-by: Steven Rostedt commit 306dcf47d28aaf9aedfafb17a602768584cfc0f2 Author: Li Hong Date: Wed Oct 28 13:06:19 2009 +0800 tracing: Add regex for weak functions in recordmcount.pl Add a variable to contain the regex needed to find weak functions in the 'nm' output. This will allow other archs to easily override it. Also rename the regex variable $nm_regex to $local_regex to be more descriptive. Signed-off-by: Li Hong LKML-Reference: <20091028050619.GF30758@uhli> Signed-off-by: Steven Rostedt commit db24c7dcf42f78629d89b34e5d5a98ed56ea2ff5 Author: Li Hong Date: Wed Oct 28 13:05:23 2009 +0800 tracing: Move mcount section search to front of loop in recordmcount.pl Move the mcount section check to the beginning of the objdump read loop. This makes the code easier to follow since the search for the mcount section is performed first before the mcount callers are processed. Signed-off-by: Li Hong LKML-Reference: <20091028050523.GE30758@uhli> Signed-off-by: Steven Rostedt commit 7b7edc27683e20624f4daf17c76041719184201c Author: Li Hong Date: Wed Oct 28 13:04:21 2009 +0800 tracing: Fix objcopy revision check in recordmcount.pl The current logic to check objcopy's version is incorrect. This patch fixes the algorithm and disables the use of local functions as a reference if the objcopy version does not support static to global conversions. Also remove some usused variables. Signed-off-by: Li Hong LKML-Reference: <20091028050421.GD30758@uhli> Signed-off-by: Steven Rostedt commit bdd3b052c63b2c19a0118937f500985c01a19956 Author: Li Hong Date: Wed Oct 28 13:03:32 2009 +0800 tracing: Check absolute path of input file in recordmcount.pl The ftrace.c file may reference the mcount function and this may interfere with the recordmcount.pl processing. To avoid this, the code does not process the kernel/trace/ftrace.o. But currently the check is against a relative path. This patch modifies the check to succeed if the path is an absolute path. Signed-off-by: Li Hong LKML-Reference: <20091028050332.GC30758@uhli> Signed-off-by: Steven Rostedt commit e2d753fac5b3954a3b6001f98479f0435fe7c868 Author: Li Hong Date: Tue Oct 27 14:57:33 2009 +0800 tracing: Correct the check for number of arguments in recordmcount.pl The number of arguments passed into recordmcount.pl is 10, but the code checks if only 7 are passed in. Signed-off-by: Li Hong LKML-Reference: <20091027065733.GB22032@uhli> Signed-off-by: Steven Rostedt commit d49f6aa76d24c60a52530474cb662e8ad9f09471 Author: Li Hong Date: Wed Oct 28 13:01:38 2009 +0800 tracing: Amend documentation in recordmcount.pl to reflect implementation The documentation currently says we will use the first function in a section as a reference. The actual algorithm is: choose the first global function we meet as a reference. If there is none, choose the first local one. Change the documentation to be consistent with the code. Also add several other clarifications. Signed-off-by: Li Hong LKML-Reference: <20091028050138.GA30758@uhli> Signed-off-by: Steven Rostedt commit 4f65ae36f0291ef97b7d4de2f59b2e68f3c8420b Author: Pavel Vasilyev Date: Thu Oct 29 17:16:00 2009 +0100 agp/amd64: Remove GART dependency on AGP_AMD64 The GART IOMMU code has no strong dependency to the AMD64 AGP code. So the automatic selection of AGP_AMD64 for GART can be removed. Cc: Dave Jones Signed-off-by: Pavel Vasilyev Signed-off-by: Joerg Roedel commit 9de09ace8d518141a4375e1d216ab64db4377799 Merge: 1beee96 6d3f1e1 Author: Ingo Molnar Date: Thu Oct 29 09:02:15 2009 +0100 Merge branch 'tracing/urgent' into tracing/core Merge reason: Pick up fixes and move base from -rc1 to -rc5. Signed-off-by: Ingo Molnar commit 3ed67776fc23061180896086a206a02be649dd26 Author: Li Zefan Date: Wed Oct 28 17:37:01 2009 +0800 tracing/filters: Fix to make system filter work commit fce29d15b59245597f7f320db4a9f2be0f5fb512 ("tracing/filters: Refactor subsystem filter code") broke system filter accidentally. Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Tom Zanussi LKML-Reference: <4AE810BD.3070009@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit b0ef07324310d66f660a311d4a8d669eda74f801 Author: Masami Hiramatsu Date: Tue Oct 27 16:43:19 2009 -0400 perf/probes: Support function entry relative line number Add function-entry relative line number specifying support to perf-probe. This allows users to define probes by line number from entry of the function. e.g. perf probe schedule:16 Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204319.30545.30678.stgit@harusame> Signed-off-by: Ingo Molnar commit 253977b0d87fbb793f12b1661a763ae264028ccf Author: Masami Hiramatsu Date: Tue Oct 27 16:43:10 2009 -0400 perf/probes: Improve probe point syntax of perf-probe This changes probe point syntax of perf-probe as below [:ABS_LN] [ARGS] or [+OFFS|%return][@SRC] [ARGS] And event name and event group name are automatically generated based on probe-symbol and offset as below. perfprobes/SYMBOL_OFFSET[_NUM] Where SYMBOL is the probing symbol and OFFSET is the byte offset from the symbol. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204310.30545.84984.stgit@harusame> Signed-off-by: Ingo Molnar commit 46ab49267d338eb5056d0077e16346509b9e9284 Author: Masami Hiramatsu Date: Tue Oct 27 16:43:02 2009 -0400 perf/probes: Improve command-line option of perf-probe Change command-line option from -P to --add, and accepting probes without --add too. perf probe --add "probe-define" or, just: perf probe "probe-define" Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204301.30545.48600.stgit@harusame> Signed-off-by: Ingo Molnar commit 8030c5f5a57e018fcdeb1f395d7adc123b48ced6 Author: Masami Hiramatsu Date: Tue Oct 27 16:42:53 2009 -0400 perf/probes: Exit searching after finding target function Exit searching after finding real (not-inlined) function, because there should be no same symbol in that CU. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204252.30545.19251.stgit@harusame> Signed-off-by: Ingo Molnar commit dd004c475cd15a5749b04b0283d41ffdfa57d658 Author: Masami Hiramatsu Date: Tue Oct 27 16:42:44 2009 -0400 kprobe-tracer: Compare both of event-name and event-group to find probe Fix find_probe_event() to compare both of event-name and event-group. Without this fix, kprobe-tracer overwrites existing same event-name probe even if its group-name is different. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204244.30545.27516.stgit@harusame> Signed-off-by: Ingo Molnar commit 3f7e454af1dd8b9cea410d9380d3f71477e94f2b Author: Masami Hiramatsu Date: Tue Oct 27 16:42:35 2009 -0400 x86: Add Intel FMA instructions to x86 opcode map Add Intel FMA(FUSED-MULTIPLY-ADD) instructions to x86 opcode map for x86 instruction decoder. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204235.30545.33997.stgit@harusame> Signed-off-by: Ingo Molnar commit e0e492e99b372c6990a5daca9e4683c341f1330e Author: Masami Hiramatsu Date: Tue Oct 27 16:42:27 2009 -0400 x86: AVX instruction set decoder support Add Intel AVX(Advanced Vector Extensions) instruction set support to x86 instruction decoder. This adds insn.vex_prefix field for storing VEX prefixes, and introduces some original tags for expressing opcodes attributes. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204226.30545.23451.stgit@harusame> Signed-off-by: Ingo Molnar commit 82cb57028c864822c5a260f806d051e2ce28c86a Author: Masami Hiramatsu Date: Tue Oct 27 16:42:19 2009 -0400 x86: Add pclmulq to x86 opcode map Add pclmulq opcode to x86 opcode map. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204219.30545.82039.stgit@harusame> Signed-off-by: Ingo Molnar commit 04d46c1b13b02e1e5c24eb270a01cf3f94ee4d04 Author: Masami Hiramatsu Date: Tue Oct 27 16:42:11 2009 -0400 x86: Merge INAT_REXPFX into INAT_PFX_* Merge INAT_REXPFX into INAT_PFX_* macro and rename it to INAT_PFX_REX. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204211.30545.58090.stgit@harusame> Signed-off-by: Ingo Molnar commit 7f387d3f2421781610588faa2f49ae5f1737b137 Author: Masami Hiramatsu Date: Tue Oct 27 16:42:04 2009 -0400 x86: Fix SSE opcode map bug Fix superscripts position because some superscripts of SSE opcode are not put in correct position. Signed-off-by: Masami Hiramatsu Cc: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Jason Baron Cc: K.Prasad Cc: Peter Zijlstra Cc: Srikar Dronamraju LKML-Reference: <20091027204204.30545.97296.stgit@harusame> Signed-off-by: Ingo Molnar commit 66bd8424cc05e800db384053bf7ab967e4658468 Author: Arnaldo Carvalho de Melo Date: Wed Oct 28 21:51:21 2009 -0200 perf tools: Delay loading symtabs till we hit a map with it So that we can have a quicker start on perf top and even speedups in the other tools, as we can have maps with no hits, so no need to load its symtabs. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256773881-4191-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar commit ff76ec18cabb12a6c8f3c65bd1d23f1a770fe908 Author: Randy Dunlap Date: Wed Oct 28 12:26:39 2009 -0700 tpm: fix header for modular build Fix build for TCG_TPM=m. Header file doesn't handle this and incorrectly builds stubs. drivers/char/tpm/tpm.c:720: error: redefinition of 'tpm_pcr_read' include/linux/tpm.h:35: error:previous definition of 'tpm_pcr_read' was here drivers/char/tpm/tpm.c:752: error: redefinition of 'tpm_pcr_extend' include/linux/tpm.h:38: error:previous definition of 'tpm_pcr_extend' was here Repairs linux-next's commit d6ba452128178091dab7a04d54f7e66fdc32fb39 Author: Mimi Zohar Date: Mon Oct 26 09:26:18 2009 -0400 tpm add default function definitions Signed-off-by: Randy Dunlap Cc: Rajiv Andrade Cc: Mimi Zohar Cc: James Morris Cc: Eric Paris Signed-off-by: Andrew Morton Signed-off-by: James Morris commit 024e1a49411a1a7363e65db48edf1b09e9ee68ad Author: Stephen Hemminger Date: Tue Oct 27 19:24:46 2009 -0700 tomoyo: improve hash bucket dispersion When examining the network device name hash, it was discovered that the low order bits of full_name_hash() are not very well dispersed across the possible values. When used by filesystem code, this is handled by folding with the function hash_long(). The only other non-filesystem usage of full_name_hash() at this time appears to be in TOMOYO. This patch should fix that. I do not use TOMOYO at this time, so this patch is build tested only. Signed-off-by: Stephen Hemminger Acked-by: Tetsuo Handa Signed-off-by: James Morris commit c86e2eaded39843e1bf4f07d1adfab4494f20894 Author: Anton Blanchard Date: Sun Oct 18 01:24:06 2009 +0000 powerpc: perf_event: Cleanup output by adding symbols Add some dummy symbols for the branches at 0xf00, 0xf20 and 0xf40, otherwise hits end up in trap_0e which is confusing to the user. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 917e407c762ba6d91d1a4bc1c804d518585082a3 Author: Anton Blanchard Date: Sun Oct 18 01:24:29 2009 +0000 powerpc: perf_event: Hide iseries_check_pending_irqs If CONFIG_PPC_ISERIES isn't defined we end up with iseries_check_pending_irqs and do_work at the same address. perf ends up picking iseries_check_pending_irqs which creates confusing backtraces. Hide it. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 3cd980dbc1050889acca7306cbcedf79a4ba2f81 Author: Anton Blanchard Date: Sun Oct 18 01:23:28 2009 +0000 powerpc: perf_event: Cleanup copy_page output by hiding setup symbol A lot of hits in "setup" doesn't make much sense, so hide this symbol and allow all the hits to end up in copy_4k_page. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 907b1f45d901c956e4bcd3f27c4f1f25d6fb36b2 Author: Anton Blanchard Date: Mon Oct 26 18:52:24 2009 +0000 powerpc: Export powerpc_debugfs_root Kernel modules should be able to place their debug output inside our powerpc debugfs directory. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit b3c86ee6d128dea7c671380090488887e73fa774 Author: Anton Blanchard Date: Mon Oct 26 18:51:57 2009 +0000 powerpc: Disable HCALL_STATS by default The overhead of HCALL_STATS is quite high and the functionality is very rarely used. Key statistics are also missing (eg min/max). With the new hcall tracepoints much more powerful tracing can be done in a kernel module. Lets disable this by default. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 6f26353ca29e96475208bce673efb6a2c58b73f2 Author: Anton Blanchard Date: Mon Oct 26 18:51:09 2009 +0000 powerpc: tracing: Give hypervisor call tracepoints access to arguments While most users of the hcall tracepoints will only want the opcode and return code, some will want all the arguments. To avoid the complexity of using varargs we pass a pointer to the register save area, which contains all the arguments. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit c8cd093a6e9f96ea6b871576fd4e46d7c818bb89 Author: Anton Blanchard Date: Mon Oct 26 18:50:29 2009 +0000 powerpc: tracing: Add hypervisor call tracepoints Add hcall_entry and hcall_exit tracepoints. This replaces the inline assembly HCALL_STATS code and converts it to use the new tracepoints. To keep the disabled case as quick as possible, we embed a status word in the TOC so we can get at it with a single load. By doing so we keep the overhead at a minimum. Time taken for a null hcall: No tracepoint code: 135.79 cycles Disabled tracepoints: 137.95 cycles For reference, before this patch enabling HCALL_STATS resulted in a null hcall of 201.44 cycles! Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 6795b85c6a4f690e61e7be31aa150d945c723fb5 Author: Anton Blanchard Date: Mon Oct 26 18:49:14 2009 +0000 powerpc: tracing: Add powerpc tracepoints for timer entry and exit We can monitor the effectiveness of our power management of both the kernel and hypervisor by probing the timer interrupt. For example, on this box we see 10.37s timer interrupts on an idle core: -0 [010] 3900.671297: timer_interrupt_entry: pt_regs=c0000000ce1e7b10 -0 [010] 3900.671302: timer_interrupt_exit: pt_regs=c0000000ce1e7b10 -0 [010] 3911.042963: timer_interrupt_entry: pt_regs=c0000000ce1e7b10 -0 [010] 3911.042968: timer_interrupt_exit: pt_regs=c0000000ce1e7b10 -0 [010] 3921.414630: timer_interrupt_entry: pt_regs=c0000000ce1e7b10 -0 [010] 3921.414635: timer_interrupt_exit: pt_regs=c0000000ce1e7b10 Since we have a 207MHz decrementer it will go negative and fire every 10.37s even if Linux is completely idle. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 1bf4af165050d90ea6659ffb2536ec8ca783aab5 Author: Anton Blanchard Date: Mon Oct 26 18:47:42 2009 +0000 powerpc: tracing: Add powerpc tracepoints for interrupt entry and exit This adds powerpc-specific tracepoints for interrupt entry and exit. While we already have generic irq_handler_entry and irq_handler_exit tracepoints there are cases on our virtualised powerpc machines where an interrupt is presented to the OS, but subsequently handled by the hypervisor. This means no OS interrupt handler is invoked. Here is an example on a POWER6 machine with the patch below applied: -0 [006] 3243.949840744: irq_entry: pt_regs=c0000000ce31fb10 -0 [006] 3243.949850520: irq_exit: pt_regs=c0000000ce31fb10 -0 [007] 3243.950218208: irq_entry: pt_regs=c0000000ce323b10 -0 [007] 3243.950224080: irq_exit: pt_regs=c0000000ce323b10 -0 [000] 3244.021879320: irq_entry: pt_regs=c000000000a63aa0 -0 [000] 3244.021883616: irq_handler_entry: irq=87 handler=eth0 -0 [000] 3244.021887328: irq_handler_exit: irq=87 return=handled -0 [000] 3244.021897408: irq_exit: pt_regs=c000000000a63aa0 Here we see two phantom interrupts (no handler was invoked), followed by a real interrupt for eth0. Without the tracepoints in this patch we would have missed the phantom interrupts. Signed-off-by: Anton Blanchard Acked-by: Steven Rostedt Signed-off-by: Paul Mackerras commit 196f02bf900c5eb6f85d889c4f70e7cc11fda7e8 Author: Anton Blanchard Date: Sun Oct 18 01:13:00 2009 +0000 powerpc: perf_event: Add alignment-faults and emulation-faults software events Hook up the alignment-faults and emulation-faults events for powerpc. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit eecff81d1fcda22cd0029d11fe2a71dceed11dad Author: Anton Blanchard Date: Tue Oct 27 18:46:55 2009 +0000 powerpc: Create PPC_WARN_ALIGNMENT to match PPC_WARN_EMULATED perf_event wants a separate event for alignment and emulation faults, so create another emulation event. This will make it easy to hook in perf_event at one spot. We pass in regs which will be required for these events. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit f7d7986060b2890fc26db6ab5203efbd33aa2497 Author: Anton Blanchard Date: Sun Oct 18 01:09:29 2009 +0000 perf_event: Add alignment-faults and emulation-faults software events Add two more software events that are common to many cpus. Alignment faults: When a load or store is not aligned properly. Emulation faults: When an instruction is emulated in software. Both cause a very significant slowdown (100x or worse), so identifying and fixing them is very important. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit 81cd5ae303e88a1e9d3a3e0f1fe8abd100edde16 Author: Anton Blanchard Date: Tue Oct 27 18:31:29 2009 +0000 powerpc: perf_event: Enable SDAR in continous sample mode In continuous sampling mode we want the SDAR to update. While we can select between dcache misses and ERAT (L1-TLB) misses, a decent default is to enable both. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit bc284e5d9d6da48934a177db92bf8e09b96a9cb8 Author: Anton Blanchard Date: Mon Sep 21 16:56:10 2009 +0000 powerpc: perf_event: Log invalid data addresses as all 1s When we take an exception and the SDAR isn't synchronised we currently log 0 as the address. Unfortunately this is a pretty common value, so use ~0UL instead. Signed-off-by: Anton Blanchard Signed-off-by: Paul Mackerras commit d6ba452128178091dab7a04d54f7e66fdc32fb39 Author: Mimi Zohar Date: Mon Oct 26 09:26:18 2009 -0400 tpm add default function definitions Add default tpm_pcr_read/extend function definitions required by IMA/Kconfig changes. Signed-off-by: Mimi Zohar Reviewed-by: Eric Paris Signed-off-by: James Morris commit 2c28e2451dba2260e9f88811b29a7787db7e7616 Author: Paul E. McKenney Date: Mon Oct 26 13:57:44 2009 -0700 rcu: Fix TINY_RCU #elif condition Some compilers are happy with "#elif CONFIG_RCU_TINY", while others strongly prefer "#elif defined(CONFIG_RCU_TINY)". Change to the latter to make more compilers happy. Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12565906642768-git-send-email-> Signed-off-by: Ingo Molnar commit 6f9b41006af1bc489030f84ee247abc0df1edccd Author: Andreas Herrmann Date: Tue Oct 27 11:01:38 2009 +0100 x86, apic: Clear APIC Timer Initial Count Register on shutdown Commit a98f8fd24fb24fcb9a359553e64dd6aac5cf4279 (x86: apic reset counter on shutdown) set the counter to max to avoid spurious interrupts when the timer is re-enabled. (In theory) you'll still get a spurious interrupt if spending more than 344 seconds with this interrupt disabled and then unmasking it. The right thing to do is to clear the register. This disables the interrupt from happening (at least it does on AMD hardware). Signed-off-by: Andreas Herrmann LKML-Reference: <20091027100138.GB30802@alberich.amd.com> Signed-off-by: Ingo Molnar commit 689d30187828afe1faedf050b2f7593515b90c76 Author: Marti Raudsepp Date: Tue Oct 27 00:33:05 2009 +0000 perf tools: Output 'perf list' to stdout not stderr Writing to stdout is probably the expected behavior because the user explicitly asked for a list. Signed-off-by: Marti Raudsepp Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <4ebb59420ef057972167.1256603585@localhost> Signed-off-by: Ingo Molnar commit 85df6f683efa457440eb922272fd5a71aa022ad4 Author: Marti Raudsepp Date: Tue Oct 27 00:33:04 2009 +0000 perf tools: Notify user when unrecognized event is specified Previously no indication was given about what went wrong. Signed-off-by: Marti Raudsepp Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <03ec9ee96f17cef05424.1256603584@localhost> Signed-off-by: Ingo Molnar commit 5b2bb75a0d4b08cd16bc35ecd674f957fc3b0eb7 Author: Arnaldo Carvalho de Melo Date: Mon Oct 26 19:23:19 2009 -0200 perf top: Support userspace symbols too Example: Compiling the kernel with 'make -k 22 allyesconfig' [root@emilia linux-2.6-tip]# perf top -r 90 ------------------------------------------------------------------------------ PerfTop: 3669 irqs/sec kernel:59.9% [1000Hz cycles], (all, 8 CPUs) ------------------------------------------------------------------------------ samples pcnt function DSO _______ _____ ________________________________ ________________ 3062.00 6.5% clear_page_c [kernel] 2233.00 4.8% _int_malloc /lib64/libc-2.5.so 2100.00 4.5% yylex /home/acme/git/build/allyesconfig/scripts/genksyms/genksyms 2029.00 4.3% memset /lib64/libc-2.5.so 1224.00 2.6% page_fault [kernel] 1075.00 2.3% __GI_strlen /lib64/libc-2.5.so 863.00 1.8% sub_preempt_count [kernel] 822.00 1.8% __GI_memcpy /lib64/libc-2.5.so 810.00 1.7% __GI_vfprintf /lib64/libc-2.5.so 786.00 1.7% _int_free /lib64/libc-2.5.so 775.00 1.7% __GI_strcmp /lib64/libc-2.5.so 748.00 1.6% _spin_lock [kernel] 699.00 1.5% main /home/acme/git/build/allyesconfig/scripts/basic/fixdep 659.00 1.4% add_preempt_count [kernel] 649.00 1.4% yyparse /home/acme/git/build/allyesconfig/scripts/genksyms/genksyms 645.00 1.4% preempt_trace [kernel] 635.00 1.4% __GI___libc_free /lib64/libc-2.5.so 597.00 1.3% trace_preempt_on [kernel] 551.00 1.2% __GI___libc_malloc /lib64/libc-2.5.so 516.00 1.1% _spin_lock_irqsave [kernel] 481.00 1.0% copy_user_generic_string [kernel] 479.00 1.0% unmap_vmas [kernel] 429.00 0.9% _IO_file_xsputn_internal /lib64/libc-2.5.so 425.00 0.9% __GI_strncpy /lib64/libc-2.5.so 416.00 0.9% get_page_from_freelist [kernel] 414.00 0.9% malloc_consolidate /lib64/libc-2.5.so 406.00 0.9% get_parent_ip [kernel] 362.00 0.8% __rmqueue [kernel] 347.00 0.7% in_lock_functions [kernel] 316.00 0.7% __d_lookup [kernel] [root@emilia linux-2.6-tip]# More polishing is needed to print just DSO basename when not --verbose, etc. Supporting a 'comm' column requires some more reworking of 'perf top' internals as we will need to use something like the hist entries 'perf report' uses and will be done in another patch. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256592199-9608-3-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 234fbbf508c58c5084292b11b242377553897459 Author: Arnaldo Carvalho de Melo Date: Mon Oct 26 19:23:18 2009 -0200 perf tools: Generalize event synthesizing routines Because we will need it in 'perf top' to support userspace symbols for existing threads. Now we pass a callback that will receive the synthesized event and then write it to the output file in 'perf record' and in the upcoming patch for 'perf top' we will just immediatelly create the in memory representation of threads and maps. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256592199-9608-2-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 7f3bedcc93f935631d2363f23de1cc80f04fdf3e Author: Arnaldo Carvalho de Melo Date: Mon Oct 26 19:23:17 2009 -0200 perf record: Fix race where process can disappear while reading its /proc/pid/tasks Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256592199-9608-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 88b91c7ca49bc8600cf1106eb891d08c1965b9ce Author: Peter Zijlstra Date: Mon Oct 26 10:24:31 2009 -0700 rcu: Simplify creating of lockdep class for root rcu_node Use lockdep_set_class() to simplify the code and to avoid any additional overhead in the !LOCKDEP case. Also move the definition of rcu_root_class into kernel/rcutree.c, as suggested by Lai Jiangshan. Signed-off-by: Peter Zijlstra Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1256577871443-git-send-email-> Signed-off-by: Ingo Molnar commit fcd14b3203b538dca04a2b065c774c0b57863eec Author: Michael Cree Date: Mon Oct 26 21:32:06 2009 +1300 perf tools, Alpha: Add Alpha support to perf.h For the perf tool the patch implements an Alpha specific section in the perf.h header file. Signed-off-by: Michael Cree Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1256545926-6972-1-git-send-email-mcree@orcon.net.nz> Signed-off-by: Ingo Molnar commit 4ce5b90340879ce93d169b7b523c2cbbe7c45843 Author: Ingo Molnar Date: Mon Oct 26 07:55:55 2009 +0100 rcu: Do tiny cleanups in rcutiny No change in functionality - just straighten out a few small stylistic details. Cc: Paul E. McKenney Cc: David Howells Cc: Josh Triplett Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: avi@redhat.com Cc: mtosatti@redhat.com LKML-Reference: <12565226351355-git-send-email-> Signed-off-by: Ingo Molnar commit cf886c44ec418a01b2c52493465accb81acbf930 Author: Paul E. McKenney Date: Sun Oct 25 19:03:54 2009 -0700 rcu: Improve rcutorture diagnostics when bad torture_type specified Make rcutorture list the available torture_type values when it doesn't like the one specified. Signed-off-by: Paul E. McKenney Acked-by: Josh Triplett Reviewed-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: avi@redhat.com Cc: mtosatti@redhat.com LKML-Reference: <12565226351868-git-send-email-> Signed-off-by: Ingo Molnar commit 64179861cb801eac4f00c79f39a29ea5ac9470d7 Author: Paul E. McKenney Date: Sun Oct 25 19:03:53 2009 -0700 rcu: Add synchronize_srcu_expedited() to the documentation Signed-off-by: Paul E. McKenney Acked-by: Josh Triplett Reviewed-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: avi@redhat.com Cc: mtosatti@redhat.com LKML-Reference: <12565226354176-git-send-email-> Signed-off-by: Ingo Molnar commit 804bb8370522a569bd3a732b9de5fbd55e26f155 Author: Paul E. McKenney Date: Sun Oct 25 19:03:52 2009 -0700 rcu: Add synchronize_srcu_expedited() to the rcutorture test suite Adds the "srcu_expedited" torture type, and also renames sched_ops_sync to sched_sync_ops for consistency while we are in this file. Signed-off-by: Paul E. McKenney Acked-by: Josh Triplett Reviewed-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: avi@redhat.com Cc: mtosatti@redhat.com LKML-Reference: <12565226353636-git-send-email-> Signed-off-by: Ingo Molnar commit 0cd397d33608ae6c97d2ee6c8c43462b419b7e26 Author: Paul E. McKenney Date: Sun Oct 25 19:03:51 2009 -0700 rcu: Add synchronize_srcu_expedited() This patch creates a synchronize_srcu_expedited() that uses synchronize_sched_expedited() where synchronize_srcu() uses synchronize_sched(). The synchronize_srcu() and synchronize_srcu_expedited() functions become one-liners that pass synchronize_sched() or synchronize_sched_expedited(), repectively, to a new __synchronize_srcu() function. While in the file, move the EXPORT_SYMBOL_GPL()s to immediately follow the corresponding functions. Requested-by: Avi Kivity Tested-by: Marcelo Tosatti Signed-off-by: Paul E. McKenney Acked-by: Josh Triplett Reviewed-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: avi@redhat.com LKML-Reference: <12565226354038-git-send-email-> Signed-off-by: Ingo Molnar commit 9b1d82fa1611706fa7ee1505f290160a18caf95d Author: Paul E. McKenney Date: Sun Oct 25 19:03:50 2009 -0700 rcu: "Tiny RCU", The Bloatwatch Edition This patch is a version of RCU designed for !SMP provided for a small-footprint RCU implementation. In particular, the implementation of synchronize_rcu() is extremely lightweight and high performance. It passes rcutorture testing in each of the four relevant configurations (combinations of NO_HZ and PREEMPT) on x86. This saves about 1K bytes compared to old Classic RCU (which is no longer in mainline), and more than three kilobytes compared to Hierarchical RCU (updated to 2.6.30): CONFIG_TREE_RCU: text data bss dec filename 183 4 0 187 kernel/rcupdate.o 2783 520 36 3339 kernel/rcutree.o 3526 Total (vs 4565 for v7) CONFIG_TREE_PREEMPT_RCU: text data bss dec filename 263 4 0 267 kernel/rcupdate.o 4594 776 52 5422 kernel/rcutree.o 5689 Total (6155 for v7) CONFIG_TINY_RCU: text data bss dec filename 96 4 0 100 kernel/rcupdate.o 734 24 0 758 kernel/rcutiny.o 858 Total (vs 848 for v7) The above is for x86. Your mileage may vary on other platforms. Further compression is possible, but is being procrastinated. Changes from v7 (http://lkml.org/lkml/2009/10/9/388) o Apply Lai Jiangshan's review comments (aside from might_sleep() in synchronize_sched(), which is covered by SMP builds). o Fix up expedited primitives. Changes from v6 (http://lkml.org/lkml/2009/9/23/293). o Forward ported to put it into the 2.6.33 stream. o Added lockdep support. o Make lightweight rcu_barrier. Changes from v5 (http://lkml.org/lkml/2009/6/23/12). o Ported to latest pre-2.6.32 merge window kernel. - Renamed rcu_qsctr_inc() to rcu_sched_qs(). - Renamed rcu_bh_qsctr_inc() to rcu_bh_qs(). - Provided trivial rcu_cpu_notify(). - Provided trivial exit_rcu(). - Provided trivial rcu_needs_cpu(). - Fixed up the rcu_*_enter/exit() functions in linux/hardirq.h. o Removed the dependence on EMBEDDED, with a view to making TINY_RCU default for !SMP at some time in the future. o Added (trivial) support for expedited grace periods. Changes from v4 (http://lkml.org/lkml/2009/5/2/91) include: o Squeeze the size down a bit further by removing the ->completed field from struct rcu_ctrlblk. o This permits synchronize_rcu() to become the empty function. Previous concerns about rcutorture were unfounded, as rcutorture correctly handles a constant value from rcu_batches_completed() and rcu_batches_completed_bh(). Changes from v3 (http://lkml.org/lkml/2009/3/29/221) include: o Changed rcu_batches_completed(), rcu_batches_completed_bh() rcu_enter_nohz(), rcu_exit_nohz(), rcu_nmi_enter(), and rcu_nmi_exit(), to be static inlines, as suggested by David Howells. Doing this saves about 100 bytes from rcutiny.o. (The numbers between v3 and this v4 of the patch are not directly comparable, since they are against different versions of Linux.) Changes from v2 (http://lkml.org/lkml/2009/2/3/333) include: o Fix whitespace issues. o Change short-circuit "||" operator to instead be "+" in order to fix performance bug noted by "kraai" on LWN. (http://lwn.net/Articles/324348/) Changes from v1 (http://lkml.org/lkml/2009/1/13/440) include: o This version depends on EMBEDDED as well as !SMP, as suggested by Ingo. o Updated rcu_needs_cpu() to unconditionally return zero, permitting the CPU to enter dynticks-idle mode at any time. This works because callbacks can be invoked upon entry to dynticks-idle mode. o Paul is now OK with this being included, based on a poll at the Kernel Miniconf at linux.conf.au, where about ten people said that they cared about saving 900 bytes on single-CPU systems. o Applies to both mainline and tip/core/rcu. Signed-off-by: Paul E. McKenney Acked-by: David Howells Acked-by: Josh Triplett Reviewed-by: Lai Jiangshan Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: avi@redhat.com Cc: mtosatti@redhat.com LKML-Reference: <12565226351355-git-send-email-> Signed-off-by: Ingo Molnar commit ce0e7b28fb75cb003cfc8d0238613aaf1c55e797 Author: Ryota Ozaki Date: Sat Oct 24 01:20:10 2009 +0900 sched, cpuacct: Fix niced guest time accounting CPU time of a guest is always accounted in 'user' time without concern for the nice value of its counterpart process although the guest is scheduled under the nice value. This patch fixes the defect and accounts cpu time of a niced guest in 'nice' time as same as a niced process. And also the patch adds 'guest_nice' to cpuacct. The value provides niced guest cpu time which is like 'nice' to 'user'. The original discussions can be found here: http://www.mail-archive.com/kvm@vger.kernel.org/msg23982.html http://www.mail-archive.com/kvm@vger.kernel.org/msg23860.html Signed-off-by: Ryota Ozaki Acked-by: Avi Kivity Cc: Peter Zijlstra LKML-Reference: <1256314810-7897-1-git-send-email-ozaki.ryota@gmail.com> Signed-off-by: Ingo Molnar commit 0b9e31e9264f1bad89856afb96da1688292f13b4 Merge: cf82ff7 964fe08 Author: Ingo Molnar Date: Sun Oct 25 17:30:53 2009 +0100 Merge branch 'linus' into sched/core Conflicts: fs/proc/array.c Merge reason: resolve conflict and queue up dependent patch. Signed-off-by: Ingo Molnar commit 6c21a7fb492bf7e2c4985937082ce58ddeca84bd Author: Mimi Zohar Date: Thu Oct 22 17:30:13 2009 -0400 LSM: imbed ima calls in the security hooks Based on discussions on LKML and LSM, where there are consecutive security_ and ima_ calls in the vfs layer, move the ima_ calls to the existing security_ hooks. Signed-off-by: Mimi Zohar Signed-off-by: James Morris commit 6ae3b84d979308671bf6f6a2123c258a8603d61c Author: Dominik Brodowski Date: Sun Oct 18 18:14:32 2009 +0200 serial_cs: use pcmcia_loop_config() and pre-determined values As the PCMCIA core already determines the multifunction count, the ConfigBase address and the Present value, we can use them directly instead of parsing the CIS again. By making use of pcmcia_loop_config(), we can further remove the remaining call to pcmcia_get_first_tuple() and friends. CC: linux-serial@vger.kernel.org CC: Russell King Signed-off-by: Dominik Brodowski commit bb015f0c85362aa767f8f00f50a40d85e489414f Author: Wolfram Sang Date: Mon Oct 19 11:43:32 2009 +0200 pcmcia: drop already defined PCI_IDs Out of 10 PCI_IDs found in the PCMCIA subsystem, only two were not defined in pci_ids.h. Move them and drop the duplicates. Successfully build-tested. Signed-off-by: Wolfram Sang Cc: Jesse Barnes Signed-off-by: Dominik Brodowski commit 6e8e16c7bc298d7887584c3d027e05db3e86eed9 Author: Eric Paris Date: Thu Oct 22 15:38:26 2009 -0400 SELinux: add .gitignore files for dynamic classes The SELinux dynamic class work in c6d3aaa4e35c71a32a86ececacd4eea7ecfc316c creates a number of dynamic header files and scripts. Add .gitignore files so git doesn't complain about these. Signed-off-by: Eric Paris Acked-by: Stephen D. Smalley Signed-off-by: James Morris commit 5c828713358cb9df8aa174371edcbbb62203a490 Author: Christian Borntraeger Date: Fri Oct 23 14:58:11 2009 +0200 ratelimit: Make suppressed output messages more useful Today I got: [39648.224782] Registered led device: iwl-phy0::TX [40676.545099] __ratelimit: 246 callbacks suppressed [40676.545103] abcdef[23675]: segfault at 0 ... as you can see the ratelimit message contains a function prefix. Since this is always __ratelimit, this wont help much. This patch changes __ratelimit and printk_ratelimit to print the function name that calls ratelimit. This will pinpoint the responsible function, as long as not several different places call ratelimit with the same ratelimit state at the same time. In that case we catch only one random function that calls ratelimit after the wait period. Signed-off-by: Christian Borntraeger Cc: Dave Young Cc: Linus Torvalds CC: Andrew Morton LKML-Reference: <200910231458.11832.borntraeger@de.ibm.com> Signed-off-by: Ingo Molnar commit 72f279b256d520e321a850880d094bc0bcbf45d6 Author: Sheng Yang Date: Thu Oct 22 19:19:34 2009 +0800 generic-ipi: Fix misleading smp_call_function*() description After commit:8969a5ede0f9e17da4b943712429aef2c9bcd82b "generic-ipi: remove kmalloc()", wait = 0 can be guaranteed. Signed-off-by: Sheng Yang Cc: Peter Zijlstra Cc: Jens Axboe Cc: Nick Piggin LKML-Reference: <1256210374-25354-1-git-send-email-sheng@linux.intel.com> Signed-off-by: Ingo Molnar commit 40b1f4e5113eafc5e84f2ba86822df66087fcb25 Author: Michael Neuling Date: Thu Oct 22 14:39:28 2009 +1100 irq: trivial: Fix typo in comment for #endif The comment suggests this #endif is CONFIG_X86 but it's really CONFIG_TRACE_IRQFLAGS_SUPPORT Signed-off-by: Michael Neuling Cc: michael@ellerman.id.au LKML-Reference: <18191.1256182768@neuling.org> Signed-off-by: Ingo Molnar commit b7cb10e790fbd145296e771f789273a875c15719 Author: Arnaldo Carvalho de Melo Date: Wed Oct 21 17:34:06 2009 -0200 perf probe: Print debug messages using pr_*() Use the new pr_{err,warning,debug,etc} printout methods, just like in the kernel. Signed-off-by: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com> [ Split this patch out, to keep perf/probes separate. ] Signed-off-by: Ingo Molnar commit 43315956509ca6913764861ac7dec128b91eb1ec Merge: 9bf4e7f 6beba7a Author: Ingo Molnar Date: Fri Oct 23 08:23:20 2009 +0200 Merge branch 'perf/core' into perf/probes Conflicts: tools/perf/Makefile Merge reason: - fix the conflict - pick up the pr_*() infrastructure to queue up dependent patch Signed-off-by: Ingo Molnar commit 6beba7adbe092e63dfe8d09fbd1e3ec140474a13 Author: Arnaldo Carvalho de Melo Date: Wed Oct 21 17:34:06 2009 -0200 perf tools: Unify debug messages mechanisms We were using eprintf in some places, that looks at a global 'verbose' level, and at other places passing a 'v' parameter to specify the verbosity level, unify it by introducing pr_{err,warning,debug,etc}, just like in the kernel. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 802da5f2289bbe363acef084805195c11f453c48 Author: Frederic Weisbecker Date: Thu Oct 22 23:23:24 2009 +0200 perf tools: Drop asm/types.h wrapper Wrapping the kernel headers is dangerous when it comes to arch headers. Once we wrap asm/types.h, it will also replace the glibc asm/types.h, not only the kernel one. This results in build errors on some machines. Drop this wrapper and do its work from linux/types.h wrapper, also the glibc asm/types.h can already handle most of the type definition it was doing (typedef __u64, __u32, etc...). Todo: Check the others asm/*.h wrappers to prevent from other conflicts. Reported-by: Ingo Molnar Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Anton Blanchard LKML-Reference: <1256246604-17156-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit a4fb581b15949cfd10b64c8af37bc106e95307f3 Author: Frederic Weisbecker Date: Thu Oct 22 23:23:23 2009 +0200 perf tools: Bind callchains to the first sort dimension column Currently, the callchains are displayed using a constant left margin. So depending on the current sort dimension configuration, callchains may appear to be well attached to the first sort dimension column field which is mostly the case, except when the first dimension of sorting is done by comm, because these are right aligned. This patch binds the callchain to the first letter in the first column, whatever type of column it is (dso, comm, symbol). Before: 0.80% perf [k] __lock_acquire __lock_acquire lock_acquire | |--58.33%-- _spin_lock | | | |--28.57%-- inotify_should_send_event | | fsnotify | | __fsnotify_parent After: 0.80% perf [k] __lock_acquire __lock_acquire lock_acquire | |--58.33%-- _spin_lock | | | |--28.57%-- inotify_should_send_event | | fsnotify | | __fsnotify_parent Also, for clarity, we don't put anymore the callchain as is but: - If we have a top level ancestor in the callchain, start it with a first ascii hook. Before: 0.80% perf [kernel] [k] __lock_acquire __lock_acquire lock_acquire | |--58.33%-- _spin_lock | | | |--28.57%-- inotify_should_send_event | | fsnotify [..] [..] After: 0.80% perf [kernel] [k] __lock_acquire | --- __lock_acquire lock_acquire | |--58.33%-- _spin_lock | | | |--28.57%-- inotify_should_send_event | | fsnotify [..] [..] - Otherwise, if we have several top level ancestors, then display these like we did before: 1.69% Xorg | |--21.21%-- vread_hpet | 0x7fffd85b46fc | 0x7fffd85b494d | 0x7f4fafb4e54d | |--15.15%-- exaOffscreenAlloc | |--9.09%-- I830WaitLpRing Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Anton Blanchard LKML-Reference: <1256246604-17156-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit af0a6fa46388e1e0c2d1a672aad84f8f6ef0b20b Author: Frederic Weisbecker Date: Thu Oct 22 23:23:22 2009 +0200 perf tools: Fix missing top level callchain While recursively printing the branches of each callchains, we forget to display the root. It is never printed. Say we have: symbol f1 f2 | -------- f3 | f4 | ---------f5 f6 Actually we never see that, instead it displays: symbol | --------- f3 | f4 | --------- f5 f6 However f1 is always the same than "symbol" and if we are sorting by symbols first then "symbol", f1 and f2 will be well aligned like in the above example, so displaying f1 looks redundant here. But if we are sorting by something else first (dso, comm, etc...), displaying f1 doesn't look redundant but rather necessary because the symbol is not well aligned anymore with its callchain: comm dso symbol f1 f2 | --------- [...] And we want the callchain to be obvious. So we fix the bug by printing the root branch, but we also filter its first entry if we are sorting by symbols first. Reported-by: Anton Blanchard Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras LKML-Reference: <1256246604-17156-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 9bf4e7fba8006d19846fec877b6da0616b2772de Author: Ingo Molnar Date: Wed Oct 21 14:39:51 2009 +0200 x86, instruction decoder: Fix test_get_len build rules Add the kernel source include file as well to the include files search path, to fix this build bug: In file included from arch/x86/tools/test_get_len.c:28: arch/x86/lib/insn.c:21:26: error: linux/string.h: No such file or directory Cc: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Jim Keniston Cc: Frederic Weisbecker LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 4e3b799d7dbb2a12ca8dca8d3594d32095772973 Author: Steven Rostedt Date: Tue Oct 20 19:19:35 2009 -0400 perf tools: Use strsep() over strtok_r() for parsing single line The second argument in the strtok_r() function is not to be used generically and can have different implementations. Currently the function parsing of the perf trace code uses the second argument to copy data from. This can crash the tool or just have unpredictable results. The correct solution is to use strsep() which has a defined result. I also added a check to see if the result was correct, and will break out of the loop in case it fails to parse as expected. Reported-by: Arnaldo Carvalho de Melo Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker LKML-Reference: <20091020232034.237814877@goodmis.org> Signed-off-by: Ingo Molnar commit 60d526f7fa6246b8e32d5b45610d625a5608d988 Author: Steven Rostedt Date: Tue Oct 20 19:19:34 2009 -0400 perf tools: Add 'make DEBUG=1' to remove the -O6 cflag When using gdb to debug perf, it is practically impossible to use when perf is compiled with -O6. For developers, this patch adds the DEBUG feature to the make command line so that a developer can easily remove the optimization flag. LKML-Reference: <1255590330.8392.446.camel@twins> Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091020232033.984323261@goodmis.org> Signed-off-by: Ingo Molnar commit 9983d60d74db9e544c6cb6f65351849fe8e9c1de Author: Masami Hiramatsu Date: Tue Oct 20 12:55:31 2009 -0400 x86: Add AES opcodes to opcode map Add Intel AES opcodes to x86 opcode map. These opcodes are used in arch/x86/crypt/aesni-intel_asm.S. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Jim Keniston Cc: Frederic Weisbecker LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 06ed6ba5ecb771cc3a967838a4bb1d9cbd8786b9 Author: Masami Hiramatsu Date: Tue Oct 20 12:55:24 2009 -0400 x86: Fix group attribute decoding bug Fix a typo in inat_get_group_attribute() which should refer inat_group_tables, not inat_escape_tables. Signed-off-by: Masami Hiramatsu Cc: systemtap Cc: DLE Cc: Jim Keniston Cc: Frederic Weisbecker LKML-Reference: <20091020165524.4145.97333.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit c88e4bf60de6253a048cf4e6b3b0715e543e0460 Author: Arnaldo Carvalho de Melo Date: Tue Oct 20 15:54:55 2009 -0200 perf top: Fix symbol annotation We need to use map->unmap_ip() here too to match section relative symbol address to the absolute address needed to match objdump -dS addresses. Reported-by: Mike Galbraith Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras LKML-Reference: <1256061295-19835-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 8f0b037398a909ccf703ad5f5803066db6327f22 Author: Arnaldo Carvalho de Melo Date: Tue Oct 20 15:08:29 2009 -0200 perf annotate: Remove requirement of passing a symbol name If the user doesn't pass a symbol name to annotate, it will annotate all the symbols that have hits, in order, just like 'perf report -s comm,dso,symbol'. This is a natural followup patch to the one that uses output_hists to find the symbols with hits. The common case is to annotate the first few entries at the top of a perf report, so lets type less characters. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256058509-19678-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit e42049926ebdcae24fdfdc8f0e3ff8f05f24a60b Author: Arnaldo Carvalho de Melo Date: Tue Oct 20 14:25:40 2009 -0200 perf annotate: Use the sym_priv_size area for the histogram We have this sym_priv_size mechanism for attaching private areas to struct symbol entries but annotate wasn't using it, adding private areas to struct symbol in addition to a ->priv pointer. Scrap all that and use the sym_priv_size mechanism. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1256055940-19511-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit ed52ce2e3c33dc7626a40fa2da766d1a6460e543 Author: Arnaldo Carvalho de Melo Date: Mon Oct 19 17:17:57 2009 -0200 perf tools: Add ->unmap_ip operation to struct map We need this because we get section relative addresses when reading the symtabs, but when a tool like 'perf annotate' needs to match these address to what 'objdump -dS' produces we need the address + section back again. So in annotate now we look at the 'struct hist_entry' instances (that weren't really being used) so that we iterate only over the symbols that had some hit and get the map where that particular hit happened so that we can get the right address to match with annotate. Verified that at least: perf annotate mmap_read_counter # Uses the ~/bin/perf binary perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all on a 'perf record perf top' session seems to work. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1255979877-12533-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit bbe2987bea26a684ff11d887dfc4cf39b22c27a2 Author: Arjan van de Ven Date: Tue Oct 20 07:09:39 2009 +0900 perf timechart: Add a process filter During the Kernel Summit demo of perf/ftrace/timechart, there was a feature request to have a process filter for timechart so that you can zoom into one or a few processes that you are really interested in. This patch adds basic support for this feature, the -p (--process) option now can select a PID or a process name to be shown. Multiple -p options are allowed, and the combined set will be included in the output. Signed-off-by: Arjan van de Ven Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091020070939.7d0fb8a7@infradead.org> Signed-off-by: Ingo Molnar commit c258449bc9d286e2ee6546c9cdf911e96cbc126a Merge: 79b9ad3 2e600d0 Author: Ingo Molnar Date: Tue Oct 20 07:51:41 2009 +0200 Merge branch 'perf/urgent' into perf/core Merge reason: Queue up dependent patch. Signed-off-by: Ingo Molnar commit 3e1c2515acf70448cad1ae3ab835ca80be043d33 Author: James Morris Date: Tue Oct 20 13:48:33 2009 +0900 security: remove root_plug Remove the root_plug example LSM code. It's unmaintained and increasingly broken in various ways. Made at the 2009 Kernel Summit in Tokyo! Acked-by: Greg Kroah-Hartman Signed-off-by: James Morris commit 79b9ad361be8c6f3eeea97dd3883e8bcfa989333 Author: Arnaldo Carvalho de Melo Date: Mon Oct 19 15:31:31 2009 -0200 perf tools: Add bunch of missing headers to LIB_H Build dependencies were not properly mapped out. Signed-off-by: Arnaldo Carvalho de Melo Cc: Peter Zijlstra Cc: Frederic Weisbecker LKML-Reference: <1255973491-11626-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 20639c15d2e78f180d398a6b6422880fac3258bb Author: Arnaldo Carvalho de Melo Date: Mon Oct 19 15:11:36 2009 -0200 perf tools: Add missing tools/perf/util/include/string.h To cure a bunch of: In file included from util/include/linux/bitmap.h:1, from util/header.h:8, from builtin-trace.c:7: util/include/../../../../include/linux/bitmap.h:8:26: error: linux/string.h: No such file or directory make: *** [builtin-trace.o] Error 1 make: *** Waiting for unfinished jobs.... Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Peter Zijlstra LKML-Reference: <1255972296-11500-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit b7f3008ad1d795935551e4dd810b0255a7bfa3c9 Author: Stephen Smalley Date: Mon Oct 19 10:08:50 2009 -0400 SELinux: fix locking issue introduced with c6d3aaa4e35c71a3 Ensure that we release the policy read lock on all exit paths from security_compute_av. Signed-off-by: Stephen D. Smalley Signed-off-by: James Morris commit dd86e72abdbc4b436471af5a97927c6145f5298c Author: Ingo Molnar Date: Mon Oct 19 13:33:03 2009 +0200 perf stat: Count branches first Count branches first, cache-misses second. The reason is that on x86 branches are not counted by all counters on all CPUs. Before: Performance counter stats for 'ls': 0.756653 task-clock-msecs # 0.802 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 250 page-faults # 0.330 M/sec 2375725 cycles # 3139.781 M/sec 1628129 instructions # 0.685 IPC 19643 cache-references # 25.960 M/sec 4608 cache-misses # 6.090 M/sec 342532 branches # 452.694 M/sec branch-misses 0.000943356 seconds time elapsed After: Performance counter stats for 'ls': 1.056734 task-clock-msecs # 0.859 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 259 page-faults # 0.245 M/sec 3345932 cycles # 3166.295 M/sec 3074090 instructions # 0.919 IPC 616928 branches # 583.806 M/sec 39279 branch-misses # 6.367 % 21312 cache-references # 20.168 M/sec 3661 cache-misses # 3.464 M/sec 0.001230551 seconds time elapsed (also prettify the printout of branch misses, in case it's getting scaled.) Cc: Tim Blechmann Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <4ADC3975.8050109@klingt.org> Signed-off-by: Ingo Molnar --- tools/perf/builtin-stat.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index c373683..95a55ea 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = { { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES}, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS}, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, }; --- tools/perf/builtin-stat.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 95a55ea..90e0a26 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -50,17 +50,17 @@ static struct perf_event_attr default_attrs[] = { - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK }, - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES}, - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS }, - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS }, - - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES }, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES}, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS}, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS }, + + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, }; commit 56aab464ff6232bcc2f53b26576983dc83f75db7 Author: Ingo Molnar Date: Mon Oct 19 13:27:08 2009 +0200 perf stat: Re-align the default_attrs[] array Clean up the array definition to be vertically aligned. No functional effects. Cc: Tim Blechmann Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <4ADC3975.8050109@klingt.org> Signed-off-by: Ingo Molnar --- tools/perf/builtin-stat.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index c373683..95a55ea 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = { { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES}, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS}, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, }; commit 12133afffcc7140eea915b1572189a2ea0cf7b0e Author: Tim Blechmann Date: Mon Oct 19 12:03:33 2009 +0200 perf stat: Add branch performance events to default output Adds performance event information about branches and branch misses to the default output of perf stat. Signed-off-by: Tim Blechmann Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <4ADC3975.8050109@klingt.org> Signed-off-by: Ingo Molnar commit 1abc7f5500fff8422f34826a006648d8741d83d3 Author: Randy Dunlap Date: Sun Oct 18 19:20:24 2009 -0700 perf tools: Display better error messages on missing packages Check for libelf headers and glibc headers separately so that the error message correctly identifies which package installation is missing/needed. Signed-off-by: Randy Dunlap Cc: paulus@samba.org Cc: a.p.zijlstra@chello.nl Cc: efault@gmx.de Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo LKML-Reference: <4ADBCCE8.3060300@oracle.com> Signed-off-by: Ingo Molnar commit db9f11e36d0125a5e3e595ea9ef2e4b89f7e8737 Author: Frederic Weisbecker Date: Sat Oct 17 17:57:18 2009 +0200 perf tools: Use DECLARE_BITMAP instead of an open-coded array Use DECLARE_BITMAP instead of an open coded array for our bitmap of featured sections. This makes the array an unsigned long instead of a u64 but since we use a 256 bits bitmap, the array size shouldn't vary between different boxes. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Steven Rostedt LKML-Reference: <1255795038-13751-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 2ba0825075e76236d22a20decd8e2346a99faabe Author: Frederic Weisbecker Date: Sat Oct 17 17:12:34 2009 +0200 perf tools: Introduce bitmask'ed additional headers This provides a new set of bitmasked headers. A new field is added in the perf headers that implements a bitmap storing optional features present in the perf.data file. The layout can be pictured like this: (Usual perf headers)(Features bitmap)[Feature 0][Feature n][Feature 255] If the bit n is set, then the feature n is used in this file. They are all set in order. This brings a backward and forward compatibility. The trace_info section has moved into such optional features, this is the first and only one for now. This is backward compatible with the .32 file version although it doesn't support the previous separate trace.info file. And finally it doesn't support the current interim development version. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Steven Rostedt LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 5a116dd2797677cad48fee2f42267e3cb69f5502 Author: Frederic Weisbecker Date: Sat Oct 17 17:12:33 2009 +0200 perf tools: Use kernel bitmap library Use the kernel bitmap library for internal perf tools uses. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Steven Rostedt LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 11018201b831e19304c0d639f105ad6c27e120b1 Author: Anton Blanchard Date: Sun Oct 18 22:29:23 2009 +1100 perf stat: Add branch performance metric When we count both branches and branch-misses it is useful to print out the percentage of branch-misses: # perf stat -e branches -e branch-misses /bin/true Performance counter stats for '/bin/true': 401684 branches # 0.000 M/sec 23301 branch-misses # 5.801 % Signed-off-by: Anton Blanchard Cc: paulus@samba.org Cc: a.p.zijlstra@chello.nl LKML-Reference: <20091018112923.GQ4808@kryten> Signed-off-by: Ingo Molnar commit 0f8f86c7bdd1c954fbe153af437a0d91a6c5721a Merge: dca2d6a f39cdf2 Author: Frederic Weisbecker Date: Sun Oct 18 01:09:09 2009 +0200 Merge commit 'perf/core' into perf/hw-breakpoint Conflicts: kernel/Makefile kernel/trace/Makefile kernel/trace/trace.h samples/Makefile Merge reason: We need to be uptodate with the perf events development branch because we plan to rewrite the breakpoints API on top of perf events. commit bb3c3e807140816b5f5fd4840473ee52a916ad4f Merge: 595c364 012abee Author: Ingo Molnar Date: Sat Oct 17 09:58:25 2009 +0200 Merge commit 'v2.6.32-rc5' into perf/probes Conflicts: kernel/trace/trace_event_profile.c Merge reason: update to -rc5 and resolve conflict. Signed-off-by: Ingo Molnar commit 595c36490deb49381dc51231a3d5e6b66786ed27 Author: Masami Hiramatsu Date: Fri Oct 16 20:08:27 2009 -0400 perf: Add perf-probe document Add perf-probe subcommand document and add it to command-list. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000827.16556.73539.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 9769833b8e4425dc93fc837bf124c6cb02a51abb Author: Masami Hiramatsu Date: Fri Oct 16 20:08:18 2009 -0400 perf: Add DIE_IF() macro for error checking Add DIE_IF() macro and replace ERR_IF() with it, and use linux/stringify.h. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000818.16556.82452.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 89c69c0eee7515cdc217f4278de43547284b3458 Author: Masami Hiramatsu Date: Fri Oct 16 20:08:10 2009 -0400 perf: Use eprintf() for debug messages in perf-probe Replace debug() macro with eprintf() and add -v option for showing those messages in perf-probe. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000810.16556.38013.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 074fc0e4b3f5d24306c2995f2f3b0bd4759e8aeb Author: Masami Hiramatsu Date: Fri Oct 16 20:08:01 2009 -0400 perf: Use die() for error cases in perf-probe Use die() for exiting perf-probe with errors. This replaces perror_exit(), msg_exit() and fprintf()+exit() with die(), and uses die() in semantic_error(). This also renames 'die' local variables to 'dw_die' for avoiding name confliction. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000801.16556.46866.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 4c20194c2de151bca14224ae384b47abf7636a95 Author: Masami Hiramatsu Date: Fri Oct 16 20:07:52 2009 -0400 perf: Check libdwarf APIs for perf probe Check libdwarf APIs for perf probe in tools/perf/Makefile. Since dwarf_get_ranges() has been added from libdwarf 20081231 (and it's the newest function used in probe-finder.c), this just checks whether the function is defined. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000752.16556.92051.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit d1baf5a5a6088e2991b7dbbd370ff200bd6615ce Author: Masami Hiramatsu Date: Fri Oct 16 20:07:44 2009 -0400 x86: Add AMD prefetch and 3DNow! opcodes to opcode map Add AMD prefetch and 3DNow! opcode including FEMMS. Since 3DNow! uses the last immediate byte as an opcode extension byte, x86 insn just treats the extenstion byte as an immediate byte instead of a part of opcode (insn_get_opcode() decodes first "0x0f 0x0f" bytes.) Users who are interested in analyzing 3DNow! opcode still can decode it by analyzing the immediate byte. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000744.16556.27881.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit 8c95bc3e206cff7a55edd2fc5f0e2b305d57903f Author: Masami Hiramatsu Date: Fri Oct 16 20:07:36 2009 -0400 x86: Add MMX/SSE opcode groups to opcode map Add missing MMX/SSE opcode groups to x86 opcode map. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000736.16556.29061.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit e63cc2397ecc0f2b604f22fb9cdbb05911c1e5d4 Author: Masami Hiramatsu Date: Fri Oct 16 20:07:28 2009 -0400 tracing/kprobes: Add failure messages for debugging Add verbose failure messages to kprobe-tracer for debugging. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000728.16556.16713.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit f397af06e4c9bf5a0bc92facb8cb29905e338ab0 Author: Masami Hiramatsu Date: Fri Oct 16 20:07:20 2009 -0400 tracing/kprobes: Update kprobe-tracer selftest against new syntax Update kprobe-tracer selftest since command syntax has been changed. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Paul Mackerras Cc: Peter Zijlstra LKML-Reference: <20091017000720.16556.26343.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Ingo Molnar commit f39cdf25bf77219676ec5360980ac40b1a7e144a Author: Julia Lawall Date: Sat Oct 17 08:43:17 2009 +0200 perf tools: Move dereference after NULL test In each case, if the NULL test on thread is needed, then the dereference should be after the NULL test. A simplified version of the semantic match that detects this problem is as follows (http://coccinelle.lip6.fr/): // @match exists@ expression x, E; identifier fld; @@ * x->fld ... when != \(x = E\|&x\) * x == NULL // Signed-off-by: Julia Lawall LKML-Reference: Signed-off-by: Ingo Molnar commit b33a6363649f0ff83ec81597ea7fe7e688f973cb Author: Borislav Petkov Date: Fri Oct 16 12:31:33 2009 +0200 x86, mce: Add a global MCE init helper Add an early initcall (pre SMP) which sets up global MCE functionality. Signed-off-by: Borislav Petkov Cc: Andi Kleen LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar commit 5e09954a9acc3b435ffe318b95afd3c02fae069f Author: Borislav Petkov Date: Fri Oct 16 12:31:32 2009 +0200 x86, mce: Fix up MCE naming nomenclature Prefix global/setup routines with "mcheck_" thus differentiating from the internal facilities prefixed with "mce_". Also, prefix the per cpu calls with mcheck_cpu and rename them to reflect the MCE setup hierarchy of calls better. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov Cc: Andi Kleen LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar commit 6b50f5c7c7163d50af0946a93b61c05e448f6038 Merge: 8968f9d fb25319 93ae501 Author: Ingo Molnar Date: Fri Oct 16 14:42:20 2009 +0200 Merge branches 'x86/mce' and 'x86/urgent' into perf/mce Merge reason: Put all MCE changes into this branch, we are queueing up a dependent patch. Signed-off-by: Ingo Molnar commit f88f2b4fdb1e098433ad2b005b6f7353f7268ce1 Author: Cyrill Gorcunov Date: Thu Oct 15 19:04:16 2009 +0400 x86: apic: Allow noop operations to be called almost at any time As only apic noop is used we allow to use almost any operation caller wants (and which of them noop driver supports of course). Initially it was reported by Ingo Molnar that apic noop issue a warning for pkg id (which is actually false positive and should be eliminated). So we save checking (and warning issue) for read/write operations while allow any other ops to be freely used. Also: - fix noop_cpu_to_logical_apicid, it should be 0. - rename noop_default_phys_pkg_id to noop_phys_pkg_id (we use default_ prefix for more general routines in apic subsystem). Reported-by: Ingo Molnar Signed-off-by: Cyrill Gorcunov Cc: Yinghai Lu Cc: Maciej W. Rozycki LKML-Reference: <20091015150416.GC5331@lenovo> Signed-off-by: Ingo Molnar commit 434a83c3fbb951908a3a52040f7f0e0b8ba00dd0 Author: Ingo Molnar Date: Thu Oct 15 11:50:39 2009 +0200 events: Harmonize event field names and print output names Now that we can filter based on fields via perf record, people will start using filter expressions and will expect them to be obvious. The primary way to see which fields are available is by looking at the trace output, such as: gcc-18676 [000] 343.011728: irq_handler_entry: irq=0 handler=timer cc1-18677 [000] 343.012727: irq_handler_entry: irq=0 handler=timer cc1-18677 [000] 343.032692: irq_handler_entry: irq=0 handler=timer cc1-18677 [000] 343.033690: irq_handler_entry: irq=0 handler=timer cc1-18677 [000] 343.034687: irq_handler_entry: irq=0 handler=timer cc1-18677 [000] 343.035686: irq_handler_entry: irq=0 handler=timer cc1-18677 [000] 343.036684: irq_handler_entry: irq=0 handler=timer While 'irq==0' filters work, the 'handler==' filter expression does not work: $ perf record -R -f -a -e irq:irq_handler_entry --filter handler=timer sleep 1 Error: failed to set filter with 22 (Invalid argument) The problem is that while an 'irq' field exists and is recognized as a filter field - 'handler' does not exist - its name is 'name' in the output. To solve this, we need to synchronize the printout and the field names, wherever possible. In cases where the printout prints a non-field, we enclose that information in square brackets, such as: perf-1380 [013] 724.903505: softirq_exit: vec=9 [action=RCU] perf-1380 [013] 724.904482: softirq_exit: vec=1 [action=TIMER] This way users can use filter expressions more intuitively: all fields that show up as 'primary' (non-bracketed) information is filterable. This patch harmonizes the field names for all irq, bkl, power, sched and timer events. We might in fact think about dropping the print format bit of generic tracepoints altogether, and just print the fields that are being recorded. Cc: Li Zefan Cc: Tom Zanussi Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo LKML-Reference: Signed-off-by: Ingo Molnar commit a66abe7fbf7805a1a02f241bd5283265ff6706ec Author: Ingo Molnar Date: Thu Oct 15 12:24:04 2009 +0200 tracing/events: Fix locking imbalance in the filter code Américo Wang noticed that we have a locking imbalance in the error paths of ftrace_profile_set_filter(), causing potential leakage of event_mutex. Also clean up other error codepaths related to event_mutex while at it. Plus fix an initialized variable in the subsystem filter code. Reported-by: Américo Wang Cc: Li Zefan Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <2375c9f90910150247u5ccb8e2at58c764e385ffa490@mail.gmail.com> Signed-off-by: Ingo Molnar commit c171b552a7d316c7e1c3ad6f70a30178dd53e14c Author: Li Zefan Date: Thu Oct 15 11:22:07 2009 +0800 perf trace: Add filter Suppport Add a new option "--filter " to perf record, and it should be right after "-e trace_point": #./perf record -R -f -e irq:irq_handler_entry --filter irq==18 ^C # ./perf trace perf-4303 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 init-0 ... irq_handler_entry: irq=18 handler=eth0 See Documentation/trace/events.txt for the syntax of filter expressions. Signed-off-by: Li Zefan Acked-by: Peter Zijlstra Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <4AD6955F.90602@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830 Author: Li Zefan Date: Thu Oct 15 11:21:42 2009 +0800 tracing/profile: Add filter support - Add an ioctl to allocate a filter for a perf event. - Free the filter when the associated perf event is to be freed. - Do the filtering in perf_swevent_match(). Signed-off-by: Li Zefan Acked-by: Peter Zijlstra Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <4AD69546.8050401@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit b0f1a59a98d7ac2102e7e4f22904c26d564a5628 Author: Li Zefan Date: Thu Oct 15 11:21:12 2009 +0800 tracing/filters: Use a different op for glob match "==" will always do a full match, and "~" will do a glob match. In the future, we may add "=~" for regex match. Signed-off-by: Li Zefan Acked-by: Peter Zijlstra Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <4AD69528.3050309@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit fce29d15b59245597f7f320db4a9f2be0f5fb512 Author: Li Zefan Date: Thu Oct 15 11:20:34 2009 +0800 tracing/filters: Refactor subsystem filter code Change: for_each_pred for_each_subsystem To: for_each_subsystem for_each_pred This change also prepares for later patches. Signed-off-by: Li Zefan Acked-by: Peter Zijlstra Acked-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <4AD69502.8060903@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 713490e02eed242b4c1c672b3c0c8b708f8b6f1d Merge: c4dc775 1beee96 Author: Ingo Molnar Date: Thu Oct 15 11:33:56 2009 +0200 Merge branch 'tracing/core' into perf/core Merge reason: to add event filter support we need the following commits from the tracing tree: 3f6fe06: tracing/filters: Unify the regex parsing helpers 1889d20: tracing/filters: Provide basic regex support 737f453: tracing/filters: Cleanup useless headers Signed-off-by: Ingo Molnar commit 0edf1a683e499191b27a067956ae9f5fa6e046c6 Author: Paul E. McKenney Date: Wed Oct 14 10:15:59 2009 -0700 rcu: Update trace.txt documentation for blocked-tasks lists Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: npiggin@suse.de Cc: jens.axboe@oracle.com LKML-Reference: <12555405592804-git-send-email-> Signed-off-by: Ingo Molnar commit bd58b430039435e4c981cf802b5b11d511d73abd Author: Paul E. McKenney Date: Wed Oct 14 10:15:54 2009 -0700 rcu: Update trace.txt documentation to reflect recent changes o Remove the CONFIG_PREEMPT_RCU documentation since this config option has now been removed. o Change the now-incorrect references to "rcu" labels to instead be "rcu_sched". o Add notes stating that CONFIG_TREE_PREEMPT_RCU kernels will have additional "rcu_preempt" output. o Note the new "oqlen" field in the rcuhier output (for RCU callbacks orphaned by an offlined CPU). Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: npiggin@suse.de Cc: jens.axboe@oracle.com LKML-Reference: <1255540559799-git-send-email-> Signed-off-by: Ingo Molnar commit 3397e040dfacbb303498ced1baa96be983dcea06 Author: Paul E. McKenney Date: Wed Oct 14 16:36:38 2009 -0700 rcu: Add rnp->blocked_tasks to tracing Signed-off-by: Paul E. McKenney Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: npiggin@suse.de Cc: jens.axboe@oracle.com Cc: Josh Triplett LKML-Reference: <20091014233638.GE6763@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar kernel/rcutree_trace.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) commit c4dc775f53136cd6af8f88bce67cce9b42751768 Author: Steven Rostedt Date: Wed Oct 14 15:43:44 2009 -0400 perf tools: Remove all char * typecasts and use const in prototype The (char *) for all the static strings was a fix for the symptom and not the disease. The real issue was that the function prototypes needed to be declared "const char *". Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194400.635935008@goodmis.org> Signed-off-by: Ingo Molnar commit afdf1a404eed236d6f762ee44cc0f1dcc97206e0 Author: Steven Rostedt Date: Wed Oct 14 15:43:43 2009 -0400 perf tools: Handle - and + in parsing trace print format The opterators '-' and '+' are not handled in the trace print format. To do: '++' and '--'. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194400.330843045@goodmis.org> Signed-off-by: Ingo Molnar commit cda48461c7fb8431a99b7960480f5f42cc1a5324 Author: Steven Rostedt Date: Wed Oct 14 15:43:42 2009 -0400 perf tools: Add latency format to trace output Add the irqs disabled, preemption count, need resched, and other info that is shown in the latency format of ftrace. # perf trace -l perf-16457 2..s2. 53636.260344: kmem_cache_free: call_site=ffffffff811198f perf-16457 2..s2. 53636.264330: kmem_cache_free: call_site=ffffffff811198f perf-16457 2d.s4. 53636.300006: kmem_cache_free: call_site=ffffffff810d889 Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194400.076588953@goodmis.org> Signed-off-by: Ingo Molnar commit 0d1da915c76838c9ee7af7cdefbcb2bae9424161 Author: Steven Rostedt Date: Wed Oct 14 15:43:41 2009 -0400 perf tools: Handle both versions of ftrace output The ftrace output events can have either arguments or no arguments. The parser needs to be able to handle both. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194359.790221427@goodmis.org> Signed-off-by: Ingo Molnar commit ffa1895561645103d8f8059b35d9c06e6eeead2e Author: Steven Rostedt Date: Wed Oct 14 15:43:40 2009 -0400 perf tools: Fix bprintk reading in trace output The bprintk parsing was broken in more ways than one. The file parsing was incorrect, and the words used by the arguments are always 4 bytes aligned, even on 64-bit machines. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194359.520931637@goodmis.org> Signed-off-by: Ingo Molnar commit 07a4bdddcf2546ccfbfb3c782deab636c371edeb Author: Steven Rostedt Date: Wed Oct 14 15:43:39 2009 -0400 perf tools: Still continue on failed parsing of an event Even though an event may fail to parse, we should not kill the entire report. The trace should still be able to show what it can. If an event fails to parse, a warning is printed, and the output continues. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194359.190809589@goodmis.org> Signed-off-by: Ingo Molnar commit 13999e59343b042b0807be2df6ae5895d29782a0 Author: Steven Rostedt Date: Wed Oct 14 15:43:38 2009 -0400 perf tools: Handle the case with and without the "signed" trace field The trace format files now have a "signed" field. But we should still be able to handle the kernels that do not have this field. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194358.888239553@goodmis.org> Signed-off-by: Ingo Molnar commit f1d1feecf07261d083859ecfef0d4399036f9683 Author: Steven Rostedt Date: Wed Oct 14 15:43:37 2009 -0400 perf tools: Handle newlines in trace parsing better New lines between args in the trace format can break the parsing. This should not be the case. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194358.637991808@goodmis.org> Signed-off-by: Ingo Molnar commit b99af874829cba2b30d212bc6fd31b56275ee4d2 Author: Steven Rostedt Date: Wed Oct 14 15:43:36 2009 -0400 perf tools: Handle * as typecast in trace parsing The '*' is currently only treated as a multiplication, and it needs to be handled as a typecast pointer. This is the version used by trace-cmd. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194358.409327875@goodmis.org> Signed-off-by: Ingo Molnar commit 0959b8d65ce26131c2d5ccfa518a7b76529280fa Author: Steven Rostedt Date: Wed Oct 14 15:43:35 2009 -0400 perf tools: Handle arrays in print fields for trace parsing The array used by the ftrace stack events (caller[x]) causes issues with the parser. This adds code to handle the case, but it also assumes that the array is of type long. Note, this is a special case used (currently) only by the ftrace user and kernel stack records. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194358.124833639@goodmis.org> Signed-off-by: Ingo Molnar commit 298ebc3ef2a6c569b3eb51651f04e26aecbf8a1d Author: Steven Rostedt Date: Wed Oct 14 15:43:34 2009 -0400 perf tools: Handle trace parsing of < and > The code to handle the '<' and '>' ops was all in place, but they were not in the switch statement to consider them as valid ops. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194357.807434040@goodmis.org> Signed-off-by: Ingo Molnar commit 91ff2bc191827f0d3f5ad0a433ff7df7d2dd9aee Author: Steven Rostedt Date: Wed Oct 14 15:43:33 2009 -0400 perf tools: Fix backslash processing on trace print formats The handling of backslashes was broken. It would stop parsing when encountering one. Also, '\n', '\t', '\r' and '\\' were not converted. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194357.521974680@goodmis.org> Signed-off-by: Ingo Molnar commit 924a79af2cdee26a034b9bdce8c9c76995b5c901 Author: Steven Rostedt Date: Wed Oct 14 15:43:32 2009 -0400 perf tools: Handle print concatenations in event format file kmem_alloc ftrace event format had a string that was broken up by two tokens. "string 1" "string 2". This patch lets the parser be able to handle the concatenation. Signed-off-by: Steven Rostedt Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Arnaldo Carvalho de Melo LKML-Reference: <20091014194357.253818714@goodmis.org> Signed-off-by: Ingo Molnar commit b226f744d40b052ac126c4cb16c76f66e5185128 Merge: d5b889f a3ccf63 Author: Ingo Molnar Date: Thu Oct 15 08:44:42 2009 +0200 Merge branch 'linus' into perf/core Merge reason: pick up tools/perf/ changes from upstream. Signed-off-by: Ingo Molnar commit 1beee96bae0daf7f491356777c3080cc436950f5 Author: Frederic Weisbecker Date: Wed Oct 14 20:50:32 2009 +0200 ftrace: Rename set_bootup_ftrace into set_cmdline_ftrace set_cmdline_ftrace is a better match against what does this function: apply a tracer name from the kernel command line. Reported-by: Steven Rostedt Signed-off-by: Frederic Weisbecker Cc: Li Zefan commit 06f43d66ec36388056f5c697bf1e67c0e0a1645c Author: Frederic Weisbecker Date: Wed Oct 14 20:43:39 2009 +0200 ftrace: Copy ftrace_graph_filter boot param using strlcpy We are using strncpy in the wrong way to copy the ftrace_graph_filter boot param because we pass the buffer size instead of the max string size it can contain (buffer size - 1). The end result might not be NULL terminated as we are abusing the max string size. Lets use strlcpy() instead. Reported-by: Li Zefan Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt commit 9636bc0555e3f383c120ddcffe4b7c5c58a10b1a Author: Cyrill Gorcunov Date: Wed Oct 14 19:09:04 2009 +0400 x86, apic: Explain show_lapic= in kernel parameters list Signed-off-by: Cyrill Gorcunov Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091014150904.GA5259@lenovo> Signed-off-by: Ingo Molnar commit 05d86412eab6a18cf57697474cc4f8fbfcd6936f Author: Thomas Gleixner Date: Fri Oct 9 19:02:20 2009 +0200 x86: Remove BKL from apm_32 The lock/unlock kernel pair in do_open() got there with the BKL push down and protects nothing. Remove it. Replace the lock/unlock kernel in the ioctl code with a mutex to protect standbys_pending and suspends_pending. Signed-off-by: Thomas Gleixner LKML-Reference: <20091010153349.365236337@linutronix.de> commit ac06ea2cd06291e63951b51dd7c9a23e6a1f2683 Author: Thomas Gleixner Date: Sat Oct 10 09:35:48 2009 +0200 x86: Remove BKL from microcode cycle_lock_kernel() in microcode_open() is a worthless exercise as there is nothing to wait for. Remove it. Signed-off-by: Thomas Gleixner LKML-Reference: <20091010153349.196074920@linutronix.de> commit 7ec13187ef48b04bb7f6dfa266c7271a52d009c2 Author: Ingo Molnar Date: Wed Oct 14 15:06:42 2009 +0200 x86, apic: Fix prototype in hw_irq.h This warning: In file included from arch/x86/include/asm/ipi.h:23, from arch/x86/kernel/apic/apic_noop.c:27: arch/x86/include/asm/hw_irq.h:105: warning: ‘struct irq_desc’ declared inside parameter list arch/x86/include/asm/hw_irq.h:105: warning: its scope is only this definition or declaration, which is probably not what you want triggers because irq_desc is defined after hw_irq.h is included in irq.h. Since it's pointer reference only, a forward declaration of the type will solve the problem. LKML-Reference: Signed-off-by: Ingo Molnar commit 459c6d15a0c52bae43842ff2cd0dd41aa7de9b7f Author: Frederic Weisbecker Date: Sat Sep 19 07:14:15 2009 +0200 tracing: Document HAVE_SYSCALL_TRACEPOINTS needs Document the arch needed requirements to get the support for syscalls tracing. v2: HAVE_FTRACE_SYSCALLS have been changed to HAVE_SYSCALL_TRACEPOINTS recently. Update this config name in the documentation then. Signed-off-by: Frederic Weisbecker Acked-by: Heiko Carstens Cc: Ingo Molnar Cc: Steven Rostedt Cc: Li Zefan Cc: Masami Hiramatsu Cc: Jason Baron Cc: Lai Jiangshan Cc: Martin Schwidefsky Cc: Paul Mundt commit c44fc770845163f8d9e573f37f92a7b7a7ade14e Author: Frederic Weisbecker Date: Sat Sep 19 06:50:42 2009 +0200 tracing: Move syscalls metadata handling from arch to core Most of the syscalls metadata processing is done from arch. But these operations are mostly generic accross archs. Especially now that we have a common variable name that expresses the number of syscalls supported by an arch: NR_syscalls, the only remaining bits that need to reside in arch is the syscall nr to addr translation. v2: Compare syscalls symbols only after the "sys" prefix so that we avoid spurious mismatches with archs that have syscalls wrappers, in which case syscalls symbols have "SyS" prefixed aliases. (Reported by: Heiko Carstens) Signed-off-by: Frederic Weisbecker Acked-by: Heiko Carstens Cc: Ingo Molnar Cc: Steven Rostedt Cc: Li Zefan Cc: Masami Hiramatsu Cc: Jason Baron Cc: Lai Jiangshan Cc: Martin Schwidefsky Cc: Paul Mundt commit 9338ad6ffb70eca97f335d93c54943828c8b209e Author: Dimitri Sivanich Date: Tue Oct 13 15:32:36 2009 -0500 x86, apic: Move SGI UV functionality out of generic IO-APIC code Move UV specific functionality out of the generic IO-APIC code. Signed-off-by: Dimitri Sivanich LKML-Reference: <20091013203236.GD20543@sgi.com> [ Cleaned up the code some more in their new places. ] Signed-off-by: Ingo Molnar commit 6c2c502910247d2820cb630e7b28fb6bdecdbf45 Author: Dimitri Sivanich Date: Wed Sep 30 11:02:59 2009 -0500 x86: SGI UV: Fix irq affinity for hub based interrupts This patch fixes handling of uv hub irq affinity. IRQs with ALL or NODE affinity can be routed to cpus other than their originally assigned cpu. Those with CPU affinity cannot be rerouted. Signed-off-by: Dimitri Sivanich LKML-Reference: <20090930160259.GA7822@sgi.com> Signed-off-by: Ingo Molnar commit 2626eb2b2fd958dc0f683126aa84e93b939699a1 Author: Cyrill Gorcunov Date: Wed Oct 14 00:07:05 2009 +0400 x86, apic: Limit apic dumping, introduce new show_lapic= setup option In case if a system has a large number of cpus printing apics contents may consume a long time period. We limit such an output by 1 apic by default. But to have an ability to see all apics or some part of them we introduce "show_lapic" setup option which allow us to limit/unlimit the number of APICs being dumped. Example: apic=debug show_lapic=5, or apic=debug show_lapic=all Also move apic_verbosity checking upper that way so helper routines do not need to inspect it at all. Suggested-by: Yinghai Lu Signed-off-by: Cyrill Gorcunov Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.926793122@openvz.org> Signed-off-by: Ingo Molnar commit a933c61829509eb27083146dda392132baa0969a Author: Cyrill Gorcunov Date: Wed Oct 14 00:07:04 2009 +0400 x86, apic: Use apic noop driver In case if apic were disabled we may use the whole apic NOOP driver instead of sparse poking the some functions in apic driver. Also NOOP would catch any inappropriate apic operation calls (not just read/write). Signed-off-by: Cyrill Gorcunov Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.747817361@openvz.org> Signed-off-by: Ingo Molnar commit 9844ab11c763bfed9f054c82366b19dcda66aca9 Author: Cyrill Gorcunov Date: Wed Oct 14 00:07:03 2009 +0400 x86, apic: Introduce the NOOP apic driver Introduce NOOP APIC driver. We should use it in case if apic was disabled due to hardware of software/firmware problems (including user requested to disable it case). The driver is attempting to catch any inappropriate apic operation call with warning issue. Also it is possible to use some apic operation like IPI calls, read/write without checking for apic presence which should make callers code easier. Signed-off-by: Cyrill Gorcunov Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.534682104@openvz.org> Signed-off-by: Ingo Molnar commit 4d8289494a37e19cd7f3beacea9c957ad3debad6 Author: Jiri Olsa Date: Tue Oct 13 16:33:54 2009 -0400 tracing: Enable "__cold" functions Based on the commit: a586df06 "x86: Support __attribute__((__cold__)) in gcc 4.3" some of the functions goes to the ".text.unlikely" section. Looks like there's not many of them (I found printk, panic, __ssb_dma_not_implemented, fat_fs_error), but still worth to include I think. Signed-off-by: Jiri Olsa Cc: Frederic Weisbecker Signed-off-by: Steven Rostedt LKML-Reference: <20091013203426.175845614@goodmis.org> Signed-off-by: Ingo Molnar commit 5cb084bb1f3fd4dcdaf7e4cf564994346ec8f783 Author: Jiri Olsa Date: Tue Oct 13 16:33:53 2009 -0400 tracing: Enable records during the module load I was debuging some module using "function" and "function_graph" tracers and noticed, that if you load module after you enabled tracing, the module's hooks will convert only to NOP instructions. The attached patch enables modules' hooks if there's function trace allready on, thus allowing to trace module functions. Signed-off-by: Jiri Olsa Cc: Frederic Weisbecker Signed-off-by: Steven Rostedt LKML-Reference: <20091013203425.896285120@goodmis.org> Signed-off-by: Ingo Molnar commit 756d17ee7ee4fbc8238bdf97100af63e6ac441ef Author: jolsa@redhat.com Date: Tue Oct 13 16:33:52 2009 -0400 tracing: Support multiple pids in set_pid_ftrace file Adding the possibility to set more than 1 pid in the set_pid_ftrace file, thus allowing to trace more than 1 independent processes. Usage: sh-4.0# echo 284 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 284 sh-4.0# echo 1 >> ./set_ftrace_pid sh-4.0# echo 0 >> ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid swapper tasks 1 284 sh-4.0# echo 4 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 4 sh-4.0# echo > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid no pid sh-4.0# Signed-off-by: Jiri Olsa Cc: Frederic Weisbecker LKML-Reference: <20091013203425.565454612@goodmis.org> Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar commit 194ec34184869f0de1cf255c924fc5299e1b3d27 Author: Steven Rostedt Date: Tue Oct 13 16:33:50 2009 -0400 function-graph/x86: Replace unbalanced ret with jmp The function graph tracer replaces the return address with a hook to trace the exit of the function call. This hook will finish by returning to the real location the function should return to. But the current implementation uses a ret to jump to the real return location. This causes a imbalance between calls and ret. That is the original function does a call, the ret goes to the handler and then the handler does a ret without a matching call. Although the function graph tracer itself still breaks the branch predictor by replacing the original ret, by using a second ret and causing an imbalance, it breaks the predictor even more. This patch replaces the ret with a jmp to keep the calls and ret balanced. I tested this on one box and it showed a 1.7% increase in performance. Another box only showed a small 0.3% increase. But no box that I tested this on showed a decrease in performance by making this change. Signed-off-by: Steven Rostedt Acked-by: Mathieu Desnoyers Cc: Frederic Weisbecker LKML-Reference: <20091013203425.042034383@goodmis.org> Signed-off-by: Ingo Molnar commit 825332e4ff1373c55d931b49408df7ec2298f71e Author: Arjan van de Ven Date: Wed Oct 14 08:17:36 2009 +1100 capabilities: simplify bound checks for copy_from_user() The capabilities syscall has a copy_from_user() call where gcc currently cannot prove to itself that the copy is always within bounds. This patch adds a very explicity bound check to prove to gcc that this copy_from_user cannot overflow its destination buffer. Signed-off-by: Arjan van de Ven Acked-by: James Morris Signed-off-by: Andrew Morton Signed-off-by: James Morris commit d5b889f2ecec7849e851ddd31c34bdfb3482b5de Author: Arnaldo Carvalho de Melo Date: Tue Oct 13 11:16:29 2009 -0300 perf tools: Move threads & last_match to threads.c This was just being copy'n'pasted all over. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <20091013141629.GD21809@ghostprotocols.net> Signed-off-by: Ingo Molnar commit f4f0b418188cc7995375acbb54e87c80f21861bd Author: Mike Galbraith Date: Tue Oct 13 14:57:20 2009 +0200 perf tools: Remove expensive old debug code from perf top Calling gettimeofday() at high frequency is painful for handicapped boxen. The spot calling gettimeofday() is old unneeded debug code, so remove it. Reported-by: Ingo Molnar Signed-off-by: Mike Galbraith Cc: Peter Zijlstra Cc: Peter Zijlstra LKML-Reference: <1255438640.7173.1.camel@marge.simson.net> Signed-off-by: Ingo Molnar commit 1bac0497ef9af8d933860672223e38bd6ac4934a Merge: 2c96c14 bf7c5b4 Author: Ingo Molnar Date: Tue Oct 13 12:03:08 2009 +0200 Merge branch 'tracing/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into tracing/core commit cfed95a693e1ea5d08b9c9019bc30e448437ee2f Author: Vincent Legoll Date: Tue Oct 13 10:18:16 2009 +0200 perf tools: Do not manually count string lengths Use strlen & macros instead of manually counting string lengths as this is error prone and may lend to bugs. Signed-off-by: Vincent Legoll Cc: Linus Torvalds LKML-Reference: <4727185d0910130118m5387058dndb02ac9b384af9f0@mail.gmail.com> Signed-off-by: Ingo Molnar commit 8968f9d3dc23d9a1821d97c6f11e72a59382e56c Author: Hidetoshi Seto Date: Tue Oct 13 16:19:41 2009 +0900 perf_event, x86, mce: Use TRACE_EVENT() for MCE logging This approach is the first baby step towards solving many of the structural problems the x86 MCE logging code is having today: - It has a private ring-buffer implementation that has a number of limitations and has been historically fragile and buggy. - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE specific. /dev/mcelog is not part of any larger logging framework and hence has remained on the fringes for many years. - The MCE logging code is still very unclean partly due to its ABI limitations. Fields are being reused for multiple purposes, and the whole message structure is limited and x86 specific to begin with. All in one, the x86 tree would like to move away from this private implementation of an event logging facility to a broader framework. By using perf events we gain the following advantages: - Multiple user-space agents can access MCE events. We can have an mcelog daemon running but also a system-wide tracer capturing important events in flight-recorder mode. - Sampling support: the kernel and the user-space call-chain of MCE events can be stored and analyzed as well. This way actual patterns of bad behavior can be matched to precisely what kind of activity happened in the kernel (and/or in the app) around that moment in time. - Coupling with other hardware and software events: the PMU can track a number of other anomalies - monitoring software might chose to monitor those plus the MCE events as well - in one coherent stream of events. - Discovery of MCE sources - tracepoints are enumerated and tools can act upon the existence (or non-existence) of various channels of MCE information. - Filtering support: we just subscribe to and act upon the events we are interested in. Then even on a per event source basis there's in-kernel filter expressions available that can restrict the amount of data that hits the event channel. - Arbitrary deep per cpu buffering of events - we can buffer 32 entries or we can buffer as much as we want, as long as we have the RAM. - An NMI-safe ring-buffer implementation - mappable to user-space. - Built-in support for timestamping of events, PID markers, CPU markers, etc. - A rich ABI accessible over system call interface. Per cpu, per task and per workload monitoring of MCE events can be done this way. The ABI itself has a nice, meaningful structure. - Extensible ABI: new fields can be added without breaking tooling. New tracepoints can be added as the hardware side evolves. There's various parsers that can be used. - Lots of scheduling/buffering/batching modes of operandi for MCE events. poll() support. mmap() support. read() support. You name it. - Rich tooling support: even without any MCE specific extensions added the 'perf' tool today offers various views of MCE data: perf report, perf stat, perf trace can all be used to view logged MCE events and perhaps correlate them to certain user-space usage patterns. But it can be used directly as well, for user-space agents and policy action in mcelog, etc. With this we hope to achieve significant code cleanup and feature improvements in the MCE code, and we hope to be able to drop the /dev/mcelog facility in the end. This patch is just a plain dumb dump of mce_log() records to the tracepoints / perf events framework - a first proof of concept step. Signed-off-by: Hidetoshi Seto Cc: Huang Ying Cc: Andi Kleen LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com> Signed-off-by: Ingo Molnar commit bf7c5b43a12614847b83f507fb169ad30640e406 Author: Frederic Weisbecker Date: Mon Oct 12 22:31:32 2009 +0200 tracing: Remove unused ftrace_trace_addr helper Remove the ftrace_trace_addr() function as only its off-case is implemented and there are no users of it currently. But we keep ftrace_graph_addr() off-case, in case someone come to use the function graph tracer to profit from top-level callers filtering. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Li Zefan commit aef6f81b55f462082699c06e8e67e6eb5630ed45 Author: Frederic Weisbecker Date: Mon Oct 12 22:23:24 2009 +0200 tracing: Rename set_ftrace to set_bootup_ftrace Do this rename because set_ftrace is too much generic and not enough self-explainable as a name. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Li Zefan commit 9dbdd6c41c12fb42ee7188eafa7e1917b192af3a Merge: 7a693d3 1612913 Author: Ingo Molnar Date: Tue Oct 13 09:31:28 2009 +0200 Merge commit 'v2.6.32-rc4' into perf/core Merge reason: we were on an -rc1 base, merge up to -rc4. Signed-off-by: Ingo Molnar commit 2c96c142e941041973faab20ca3b82d57f435c5e Merge: 3c35586 8ad8073 Author: Ingo Molnar Date: Tue Oct 13 09:24:51 2009 +0200 Merge branch 'tracing/urgent' into tracing/core Merge reason: Pick up tracing/filters fix from the urgent queue, we will queue up dependent patches. Signed-off-by: Ingo Molnar commit 7a693d3f0d10f978ebdf3082c41404ab97106567 Author: Ingo Molnar Date: Tue Oct 13 08:16:30 2009 +0200 perf_events, x86: Fix event constraints code There was namespace overlap due to a rename i did - this caused the following build warning, reported by Stephen Rothwell against linux-next x86_64 allmodconfig: arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx': arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function This is a real bug not just a warning: fix it by renaming the global event-constraints table pointer to 'event_constraints'. Reported-by: Stephen Rothwell Cc: Stephane Eranian Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar commit 23e8ec0d1c410f2f1d81050ee155db229abb1707 Author: Masami Hiramatsu Date: Wed Oct 7 18:28:30 2009 -0400 perf probe: Add perf probe command support without libdwarf Enables 'perf probe' even if libdwarf is not installed. If libdwarf is not found, 'perf probe' just disables dwarf support. Users can use 'perf probe' to set up new events by using kprobe_events format. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091007222830.1684.25665.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 4ea42b181434bfc6a0a18d32214130a242d489bf Author: Masami Hiramatsu Date: Thu Oct 8 17:17:38 2009 -0400 perf: Add perf probe subcommand, a kprobe-event setup helper Add perf probe subcommand that implements a kprobe-event setup helper to the perf command. This allows user to define kprobe events using C expressions (C line numbers, C function names, and C local variables). Usage ----- perf probe [] -P 'PROBEDEF' [-P 'PROBEDEF' ...] -k, --vmlinux vmlinux/module pathname -P, --probe probe point definition, where p: kprobe probe r: kretprobe probe GRP: Group name (optional) NAME: Event name FUNC: Function name OFFS: Offset from function entry (in byte) SRC: Source code path LINE: Line number ARG: Probe argument (local variable name or kprobe-tracer argument format is supported.) Changes in v4: - Add _GNU_SOURCE macro for strndup(). Changes in v3: - Remove -r option because perf always be used for online kernel. - Check malloc/calloc results. Changes in v2: - Check synthesized string length. - Rename perf kprobe to perf probe. - Use spaces for separator and update usage comment. - Check error paths in parse_probepoint(). - Check optimized-out variables. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091008211737.29299.14784.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit e93f4d8539d5e9dd59f4af9d8ef4e9b62cfa1f81 Author: Masami Hiramatsu Date: Wed Oct 7 18:28:14 2009 -0400 tracing/kprobes: Robustify fixed field names against variable field names conflicts Rename probe-common fixed field names to harder conflictable names, because current 'ip', 'func', and other probe field names are easily in conflict with user-specified variable names. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091007222814.1684.407.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit a703d946e883d8e447d0597de556e2effd110372 Author: Masami Hiramatsu Date: Wed Oct 7 18:28:07 2009 -0400 tracing/kprobes: Avoid field name confliction Check whether the argument name is in conflict with other field names while creating a kprobe through the debugfs interface. Changes in v3: - Check strcmp() == 0 instead of !strcmp(). Changes in v2: - Add common_lock_depth to reserved name list. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091007222807.1684.26880.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 2e06ff6389aedafc4a3a374344ac70672252f9b5 Author: Masami Hiramatsu Date: Wed Oct 7 18:27:59 2009 -0400 tracing/kprobes: Make special variable names more self-explainable Rename special variables to more self-explainable names as below: - $rv to $retval - $sa to $stack - $aN to $argN - $sN to $stackN Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091007222759.1684.3319.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 98272ed0d2e6509fe7dc571e77956c99bf653bb6 Author: H. Peter Anvin Date: Mon Oct 12 14:14:10 2009 -0700 x86: use kernel_stack_pointer() in kprobes.c The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Signed-off-by: H. Peter Anvin Cc: Ananth N Mavinakayanahalli Cc: Anil S Keshavamurthy Cc: "David S. Miller" Cc: Masami Hiramatsu commit 5ca6c0ca5dbf105d7b0ffdae2289519982189730 Author: H. Peter Anvin Date: Mon Oct 12 14:12:18 2009 -0700 x86: use kernel_stack_pointer() in kgdb.c The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Signed-off-by: H. Peter Anvin Cc: Jason Wessel commit a343c75d338aa2afaea4a2a8e40de9e67b6fb4a7 Author: H. Peter Anvin Date: Mon Oct 12 14:11:09 2009 -0700 x86: use kernel_stack_pointer() in dumpstack.c The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Furthermore, user_mode() is only valid when the process is known to not run in V86 mode. Use the safer user_mode_vm() instead. Signed-off-by: H. Peter Anvin commit def3c5d0a34e4b09b3cea4435c17209ad347104d Author: H. Peter Anvin Date: Mon Oct 12 14:09:07 2009 -0700 x86: use kernel_stack_pointer() in process_32.c The way to obtain a kernel-mode stack pointer from a struct pt_regs in 32-bit mode is "subtle": the stack doesn't actually contain the stack pointer, but rather the location where it would have been marks the actual previous stack frame. For clarity, use kernel_stack_pointer() instead of coding this weirdness explicitly. Signed-off-by: H. Peter Anvin commit ad8f4356af58f7ded6b4a5787c67c7cab51066b5 Author: Arjan van de Ven Date: Tue Oct 6 07:04:52 2009 -0700 x86: Don't use the strict copy checks when branch profiling is in use The branch profiling creates very complex code for each if statement, to the point that gcc has trouble even analyzing something as simple as if (count > 5) count = 5; This then means that causing an error on code that gcc cannot analyze for copy_from_user() and co is not very productive. This patch excludes the strict copy checks in the case of branch profiling being enabled. Signed-off-by: Arjan van de Ven Cc: Steven Rostedt LKML-Reference: <20091006070452.5e1fc119@infradead.org> Signed-off-by: Ingo Molnar commit 369bc18f9a6c4e2686204c1d7476ab684a720968 Author: Stefan Assmann Date: Mon Oct 12 22:17:21 2009 +0200 ftrace: add kernel command line graph function filtering Add a command line parameter to allow limiting the function graphs that are traced on boot up from the given top-level callers , when ftrace=function_graph is specified. This patch adds the following command line option: ftrace_graph_filter=function-list Where function-list is a comma separated list of functions to filter. [fweisbec@gmail.com: picked the documentation changes from the v2 patch] Signed-off-by: Stefan Assmann Acked-by: Steven Rostedt LKML-Reference: <4AD2DEB9.2@redhat.com> Signed-off-by: Frederic Weisbecker commit 99329c44f28a1b7ac83beebfb4319e612042e319 Author: Masami Hiramatsu Date: Wed Oct 7 18:27:48 2009 -0400 tracing/kprobes: Remove '$ra' special variable Remove '$ra' (return address) because it is already shown at the head of each entry. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091007222748.1684.12711.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 405b2651e4bedf8d3932b64cad649b4d26b067f5 Author: Masami Hiramatsu Date: Wed Oct 7 18:27:40 2009 -0400 tracing/kprobes: Add $ prefix to special variables Add $ prefix to the special variables(e.g. sa, rv) of kprobe-tracer. This resolves consistency issues between kprobe_events and perf-kprobe. The main goal is to avoid conflicts between local variable names of probed functions, used by perf probe, and special variables used in the kprobe event creation interface (stack values, etc...) and also available from perf probe. ie: we don't want rv (return value) to conflict with a local variable named rv in a probed function. Signed-off-by: Masami Hiramatsu Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Arnaldo Carvalho de Melo Cc: Steven Rostedt Cc: Mike Galbraith Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Christoph Hellwig Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Frank Ch. Eigler LKML-Reference: <20091007222740.1684.91170.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit ae24ffe5ecec17c956ac25371d7c2e12b4b36e53 Author: Brian Gerst Date: Mon Oct 12 10:18:23 2009 -0400 x86, 64-bit: Move K8 B step iret fixup to fault entry asm Move the handling of truncated %rip from an iret fault to the fault entry path. This allows x86-64 to use the standard search_extable() function. Signed-off-by: Brian Gerst Cc: Linus Torvalds Cc: Jan Beulich LKML-Reference: <1255357103-5418-1-git-send-email-brgerst@gmail.com> Signed-off-by: Ingo Molnar commit fb2531953fd8855abdcf458459020fd382c5deca Author: Borislav Petkov Date: Wed Oct 7 13:20:38 2009 +0200 mce, edac: Use an atomic notifier for MCEs decoding Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar commit 55ffb7a6bd45d0083ffb132381cb46964a4afe01 Author: Mike Galbraith Date: Sat Oct 10 14:46:04 2009 +0200 perf sched: Add -C option to measure on a specific CPU To refresh, trying to sched record only one CPU results in bogus latencies as below. I fixed^Wmade it stop doing the bad thing today, by following task migration events properly. Before: marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10 marge:/root/tmp # perf sched lat ----------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------------- Xorg:4943 | 1.290 ms | 1 | avg: 1670.132 ms | max: 1670.132 ms | hald-addon-stor:3569 | 0.091 ms | 3 | avg: 658.609 ms | max: 1975.797 ms | hald-addon-stor:3573 | 0.209 ms | 4 | avg: 499.138 ms | max: 1990.565 ms | audispd:4270 | 0.012 ms | 1 | avg: 0.015 ms | max: 0.015 ms | .... marge:/root/tmp # perf sched trace|grep 'Xorg:4943' swapper-0 [000] 401.184013288: sched_stat_runtime: task: Xorg:4943 runtime: 1233188 [ns], vruntime: 19105169779 [ns] rt2870TimerQHan-4947 [000] 402.854140127: sched_stat_wait: task: Xorg:4943 wait: 580073 [ns] rt2870TimerQHan-4947 [000] 402.854141770: sched_migrate_task: task Xorg:4943 [140] from: 1 to: 0 rt2870TimerQHan-4947 [000] 402.854143854: sched_stat_wait: task: Xorg:4943 wait: 0 [ns] rt2870TimerQHan-4947 [000] 402.854145397: sched_switch: task rt2870TimerQHan:4947 [140] (D) ==> Xorg:4943 [140] Xorg-4943 [000] 402.854193133: sched_stat_runtime: task: Xorg:4943 runtime: 56546 [ns], vruntime: 11766332500 [ns] Xorg-4943 [000] 402.854196842: sched_switch: task Xorg:4943 [140] (S) ==> swapper:0 [140] After: marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10 marge:/root/tmp # perf sched lat ----------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------------- amarokapp:11150 | 271.297 ms | 878 | avg: 0.130 ms | max: 1.057 ms | konsole:5965 | 1.370 ms | 12 | avg: 0.092 ms | max: 0.855 ms | Xorg:4943 | 179.980 ms | 1109 | avg: 0.087 ms | max: 1.206 ms | hald-addon-stor:3574 | 0.212 ms | 9 | avg: 0.040 ms | max: 0.169 ms | hald-addon-stor:3570 | 0.223 ms | 9 | avg: 0.037 ms | max: 0.223 ms | klauncher:5864 | 0.550 ms | 8 | avg: 0.032 ms | max: 0.048 ms | The 'Maximum delay ms' results are now sane. Signed-off-by: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit 7e4ff9e3e8f88de8a8536f43294cd32b4e7d9123 Author: Mike Galbraith Date: Mon Oct 12 07:56:03 2009 +0200 perf tools: Fix counter sample frequency breakage Commit 42e59d7d19dc4b4 switched to a default sample frequency of 1KHz, which overrides any user supplied count, causing sched, top and timechart to miss events due to their discrete events being flagged PERF_SAMPLE_PERIOD. Override default sample frequency when the user profides a period count, and make both record and top honor that user supplied option. Signed-off-by: Mike Galbraith Cc: Peter Zijlstra Cc: Arjan van de Ven LKML-Reference: <1255326963.15107.2.camel@marge.simson.net> Signed-off-by: Ingo Molnar commit 3c355863fb32070a2800f41106519c5c3038623a Author: Joe Perches Date: Sun Oct 4 17:53:40 2009 -0700 testmmiotrace.c: Add and use pr_fmt(fmt) - Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt. - Strip MODULE_NAME from pr_s. - Remove MODULE_NAME definition. Signed-off-by: Joe Perches LKML-Reference: <3bb66cc7f85f77b9416902e1be7076f7e3f4ad48.1254701151.git.joe@perches.com> Signed-off-by: Ingo Molnar commit 3bb258bf430d29a24350fe4f44f8bf07b7b7a8f6 Author: Joe Perches Date: Sun Oct 4 17:53:29 2009 -0700 ftrace.c: Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - Remove prefixes from pr_, use pr_fmt(fmt). No change in output. Signed-off-by: Joe Perches Acked-by: Steven Rostedt Cc: Frederic Weisbecker LKML-Reference: <9b377eefae9e28c599dd4a17bdc81172965e9931.1254701151.git.joe@perches.com> Signed-off-by: Ingo Molnar commit a27ab9f26b729326778271c1efd895aef4fda1c4 Author: Tetsuo Handa Date: Sun Oct 4 21:49:49 2009 +0900 LSM: Pass original mount flags to security_sb_mount(). This patch allows LSM modules to determine based on original mount flags passed to mount(). A LSM module can get masked mount flags (if needed) by flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE | MS_NOATIME | MS_NODIRATIME | MS_RELATIME| MS_KERNMOUNT | MS_STRICTATIME); Signed-off-by: Tetsuo Handa Signed-off-by: James Morris commit 8b8efb44033c7e86b3dc76f825c693ec92ae30e9 Author: Tetsuo Handa Date: Sun Oct 4 21:49:48 2009 +0900 LSM: Add security_path_chroot(). This patch allows pathname based LSM modules to check chroot() operations. This hook is used by TOMOYO. Signed-off-by: Tetsuo Handa Signed-off-by: James Morris commit 89eda06837094ce9f34fae269b8773fcfd70f046 Author: Tetsuo Handa Date: Sun Oct 4 21:49:47 2009 +0900 LSM: Add security_path_chmod() and security_path_chown(). This patch allows pathname based LSM modules to check chmod()/chown() operations. Since notify_change() does not receive "struct vfsmount *", we add security_path_chmod() and security_path_chown() to the caller of notify_change(). These hooks are used by TOMOYO. Signed-off-by: Tetsuo Handa Signed-off-by: James Morris commit f3834b9ef68067199486740b31f691afb14dbdf5 Author: Peter Zijlstra Date: Fri Oct 9 10:12:46 2009 +0200 x86: Generate cmpxchg build failures Rework the x86 cmpxchg() implementation to generate build failures when used on improper types. Signed-off-by: Peter Zijlstra Acked-by: Linus Torvalds LKML-Reference: <1254771187.21044.22.camel@laptop> Signed-off-by: Ingo Molnar commit fe9081cc9bdabb0be953a39ad977cea14e35bce5 Author: Peter Zijlstra Date: Thu Oct 8 11:56:07 2009 +0200 perf, x86: Add simple group validation Refuse to add events when the group wouldn't fit onto the PMU anymore. Naive implementation. Signed-off-by: Peter Zijlstra Cc: Stephane Eranian LKML-Reference: <1254911461.26976.239.camel@twins> Signed-off-by: Ingo Molnar commit b690081d4d3f6a23541493f1682835c3cd5c54a1 Author: Stephane Eranian Date: Tue Oct 6 16:42:09 2009 +0200 perf_events: Add event constraints support for Intel processors On some Intel processors, not all events can be measured in all counters. Some events can only be measured in one particular counter, for instance. Assigning an event to the wrong counter does not crash the machine but this yields bogus counts, i.e., silent error. This patch changes the event to counter assignment logic to take into account event constraints for Intel P6, Core and Nehalem processors. There is no contraints on Intel Atom. There are constraints on Intel Yonah (Core Duo) but they are not provided in this patch given that this processor is not yet supported by perf_events. As a result of the constraints, it is possible for some event groups to never actually be loaded onto the PMU if they contain two events which can only be measured on a single counter. That situation can be detected with the scaling information extracted with read(). Signed-off-by: Stephane Eranian Signed-off-by: Peter Zijlstra LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar commit 04a705df47d1ea27ca2b066f24b1951c51792d0d Author: Stephane Eranian Date: Tue Oct 6 16:42:08 2009 +0200 perf_events: Check for filters on fixed counter events Intel fixed counters do not support all the filters possible with a generic counter. Thus, if a fixed counter event is passed but with certain filters set, then the fixed_mode_idx() function must fail and the event must be measured in a generic counter instead. Reject filters are: inv, edge, cnt-mask. Signed-off-by: Stephane Eranian Signed-off-by: Peter Zijlstra LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar commit 5a943617ef52e9f79cd7cf437aad8870be27aabb Author: John Kacur Date: Thu Oct 8 17:20:15 2009 +0200 x86, cpuid: Simplify the code in cpuid_open Peter picked up my patch for tip/x86/cpu that removes the bkl in cpuid_open. Ingo subsequently merged that into tip/master. This patch folds back in tglx's 55968ede164ae523692f00717f50cd926f1382a0 to my patch that removed the bkl. This simplifies the code, and makes it consistent with the changes to kill the bkl in msr.c as well. Originally-by: Thomas Gleixner Signed-off-by: John Kacur Signed-off-by: H. Peter Anvin commit 26dd2cb074d9dc41c9e3cddd7bf175fd0a41febc Author: Frederic Weisbecker Date: Thu Oct 8 22:07:29 2009 +0200 perf tools: Provide backward compatibility with previous perf.data version We have merged the trace.info file into perf.data by adding one section in the perf headers. This makes it incompatible with previous version: the new perf tools can't read the older perf.data. To support the previous format, we check the headers size. If they have the same size than in the previous format, then ignore the trace info section that doesn't exist. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 97ea1a7fa62af0d8d49a0fc12796b0073537c9d8 Author: Frederic Weisbecker Date: Thu Oct 8 21:04:17 2009 +0200 perf tools: Fix thread comm resolution in perf sched This reverts commit 9a92b479b2f088ee2d3194243f4c8e59b1b8c9c2 ("perf tools: Improve thread comm resolution in perf sched") and fixes the real bug. The bug was elsewhere: We are failing to resolve thread names in perf sched because the table of threads we are building, on top of comm events, has a per process granularity. But perf sched, unlike the other perf tools, needs a per thread granularity as we are profiling every tasks individually. So fix it by building our threads table using the tid instead of the pid as the thread identifier. v2: Revert the previous fix - it is not really needed Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 2e538c4a1847291cf01218d4fe7bb4dc60fef7cf Author: Arnaldo Carvalho de Melo Date: Wed Oct 7 13:48:56 2009 -0300 perf tools: Improve kernel/modules symbol lookup This removes the ovelapping of vmlinux addresses with modules, using the ELF section name when using --vmlinux and creating a unique DSO name when using /proc/kallsyms ([kernel].N). This is done by creating multiple 'struct map' instances for address ranges backed by DSOs that have just the symbols for that range and a name that is derived from the ELF section name.o Now it is possible to ask for just the symbols in some particular kernel section: $ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \ --dsos [kernel].vsyscall_fn | head -15 52.73% Xorg [.] vread_hpet 18.61% firefox [.] vread_hpet 14.50% npviewer.bin [.] vread_hpet 6.83% compiz [.] vread_hpet 5.73% glxgears [.] vread_hpet 0.63% java [.] vread_hpet 0.30% gnome-terminal [.] vread_hpet 0.23% perf [.] vread_hpet 0.18% xchat [.] vread_hpet $ Now we don't have to first lookup the list of modules and then, if it fails, vmlinux symbols, its just a simple lookup for the map then the symbols, just like for threads. Reports generated using /proc/kallsyms and --vmlinux should provide the same results, modulo the DSO name for sections other than ".text". But they don't right now because things like: ffffffff81011c20-ffffffff81012068 system_call ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call I.e. overlapping symbols, again some ASM special case that we have to fixup. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit da21d1b547cbaa2c026cf645753651c25d340923 Author: Arnaldo Carvalho de Melo Date: Wed Oct 7 10:49:00 2009 -0300 perf tools: Up the verbose level for some really verbose stuff Like printing every symbol created. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Paul Mackerras Cc: Mike Galbraith LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar commit 9a92b479b2f088ee2d3194243f4c8e59b1b8c9c2 Author: Frederic Weisbecker Date: Thu Oct 8 16:37:12 2009 +0200 perf tools: Improve thread comm resolution in perf sched When we get sched traces that involve a task that was already created before opening the event, we won't have the comm event for it. So if we can't find the comm event for a given thread, we look at the traces that may contain these informations. Before: ata/1:371 | 0.000 ms | 1 | avg: 3988.693 ms | max: 3988.693 ms | kondemand/1:421 | 0.096 ms | 3 | avg: 345.346 ms | max: 1035.989 ms | kondemand/0:420 | 0.025 ms | 3 | avg: 421.332 ms | max: 964.014 ms | :5124:5124 | 0.103 ms | 5 | avg: 74.082 ms | max: 277.194 ms | :6244:6244 | 0.691 ms | 9 | avg: 125.655 ms | max: 271.306 ms | firefox:5080 | 0.924 ms | 5 | avg: 53.833 ms | max: 257.828 ms | npviewer.bin:6225 | 21.871 ms | 53 | avg: 22.462 ms | max: 220.835 ms | :6245:6245 | 9.631 ms | 21 | avg: 41.864 ms | max: 213.349 ms | After: ata/1:371 | 0.000 ms | 1 | avg: 3988.693 ms | max: 3988.693 ms | kondemand/1:421 | 0.096 ms | 3 | avg: 345.346 ms | max: 1035.989 ms | kondemand/0:420 | 0.025 ms | 3 | avg: 421.332 ms | max: 964.014 ms | firefox:5124 | 0.103 ms | 5 | avg: 74.082 ms | max: 277.194 ms | npviewer.bin:6244 | 0.691 ms | 9 | avg: 125.655 ms | max: 271.306 ms | firefox:5080 | 0.924 ms | 5 | avg: 53.833 ms | max: 257.828 ms | npviewer.bin:6225 | 21.871 ms | 53 | avg: 22.462 ms | max: 220.835 ms | npviewer.bin:6245 | 9.631 ms | 21 | avg: 41.864 ms | max: 213.349 ms | Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 016e92fbc9ef33689cf654f343a94383d43235e7 Author: Frederic Weisbecker Date: Wed Oct 7 12:47:31 2009 +0200 perf tools: Unify perf.data mapping and events handling This librarizes the perf.data file mapping and handling in various perf tools, roughly reducing the amount of code and fixing the places that mmap from beginning of the file whereas we want to mmap from the beginning of the data, leading to page fault because the mmap window is too small since the trace info are written in the file too. TODO: - convert perf timechart too Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arjan van de Ven LKML-Reference: <20091007104729.GD5043@nowhere> Signed-off-by: Ingo Molnar commit 170a0bc3808909d8ea0f3f9c725c6565efe7f9c4 Author: John Kacur Date: Wed Oct 7 20:19:32 2009 +0200 x86, cpuid: Remove the bkl from cpuid_open() Most of the variables are local to the function. It IS possible that for struct cpuinfo_x86 *c c could point to the same area. However, this is used read only. Signed-off-by: John Kacur LKML-Reference: Signed-off-by: H. Peter Anvin commit d6c304055b3cecd4ca865769ac7cea97a320727b Author: Frederic Weisbecker Date: Wed Oct 7 21:43:22 2009 +0200 x86, msr: Remove the bkl from msr_open() Remove the big kernel lock from msr_open() as it doesn't protect anything there. The only racy event that can happen here is a concurrent cpu shutdown. So let's look at what could be racy during/after the above event: - The cpu_online() check is racy, but the bkl doesn't help about that anyway it disables preemption but we may be chcking another cpu than the current one. Also the cpu can still become offlined between open and read calls. - The cpu_data(cpu) returns a safe pointer too. It won't be released on cpu offlining. But some fields can be changed from arch/x86/kernel/smpboot.c:remove_siblinginfo() : - phys_proc_id - cpu_core_id Those are not read from msr_open(). What we are checking is the x86_capability that is left untouched on offlining. So this removal looks safe. Signed-off-by: Frederic Weisbecker Cc: John Kacur Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Sven-Thorsten Dietrich LKML-Reference: <1254944602-7382-1-git-send-email-fweisbec@gmail.com> Signed-off-by: H. Peter Anvin commit 941fc5b2bf8f7dd1d0a9c502e152fa719ff6578e Author: Stephen Smalley Date: Thu Oct 1 14:48:23 2009 -0400 selinux: drop remapping of netlink classes Drop remapping of netlink classes and bypass of permission checking based on netlink message type for policy version < 18. This removes compatibility code introduced when the original single netlink security class used for all netlink sockets was split into finer-grained netlink classes based on netlink protocol and when permission checking was added based on netlink message type in Linux 2.6.8. The only known distribution that shipped with SELinux and policy < 18 was Fedora Core 2, which was EOL'd on 2005-04-11. Given that the remapping code was never updated to address the addition of newer netlink classes, that the corresponding userland support was dropped in 2005, and that the assumptions made by the remapping code about the fixed ordering among netlink classes in the policy may be violated in the future due to the dynamic class/perm discovery support, we should drop this compatibility code now. Signed-off-by: Stephen Smalley Signed-off-by: James Morris commit 8753f6bec352392b52ed9b5e290afb34379f4612 Author: Stephen Smalley Date: Wed Sep 30 13:41:02 2009 -0400 selinux: generate flask headers during kernel build Add a simple utility (scripts/selinux/genheaders) and invoke it to generate the kernel-private class and permission indices in flask.h and av_permissions.h automatically during the kernel build from the security class mapping definitions in classmap.h. Adding new kernel classes and permissions can then be done just by adding them to classmap.h. Signed-off-by: Stephen Smalley Signed-off-by: James Morris commit c6d3aaa4e35c71a32a86ececacd4eea7ecfc316c Author: Stephen Smalley Date: Wed Sep 30 13:37:50 2009 -0400 selinux: dynamic class/perm discovery Modify SELinux to dynamically discover class and permission values upon policy load, based on the dynamic object class/perm discovery logic from libselinux. A mapping is created between kernel-private class and permission indices used outside the security server and the policy values used within the security server. The mappings are only applied upon kernel-internal computations; similar mappings for the private indices of userspace object managers is handled on a per-object manager basis by the userspace AVC. The interfaces for compute_av and transition_sid are split for kernel vs. userspace; the userspace functions are distinguished by a _user suffix. The kernel-private class indices are no longer tied to the policy values and thus do not need to skip indices for userspace classes; thus the kernel class index values are compressed. The flask.h definitions were regenerated by deleting the userspace classes from refpolicy's definitions and then regenerating the headers. Going forward, we can just maintain the flask.h, av_permissions.h, and classmap.h definitions separately from policy as they are no longer tied to the policy values. The next patch introduces a utility to automate generation of flask.h and av_permissions.h from the classmap.h definitions. The older kernel class and permission string tables are removed and replaced by a single security class mapping table that is walked at policy load to generate the mapping. The old kernel class validation logic is completely replaced by the mapping logic. The handle unknown logic is reworked. reject_unknown=1 is handled when the mappings are computed at policy load time, similar to the old handling by the class validation logic. allow_unknown=1 is handled when computing and mapping decisions - if the permission was not able to be mapped (i.e. undefined, mapped to zero), then it is automatically added to the allowed vector. If the class was not able to be mapped (i.e. undefined, mapped to zero), then all permissions are allowed for it if allow_unknown=1. avc_audit leverages the new security class mapping table to lookup the class and permission names from the kernel-private indices. The mdp program is updated to use the new table when generating the class definitions and allow rules for a minimal boot policy for the kernel. It should be noted that this policy will not include any userspace classes, nor will its policy index values for the kernel classes correspond with the ones in refpolicy (they will instead match the kernel-private indices). Signed-off-by: Stephen Smalley Signed-off-by: James Morris commit 03456a158d9067d2f657bec170506009db81756d Author: Frederic Weisbecker Date: Tue Oct 6 23:36:47 2009 +0200 perf tools: Merge trace.info content into perf.data This drops the trace.info file and move its contents into the common perf.data file. This is done by creating a new trace_info section into this file. A user of perf headers needs to call perf_header__set_trace_info() to save the trace meta informations into the perf.data file. A file created by perf after his patch is unsupported by previous version because the size of the headers have increased. That said, it's two new fields that have been added in the end of the headers, and those could be ignored by previous versions if they just handled the dynamic header size and then ignore the unknow part. The offsets guarantee the compatibility. We'll do a -stable fix for that. But current previous versions handle the header size using its static size, not dynamic, then it's not backward compatible with trace records. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Paul Mackerras Cc: Mike Galbraith Cc: Paul Mackerras LKML-Reference: <20091006213643.GA5343@nowhere> Signed-off-by: Ingo Molnar commit b209aa1f83964d49a332a7b6b818ebede5cdc6ef Author: Frederic Weisbecker Date: Tue Oct 6 21:21:26 2009 +0200 perf tools: Start the perf.data mapping at data offset in perf trace Currently, we are mapping perf.data in the beginning of the file and use the data offset as a buffer offset. This may exceed the mapping area if the data offset is upper than page_size * mmap_window and result in a page fault (thing that happen if we merge trace.info in perf.data). Instead, let's start the mapping in the page that matches our data offset. v2: Drop a junk from another patch (trace_report() removal) Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Mike Galbraith Cc: Paul Mackerras Cc: Tom Zanussi LKML-Reference: <1254856886-10348-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar commit 42e59d7d19dc4b49feab2a860fd9a8ca3248c833 Author: Ingo Molnar Date: Tue Oct 6 15:14:21 2009 +0200 perf tools: Default to 1 KHz auto-sampling freq events Use auto-freq events by default in perf record and perf top. This allows more consistent hardware event sampling, regardless of the intensity of the underlying event. It also keeps us from over-sampling on larger/busier systems. (also make surrounding initializations more consistent) Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker LKML-Reference: Signed-off-by: Ingo Molnar commit 064739bc4b3d7f424b2f25547e6611bcf0132415 Author: Tom Zanussi Date: Tue Oct 6 01:09:52 2009 -0500 perf trace: Add string/dynamic cases to format_flags Needed for distinguishing string fields in event stream processing. Signed-off-by: Tom Zanussi Acked-by: Frederic Weisbecker Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo LKML-Reference: <1254809398-8078-4-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 2774601811bedd04ee7e38624343ea80b4a62d7e Author: Tom Zanussi Date: Tue Oct 6 01:09:51 2009 -0500 perf trace: Add subsystem string to struct event Needed to fully qualify event names for event stream processing. Signed-off-by: Tom Zanussi Acked-by: Frederic Weisbecker Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo LKML-Reference: <1254809398-8078-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit 26a50744b21fff65bd754874072857bee8967f4d Author: Tom Zanussi Date: Tue Oct 6 01:09:50 2009 -0500 tracing/events: Add 'signed' field to format files The sign info used for filters in the kernel is also useful to applications that process the trace stream. Add it to the format files and make it available to userspace. Signed-off-by: Tom Zanussi Acked-by: Frederic Weisbecker Cc: rostedt@goodmis.org Cc: lizf@cn.fujitsu.com Cc: hch@infradead.org Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo LKML-Reference: <1254809398-8078-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar commit d9b2002c406011164f245de7a81304625989f1c9 Merge: c3b32fc 906010b Author: Ingo Molnar Date: Tue Oct 6 15:02:30 2009 +0200 Merge branch 'perf/urgent' into perf/core Merge reason: Upcoming patch is dependent on a fix in perf/urgent. Signed-off-by: Ingo Molnar commit cf82ff7ea7695b0e82ba07bc5e9f1bd03a74e1aa Author: Jayson R. King Date: Mon Oct 5 05:21:26 2009 -0500 sched: Remove obsolete comment in sched_init() Remove the comment about calling alloc_bootmem() as it is not called here since commit 36b7b6d465489c4754c4fd66fcec6086eba87896. Signed-off-by: Jayson R. King Cc: Peter Zijlstra Cc: Jiri Kosina LKML-Reference: <4AC9C8A6.6010209@jaysonking.com> Signed-off-by: Ingo Molnar commit c3b32fcbc7f4fd9a9b84718b991b175b0fd53f8c Author: Arnaldo Carvalho de Melo Date: Mon Oct 5 14:26:16 2009 -0300 perf report: Use kernel_maps__find_symbol as fallback to find vdsos, etc In resolve_symbol, as we're moving to breaking the kernel symbols list per address ranges, i.e. kernel linking sections, so that we don't have a big kernel_map that in its range covers what is in the modules. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit a2a99e8e12798706ec1026e5d8fc36f7c86122ce Author: Arnaldo Carvalho de Melo Date: Mon Oct 5 14:26:18 2009 -0300 perf tools: /proc/modules names don't always match its name $ cut -d' ' -f1 /proc/modules|grep _|wc -l 29 $ cut -d' ' -f1 /proc/modules|grep _|sed 's/$/.ko'/g|while read n;do find /lib/modules/`uname -r` -name $n;done|wc -l 12 For instance: $ grep ^aes_x86 /proc/modules aes_x86_64 9056 2 - Live 0xffffffffa0091000 $ l /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko -rw-r--r-- 1 root root 136438 2009-09-22 19:05 /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko Handle that by introducing a strxfrchar routine that replaces dashes with underscores when matching file names to loaded modules. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit af427bf529c5991be8d1a36f43e2d0141f532f63 Author: Arnaldo Carvalho de Melo Date: Mon Oct 5 14:26:17 2009 -0300 perf tools: Create maps for modules when processing kallsyms So that we get kallsyms processing closer to vmlinux + modules symtabs processing. One change in behaviour is that since when one specifies --vmlinux -m should be used to ask for modules, so it is now for kallsyms as well. Also continue if one manages to load the vmlinux data but module processing fails, so that at least some analisys can be done with part of the needed symbols. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit 5c2068059a0e852f72b7c2608d92170b752d821f Author: Arnaldo Carvalho de Melo Date: Mon Oct 5 14:26:15 2009 -0300 perf top: Keep the default of asking for kernel module symbols Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: Signed-off-by: Ingo Molnar commit ec218fc4a796a1b584741d59ef22615d96981188 Author: Arnaldo Carvalho de Melo Date: Sat Oct 3 20:30:48 2009 -0300 perf tools: Remove show_mask bitmask As it was not being exposed via any command line and with --dsos/--comms we can do this and even more, like asking for just kernel + some module: [root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\] --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15 # Samples: 619669 # # Overhead Command Shared Object Symbol # ........ ............... ............. ...... # 7.12% swapper [kernel] [k] read_hpet 6.86% init [kernel] [k] read_hpet 6.22% init [kernel] [k] mwait_idle_with_hints 5.34% swapper [kernel] [k] mwait_idle_with_hints 3.01% firefox [kernel] [.] vread_hpet 2.14% Xorg [drm] [k] drm_clflush_pages 2.09% pidgin [kernel] [.] vread_hpet 1.58% npviewer.bin [kernel] [.] vread_hpet 1.37% swapper [kernel] [k] hpet_next_event 1.23% Xorg [kernel] [k] read_hpet [root@doppio linux-2.6-tip]# Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: <20091003233048.GA30535@ghostprotocols.net> Signed-off-by: Ingo Molnar commit 9735abf11bec48bfbbb1b54772a02deb2ae0c403 Author: Arnaldo Carvalho de Melo Date: Sat Oct 3 10:42:45 2009 -0300 perf tools: Move hist_entry__add common code to hist.c Now perf report and annotate do the callgraph/hit processing in their specialized hist_entry__add functions. Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith Signed-off-by: Ingo Molnar commit 88f70d7590538e427c8405a2e02ac2624847386c Author: Masami Hiramatsu Date: Fri Sep 25 11:20:54 2009 -0700 tracing/ftrace: Fix to check create_event_dir() when adding new events Check result of event_create_dir() and add ftrace_event_call to ftrace_events list only if it is succeeded. Thanks to Li for pointing it out. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Acked-by: Ingo Molnar Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090925182054.10157.55219.stgit@omoto> Signed-off-by: Frederic Weisbecker commit c0b11d3af164947c71e2491912c5b8418900dafb Author: Masami Hiramatsu Date: Fri Sep 25 11:20:38 2009 -0700 x86: Add VIA processor instructions in opcodes decoder Add VIA processor's Padlock instructions(MONTMUL, XSHA1, XSHA256) as parts of the kernel may use them. This fixes the following crash in opcodes decoder selftests: make[2]: `scripts/unifdef' is up to date. TEST posttest Error: c145cf71: f3 0f a6 d0 repz xsha256 Error: objdump says 4 bytes, but insn_get_length() says 3 (attr:0) make[1]: *** [posttest] Error 2 make: *** [bzImage] Error 2 Reported-by: Ingo Molnar Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Acked-by: Ingo Molnar Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090925182037.10157.3180.stgit@omoto> Signed-off-by: Frederic Weisbecker commit a1a138d05fa060ac4238c19a1e890aacc25ed3ba Author: Masami Hiramatsu Date: Fri Sep 25 11:20:12 2009 -0700 tracing/kprobes: Use global event perf buffers in kprobe tracer Use new percpu global event buffer instead of stack in kprobe tracer while tracing through perf. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Acked-by: Ingo Molnar Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090925182011.10157.60140.stgit@omoto> Signed-off-by: Frederic Weisbecker commit 98059e3463383b18fd79181179cd539b74846b47 Author: Matteo Croce Date: Thu Oct 1 17:11:10 2009 +0200 x86: AMD Geode LX optimizations Add CPU optimizations for AMD Geode LX. Signed-off-by: Matteo Croce LKML-Reference: <40101cc30910010811v5d15ff4cx9dd57c9cc9b4b045@mail.gmail.com> Signed-off-by: H. Peter Anvin commit 63312b6a6faae3f2e5577f2b001e3b504f10a2aa Author: Arjan van de Ven Date: Fri Oct 2 07:50:50 2009 -0700 x86: Add a Kconfig option to turn the copy_from_user warnings into errors For automated testing it is useful to have the option to turn the warnings on copy_from_user() etc checks into errors: In function ‘copy_from_user’, inlined from ‘fd_copyin’ at drivers/block/floppy.c:3080, inlined from ‘fd_ioctl’ at drivers/block/floppy.c:3503: linux/arch/x86/include/asm/uaccess_32.h:213: error: call to ‘copy_from_user_overflow’ declared with attribute error: copy_from_user buffer size is not provably correct Signed-off-by: Arjan van de Ven Cc: Linus Torvalds Cc: Andrew Morton LKML-Reference: <20091002075050.4e9f7641@infradead.org> Signed-off-by: Ingo Molnar commit 439d473b4777de510e1322168ac6f2f377ecd5bc Author: Arnaldo Carvalho de Melo Date: Fri Oct 2 03:29:58 2009 -0300 perf tools: Rewrite and improve support for kernel modules Representing modules as struct map entries, backed by a DSO, etc, using /proc/modules to find where the module is loaded. DSOs now can have a short and long name, so that in verbose mode we can show exactly which .ko or vmlinux image was used. As kernel modules now are a DSO separate from the kernel, we can ask for just the hits for a particular set of kernel modules, just like we can do with shared libraries: [root@doppio linux-2.6-tip]# perf report -n --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15 84.58% 13266 Xorg [k] drm_clflush_pages 4.02% 630 Xorg [k] trace_kmalloc.clone.0 3.95% 619 Xorg [k] drm_ioctl 2.07% 324 Xorg [k] drm_addbufs 1.68% 263 Xorg [k] drm_gem_close_ioctl 0.77% 120 Xorg [k] drm_setmaster_ioctl 0.70% 110 Xorg [k] drm_lastclose 0.68% 106 Xorg [k] drm_open 0.54% 85 Xorg [k] drm_mm_search_free [root@doppio linux-2.6-tip]# Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko would have the same effect. Allowing specifying just 'drm.ko' is left for another patch. Processing kallsyms so that per kernel module struct map are instantiated was also left for another patch. That will allow removing the module name from each of its symbols. struct symbol was reduced by removing the ->module backpointer and moving it (well now the map) to struct symbol_entry in perf top, that is its only user right now. The total linecount went down by ~500 lines. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: "H. Peter Anvin" Cc: Peter Zijlstra Cc: Mike Galbraith Cc: Avi Kivity Signed-off-by: Ingo Molnar commit 4a3127693001c61a21d1ce680db6340623f52e93 Author: Arjan van de Ven Date: Wed Sep 30 13:05:23 2009 +0200 x86: Turn the copy_from_user check into an (optional) compile time warning A previous patch added the buffer size check to copy_from_user(). One of the things learned from analyzing the result of the previous patch is that in general, gcc is really good at proving that the code contains sufficient security checks to not need to do a runtime check. But that for those cases where gcc could not prove this, there was a relatively high percentage of real security issues. This patch turns the case of "gcc cannot prove" into a compile time warning, as long as a sufficiently new gcc is in use that supports this. The objective is that these warnings will trigger developers checking new cases out before a security hole enters a linux kernel release. Signed-off-by: Arjan van de Ven Cc: Linus Torvalds Cc: "David S. Miller" Cc: James Morris Cc: Jan Beulich LKML-Reference: <20090930130523.348ae6c4@infradead.org> Signed-off-by: Ingo Molnar commit 0aa73ba1c4e1ad1d51a29e0df95ccd9f746918b6 Merge: 925936e 3397409 Author: Ingo Molnar Date: Thu Oct 1 11:20:33 2009 +0200 Merge branch 'tracing/urgent' into tracing/core Merge reason: Pick up latest fixes and update to latest upstream. Signed-off-by: Ingo Molnar commit 2ccdc450e658053681202d42ac64b3638f22dc1a Author: Arnaldo Carvalho de Melo Date: Thu Sep 24 14:24:00 2009 -0700 perf top: Remove dead {min,max}_ip unused variables Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: "H. Peter Anvin" Cc: Peter Zijlstra Cc: Mike Galbraith LKML-Reference: <20090924212400.GA15321@ghostprotocols.net> Signed-off-by: Ingo Molnar commit 23acb98de5a4109a60b5fe3f0439389218b039d7 Author: Rajiv Andrade Date: Wed Sep 30 12:26:55 2009 -0300 TPM: fix pcrread The previously sent patch: http://marc.info/?l=tpmdd-devel&m=125208945007834&w=2 Had its first hunk cropped when merged, submitting only this first hunk again. Signed-off-by: Jason Gunthorpe Cc: Debora Velarde Cc: Marcel Selhorst Cc: James Morris Signed-off-by: Andrew Morton Signed-off-by: Rajiv Andrade Acked-by: Mimi Zohar Tested-by: Mimi Zohar Signed-off-by: James Morris commit cad3071424edd7854f63aa80d09473e84f49ed79 Author: Arnaldo Carvalho de Melo Date: Mon Sep 28 17:08:18 2009 -0300 perf trace: Remove dead code Several variables are not used at all, cut'n'paste leftovers. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith Cc: "H. Peter Anvin" LKML-Reference: <20090928200818.GF3361@ghostprotocols.net> Signed-off-by: Ingo Molnar commit a80deb622dba7dfb65d9e27b6b74b7c1963c3635 Author: Arnaldo Carvalho de Melo Date: Mon Sep 28 15:23:51 2009 -0300 perf sched: Remove dead code Several variables are not used at all, cut'n'paste leftovers. Also check if the sample_type is RAW earlier, to avoid needless searches. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: "H. Peter Anvin" Cc: Peter Zijlstra Cc: Mike Galbraith Signed-off-by: Ingo Molnar commit 1b46cddfccfec4cc67b187fb53d78198de6a057c Author: Arnaldo Carvalho de Melo Date: Mon Sep 28 14:48:46 2009 -0300 perf tools: Use rb_tree for maps Threads can have many and kernel modules will be represented as a tree of maps as well. Ah, and for a perf.data with 146607 samples: Before: [root@doppio ~]# perf stat -r 5 perf report > /dev/null Performance counter stats for 'perf report' (5 runs): 699.823680 task-clock-msecs # 0.991 CPUs ( +- 0.454% ) 74 context-switches # 0.000 M/sec ( +- 1.709% ) 2 CPU-migrations # 0.000 M/sec ( +- 17.008% ) 23114 page-faults # 0.033 M/sec ( +- 0.000% ) 1381257019 cycles # 1973.721 M/sec ( +- 0.290% ) 1456894438 instructions # 1.055 IPC ( +- 0.007% ) 18779818 cache-references # 26.835 M/sec ( +- 0.380% ) 641799 cache-misses # 0.917 M/sec ( +- 1.200% ) 0.705972729 seconds time elapsed ( +- 0.501% ) [root@doppio ~]# After Performance counter stats for 'perf report' (5 runs): 691.261451 task-clock-msecs # 0.993 CPUs ( +- 0.307% ) 72 context-switches # 0.000 M/sec ( +- 0.829% ) 6 CPU-migrations # 0.000 M/sec ( +- 18.409% ) 23127 page-faults # 0.033 M/sec ( +- 0.000% ) 1366395876 cycles # 1976.670 M/sec ( +- 0.153% ) 1443136016 instructions # 1.056 IPC ( +- 0.012% ) 17956402 cache-references # 25.976 M/sec ( +- 0.325% ) 661924 cache-misses # 0.958 M/sec ( +- 1.335% ) 0.696127275 seconds time elapsed ( +- 0.377% ) I.e. we see some speedup too. Signed-off-by: Arnaldo Carvalho de Melo Cc: Frédéric Weisbecker Cc: Peter Zijlstra Cc: Mike Galbraith Cc: "H. Peter Anvin" LKML-Reference: <20090928174846.GA3361@ghostprotocols.net> Signed-off-by: Ingo Molnar commit 3d1d07ecd2009f65cb2091563fa21f9600c36774 Author: John Kacur Date: Mon Sep 28 15:32:55 2009 +0200 perf tools: Put common histogram functions in their own file Move histogram related functions into their own files (hist.c and hist.h) and make use of them in builtin-annotate.c and builtin-report.c. Signed-off-by: John Kacur Acked-by: Frederic Weisbecker Cc: Peter Zijlstra LKML-Reference: Signed-off-by: Ingo Molnar commit af8ff04917169805b151280155bf772d3ca9bec0 Author: Eric Paris Date: Sun Sep 20 21:23:01 2009 -0400 SELinux: reset the security_ops before flushing the avc cache This patch resets the security_ops to the secondary_ops before it flushes the avc. It's still possible that a task on another processor could have already passed the security_ops dereference and be executing an selinux hook function which would add a new avc entry. That entry would still not be freed. This should however help to reduce the number of needless avcs the kernel has when selinux is disabled at run time. There is no wasted memory if selinux is disabled on the command line or not compiled. Signed-off-by: Eric Paris Signed-off-by: James Morris commit 1669b049db50fc7f1d4e694fb115a0f408c63fce Merge: 7f36678 17d857b Author: James Morris Date: Wed Sep 30 07:47:33 2009 +1000 Merge branch 'master' into next commit ff60fab71bb3b4fdbf8caf57ff3739ffd0887396 Author: Arjan van de Ven Date: Mon Sep 28 14:21:22 2009 +0200 x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy GCC provides reasonable memset/memcpy functions itself, with __builtin_memset and __builtin_memcpy. For the "unknown" cases, it'll fall back to our current existing functions, but for fixed size versions it'll inline something smart. Quite often that will be the same as we have now, but sometimes it can do something smarter (for example, if the code then sets the first member of a struct, it can do a shorter memset). In addition, and this is more important, gcc knows which registers and such are not clobbered (while for our asm version it pretty much acts like a compiler barrier), so for various cases it can avoid reloading values. The effect on codesize is shown below on my typical laptop .config: text data bss dec hex filename 5605675 2041100 6525148 14171923 d83f13 vmlinux.before 5595849 2041668 6525148 14162665 d81ae9 vmlinux.after Due to some not-so-good behavior in the gcc 3.x series, this change is only done for GCC 4.x and above. Signed-off-by: Arjan van de Ven LKML-Reference: <20090928142122.6fc57e9c@infradead.org> Signed-off-by: H. Peter Anvin commit 925936ebf35a95c290e010b784c962164e6728f3 Author: Frederic Weisbecker Date: Mon Sep 28 17:12:49 2009 +0200 tracing: Pushdown the bkl tracepoints calls Currently we are calling the bkl tracepoint callbacks just before the bkl lock/unlock operations, ie the tracepoint call is not inside a lock_kernel() function but inside a lock_kernel() macro. Hence the bkl trace event header must be included from smp_lock.h. This raises some nasty circular header dependencies: linux/smp_lock.h -> trace/events/bkl.h -> trace/define_trace.h -> trace/ftrace.h -> linux/ftrace_event.h -> linux/hardirq.h -> linux/smp_lock.h This results in incomplete event declarations, spurious event definitions and other kind of funny behaviours. This is hardly fixable without ugly workarounds. So instead, we push the file name, line number and function name as lock_kernel() parameters, so that we only deal with the trace event header from lib/kernel_lock.c This adds two parameters to lock_kernel() and unlock_kernel() but it should be fine wrt to performances because this pair dos not seem to be called in fast paths. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Ingo Molnar Cc: Li Zefan commit 9f0cf4adb6aa0bfccf675c938124e68f7f06349d Author: Arjan van de Ven Date: Sat Sep 26 14:33:01 2009 +0200 x86: Use __builtin_object_size() to validate the buffer size for copy_from_user() gcc (4.x) supports the __builtin_object_size() builtin, which reports the size of an object that a pointer point to, when known at compile time. If the buffer size is not known at compile time, a constant -1 is returned. This patch uses this feature to add a sanity check to copy_from_user(); if the target buffer is known to be smaller than the copy size, the copy is aborted and a WARNing is emitted in memory debug mode. These extra checks compile away when the object size is not known, or if both the buffer size and the copy length are constants. Signed-off-by: Arjan van de Ven LKML-Reference: <20090926143301.2c396b94@infradead.org> Signed-off-by: Ingo Molnar commit 7f366784f5c2b8fc0658b5b374f4c63ee42c789f Author: Rajiv Andrade Date: Thu Sep 24 16:27:46 2009 -0300 TPM: increase default TPM buffer The TPM Working Group requested this communication buffer increase given that a particular TPM vendor can support a TPM_SHA1Start command input bigger than the current size. Signed-off-by: Rajiv Andrade Signed-off-by: James Morris commit 3f6fe06dbf67b46d36fedec502300e04dffeb67a Author: Frederic Weisbecker Date: Thu Sep 24 21:31:51 2009 +0200 tracing/filters: Unify the regex parsing helpers The filter code has stolen the regex parsing function from ftrace to get the regex support. We have duplicated this code, so factorize it in the filter area and make it generally available, as the filter code is the most suited to host this feature. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi Cc: Li Zefan commit 1889d20922d14a97b2099fa4d47587217c0ba48b Author: Frederic Weisbecker Date: Thu Sep 24 21:10:44 2009 +0200 tracing/filters: Provide basic regex support This patch provides basic support for regular expressions in filters. It supports the following types of regexp: - *match_beginning - *match_middle* - match_end* - !don't match Example: cd /debug/tracing/events/bkl/lock_kernel echo 'file == "*reiserfs*"' > filter echo 1 > enable gedit-4941 [000] 457.735437: lock_kernel: depth: 0, fs/reiserfs/namei.c:334 reiserfs_lookup() sync_supers-227 [001] 461.379985: lock_kernel: depth: 0, fs/reiserfs/super.c:69 reiserfs_sync_fs() sync_supers-227 [000] 461.383096: lock_kernel: depth: 0, fs/reiserfs/journal.c:1069 flush_commit_list() reiserfs/1-1369 [001] 461.479885: lock_kernel: depth: 0, fs/reiserfs/journal.c:3509 flush_async_commits() Every string is now handled as a regexp in the filter framework, which helps to factorize the code for handling both simple strings and regexp comparisons. (The regexp parsing code has been wildly cherry picked from ftrace.c written by Steve.) v2: Simplify the whole and drop the filter_regex file Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Tom Zanussi Cc: Li Zefan commit dd68ada2d417e57b848822a1407b5317a54136c5 Author: John Kacur Date: Thu Sep 24 18:02:49 2009 +0200 perf tools: Create util/sort.and use it Create util/sort.[ch] and move common functionality for builtin-report.c and builtin-annotate.c there, and make use of it. Signed-off-by: John Kacur LKML-Reference: Signed-off-by: Ingo Molnar commit 8b40f521cf1c9750eab0c04da9075e7484675e9c Author: John Kacur Date: Thu Sep 24 18:02:18 2009 +0200 perf tools: Protect header files with a consistent style There was a colorful mix of header guards - standardize them. Signed-off-by: John Kacur LKML-Reference: Signed-off-by: Ingo Molnar commit cbfeb267cb0ff632dbc8ff02685012bee2e87434 Author: John Kacur Date: Thu Sep 24 18:01:51 2009 +0200 perf annotate: Add the cmp_null function and make use of it This function exists in builtin-report.c but not in builtin-annotate.c Functions that use cmp_null are shorter and clearer. Synchronizing functions between these two files will also make it easier to potential share code in the future. Signed-off-by: John Kacur Cc: Peter Zijlstra LKML-Reference: Signed-off-by: Ingo Molnar commit f3f3f0092477d0165f3f1bf0fd518550b2abd097 Author: Frederic Weisbecker Date: Thu Sep 24 15:27:41 2009 +0200 tracing/event: Cleanup the useless dentry variable Cleanup the useless dentry variable while creating a kernel event set of files. trace_create_file() warns if it fails to create the file anyway, and we don't store the dentry anywhere. v2: Fix a small conflict in kernel/trace/trace_events.c Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Li Zefan commit 737f453fd115ea0c9642ed6b30e37e296a4e3ed7 Author: Frederic Weisbecker Date: Sat Aug 1 03:42:44 2009 +0200 tracing/filters: Cleanup useless headers Cleanup remaining headers inclusion that were only useful when the filter framework and its tracing related filesystem user interface weren't yet separated. v2: Keep module.h, needed for EXPORT_SYMBOL_GPL Signed-off-by: Frederic Weisbecker Cc: Tom Zanussi Cc: Steven Rostedt Cc: Li Zefan commit 96a2c464de07d7c72988db851c029b204fc59108 Author: Frederic Weisbecker Date: Sat Aug 1 01:34:24 2009 +0200 tracing/bkl: Add bkl ftrace events Add two events lock_kernel and unlock_kernel() to trace the bkl uses. This opens the door for userspace tools to perform statistics about the callsites that use it, dependencies with other locks (by pairing the trace with lock events), use with recursivity and so on... The {__reacquire,release}_kernel_lock() events are not traced because these are called from schedule, thus the sched events are sufficient to trace them. Example of a trace: hald-addon-stor-4152 [000] 165.875501: unlock_kernel: depth: 0, fs/block_dev.c:1358 __blkdev_put() hald-addon-stor-4152 [000] 167.832974: lock_kernel: depth: 0, fs/block_dev.c:1167 __blkdev_get() How to get the callsites that acquire it recursively: cd /debug/tracing/events/bkl echo "lock_depth > 0" > filter firefox-4951 [001] 206.276967: unlock_kernel: depth: 1, fs/reiserfs/super.c:575 reiserfs_dirty_inode() You can also filter by file and/or line. v2: Use of FILTER_PTR_STRING attribute for files and lines fields to make them traceable. Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt Cc: Li Zefan commit d7a4b414eed51f1653bb05ebe84122bf9a7ae18b Merge: 1f0ab40 a724ead Author: Frederic Weisbecker Date: Wed Sep 23 23:08:43 2009 +0200 Merge commit 'linus/master' into tracing/kprobes Conflicts: kernel/trace/Makefile kernel/trace/trace.h kernel/trace/trace_event_types.h kernel/trace/trace_export.c Merge reason: Sync with latest significant tracing core changes. commit 3fff4c42bd0a89869a0eb1e7874cc06ffa4aa0f5 Author: Ingo Molnar Date: Tue Sep 22 16:18:09 2009 +0200 printk: Remove ratelimit.h from kernel.h Decouple kernel.h from ratelimit.h: the global declaration of printk's ratelimit_state is not needed, and it leads to messy circular dependencies due to ratelimit.h's (new) adding of a spinlock_types.h include. Cc: Peter Zijlstra Cc: Andrew Morton Cc: Linus Torvalds Cc: David S. Miller LKML-Reference: Signed-off-by: Ingo Molnar commit edaac8e3167501cda336231d00611bf59c164346 Author: Ingo Molnar Date: Tue Sep 22 14:44:11 2009 +0200 ratelimit: Fix/allow use in atomic contexts I'd like to use printk_ratelimit() in NMI context, but it's not robust right now due to spinlock usage in lib/ratelimit.c. If an NMI is unlucky enough to hit just that spot we might lock up trying to take the spinlock again. Fix that by using a trylock variant. If we contend on that lock we can genuinely skip the message because the state is just being accessed by another CPU (or by this CPU). ( We could use atomics for the suppressed messages field, but i doubt it matters in practice and it makes the code heavier. ) Cc: Peter Zijlstra Cc: Andrew Morton Cc: Linus Torvalds Cc: David S. Miller LKML-Reference: Signed-off-by: Ingo Molnar commit 979f693def9084a452846365dfde5dcb28366333 Author: Ingo Molnar Date: Tue Sep 22 14:44:11 2009 +0200 ratelimit: Use per ratelimit context locking I'd like to use printk_ratelimit() in atomic context, but that's not possible right now due to the spinlock usage this commit introduced more than a year ago: 717115e: printk ratelimiting rewrite As a first step push the lock into the ratelimit state structure. This allows us to deal with locking failures to be considered as an event related to that state being too busy. Also clean up the code a bit (without changing functionality): - tidy up the definitions - clean up the code flow This also shrinks the code a tiny bit: text data bss dec hex filename 264 0 4 268 10c ratelimit.o.before 255 0 0 255 ff ratelimit.o.after ( Whole-kernel data size got a bit larger, because we have two ratelimit-state data structures right now. ) Cc: Peter Zijlstra Cc: Andrew Morton Cc: Linus Torvalds Cc: David S. Miller LKML-Reference: Signed-off-by: Ingo Molnar commit d01d4827858cdc2e1c437c87ab65ec0a00fd40f8 Author: Heiko Carstens Date: Mon Sep 21 11:06:27 2009 +0200 sched: Always show Cpus_allowed field in /proc//status The Cpus_allowed fields in /proc//status is currently only shown in case of CONFIG_CPUSETS. However their contents are also useful for the !CONFIG_CPUSETS case. So change the current behaviour and always show these fields. Signed-off-by: Heiko Carstens Cc: Andrew Morton Cc: Oleg Nesterov Cc: Peter Zijlstra LKML-Reference: <20090921090627.GD4649@osiris.boeblingen.de.ibm.com> Signed-off-by: Ingo Molnar commit 1f0ab40976460bc4673fa204ce917a725185d8f2 Author: Ananth N Mavinakayanahalli Date: Tue Sep 15 10:43:07 2009 +0530 kprobes: Prevent re-registration of the same kprobe Prevent re-registration of the same kprobe. This situation, though unlikely, needs to be flagged since it can lead to a system crash if it's not handled. The core change itself is small, but the helper routine needed to be moved around a bit; hence the diffstat. Signed-off-by: Ananth N Mavinakayanahalli Acked-by: Masami Hiramatsu Cc: Jim Keniston Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090915051307.GB26458@in.ibm.com> Signed-off-by: Frederic Weisbecker commit 5a0d9050db4d1147722b42afef9011251b2651ee Author: Masami Hiramatsu Date: Mon Sep 14 16:49:37 2009 -0400 tracing/kprobes: Disable kprobe events by default after creation Disable newly created kprobe events by default, not to disturb another user using ftrace. "Disturb" means when someone is using ftrace and another user tries to use perf-tools, (in near future) if he defines new kprobe event via perf-tools, then new events will mess up the frace buffer. Fix this to allow proper and transparent kprobes events concurrent usage between ftrace users and perf users. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090914204937.18779.59422.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 74ebb63e7cd25f6fb02a45fc2ea7735bce1217c9 Author: Masami Hiramatsu Date: Mon Sep 14 16:49:28 2009 -0400 tracing/kprobes: Fix profiling alignment for perf_counter buffer Fix *probe_profile_func() to align buffer size, since perf_counter requires its buffer entries to be 8 bytes aligned. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090914204928.18779.60029.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 50d780560785b068c358675c5f0bf6c83b5c373e Author: Masami Hiramatsu Date: Mon Sep 14 16:49:20 2009 -0400 tracing/kprobes: Add probe handler dispatcher to support perf and ftrace concurrent use Add kprobe_dispatcher and kretprobe_dispatcher to dispatch event in both profile and tracing handlers. This allows simultaneous kprobe uses by ftrace and perf. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090914204920.18779.57555.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 4fead8e46fded93cc0d432ced774d9a3a8d21bad Author: Masami Hiramatsu Date: Mon Sep 14 16:49:12 2009 -0400 ftrace: Fix trace_remove_event_call() to lock trace_event_mutex Lock not only event_mutex but also trace_event_mutex in trace_remove_event_call() to protect __unregister_ftrace_event(). Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090914204912.18779.68734.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 588bebb74fe87270f94c2810652bd683d63c4b54 Author: Masami Hiramatsu Date: Wed Sep 16 11:42:55 2009 -0400 ftrace: Fix trace_add_event_call() to initialize list Handle failure path in trace_add_event_call() to fix the below bug which occurred when I tried to add invalid event twice. Could not create debugfs 'kmalloc' directory Failed to register kprobe event: kmalloc Faild to register probe event(-1) ------------[ cut here ]------------ WARNING: at /home/mhiramat/ksrc/random-tracing/lib/list_debug.c:26 __list_add+0x27/0x5c() Hardware name: list_add corruption. next->prev should be prev (c07d78cc), but was 00001000. (next=d854236c). Modules linked in: sunrpc uinput virtio_net virtio_balloon i2c_piix4 pcspkr i2c_core virtio_blk virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] Pid: 1394, comm: tee Not tainted 2.6.31-rc9 #51 Call Trace: [] warn_slowpath_common+0x65/0x7c [] ? __list_add+0x27/0x5c [] warn_slowpath_fmt+0x24/0x27 [] __list_add+0x27/0x5c [] list_add+0xa/0xc [] trace_add_event_call+0x60/0x97 [] command_trace_probe+0x42c/0x51b [] ? remove_wait_queue+0x22/0x27 [] ? __wake_up+0x32/0x3b [] probes_write+0xd4/0x10a [] ? probes_write+0x0/0x10a [] vfs_write+0x80/0xdf [] sys_write+0x3b/0x5d [] syscall_call+0x7/0xb ---[ end trace 2b962b5dc1fdc07d ]--- Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <4AB1077F.6020107@redhat.com> Signed-off-by: Frederic Weisbecker commit 2d5e067edc4635ff7515bfa9ab3edb38bc344cab Author: Masami Hiramatsu Date: Mon Sep 14 16:48:56 2009 -0400 tracing/kprobes: Fix trace_probe registration order Fix trace_probe registration order. ftrace_event_call and ftrace_event must be registered before kprobe/kretprobe, because tracing/profiling handlers dereference the event-id. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Tom Zanussi LKML-Reference: <20090914204856.18779.52961.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit f52487e9c0041842eeb77c6c48774414b1cede08 Author: Masami Hiramatsu Date: Thu Sep 10 19:53:53 2009 -0400 tracing/kprobes: Support custom subsystem for each kprobe event Support specifying a custom subsystem(group) for each kprobe event. This allows users to create new group to control several probes at once, or add events to existing groups as additional tracepoints. New synopsis: p[:[subsys/]event-name] KADDR|KSYM[+offs] [ARGS] Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235353.22412.15149.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit dca2d6ac09d9ef59ff46820d4f0c94b08a671202 Merge: d6a65df 1824090 Author: Ingo Molnar Date: Tue Sep 15 12:18:15 2009 +0200 Merge branch 'linus' into tracing/hw-breakpoints Conflicts: arch/x86/kernel/process_64.c Semantic conflict fixed in: arch/x86/kvm/x86.c Signed-off-by: Ingo Molnar commit b8a4754147d61f5359a765a3afd3eb03012aa052 Author: Borislav Petkov Date: Thu Jul 30 11:10:02 2009 +0200 x86, msr: Unify rdmsr_on_cpus/wrmsr_on_cpus Since rdmsr_on_cpus and wrmsr_on_cpus are almost identical, unify them into a common __rwmsr_on_cpus helper thus avoiding code duplication. While at it, convert cpumask_t's to const struct cpumask *. Signed-off-by: Borislav Petkov Signed-off-by: H. Peter Anvin Signed-off-by: Ingo Molnar commit 6e9f23d1619f7badaf9090dac09e86a22d6061d8 Author: Masami Hiramatsu Date: Thu Sep 10 19:53:45 2009 -0400 tracing/kprobes: Show event name in trace output Show event name in tracing/trace output. This also fixes kprobes events format to comply with other tracepoint events formats. Before patching: <...>-1447 [001] 1038282.286875: do_sys_open+0x0/0xd6: ... <...>-1447 [001] 1038282.286878: sys_openat+0xc/0xe <- do_sys_open: ... After patching: <...>-1447 [001] 1038282.286875: myprobe: (do_sys_open+0x0/0xd6) ... <...>-1447 [001] 1038282.286878: myretprobe: (sys_openat+0xc/0xe <- do_sys_open) ... Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235345.22412.76527.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit eca0d916f6429785bbc88db3ff66631cde62b432 Author: Masami Hiramatsu Date: Thu Sep 10 19:53:38 2009 -0400 tracing/kprobes: Add argument name support Add argument name assignment support and remove "alias" lines from format. This allows user to assign unique name to each argument. For example, $ echo p do_sys_open dfd=a0 filename=a1 flags=a2 mode=a3 > kprobe_events This assigns dfd, filename, flags, and mode to 1st - 4th arguments respectively. Trace buffer shows those names too. <...>-1439 [000] 1200885.933147: do_sys_open+0x0/0xdf: dfd=ffffff9c filename=bfa898ac flags=8000 mode=0 This helps users to know what each value means. Users can filter each events by these names too. Note that you can not filter by argN anymore. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235337.22412.77383.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit e08d1c657f70bcaca11401cd6ac5c8fe59bd2bb7 Author: Masami Hiramatsu Date: Thu Sep 10 19:53:30 2009 -0400 tracing/kprobes: Add event profiling support Add *probe_profile_enable/disable to support kprobes raw events sampling from perf counters, like other ftrace events, when CONFIG_PROFILE_EVENT=y. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235329.22412.94731.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 4a846b443b4e8633057946a2234e23559a67ce42 Author: Masami Hiramatsu Date: Fri Sep 11 05:31:21 2009 +0200 tracing/kprobes: Cleanup kprobe tracer code. Simplify trace_probe to remove a union, and remove some redundant wrappers. And also, cleanup create_trace_probe() function. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235322.22412.52525.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 2fba0c8867af47f6455490e7b59e512dd180c027 Author: Masami Hiramatsu Date: Thu Sep 10 19:53:14 2009 -0400 tracing/kprobes: Fix probe offset to be unsigned Prohibit user to specify negative offset from symbols. Since kprobe.offset is unsigned int, the offset must be always positive value. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235314.22412.64631.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit ad5cafcdb09c57008c990edd309c0a563b09f238 Author: Masami Hiramatsu Date: Thu Sep 10 19:53:06 2009 -0400 x86/ptrace: Fix regs_get_argument_nth() to add correct offset Fix regs_get_argument_nth() to add correct offset bytes. Because offset_of() returns offset in byte, the offset should be added to char * instead of unsigned long *. Signed-off-by: Masami Hiramatsu Acked-by: Steven Rostedt Cc: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Peter Zijlstra Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi LKML-Reference: <20090910235306.22412.31613.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit a00e817f42663941ea0aa5f85a9d1c4f8b212839 Author: Masami Hiramatsu Date: Tue Sep 8 12:47:55 2009 -0400 kprobes/x86-32: Move irq-exit functions to kprobes section Move irq-exit functions to .kprobes.text section to protect against kprobes recursion. When I ran kprobe stress test on x86-32, I found below symbols cause unrecoverable recursive probing: ret_from_exception ret_from_intr check_userspace restore_all restore_all_notrace restore_nocheck irq_return And also, I found some interrupt/exception entry points that cause similar problems. This patch moves those symbols (including their container functions) to .kprobes.text section to prevent any kprobes probing. Signed-off-by: Masami Hiramatsu Cc: Frederic Weisbecker Cc: Ananth N Mavinakayanahalli Cc: Jim Keniston Cc: Ingo Molnar LKML-Reference: <20090908164755.24050.81182.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit f12b4f546b4e327d5620a544a2bddab68de66027 Author: Masami Hiramatsu Date: Tue Sep 8 12:32:46 2009 -0400 x86: Add MMX support for instruction decoder Add MMX/SSE instructions to x86 opcode maps, since some of those instructions are used in the kernel. This also fixes failures in the x86 instruction decoder seftest. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: H. Peter Anvin Cc: Sam Ravnborg Cc: Frederic Weisbecker Cc: Ingo Molnar LKML-Reference: <20090908163246.23516.78835.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: Frederic Weisbecker commit 8f8ffe2485bcaa890800681451d380779cea06af Merge: 7006957 d28daf9 Author: Frederic Weisbecker Date: Fri Sep 11 01:09:23 2009 +0200 Merge commit 'tracing/core' into tracing/kprobes Conflicts: kernel/trace/trace_export.c kernel/trace/trace_kprobe.c Merge reason: This topic branch lacks an important build fix in tracing/core: 0dd7b74787eaf7858c6c573353a83c3e2766e674: tracing: Fix double CPP substitution in TRACE_EVENT_FN that prevents from multiple tracepoint headers inclusion crashes. Signed-off-by: Frederic Weisbecker commit d6a65dffb30d8636b1e5d4c201564ef401a246cf Author: Frederic Weisbecker Date: Mon Sep 7 03:23:20 2009 +0200 tracing: Fix ring-buffer and ksym tracer merge interaction The compiler warns us about: kernel/trace/trace_ksym.c: In function ksym_hbp_handler: kernel/trace/trace_ksym.c:92: attention : passing argument 1 of trace_buffer_lock_reserve from incompatible pointer type kernel/trace/trace_ksym.c:106: attention : passing argument 1 of trace_buffer_unlock_commit from incompatible pointer type Commit "e77405ad" (tracing: pass around ring buffer instead of tracer) has changed the central tracing APIs. And this change has updated every callsites of these APIs except those that aren't in tracing/core, such as the ksym tracer. Cc: Steven Rostedt Signed-off-by: Ingo Molnar commit a1922ed661ab2c1637d0b10cde933bd9cd33d965 Merge: 75e3375 d28daf9 Author: Ingo Molnar Date: Mon Sep 7 08:19:51 2009 +0200 Merge branch 'tracing/core' into tracing/hw-breakpoints Conflicts: arch/Kconfig kernel/trace/trace.h Merge reason: resolve the conflicts, plus adopt to the new ring-buffer APIs. Signed-off-by: Ingo Molnar commit 70069577323e6f72b845166724f34b9858134437 Author: Masami Hiramatsu Date: Fri Aug 28 18:13:26 2009 -0400 x86: Remove unused config macros from instruction decoder selftest Remove dummy definitions of CONFIG_X86_64 and CONFIG_X86_32 because those macros are not used in the instruction decoder anymore. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ingo Molnar LKML-Reference: <20090828221326.8778.70723.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 50a482fbd96943516b7a2783900e8fe61a6425e7 Author: Masami Hiramatsu Date: Fri Aug 28 18:13:19 2009 -0400 x86: Allow x86-32 instruction decoder selftest on x86-64 Pass $(CONFIG_64BIT) to the x86 insn decoder selftest in case we are decoding 32bit code on x86-64, which will happen when building kernel with ARCH=i386 on x86-64. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: Ingo Molnar LKML-Reference: <20090828221319.8778.88508.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 65e234ec2c4a0659ca22531dc1372a185f088517 Author: Masami Hiramatsu Date: Thu Aug 27 13:23:32 2009 -0400 kprobes: Prohibit to probe native_get_debugreg Since do_debug() calls get_debugreg(), native_get_debugreg() will be called from singlestepping. This can cause an int3 infinite loop. We can't put it in the .text.kprobes section because it is inlined, then we blacklist its name. Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Ingo Molnar LKML-Reference: <20090827172332.8246.34194.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 8222d718b3ad3ae49c48f69ae4b6a1128c9a92cf Author: Masami Hiramatsu Date: Thu Aug 27 13:23:25 2009 -0400 kprobes/x86-64: Fix to move common_interrupt to .kprobes.text Since nmi, debug and int3 returns to irq_return inside common_interrupt, probing this function will cause int3-loop, so it should be marked as __kprobes. Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Ingo Molnar LKML-Reference: <20090827172325.8246.40000.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 8f270083587a4cb70fa14f0e2fd698eb08a4dd07 Author: Masami Hiramatsu Date: Thu Aug 27 13:23:18 2009 -0400 kprobes: Fix to add __kprobes to notify_die Add __kprobes to notify_die() because do_int3() calls notify_die() instead of atomic_notify_call_chain() which is already marked as __kprobes. Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Ingo Molnar LKML-Reference: <20090827172318.8246.53702.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 62c9295f9dd250ea1bb2c8078642a275a9ce82f8 Author: Masami Hiramatsu Date: Thu Aug 27 13:23:11 2009 -0400 kprobes/x86: Fix to add __kprobes to in-kernel fault handing functions Add __kprobes to the functions which handle in-kernel fixable page faults. Since kprobes can cause those in-kernel page faults by accessing kprobe data structures, probing those fault functions will cause fault-int3-loop (do_page_fault has already been marked as __kprobes). Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Ingo Molnar LKML-Reference: <20090827172311.8246.92725.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit f5ad31158d60946b9fd18c8a79c283a6bc432430 Author: Masami Hiramatsu Date: Thu Aug 27 13:23:04 2009 -0400 kprobes/x86-64: Allow to reenter probe on post_handler Allow to reenter probe on the post_handler of another probe on x86-64, because x86-64 already allows reentering int3. In that case, reentered probe just increases kp.nmissed and returns. Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Ingo Molnar LKML-Reference: <20090827172304.8246.4822.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit e9afe9e1b3fdbd56cca53959a2519e70db9c8095 Author: Masami Hiramatsu Date: Thu Aug 27 13:22:58 2009 -0400 kprobes/x86: Call BUG() when reentering probe into KPROBES_HIT_SS Call BUG() when a probe have been hit on the way of kprobe processing path, because that kind of probes are currently unrecoverable (recovering it will cause an infinite loop and stack overflow). The original code seems to assume that it's caused by an int3 which another subsystem inserted on out-of-line singlestep buffer if the hitting probe is same as current probe. However, in that case, int3-hitting-address is on the out-of-line buffer and should be different from first (current) int3 address. Thus, I decided to remove the code. I also removes arch_disarm_kprobe() because it will involve other stuffs in text_poke(). Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Ingo Molnar LKML-Reference: <20090827172258.8246.61889.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit f8468f3695209735c1595342f6bd95f7bdab66e1 Author: Frederic Weisbecker Date: Thu Aug 27 05:23:29 2009 +0200 tracing: Remove unneeded pointer casts Cleaup uneeded casts from void * to char * in syscalls tracing file. Reported-by: Li Zefan Signed-off-by: Frederic Weisbecker commit aeaeae1187d7520f1c5559623f0a149da6a1c96e Author: Frederic Weisbecker Date: Thu Aug 27 05:09:51 2009 +0200 tracing: Restore the const qualifier for field names and types definition Restore the const qualifier in field's name and type parameters of trace_define_field that was lost while solving a conflict. Fields names and types are defined as builtin constant strings in static TRACE_EVENTs. But kprobes allocates these dynamically. That said, we still want to always pass these strings as const char * in trace_define_fields() to avoid any further accidental writes on the pointed strings. Reported-by: Li Zefan Signed-off-by: Frederic Weisbecker Cc: Steven Rostedt commit 24851d2447830e6cba4c4b641cb73e713f312373 Author: Frederic Weisbecker Date: Wed Aug 26 23:38:30 2009 +0200 tracing/kprobes: Dump the culprit kprobe in case of kprobe recursion Kprobes can enter into a probing recursion, ie: a kprobe that does an endless loop because one of its core mechanism function used during probing is also probed itself. This patch helps pinpointing the kprobe that raised such recursion by dumping it and raising a BUG instead of a warning (we also disarm the kprobe to try avoiding recursion in BUG itself). Having a BUG instead of a warning stops the stacktrace in the right place and doesn't pollute the logs with hundreds of traces that eventually end up in a stack overflow. Signed-off-by: Frederic Weisbecker Cc: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli commit 30a7e073b590ebd1829a906164b0a637e77cc967 Author: Masami Hiramatsu Date: Fri Aug 21 15:43:51 2009 -0400 tracing/kprobes: Change trace_arg to probe_arg Change trace_arg_string() and parse_trace_arg() to probe_arg_string() and parse_probe_arg(), since those are kprobe-tracer local functions. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: H. Peter Anvin Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090821194351.12478.15247.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 38a47497d9e34632abbeb484603cedf10c4b05e4 Author: Masami Hiramatsu Date: Fri Aug 21 15:43:43 2009 -0400 tracing/kprobes: Fix format typo in trace_kprobes Fix a format typo in kprobe-tracer. Currently, it shows 'tsize' in format; $ cat /debug/tracing/events/kprobes/event/format ... field: unsigned long ip; offset:16;tsize:8; field: int nargs; offset:24;tsize:4; ... This should be '\tsize'; $ cat /debug/tracing/events/kprobes/event/format ... field: unsigned long ip; offset:16; size:8; field: int nargs; offset:24; size:4; ... Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: H. Peter Anvin Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090821194343.12478.37618.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 69d991f32152283cbc373136fa45bbb152b32048 Author: Masami Hiramatsu Date: Fri Aug 21 15:43:16 2009 -0400 x86: Check awk features before generating inat-tables.c Check some awk mandatory features to generate inat-tables.c that old mawk doesn't support. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: H. Peter Anvin Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090821194316.12478.57394.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 8d7d14fb27818eb08ebedf9f4a6e286970fe9977 Author: Masami Hiramatsu Date: Fri Aug 21 15:43:07 2009 -0400 x86: Fix x86 instruction decoder selftest to check only .text Fix x86 instruction decoder selftest to check only .text because other sections (e.g. .notes) will have random bytes which don't need to be checked. Signed-off-by: Masami Hiramatsu Cc: Jim Keniston Cc: H. Peter Anvin Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090821194307.12478.76938.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit cd7e7bd5e44718c7625ce1e1f0fda53d77cd3797 Author: Masami Hiramatsu Date: Thu Aug 13 16:35:42 2009 -0400 tracing: Add kprobes event profiling interface Add profiling interfaces for each kprobes event. This interface provides how many times each probe hit or missed. Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203541.31965.8452.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit ff50d99136c3315513ef3b2921e77f35ab04d081 Author: Masami Hiramatsu Date: Thu Aug 13 16:35:34 2009 -0400 tracing: Kprobe tracer assigns new event ids for each event Assign new event ids for each kprobes event. This doesn't clear ring_buffer when unregistering each kprobe event. Thus, if you mind 'Unknown event' messages, clear the buffer manually after changing kprobe events. Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203534.31965.49105.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 4263565d491145b57621a761714f2ca6f1293a45 Author: Masami Hiramatsu Date: Thu Aug 13 16:35:26 2009 -0400 tracing: Generate names for each kprobe event automatically Generate names for each kprobe event based on the probe point. (SYMBOL+offs or MEMADDR). Also remove generic k*probe event types because there is no user of those types. Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203526.31965.56672.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit a82378d8802717b9776a7d9b54422f65c414d6cc Author: Masami Hiramatsu Date: Thu Aug 13 16:35:18 2009 -0400 tracing: Kprobe-tracer supports more than 6 arguments Support up to 128 arguments to fetch for each kprobes event. Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203518.31965.96979.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit d8ec91850efaf6cee9234c80260fe03881242374 Author: Masami Hiramatsu Date: Wed Aug 19 21:13:57 2009 +0200 tracing: Add kprobe-based event tracer documentation Add the documentation to use the kprobe based event tracer. [fweisbec@gmail.com: Split tracer and its Documentation in two patchs] Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203510.31965.29123.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 413d37d1eb69c1765b9ace0a612dac9b6c990e66 Author: Masami Hiramatsu Date: Thu Aug 13 16:35:11 2009 -0400 tracing: Add kprobe-based event tracer Add kprobes-based event tracer on ftrace. This tracer is similar to the events tracer which is based on Tracepoint infrastructure. Instead of Tracepoint, this tracer is based on kprobes (kprobe and kretprobe). It probes anywhere where kprobes can probe(this means, all functions body except for __kprobes functions). Similar to the events tracer, this tracer doesn't need to be activated via current_tracer, instead of that, just set probe points via /sys/kernel/debug/tracing/kprobe_events. And you can set filters on each probe events via /sys/kernel/debug/tracing/events/kprobes//filter. This tracer supports following probe arguments for each probe. %REG : Fetch register REG sN : Fetch Nth entry of stack (N >= 0) sa : Fetch stack address. @ADDR : Fetch memory at ADDR (ADDR should be in kernel) @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol) aN : Fetch function argument. (N >= 0) rv : Fetch return value. ra : Fetch return address. +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address. See Documentation/trace/kprobetrace.txt in the next patch for details. Changes from v13: - Support 'sa' for stack address. - Use call->data instead of container_of() macro. [fweisbec@gmail.com: Fixed conflict against latest tracing/core] Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203510.31965.29123.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit d93f12f3f417e49a175800da85c6fcb2a5096e03 Author: Masami Hiramatsu Date: Thu Aug 13 16:35:01 2009 -0400 tracing: Introduce TRACE_FIELD_ZERO() macro Use TRACE_FIELD_ZERO(type, item) instead of TRACE_FIELD_ZERO_CHAR(item). This also includes a typo fix of TRACE_ZERO_CHAR() macro. Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203501.31965.30172.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit bd1a5c849bdcc5c89e4a6a18216cd2b9a7a8a78f Author: Masami Hiramatsu Date: Thu Aug 13 16:34:53 2009 -0400 tracing: Ftrace dynamic ftrace_event_call support Add dynamic ftrace_event_call support to ftrace. Trace engines can add new ftrace_event_call to ftrace on the fly. Each operator function of the call takes an ftrace_event_call data structure as an argument, because these functions may be shared among several ftrace_event_calls. Changes from v13: - Define remove_subsystem_dir() always (revirt a2ca5e03), because trace_remove_event_call() uses it. - Modify syscall tracer because of ftrace_event_call change. [fweisbec@gmail.com: Fixed conflict against latest tracing/core] Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203453.31965.71901.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit b1cf540f0e5278ecfe8532557e547d833ed269d7 Author: Masami Hiramatsu Date: Thu Aug 13 16:34:44 2009 -0400 x86: Add pt_regs register and stack access APIs Add following APIs for accessing registers and stack entries from pt_regs. These APIs are required by kprobes-based event tracer on ftrace. Some other debugging tools might be able to use it too. - regs_query_register_offset(const char *name) Query the offset of "name" register. - regs_query_register_name(unsigned int offset) Query the name of register by its offset. - regs_get_register(struct pt_regs *regs, unsigned int offset) Get the value of a register by its offset. - regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr) Check the address is in the kernel stack. - regs_get_kernel_stack_nth(struct pt_regs *reg, unsigned int nth) Get Nth entry of the kernel stack. (N >= 0) - regs_get_argument_nth(struct pt_regs *reg, unsigned int nth) Get Nth argument at function call. (N >= 0) Signed-off-by: Masami Hiramatsu Cc: linux-arch@vger.kernel.org Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203444.31965.26374.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 89ae465b0ee470f7d3f8a1c61353445c3acbbe2a Author: Masami Hiramatsu Date: Thu Aug 13 16:34:36 2009 -0400 kprobes: Cleanup fix_riprel() using insn decoder on x86 Cleanup fix_riprel() in arch/x86/kernel/kprobes.c by using the new x86 instruction decoder instead of using comparisons with raw ad hoc numeric opcodes. Signed-off-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203436.31965.34374.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit b46b3d70c9c017d7c4ec49f7f3ffd0af5a622277 Author: Masami Hiramatsu Date: Thu Aug 13 16:34:28 2009 -0400 kprobes: Checks probe address is instruction boudary on x86 Ensure safeness of inserting kprobes by checking whether the specified address is at the first byte of an instruction on x86. This is done by decoding probed function from its head to the probe point. Signed-off-by: Masami Hiramatsu Acked-by: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: Jim Keniston Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203428.31965.21939.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit ca0e9badd1a39fecdd235f4bf1481b9da756e27b Author: Masami Hiramatsu Date: Thu Aug 13 16:34:21 2009 -0400 x86: X86 instruction decoder build-time selftest Add a user-space selftest of x86 instruction decoder at kernel build time. When CONFIG_X86_DECODER_SELFTEST=y, Kbuild builds a test harness of x86 instruction decoder and performs it after building vmlinux. The test compares the results of objdump and x86 instruction decoder code and check there are no differences. Signed-off-by: Masami Hiramatsu Signed-off-by: Jim Keniston Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203421.31965.29006.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit eb13296cfaf6c699566473669a96a38a90562384 Author: Masami Hiramatsu Date: Thu Aug 13 16:34:13 2009 -0400 x86: Instruction decoder API Add x86 instruction decoder to arch-specific libraries. This decoder can decode x86 instructions used in kernel into prefix, opcode, modrm, sib, displacement and immediates. This can also show the length of instructions. This version introduces instruction attributes for decoding instructions. The instruction attribute tables are generated from the opcode map file (x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk). Currently, the opcode maps are based on opcode maps in Intel(R) 64 and IA-32 Architectures Software Developers Manual Vol.2: Appendix.A, and consist of below two types of opcode tables. 1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are written as below; Table: table-name Referrer: escaped-name opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] (or) opcode: escape # escaped-name EndTable Group opcodes, which has 8 elements, are written as below; GrpTable: GrpXXX reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...] EndTable These opcode maps include a few SSE and FP opcodes (for setup), because those opcodes are used in the kernel. Signed-off-by: Masami Hiramatsu Signed-off-by: Jim Keniston Acked-by: H. Peter Anvin Cc: Ananth N Mavinakayanahalli Cc: Avi Kivity Cc: Andi Kleen Cc: Christoph Hellwig Cc: Frank Ch. Eigler Cc: Ingo Molnar Cc: Jason Baron Cc: K.Prasad Cc: Lai Jiangshan Cc: Li Zefan Cc: Przemysław Pawełczyk Cc: Roland McGrath Cc: Sam Ravnborg Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Tom Zanussi Cc: Vegard Nossum LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker commit 75e33751ca8bbb72dd6f1a74d2810ddc8cbe4bdf Author: Xiao Guangrong Date: Thu Jul 23 12:01:22 2009 +0800 tracing/ksym_tracer: support quick clear for ksym_trace_filter -- v2 It's rather boring to clear symbol one by one in ksym_trace_filter file, so, this patch will let ksym_trace_filter file support quickly clear all break points. We can write "0" to this file and it will clear all symbols for example: # cat ksym_trace_filter ksym_filter_head:rw- global_trace:rw- # echo 0 > ksym_trace_filter # cat ksym_trace_filter # Changelog v1->v2: Add other ways to clear all breakpoints by writing NULL or "*:---" to ksym_trace_filter file base on K.Prasad's suggestion Signed-off-by: Xiao Guangrong LKML-Reference: <4A67E092.3080202@cn.fujitsu.com> Signed-off-by: Steven Rostedt commit 8e068542a8d9efec55126284d2f5cb32f003d507 Author: Xiao Guangrong Date: Wed Jul 22 11:23:41 2009 +0800 tracing/ksym_tracer: fix write operation of ksym_trace_filter This patch fix 2 bugs: - fix the return value of ksym_trace_filter_write() when we want to clear symbol in ksym_trace_filter file for example: # echo global_trace:rw- > /debug/tracing/ksym_trace_filter # echo global_trace:--- > /debug/tracing/ksym_trace_filter -bash: echo: write error: Invalid argument # cat /debug/tracing/ksym_trace_filter # We want to clear 'global_trace' in ksym_trace_filter, it complain with "Invalid argument", but the operation is successful - the "r--" access types is not allowed, but ksym_trace_filter file think it OK for example: # echo ksym_tracer_mutex:r-- > ksym_trace_filter -bash: echo: write error: Resource temporarily unavailable # dmesg ksym_tracer request failed. Try again later!! The error occur at register_kernel_hw_breakpoint(), but It's should at access types parser Signed-off-by: Xiao Guangrong LKML-Reference: <4A66863D.5090802@cn.fujitsu.com> Signed-off-by: Steven Rostedt commit d857ace143df3884954887e1899a65831ca72ece Author: Xiao Guangrong Date: Wed Jul 22 11:21:31 2009 +0800 tracing/ksym_tracer: fix the output of ksym tracer Fix the output format of ksym tracer, make it properly aligned Befor patch: # tracer: ksym_tracer # # TASK-PID CPU# Symbol Type Function # | | | | | bash 1378 1 ksym_tracer_mutex W mutex_lock+0x11/0x27 bash 1378 1 ksym_filter_head W process_new_ksym_entry+0xd2/0x10c bash 1378 1 ksym_tracer_mutex W mutex_unlock+0x12/0x1b cat 1429 0 ksym_tracer_mutex W mutex_lock+0x11/0x27 After patch: # tracer: ksym_tracer # # TASK-PID CPU# Symbol Type Function # | | | | | cat-1423 [000] ksym_tracer_mutex RW mutex_lock+0x11/0x27 cat-1423 [000] ksym_filter_head RW ksym_trace_filter_read+0x6e/0x10d cat-1423 [000] ksym_tracer_mutex RW mutex_unlock+0x12/0x1b cat-1423 [000] ksym_tracer_mutex RW mutex_lock+0x11/0x27 cat-1423 [000] ksym_filter_head RW ksym_trace_filter_read+0x6e/0x10d cat-1423 [000] ksym_tracer_mutex RW mutex_unlock+0x12/0x1b Signed-off-by: Xiao Guangrong LKML-Reference: <4A6685BB.2090809@cn.fujitsu.com> Signed-off-by: Steven Rostedt commit 9d7e934408b52cd53dd85270eb36941a6a318cc5 Author: Li Zefan Date: Tue Jul 7 13:55:18 2009 +0800 ksym_tracer: Fix the output of stat tracing - make ksym_tracer_stat_start() return head->first instead of &head->first - make the output properly aligned Before: Access type Symbol Counter NA 0 RW pid_max 0 After: Access Type Symbol Counter ----------- ------ ------- RW pid_max 0 Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E346.5050608@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 558df6c8f74ac4a0b9026ef85b0028280f364d96 Author: Li Zefan Date: Tue Jul 7 13:54:48 2009 +0800 ksym_tracer: Fix memory leak - When remove a filter, we leak entry->ksym_hbp->info.name. - With CONFIG_FTRAC_SELFTEST enabled, we leak ->info.name: # echo ksym_tracer > current_tracer # echo 'ksym_selftest_dummy:rw-' > ksym_trace_filter # echo nop > current_tracer Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E328.8010200@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 0d109c8f70eab8b9f693bd5caea23012394e4876 Author: Li Zefan Date: Tue Jul 7 13:54:28 2009 +0800 ksym_tracer: Report error when failed to re-register hbp When access type is changed, the hw break point will be unregistered and then be registered again with new access type. But the registration may fail, in this case, -errno should be returned. Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E314.7070004@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 011ed56853e07e30653d6f1bfddc56b396218664 Author: Li Zefan Date: Tue Jul 7 13:54:08 2009 +0800 ksym_tracer: NIL-terminate user input filter Make sure the user input string is NULL-terminated. Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E300.7020601@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 92cf9f8f7e89c6bdbb1a724f879b8b18fc0dfe0f Author: Li Zefan Date: Tue Jul 7 13:53:47 2009 +0800 ksym_tracer: Fix validation of length of access type Don't take newline into account, otherwise: # echo 'pid_max:-w-' > ksym_trace_filter # echo -n 'pid_max:rw-' > ksym_trace_filter bash: echo: write error: Invalid argument Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E2EB.9070503@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit f088e5471297cc78d7465e1fd997cb1a91a48019 Author: Li Zefan Date: Tue Jul 7 13:53:18 2009 +0800 ksym_tracer: Fix validation of access type # echo 'pid_max:rw-' > ksym_trace_filter # cat ksym_trace_filter pid_max:rw- # echo 'pid_max:ww-' > ksym_trace_filter (should return -EINVAL) # cat ksym_trace_filter (but it ended up removing filter entry) Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E2CE.6080409@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit be9742e6cb107fe1d77db7a081ea4eb25e79e1ad Author: Li Zefan Date: Tue Jul 7 13:52:52 2009 +0800 ksym_tracer: Rewrite ksym_trace_filter_read() Reading ksym_trace_filter gave me some arbitrary characters, when it should show nothing. It's because buf is not initialized when there's no filter. Also reduce stack usage by about 512 bytes. Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E2B4.6030706@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit db59504d89db1462a5281fb55b1d962cb74a398f Author: Li Zefan Date: Tue Jul 7 13:52:36 2009 +0800 ksym_tracer: Extract trace entry from struct trace_ksym struct trace_ksym is used as an entry in hbp list, and is also used as trace_entry stored in ring buffer. This is not necessary and is a waste of memory in ring buffer. There is also a bug that dereferencing field->ksym_hbp in ksym_trace_output() can be invalid. Signed-off-by: Li Zefan Acked-by: Frederic Weisbecker Cc: "K.Prasad" Cc: Alan Stern Cc: Steven Rostedt LKML-Reference: <4A52E2A4.4050007@cn.fujitsu.com> Signed-off-by: Ingo Molnar commit 9d22b536609abf0d64648f99518676ea58245e3b Author: Jaswinder Singh Rajput Date: Wed Jul 1 19:52:30 2009 +0530 x86: Mark ptrace_get_debugreg() as static This sparse warning: arch/x86/kernel/ptrace.c:560:15: warning: symbol 'ptrace_get_debugreg' was not declared. Should it be static? triggers because ptrace_get_debugreg() is global but is only used in a single .c file. change ptrace_get_debugreg() to static to fix that - this also addresses the sparse warning. Signed-off-by: Jaswinder Singh Rajput Cc: Steven Rostedt LKML-Reference: <1246458150.6940.19.camel@hpdv5.satnam> Signed-off-by: Ingo Molnar commit 4555835b707d5c778ee1c9076670bc99b1eeaf61 Author: Jaswinder Singh Rajput Date: Wed Jun 17 14:44:19 2009 +0530 x86: hw_breakpoint.c arch_check_va_in_kernelspace and hw_breakpoint_handler should be static arch_check_va_in_kernelspace() and hw_breakpoint_handler() is used only by same file so it should be static. Also fixed non-ANSI function declaration of function 'arch_uninstall_thread_hw_breakpoint' Fixed following sparse warnings : arch/x86/kernel/hw_breakpoint.c:124:42: warning: non-ANSI function declaration of function 'arch_uninstall_thread_hw_breakpoint' arch/x86/kernel/hw_breakpoint.c:169:5: warning: symbol 'arch_check_va_in_kernelspace' was not declared. Should it be static? arch/x86/kernel/hw_breakpoint.c:313:15: warning: symbol 'hw_breakpoint_handler' was not declared. Should it be static? Signed-off-by: Jaswinder Singh Rajput Cc: Alan Stern Cc: "K.Prasad" Cc: Frederic Weisbecker LKML-Reference: <1245230059.2662.4.camel@ht.satnam> Signed-off-by: Ingo Molnar commit eadb8a091b27a840de7450f84ecff5ef13476424 Merge: 7387400 65795ef Author: Ingo Molnar Date: Wed Jun 17 12:52:15 2009 +0200 Merge branch 'linus' into tracing/hw-breakpoints Conflicts: arch/x86/Kconfig arch/x86/kernel/traps.c arch/x86/power/cpu.c arch/x86/power/cpu_32.c kernel/Makefile Semantic conflict: arch/x86/kernel/hw_breakpoint.c Merge reason: Resolve the conflicts, move from put_cpu_no_sched() to put_cpu() in arch/x86/kernel/hw_breakpoint.c. Signed-off-by: Ingo Molnar commit 73874005cd8800440be4299bd095387fff4b90ac Author: Frederic Weisbecker Date: Wed Jun 3 01:43:38 2009 +0200 hw-breakpoints: fix undeclared ksym_tracer_mutex ksym_tracer_mutex is declared inside an #ifdef CONFIG_PROFILE_KSYM_TRACER section. This makes it unavailable for the hardware breakpoint tracer if it is configured without the breakpoint profiler. This patch fixes the following build error: kernel/trace/trace_ksym.c: In function ‘ksym_trace_filter_read’: kernel/trace/trace_ksym.c:226: erreur: ‘ksym_tracer_mutex’ undeclared (first use in this function) kernel/trace/trace_ksym.c:226: erreur: (Each undeclared identifier is reported only once kernel/trace/trace_ksym.c:226: erreur: for each function it appears in.) kernel/trace/trace_ksym.c: In function ‘ksym_trace_filter_write’: kernel/trace/trace_ksym.c:273: erreur: ‘ksym_tracer_mutex’ undeclared (first use in this function) kernel/trace/trace_ksym.c: In function ‘ksym_trace_reset’: kernel/trace/trace_ksym.c:335: erreur: ‘ksym_tracer_mutex’ undeclared (first use in this function) make[1]: *** [kernel/trace/trace_ksym.o] Erreur 1 [ Impact: fix a build error ] Reported-by: Ingo Molnar Signed-off-by: Frederic Weisbecker commit 62edab9056a6cf0c9207339c8892c923a5217e45 Author: K.Prasad Date: Mon Jun 1 23:47:06 2009 +0530 hw-breakpoints: reset bits in dr6 after the corresponding exception is handled This patch resets the bit in dr6 after the corresponding exception is handled in code, so that we keep a clean track of the current virtual debug status register. [ Impact: keep track of breakpoints triggering completion ] Signed-off-by: K.Prasad Signed-off-by: Frederic Weisbecker commit 0722db015c246204044299eae3b02d18d3ca4faf Author: K.Prasad Date: Mon Jun 1 23:46:40 2009 +0530 hw-breakpoints: ftrace plugin for kernel symbol tracing using HW Breakpoint interfaces This patch adds an ftrace plugin to detect and profile memory access over kernel variables. It uses HW Breakpoint interfaces to 'watch memory addresses. Signed-off-by: K.Prasad Signed-off-by: Frederic Weisbecker commit 432039933a16b8227b7b267f46ac1c1b9b3adf14 Author: K.Prasad Date: Mon Jun 1 23:46:20 2009 +0530 hw-breakpoints: sample HW breakpoint over kernel data address This patch introduces a sample kernel module to demonstrate the use of Hardware Breakpoint feature. It places a breakpoint over the kernel variable 'pid_max' to monitor all write operations and emits a function-backtrace when done. Signed-off-by: K.Prasad Signed-off-by: Frederic Weisbecker commit 17f557e5b5d43a2af66c969f6560ac7105020672 Author: K.Prasad Date: Mon Jun 1 23:46:03 2009 +0530 hw-breakpoints: cleanup HW Breakpoint registers before kexec This patch disables Hardware breakpoints before doing a 'kexec' on the machine so that the cpu doesn't keep debug registers values which would be out of sync for the new image. Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit 72f674d203cd230426437cdcf7dd6f681dad8b0d Author: K.Prasad Date: Mon Jun 1 23:45:48 2009 +0530 hw-breakpoints: modify Ptrace routines to access breakpoint registers This patch modifies the ptrace code to use the new wrapper routines around the debug/breakpoint registers. [ Impact: adapt x86 ptrace to the new breakpoint Api ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Signed-off-by: Maneesh Soni Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit da0cdc14f5f7e0faee6b2393fefed056cdb17146 Author: K.Prasad Date: Mon Jun 1 23:45:03 2009 +0530 hw-breakpoints: modify signal handling code to refrain from re-enabling HW Breakpoints This patch disables re-enabling of Hardware Breakpoint registers through the signal handling code. This is now done during from hw_breakpoint_handler(). Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit 66cb5917295958652ff6ba36d83f98f2379c46b4 Author: K.Prasad Date: Mon Jun 1 23:44:55 2009 +0530 hw-breakpoints: use the new wrapper routines to access debug registers in process/thread code This patch enables the use of abstract debug registers in process-handling routines, according to the new hardware breakpoint Api. [ Impact: adapt thread breakpoints handling code to the new breakpoint Api ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit 1e3500666f7c5daaadadb8431a2927cdbbdb7dd4 Author: K.Prasad Date: Mon Jun 1 23:44:26 2009 +0530 hw-breakpoints: use wrapper routines around debug registers in processor related functions This patch enables the use of wrapper routines to access the debug/breakpoint registers on cpu management. The hardcoded debug registers save and restore operations for threads breakpoints are replaced by wrappers. And now that we handle the kernel breakpoints too, we also need to handle them on cpu hotplug operations. [ Impact: adapt new hardware breakpoint api to cpu hotplug ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit 08d68323d1f0c34452e614263b212ca556dae47f Author: K.Prasad Date: Mon Jun 1 23:44:08 2009 +0530 hw-breakpoints: modifying generic debug exception to use thread-specific debug registers This patch modifies the breakpoint exception handler code to use the new abstract debug register names. [ fweisbec@gmail.com: fix conflict against kmemcheck ] [ Impact: refactor and cleanup x86 debug exception handler ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit 0067f1297241ea567f2b22a455519752d70fcca9 Author: K.Prasad Date: Mon Jun 1 23:43:57 2009 +0530 hw-breakpoints: x86 architecture implementation of Hardware Breakpoint interfaces This patch introduces the arch-specific implementation of the generic hardware breakpoints in kernel/hw_breakpoint.c inside x86 specific directories. It contains functions which help to validate and serve requests using Hardware Breakpoint registers on x86 processors. [ fweisbec@gmail.com: fix conflict against kmemcheck ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit 62a038d34db26771756cf3689e36de638bedd2c4 Author: K.Prasad Date: Mon Jun 1 23:43:33 2009 +0530 hw-breakpoints: introducing generic hardware breakpoint handler interfaces This patch introduces the generic Hardware Breakpoint interfaces for both user and kernel space requests. This core Api handles the hardware breakpoints through new helpers. It handles the user-space breakpoints and kernel breakpoints in front of arch implementation. One can choose kernel wide breakpoints using the following helpers and passing them a generic struct hw_breakpoint: - register_kernel_hw_breakpoint() - unregister_kernel_hw_breakpoint() - modify_kernel_hw_breakpoint() On the other side, you can choose per task breakpoints. - register_user_hw_breakpoint() - unregister_user_hw_breakpoint() - modify_user_hw_breakpoint() [ fweisbec@gmail.com: fix conflict against perfcounter ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker commit b332828c39326b1dca617f387dd15d12e81cd5f0 Author: K.Prasad Date: Mon Jun 1 23:43:10 2009 +0530 hw-breakpoints: prepare the code for Hardware Breakpoint interfaces The generic hardware breakpoint interface provides an abstraction of hardware breakpoints in front of specific arch implementations for both kernel and user side breakpoints. This includes execution breakpoints and read/write breakpoints, also known as "watchpoints". This patch introduces header files containing constants, structure definitions and declaration of functions used by the hardware breakpoint core and x86 specific code. It also introduces an array based storage for the debug-register values in 'struct thread_struct', while modifying all users of debugreg member in the structure. [ Impact: add headers for new hardware breakpoint interface ] Original-patch-by: Alan Stern Signed-off-by: K.Prasad Reviewed-by: Alan Stern Signed-off-by: Frederic Weisbecker