summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/Makefile
AgeCommit message (Collapse)Author
2019-11-12x86/cpu: Add a "tsx=" cmdline option with TSX disabled by defaultPawan Gupta
commit 95c5824f75f3ba4c9e8e5a4b1a623c95390ac266 upstream. Add a kernel cmdline parameter "tsx" to control the Transactional Synchronization Extensions (TSX) feature. On CPUs that support TSX control, use "tsx=on|off" to enable or disable TSX. Not specifying this option is equivalent to "tsx=off". This is because on certain processors TSX may be used as a part of a speculative side channel attack. Carve out the TSX controlling functionality into a separate compilation unit because TSX is a CPU feature while the TSX async abort control machinery will go to cpu/bugs.c. [ bp: - Massage, shorten and clear the arg buffer. - Clarifications of the tsx= possible options - Josh. - Expand on TSX_CTRL availability - Pawan. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-10x86 / CPU: Always show current CPU frequency in /proc/cpuinfoRafael J. Wysocki
commit 7d5905dc14a87805a59f3c5bf70173aac2bb18f8 upstream. After commit 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo on x86 can be either the nominal CPU frequency (which is constant) or the frequency most recently requested by a scaling governor in cpufreq, depending on the cpufreq configuration. That is somewhat inconsistent and is different from what it was before 4.13, so in order to restore the previous behavior, make it report the current CPU frequency like the scaling_cur_freq sysfs file in cpufreq. To that end, modify the /proc/cpuinfo implementation on x86 to use aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback registers, if available, and use their values to compute the CPU frequency to be reported as "cpu MHz". However, do that carefully enough to avoid accumulating delays that lead to unacceptable access times for /proc/cpuinfo on systems with many CPUs. Run aperfmperf_snapshot_khz() once on all CPUs asynchronously at the /proc/cpuinfo open time, add a single delay upfront (if necessary) at that point and simply compute the current frequency while running show_cpuinfo() for each individual CPU. Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce the default delay between consecutive APERF and MPERF reads to 10 ms, which should be sufficient to get large enough numbers for the frequency computation in all cases. Fixes: 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-25x86/cpuid: Add generic table for CPUID dependenciesAndi Kleen
commit 0b00de857a648dafe7020878c7a27cf776f5edf4 upstream. Some CPUID features depend on other features. Currently it's possible to to clear dependent features, but not clear the base features, which can cause various interesting problems. This patch implements a generic table to describe dependencies between CPUID features, to be used by all code that clears CPUID. Some subsystems (like XSAVE) had an own implementation of this, but it's better to do it all in a single place for everyone. Then clear_cpu_cap and setup_clear_cpu_cap always look up this table and clear all dependencies too. This is intended to be a practical table: only for features that make sense to clear. If someone for example clears FPU, or other features that are essentially part of the required base feature set, not much is going to work. Handling that is right now out of scope. We're only handling features which can be usefully cleared. Signed-off-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan McDowell <noodles@earth.li> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171013215645.23166-3-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-10Revert "x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo"Linus Torvalds
This reverts commit 941f5f0f6ef5338814145cf2b813cf1f98873e2f. Sadly, it turns out that we really can't just do the cross-CPU IPI to all CPU's to get their proper frequencies, because it's much too expensive on systems with lots of cores. So we'll have to revert this for now, and revisit it using a smarter model (probably doing one system-wide IPI at open time, and doing all the frequency calculations in parallel). Reported-by: WANG Chao <chao.wang@ucloud.cn> Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-03x86: CPU: Fix up "cpu MHz" in /proc/cpuinfoRafael J. Wysocki
Commit 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") is not sufficient to restore the previous behavior of "cpu MHz" in /proc/cpuinfo on x86 due to some changes made after the commit it has reverted. To address this, make the code in question use arch_freq_get_on_cpu() which also is used by cpufreq for reporting the current frequency of CPUs and since that function doesn't really depend on cpufreq in any way, drop the CONFIG_CPU_FREQ dependency for the object file containing it. Also refactor arch_freq_get_on_cpu() somewhat to avoid IPIs and return cached values right away if it is called very often over a short time (to prevent user space from triggering IPI storms through it). Fixes: 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") Cc: stable@kernel.org # 4.13 - together with 890da9cf0983 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-01x86/intel_rdt: Prepare for RDT monitor data supportVikas Shivappa
Rename the intel_rdt_schemata file to intel_rdt_ctrlmondata as we now want to add support for RDT monitoring data for the events that are supported in later patches. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-19-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/cqm: Add RDT monitoring initializationVikas Shivappa
Add common data structures for RDT resource monitoring and perform RDT monitoring related data structure initializations which include setting up the RMID(Resource monitoring ID) lists and event list which the resource supports. [ tony: some cleanup to make adding MBM easier later, remove "cqm" from some names, make some data structure local to intel_rdt_monitor.c static. Add copyright header] [ tglx: Made it readable ] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-9-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt: Introduce a common compile option for RDTVikas Shivappa
We currently have a CONFIG_RDT_A which is for RDT(Resource directory technology) allocation based resctrl filesystem interface. As a preparation to add support for RDT monitoring as well into the same resctrl filesystem, change the config option to be CONFIG_RDT which would include both RDT allocation and monitoring code. No functional change. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-4-git-send-email-vikas.shivappa@linux.intel.com
2017-06-27x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERFLen Brown
The goal of this change is to give users a uniform and meaningful result when they read /sys/...cpufreq/scaling_cur_freq on modern x86 hardware, as compared to what they get today. Modern x86 processors include the hardware needed to accurately calculate frequency over an interval -- APERF, MPERF, and the TSC. Here we provide an x86 routine to make this calculation on supported hardware, and use it in preference to any driver driver-specific cpufreq_driver.get() routine. MHz is computed like so: MHz = base_MHz * delta_APERF / delta_MPERF MHz is the average frequency of the busy processor over a measurement interval. The interval is defined to be the time between successive invocations of aperfmperf_khz_on_cpu(), which are expected to to happen on-demand when users read sysfs attribute cpufreq/scaling_cur_freq. As with previous methods of calculating MHz, idle time is excluded. base_MHz above is from TSC calibration global "cpu_khz". This x86 native method to calculate MHz returns a meaningful result no matter if P-states are controlled by hardware or firmware and/or if the Linux cpufreq sub-system is or is-not installed. When this routine is invoked more frequently, the measurement interval becomes shorter. However, the code limits re-computation to 10ms intervals so that average frequency remains meaningful. Discerning users are encouraged to take advantage of the turbostat(8) utility, which can gracefully handle concurrent measurement intervals of arbitrary length. Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-22Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache allocation interface from Thomas Gleixner: "This provides support for Intel's Cache Allocation Technology, a cache partitioning mechanism. The interface is odd, but the hardware interface of that CAT stuff is odd as well. We tried hard to come up with an abstraction, but that only allows rather simple partitioning, but no way of sharing and dealing with the per package nature of this mechanism. In the end we decided to expose the allocation bitmaps directly so all combinations of the hardware can be utilized. There are two ways of associating a cache partition: - Task A task can be added to a resource group. It uses the cache partition associated to the group. - CPU All tasks which are not member of a resource group use the group to which the CPU they are running on is associated with. That allows for simple CPU based partitioning schemes. The main expected user sare: - Virtualization so a VM can only trash only the associated part of the cash w/o disturbing others - Real-Time systems to seperate RT and general workloads. - Latency sensitive enterprise workloads - In theory this also can be used to protect against cache side channel attacks" [ Intel RDT is "Resource Director Technology". The interface really is rather odd and very specific, which delayed this pull request while I was thinking about it. The pull request itself came in early during the merge window, I just delayed it until things had calmed down and I had more time. But people tell me they'll use this, and the good news is that it is _so_ specific that it's rather independent of anything else, and no user is going to depend on the interface since it's pretty rare. So if push comes to shove, we can just remove the interface and nothing will break ] * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86/intel_rdt: Implement show_options() for resctrlfs x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount x86/intel_rdt: Fix setting of closid when adding CPUs to a group x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee x86/intel_rdt: Reset per cpu closids on unmount x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A x86/intel_rdt: Prevent deadlock against hotplug lock x86/intel_rdt: Protect info directory from removal x86/intel_rdt: Add info files to Documentation x86/intel_rdt: Export the minimum number of set mask bits in sysfs x86/intel_rdt: Propagate error in rdt_mount() properly x86/intel_rdt: Add a missing #include MAINTAINERS: Add maintainer for Intel RDT resource allocation x86/intel_rdt: Add scheduler hook x86/intel_rdt: Add schemata file x86/intel_rdt: Add tasks files x86/intel_rdt: Add cpus file x86/intel_rdt: Add mkdir to resctrl file system x86/intel_rdt: Add "info" files to resctrl file system ...
2016-10-30x86/intel_rdt: Add schemata fileTony Luck
Last of the per resource group files. Also mode 0644. This one shows the resources available to the group. Syntax depends on whether the "cdp" mount option was given. With code/data prioritization disabled it is simply a list of masks for each cache domain. Initial value allows access to all of the L3 cache on all domains. E.g. on a 2 socket Broadwell: L3:0=fffff;1=fffff With CDP enabled, separate masks for data and instructions are provided: L3DATA:0=fffff;1=fffff L3CODE:0=fffff;1=fffff Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com> Cc: "Shaohua Li" <shli@fb.com> Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Stephane Eranian" <eranian@google.com> Cc: "Dave Hansen" <dave.hansen@intel.com> Cc: "David Carrillo-Cisneros" <davidcc@google.com> Cc: "Nilay Vaish" <nilayvaish@gmail.com> Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com> Cc: "Ingo Molnar" <mingo@elte.hu> Cc: "Borislav Petkov" <bp@suse.de> Cc: "H. Peter Anvin" <h.peter.anvin@intel.com> Link: http://lkml.kernel.org/r/1477692289-37412-9-git-send-email-fenghua.yu@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-30x86/intel_rdt: Add basic resctrl filesystem supportFenghua Yu
Use kernfs as basis for our user interface filesystem. This patch supports mount/umount, and one mount parameter "cdp" to enable code/data prioritization (though all we do at this point is ensure that the system can support CDP). The file system is not populated yet in this patch. [ tglx: Fixed up a few nits and added cdp handling in case of error ] Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com> Cc: "Tony Luck" <tony.luck@intel.com> Cc: "Shaohua Li" <shli@fb.com> Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Stephane Eranian" <eranian@google.com> Cc: "Dave Hansen" <dave.hansen@intel.com> Cc: "David Carrillo-Cisneros" <davidcc@google.com> Cc: "Nilay Vaish" <nilayvaish@gmail.com> Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com> Cc: "Ingo Molnar" <mingo@elte.hu> Cc: "Borislav Petkov" <bp@suse.de> Cc: "H. Peter Anvin" <h.peter.anvin@intel.com> Link: http://lkml.kernel.org/r/1477692289-37412-4-git-send-email-fenghua.yu@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-26x86/intel_rdt: Add CONFIG, Makefile, and basic initializationFenghua Yu
Introduce CONFIG_INTEL_RDT_A (default: no, dependent on CPU_SUP_INTEL) to control inclusion of Resource Director Technology in the build. Simple init() routine just checks which features are present. If they are pr_info() one line summary for each feature for now. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com> Cc: "Tony Luck" <tony.luck@intel.com> Cc: "David Carrillo-Cisneros" <davidcc@google.com> Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Stephane Eranian" <eranian@google.com> Cc: "Dave Hansen" <dave.hansen@intel.com> Cc: "Shaohua Li" <shli@fb.com> Cc: "Nilay Vaish" <nilayvaish@gmail.com> Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com> Cc: "Ingo Molnar" <mingo@elte.hu> Cc: "Borislav Petkov" <bp@suse.de> Cc: "H. Peter Anvin" <h.peter.anvin@intel.com> Link: http://lkml.kernel.org/r/1477142405-32078-7-git-send-email-fenghua.yu@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-25x86/cpu: Merge bugs.c and bugs_64.cBorislav Petkov
Should be easier when following boot paths. It probably is a left over from the x86 unification eons ago. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161024173844.23038-3-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-22kernel: add kcov code coverageDmitry Vyukov
kcov provides code coverage collection for coverage-guided fuzzing (randomized testing). Coverage-guided fuzzing is a testing technique that uses coverage feedback to determine new interesting inputs to a system. A notable user-space example is AFL (http://lcamtuf.coredump.cx/afl/). However, this technique is not widely used for kernel testing due to missing compiler and kernel support. kcov does not aim to collect as much coverage as possible. It aims to collect more or less stable coverage that is function of syscall inputs. To achieve this goal it does not collect coverage in soft/hard interrupts and instrumentation of some inherently non-deterministic or non-interesting parts of kernel is disbled (e.g. scheduler, locking). Currently there is a single coverage collection mode (tracing), but the API anticipates additional collection modes. Initially I also implemented a second mode which exposes coverage in a fixed-size hash table of counters (what Quentin used in his original patch). I've dropped the second mode for simplicity. This patch adds the necessary support on kernel side. The complimentary compiler support was added in gcc revision 231296. We've used this support to build syzkaller system call fuzzer, which has found 90 kernel bugs in just 2 months: https://github.com/google/syzkaller/wiki/Found-Bugs We've also found 30+ bugs in our internal systems with syzkaller. Another (yet unexplored) direction where kcov coverage would greatly help is more traditional "blob mutation". For example, mounting a random blob as a filesystem, or receiving a random blob over wire. Why not gcov. Typical fuzzing loop looks as follows: (1) reset coverage, (2) execute a bit of code, (3) collect coverage, repeat. A typical coverage can be just a dozen of basic blocks (e.g. an invalid input). In such context gcov becomes prohibitively expensive as reset/collect coverage steps depend on total number of basic blocks/edges in program (in case of kernel it is about 2M). Cost of kcov depends only on number of executed basic blocks/edges. On top of that, kernel requires per-thread coverage because there are always background threads and unrelated processes that also produce coverage. With inlined gcov instrumentation per-thread coverage is not possible. kcov exposes kernel PCs and control flow to user-space which is insecure. But debugfs should not be mapped as user accessible. Based on a patch by Quentin Casasnovas. [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode'] [akpm@linux-foundation.org: unbreak allmodconfig] [akpm@linux-foundation.org: follow x86 Makefile layout standards] Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: syzkaller <syzkaller@googlegroups.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tavis Ormandy <taviso@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@google.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: David Drysdale <drysdale@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "This is another big update. Main changes are: - lots of x86 system call (and other traps/exceptions) entry code enhancements. In particular the complex parts of the 64-bit entry code have been migrated to C code as well, and a number of dusty corners have been refreshed. (Andy Lutomirski) - vDSO special mapping robustification and general cleanups (Andy Lutomirski) - cpufeature refactoring, cleanups and speedups (Borislav Petkov) - lots of other changes ..." * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/cpufeature: Enable new AVX-512 features x86/entry/traps: Show unhandled signal for i386 in do_trap() x86/entry: Call enter_from_user_mode() with IRQs off x86/entry/32: Change INT80 to be an interrupt gate x86/entry: Improve system call entry comments x86/entry: Remove TIF_SINGLESTEP entry work x86/entry/32: Add and check a stack canary for the SYSENTER stack x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed x86/entry: Vastly simplify SYSENTER TF (single-step) handling x86/entry/traps: Clear DR6 early in do_debug() and improve the comment x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions x86/entry/32: Restore FLAGS on SYSEXIT x86/entry/32: Filter NT and speed up AC filtering in SYSENTER x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test selftests/x86: In syscall_nt, test NT|TF as well x86/asm-offsets: Remove PARAVIRT_enabled x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled uprobes: __create_xol_area() must nullify xol_mapping.fault x86/cpufeature: Create a new synthetic cpu capability for machine check recovery ...
2016-02-17perf/x86: Move perf_event_msr.c .............. => x86/events/msr.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-17-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-16-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-15-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-14-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_uncore_snbep.c => ↵Borislav Petkov
x86/events/intel/uncore_snbep.c Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-13-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_uncore_snb.c => x86/events/intel/uncore_snb.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-12-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_uncore_nhmex.c => ↵Borislav Petkov
x86/events/intel/uncore_nmhex.c Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-11-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch]Borislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-10-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-9-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]Borislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-8-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.cBorislav Petkov
Start moving the Intel bits. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09perf/x86: Move perf_event_amd_uncore.c .... => x86/events/amd/uncore.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454947748-28629-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09perf/x86: Move perf_event_amd_iommu.[ch] .. => x86/events/amd/iommu.[ch]Borislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454947748-28629-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09perf/x86: Move perf_event_amd_ibs.c ....... => x86/events/amd/ibs.cBorislav Petkov
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454947748-28629-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09perf/x86: Move perf_event_amd.c ........... => x86/events/amd/core.cBorislav Petkov
We distribute those in vendor subdirs, starting with .../events/amd/. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454947748-28629-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-09perf/x86: Move perf_event.c ............... => x86/events/core.cBorislav Petkov
Also, keep the churn at minimum by adjusting the include "perf_event.h" when each file gets moved. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1454947748-28629-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-30x86/cpufeature: Carve out X86_FEATURE_*Borislav Petkov
Move them to a separate header and have the following dependency: x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h This makes it easier to use the header in asm code and not include the whole cpufeature.h and add guards for asm. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1453842730-28463-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06perf/x86: Add Intel cstate PMUs supportKan Liang
This patch adds new PMUs to support cstate related free running (read-only) counters. These counters may be used simultaneously by other tools, such as turbostat. However, it still make sense to implement them in perf. Because we can conveniently collect them together with other events, and allow to use them from tools without special MSR access code. These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY. According to counters' scope and category, two PMUs are registered with the perf_event core subsystem. - 'cstate_core': The counter is available for each physical core. The counters include CORE_C*_RESIDENCY. - 'cstate_pkg': The counter is available for each physical package. The counters include PKG_C*_RESIDENCY. The events are exposed in sysfs for use by perf stat and other tools. The files are: /sys/devices/cstate_core/events/c*-residency /sys/devices/cstate_pkg/events/c*-residency These events only support system-wide mode counting. The /sys/devices/cstate_*/cpumask file can be used by tools to figure out which CPUs to monitor by default. The PMU type (attr->type) is dynamically allocated and is available from /sys/devices/core_misc/type and /sys/device/cstate_*/type. Sampling is not supported. Here is an example. - To caculate the fraction of time when the core is running in C6 state CORE_C6_time% = CORE_C6_RESIDENCY / TSC # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5 11838820015,,cstate_core/c6-residency/,5175919658,100.00 11877130740,,msr/tsc/,5175922010,100.00 For sleep, 99.7% of time we ran in C6 state. # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop 1253316,,cstate_core/c6-residency/,4360969154,100.00 10012635248,,msr/tsc/,4360972366,100.00 For busyloop, 0.01% of time we ran in C6 state. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-04perf/x86: Add an MSR PMU driverAndy Lutomirski
This patch adds an MSR PMU to support free running MSR counters. Such as time and freq related counters includes TSC, IA32_APERF, IA32_MPERF and IA32_PPERF, but also SMI_COUNT. The events are exposed in sysfs for use by perf stat and other tools. The files are under /sys/devices/msr/events/ Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kan Liang <kan.liang@intel.com> [ s/freq/msr/, added SMI_COUNT, fixed bugs. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: adrian.hunter@intel.com Cc: dsahern@gmail.com Cc: eranian@google.com Cc: jolsa@kernel.org Cc: mark.rutland@arm.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1437407346-31186-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02perf/x86/intel/bts: Add BTS PMU driverAlexander Shishkin
Add support for Branch Trace Store (BTS) via kernel perf event infrastructure. The difference with the existing implementation of BTS support is that this one is a separate PMU that exports events' trace buffers to userspace by means of AUX area of the perf buffer, which is zero-copy mapped into userspace. The immediate benefit is that the buffer size can be much bigger, resulting in fewer interrupts and no kernel side copying is involved and little to no trace data loss. Also, kernel code can be traced with this driver. The old way of collecting BTS traces still works. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kaixu Xia <kaixu.xia@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Robert Richter <rric@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: markus.t.metzger@intel.com Cc: mathieu.poirier@linaro.org Link: http://lkml.kernel.org/r/1422614435-114702-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02perf/x86/intel/pt: Add Intel PT PMU driverAlexander Shishkin
Add support for Intel Processor Trace (PT) to kernel's perf events. PT is an extension of Intel Architecture that collects information about software execuction such as control flow, execution modes and timings and formats it into highly compressed binary packets. Even being compressed, these packets are generated at hundreds of megabytes per second per core, which makes it impractical to decode them on the fly in the kernel. This driver exports trace data by through AUX space in the perf ring buffer, which is zero-copy mapped into userspace for faster data retrieval. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kaixu Xia <kaixu.xia@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Robert Richter <rric@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: markus.t.metzger@intel.com Cc: mathieu.poirier@linaro.org Link: http://lkml.kernel.org/r/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23perf/x86/intel: Fix Makefile to actually build the cqm driverMatt Fleming
Someone fat fingered a merge conflict and lost the Makefile hunk. Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <acme@redhat.com> Cc: <hpa@zytor.com> Cc: <jolsa@redhat.com> Cc: <kanaka.d.juvva@intel.com> Cc: <tglx@linutronix.de> Cc: <torvalds@linux-foundation.org> Cc: <vikas.shivappa@linux.intel.com> Link: http://lkml.kernel.org/r/1424976420.15321.35.camel@mfleming-mobl1.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-23x86/build: Clean auto-generated processor feature filesBjørn Mork
Commit 9def39be4e96 ("x86: Support compiling out human-friendly processor feature names") made two source file targets conditional. Such conditional targets will not be cleaned automatically by make mrproper. Fix by adding explicit clean-files targets for the two files. Fixes: 9def39be4e96 ("x86: Support compiling out human-friendly processor feature names") Signed-off-by: Bjørn Mork <bjorn@mork.no> Cc: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/1419335863-10608-1-git-send-email-bjorn@mork.no Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28perf/x86: Fix compile warnings for intel_uncorePeter Zijlstra
The uncore drivers require PCI and generate compile time warnings when !CONFIG_PCI. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Stephane Eranian <eranian@google.com> Cc: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-13Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side updates: - Fix and enhance poll support (Jiri Olsa) - Re-enable inheritance optimization (Jiri Olsa) - Enhance Intel memory events support (Stephane Eranian) - Refactor the Intel uncore driver to be more maintainable (Zheng Yan) - Enhance and fix Intel CPU and uncore PMU drivers (Peter Zijlstra, Andi Kleen) - [ plus various smaller fixes/cleanups ] User visible tooling updates: - Add +field argument support for --field option, so that one can add fields to the default list of fields to show, ie now one can just do: perf report --fields +pid And the pid will appear in addition to the default fields (Jiri Olsa) - Add +field argument support for --sort option (Jiri Olsa) - Honour -w in the report tools (report, top), allowing to specify the widths for the histogram entries columns (Namhyung Kim) - Properly show submicrosecond times in 'perf kvm stat' (Christian Borntraeger) - Add beautifier for mremap flags param in 'trace' (Alex Snast) - perf script: Allow callchains if any event samples them - Don't truncate Intel style addresses in 'annotate' (Alex Converse) - Allow profiling when kptr_restrict == 1 for non root users, kernel samples will just remain unresolved (Andi Kleen) - Allow configuring default options for callchains in config file (Namhyung Kim) - Support operations for shared futexes. (Davidlohr Bueso) - "perf kvm stat report" improvements by Alexander Yarygin: - Save pid string in opts.target.pid - Enable the target.system_wide flag - Unify the title bar output - [ plus lots of other fixes and small improvements. ] Tooling infrastructure changes: - Refactor unit and scale function parameters for PMU parsing routines (Matt Fleming) - Improve DSO long names lookup with rbtree, resulting in great speedup for workloads with lots of DSOs (Waiman Long) - We were not handling POLLHUP notifications for event file descriptors Fix it by filtering entries in the events file descriptor array after poll() returns, refcounting mmaps so that when the last fd pointing to a perf mmap goes away we do the unmap (Arnaldo Carvalho de Melo) - Intel PT prep work, from Adrian Hunter, including: - Let a user specify a PMU event without any config terms - Add perf-with-kcore script - Let default config be defined for a PMU - Add perf_pmu__scan_file() - Add a 'perf test' for tracking with sched_switch - Add 'flush' callback to scripting API - Use ring buffer consume method to look like other tools (Arnaldo Carvalho de Melo) - hists browser (used in top and report) refactorings, getting rid of unused variables and reducing source code size by handling similar cases in a fewer functions (Namhyung Kim). - Replace thread unsafe strerror() with strerror_r() accross the whole tools/perf/ tree (Masami Hiramatsu) - Rename ordered_samples to ordered_events and allow setting a queue size for ordering events (Jiri Olsa) - [ plus lots of fixes, cleanups and other improvements ]" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (198 commits) perf/x86: Tone down kernel messages when the PMU check fails in a virtual environment perf/x86/intel/uncore: Fix minor race in box set up perf record: Fix error message for --filter option not coming after tracepoint perf tools: Fix build breakage on arm64 targets perf symbols: Improve DSO long names lookup speed with rbtree perf symbols: Encapsulate dsos list head into struct dsos perf bench futex: Sanitize -q option in requeue perf bench futex: Support operations for shared futexes perf trace: Fix mmap return address truncation to 32-bit perf tools: Refactor unit and scale function parameters perf tools: Fix line number in the config file error message perf tools: Convert {record,top}.call-graph option to call-graph.record-mode perf tools: Introduce perf_callchain_config() perf callchain: Move some parser functions to callchain.c perf tools: Move callchain config from record_opts to callchain_param perf hists browser: Fix callchain print bug on TUI perf tools: Use ACCESS_ONCE() instead of volatile cast perf tools: Modify error code for when perf_session__new() fails perf tools: Fix perf record as non root with kptr_restrict == 1 perf stat: Fix --per-core on multi socket systems ...
2014-08-17x86: Support compiling out human-friendly processor feature namesJosh Triplett
The table mapping CPUID bits to human-readable strings takes up a non-trivial amount of space, and only exists to support /proc/cpuinfo and a couple of kernel messages. Since programs depend on the format of /proc/cpuinfo, force inclusion of the table when building with /proc support; otherwise, support omitting that table to save space, in which case the kernel messages will print features numerically instead. In addition to saving 1408 bytes out of vmlinux, this also saves 1373 bytes out of the uncompressed setup code, which contributes directly to the size of bzImage. Signed-off-by: Josh Triplett <josh@joshtriplett.org>
2014-08-17x86: Drop support for /proc files when !CONFIG_PROC_FSJosh Triplett
arch/x86/kernel/cpu/proc.c only exists to support files in /proc; omit that file when compiling without CONFIG_PROC_FS. Saves 645 additional bytes on 32-bit x86 when !CONFIG_PROC_FS: add/remove: 0/5 grow/shrink: 0/0 up/down: 0/-645 (-645) function old new delta c_stop 1 - -1 c_next 11 - -11 cpuinfo_op 16 - -16 c_start 24 - -24 show_cpuinfo 593 - -593 Signed-off-by: Josh Triplett <josh@joshtriplett.org>
2014-08-13perf/x86/uncore: move NHM-EX/WSM-EX specific code to seperate fileYan, Zheng
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1406704935-27708-4-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>