summaryrefslogtreecommitdiff
path: root/arch/arc/kernel/perf_event.c
AgeCommit message (Collapse)Author
2019-01-17ARC: perf: avoid kernel killing where it is possibleEugeniy Paltsev
No, not gonna die tonight. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-01-17ARC: perf: move HW events mapping to separate functionEugeniy Paltsev
Move HW events mapping to separate function to make code more readable. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-01-17ARC: perf: introduce Kernel PMU events supportEugeniy Paltsev
Export all available ARC architected hardware events as kernel PMU events to make non-generic events accessible. ARC PMU HW allow us to read the list of all available events names. So we generate kernel PMU event list dynamically in arc_pmu_device_probe() using human-readable events names we got from HW instead of using pre-defined events list. -------------------------->8-------------------------- $ perf list [snip] arc_pmu/bdata64/ [Kernel PMU event] arc_pmu/bdcstall/ [Kernel PMU event] arc_pmu/bdslot/ [Kernel PMU event] arc_pmu/bfbmp/ [Kernel PMU event] arc_pmu/bfirqex/ [Kernel PMU event] arc_pmu/bflgstal/ [Kernel PMU event] arc_pmu/bflush/ [Kernel PMU event] -------------------------->8-------------------------- Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-01-17ARC: perf: trivial code cleanupEugeniy Paltsev
* Use BIT(), lower_32_bits(), upper_32_bits() macroses, fix code style violations. * Use u32, u64, s64 instead of uint32_t, uint64_t, int64_t * Fix description comment as this code doesn't belong only to ARC700 anymore. * Use SPDX License Identifier. * Remove useless ifdefs. ifdef around 'arc_pmu_match' structure declaration is useless as we refer to 'arc_pmu_match' in several places which aren't guarded with ifdef. Nevertheless 'ARC' option selects 'OF' unconditionally so we can simply get rid of this ifdef. Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-11-21ARCv2: perf: optimize given that num counters <= 32Vineet Gupta
use ffz primitive which maps to ARCv2 instruction, vs. non atomic __test_and_set_bit It is unlikely if we will even have more than 32 counters, but still add a BUILD_BUG to catch that Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-11-21ARCv2: perf: tweak overflow interruptVineet Gupta
Current perf ISR loops thru all 32 counters, checking for each if it caused the interrupt. Instead only loop thru counters which actually interrupted (typically 1). Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30arc: perf: Enable generic "cache-references" and "cache-misses" eventsAlexey Brodkin
We used to live with PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC. Those events are actually aliases to 2 cache events that we do support and so this change sets "cache-reference" and "cache-misses" events in the same way as "L1-dcache-loads" and L1-dcache-load-misses. And while at it adding debug info for cache events as well as doing a subtle fix in HW events debug info - config value is much better represented by hex so we may see not only event index but as well other control bits set (if they exist). Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-30Fix typosAndrea Gelmini
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-16perf core: Pass max stack as a perf_callchain_entry contextArnaldo Carvalho de Melo
This makes perf_callchain_{user,kernel}() receive the max stack as context for the perf_callchain_entry, instead of accessing the global sysctl_perf_event_max_stack. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-12ARCv2: perf: Ensure perf intr gets enabled on all coresVineet Gupta
This was the second perf intr issue perf sampling on multicore requires intr to be enabled on all cores. ARC perf probe code used helper arc_request_percpu_irq() which calls - request_percpu_irq() on core0 - enable_percpu_irq() on all all cores (including core0) genirq requires that request be made ahead of enable call. However if perf probe happened on non core0 (observed on a 3.18 kernel), enable would get called ahead of request, failing obviously and rendering perf intr disabled on all such cores [ 11.120000] 1 ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support] [ 11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed [ 11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed [ 11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed [ 11.140000] 0 =====> request_percpu_irq() IRQ 20 [ 11.140000] 0 -----> enable_percpu_irq() IRQ 20 Fix this fragility, by calling request_percpu_irq() on whatever core calls probe (there is no requirement on which core calls this anyways) and then calling enable on each cores. Interestingly this started as invesigation of STAR 9000838902: "sporadically IRQs enabled on perf prob" which was about occassional boot spew as request_percpu_irq got called non-locally (from an IPI), and re-enabled interrupts in following path proc_mkdir -> spin_unlock_irq() which the irq work code didn't like. | ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support] | | BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()! | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2 | | Stack Trace: | arc_unwind_core.constprop.1+0x94/0x104 | dump_stack+0x62/0x98 | irq_work_run_list+0xb0/0xb4 | irq_work_run+0x22/0x3c | do_IPI+0x74/0x9c | handle_irq_event_percpu+0x34/0x164 | handle_percpu_irq+0x58/0x78 | generic_handle_irq+0x1e/0x2c | arch_do_IRQ+0x3c/0x60 | ret_from_exception+0x0/0x8 Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Alexey Brodkin <abrodkin@synopsys.com> Cc: <stable@vger.kernel.org> #4.2+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27ARCv2: perf: Finally introduce HS perf unitVineet Gupta
With all features in place, the ARC HS pct block can now be effectively allowed to be probed/used Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27ARCv2: perf: SMP supportAlexey Brodkin
* split off pmu info into singleton and per-cpu bits * setup PMU on all cores Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27ARCv2: perf: implement exclusion of event counting in user or kernel modeAlexey Brodkin
Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27ARCv2: perf: Support sampling events using overflow interruptsAlexey Brodkin
In times of ARC 700 performance counters didn't have support of interrupt an so for ARC we only had support of non-sampling events. Put simply only "perf stat" was functional. Now with ARC HS we have support of interrupts in performance counters which this change introduces support of. ARC performance counters act in the following way in regard of interrupts generation. [1] A counter counts starting from value set in PCT_COUNT register pair [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised Basic setup look like this: [1] PCT_COUNT = 0; [2] PCT_INT_CNT = __limit_value__; [3] Enable interrupts for that counter and let it run [4] Let counter reach its limit [5] Handle interrupt when it happens Note that PCT HW block is build in CPU core and so ints interrupt line (which is basically OR of all counters IRQs) is wired directly to top-level IRQC. That means do de-assert PCT interrupt it's required to reset IRQs from all counters that have reached their limit values. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27ARCv2: perf: implement "event_set_period"Alexey Brodkin
This generalization prepares for support of overflow interrupts. Hardware event counters on ARC work that way: Each counter counts from programmed start value (set in ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and once limit value is reached this timer generates an interrupt. Even though this hardware implementation allows for more flexibility, in Linux kernel we decided to mimic behavior of other architectures this way: [1] Set limit value as half of counter's max value (to allow counter to run after reaching it limit, see below for more explanation): ---------->8----------- arc_pmu->max_period = (1ULL << counter_size) / 2 - 1ULL; ---------->8----------- [2] Set start value as "arc_pmu->max_period - sample_period" and then count up to the limit Our event counters don't stop on reaching max value (the one we set in ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly stops each of them. And setting a limit as half of counter capacity is done to allow capturing of additional events in between moment when interrupt was triggered until we're actually processing PMU interrupts. That way we're trying to be more precise. For example if we count CPU cycles we keep track of cycles while running through generic IRQ handling code: [1] We set counter period as say 100_000 events of type "crun" [2] Counter reaches that limit and raises its interrupt [3] Once we get in PMU IRQ handler we read current counter value from ARC_REG_PCT_SNAP ans see there something like 105_000. If counters stop on reaching a limit value then we would miss additional 5000 cycles. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27ARC: perf: cap the number of counters to hardware max of 32Vineet Gupta
The number of counters in PCT can never be more than 32 (while countable conditions could be 100+) for both ARCompact and ARCv2 And while at it update copyright dates. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20ARC: add/fix some comments in code - no functional changeVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19ARC: perf: Remove unnecessary local variableTobias Klauser
Directly return the result of perf_pmu_register() in arc_pmu_device_probe() instead of assigning and returning variable ret. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-19arc: fix use of uninitialized arc_pmuMax Filippov
static arc_pmu in the arch/arc/kernel/perf_event.c is not initialized as it's shadowed by a local variable of the same name in the arc_pmu_device_probe. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Fixes: 03c94fcf954d "ARC: perf: make @arc_pmu static global" CC: <stable@vger.kernel.org> # 4.1 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: don't add code for impossible caseVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: Rename DT binding to not confuse with power mgmtVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: add user space attribution in callchainsVineet Gupta
The actual user space unwinding is more involved, so simply capture the user space PC Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: Add kernel callchain supportVineet Gupta
Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: Add some comments/debug stuffVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-04-20ARC: perf: make @arc_pmu static globalVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2014-10-13ARC: boot: cpu feature print enhancementsVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2014-06-18arc, perf: Use common PMU interrupt disabled codeVince Weaver
Transition to using the new generic PERF_PMU_CAP_NO_INTERRUPT method for failing a sampling event when no PMU interrupt is available. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1406150159280.16738@vincent-weaver-1.umelst.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-28ARC: [perf] Fix a few thinkosVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-15ARC: perf: ARC 700 PMU doesn't support sampling eventsMischa Jonker
The ARC 700 does not have an interrupt associated with it, and as such it cannot trigger when a counter overflows. As the counters are 48 bit, it will usually take at least 100 days before a counter overflows, so for mere counting of events, there is no problem. Sampling is not supported though. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-11-12ARC: Add perf support for ARC700 coresMischa Jonker
This adds basic perf support for ARC700 cores. Most PERF_COUNT_HW* events are supported now. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>