summaryrefslogtreecommitdiff
path: root/kernel/trace/trace.c
AgeCommit message (Collapse)Author
2014-06-10tracing: Fix leak of per cpu max data in instancesSteven Rostedt (Red Hat)
The freeing of an instance, if max data is configured, there will be per cpu data structures created. But these are not freed when the instance is deleted, which causes a memory leak. A new helper function is added that frees the individual buffers within a trace array, instead of duplicating the code. This way changes made for one are applied to the other (normal buffer vs max buffer). Link: http://lkml.kernel.org/r/87k38pbake.fsf@sejong.aot.lge.com Reported-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-10tracing: Cleanup saved_cmdlines_size changesNamhyung Kim
The recent addition of saved_cmdlines_size file had some remaining (minor - mostly coding style) issues. Fix them by passing pointer name to sizeof() and using scnprintf(). Link: http://lkml.kernel.org/p/1402384295-23680-1-git-send-email-namhyung@kernel.org Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-10ring-buffer: Check if buffer exists before pollingSteven Rostedt (Red Hat)
The per_cpu buffers are created one per possible CPU. But these do not mean that those CPUs are online, nor do they even exist. With the addition of the ring buffer polling, it assumes that the caller polls on an existing buffer. But this is not the case if the user reads trace_pipe from a CPU that does not exist, and this causes the kernel to crash. Simple fix is to check the cpu against buffer bitmask against to see if the buffer was allocated or not and return -ENODEV if it is not. More updates were done to pass the -ENODEV back up to userspace. Link: http://lkml.kernel.org/r/5393DB61.6060707@oracle.com Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06tracing: Fix memory leak on instance deletionSteven Rostedt (Red Hat)
When an instance is created, it also gets a snapshot ring buffer allocated (with minimum of pages). But when it is deleted the snapshot buffer is not. There was a helper function added to match the allocation of these ring buffers to a way to free them, but it wasn't used by the deletion of an instance. Using that helper function solves this memory leak. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06tracing: Fix leak of ring buffer data when new instances creation failsSteven Rostedt (Red Hat)
Yoshihiro Yunomae reported that the ring buffer data for a trace instance does not get properly cleaned up when it fails. He proposed a patch that manually cleaned the data up and addad a bunch of labels. The labels are not needed because all trace array is allocated with a kzalloc which initializes it to 0 and all kfree()s can take a NULL pointer and will ignore it. Adding a new helper function free_trace_buffers() that can also take null buffers to free the buffers that were allocated by allocate_trace_buffers(). Link: http://lkml.kernel.org/r/20140605223522.32311.31664.stgit@yunodevel Reported-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-05tracing: Introduce saved_cmdlines_size fileYoshihiro YUNOMAE
Introduce saved_cmdlines_size file for changing the number of saved pid-comms. saved_cmdlines currently stores 128 command names using SAVED_CMDLINES, but 'no-existing processes' names are often lost in saved_cmdlines when we read the trace data. So, by introducing saved_cmdlines_size file, we can now change the 128 command names saved to something much larger if needed. When we write a value to saved_cmdlines_size, the number of the value will be stored in pid-comm list: # echo 1024 > /sys/kernel/debug/tracing/saved_cmdlines_size Here, 1024 command names can be stored. The default number is 128 and the maximum number is PID_MAX_DEFAULT (=32768 if CONFIG_BASE_SMALL is not set). So, if we want to avoid losing any command names, we need to set 32768 to saved_cmdlines_size. We can read the maximum number of the list: # cat /sys/kernel/debug/tracing/saved_cmdlines_size 128 Link: http://lkml.kernel.org/p/20140605012427.22115.16173.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-03tracing: Eliminate double free on failure of allocation on boot upYoshihiro YUNOMAE
If allocation of the max_buffer fails on boot up, the error path will free both per_cpu data structures from the buffers. With the new redesign of the code, those structures are freed if allocations failed. That is, the helper function that allocates the buffers will free the per cpu data on failure. No need to do it again. In fact, the second free will cause a bug as the code can not handle a double free. Link: http://lkml.kernel.org/p/20140603042803.27308.30956.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-30tracing: Move locking of trace_cmdline_lock into start/stop seq callsSteven Rostedt (Red Hat)
With the conversion of the saved_cmdlines output to use seq_read, there is now a race between accessing the values of the saved_cmdlines and the writing to them. The trace_cmdline_lock needs to be taken at the start and stop of the seq calls. A new __trace_find_cmdline() call is created to allow for the look up to happen without taking the lock. Fixes: 42584c81c5ad tracing: Have saved_cmdlines use the seq_read infrastructure Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-30tracing: Try again for saved cmdline if failed due to lockingSteven Rostedt (Red Hat)
In order to prevent the saved cmdline cache from being filled when tracing is not active, the comms are only recorded after a trace event is recorded. The problem is, a comm can fail to be recorded if the trace_cmdline_lock is held. That lock is taken via a trylock to allow it to happen from any context (including NMI). If the lock fails to be taken, the comm is skipped. No big deal, as we will try again later. But! Because of the code that was added to only record after an event, we may not try again later as the recording is made as a oneshot per event per CPU. Only disable the recording of the comm if the comm is actually recorded. Fixes: 7ffbd48d5cab "tracing: Cache comms only after an event occurred" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-29tracing: Have saved_cmdlines use the seq_read infrastructureYoshihiro YUNOMAE
Current tracing_saved_cmdlines_read() implementation is naive; It allocates a large buffer, constructs output data to that buffer for each read operation, and then copies a portion of the buffer to the user space buffer. This has several issues such as slow memory allocation, high CPU usage, and even corruption of the output data. The seq_read infrastructure is made to handle this type of work. By converting it to use seq_read() the code becomes smaller, simplified, as well as correct. Link: http://lkml.kernel.org/p/20140220084431.3839.51793.stgit@yunodevel Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-29tracing: Print nasty banner when trace_printk() is in useSteven Rostedt
trace_printk() is used to debug fast paths within the kernel. Places that gets called in any context (interrupt or NMI) or thousands of times a second. Something you do not want to do with a printk(). In order to make it completely lockless as it needs a temporary buffer to handle some of the string formatting, a page is created per cpu for every context (four per cpu; normal, softirq, irq, NMI). Since trace_printk() should only be used for debugging purposes, there's no reason to waste memory on these buffers on a production system. That means, trace_printk() should never be used unless a developer is debugging their kernel. There's macro magic to allocate the buffers if trace_printk() is used anywhere in the kernel. To help enforce that trace_printk() isn't used outside of development, when it is used, a nasty banner is displayed on bootup (or when a module is loaded that uses trace_printk() and the kernel core does not). Here's the banner: ********************************************************** ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** ** ** ** trace_printk() being used. Allocating extra memory. ** ** ** ** This means that this is a DEBUG kernel and it is ** ** unsafe for produciton use. ** ** ** ** If you see this message and you are not debugging ** ** the kernel, report this immediately to your vendor! ** ** ** ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** ********************************************************** That should hopefully keep developers from trying to sneak in a trace_printk() or two. Link: http://lkml.kernel.org/p/20140528131440.2283213c@gandalf.local.home Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-05tracing: Replace __get_cpu_var uses with this_cpu_ptrChristoph Lameter
Replace uses of &__get_cpu_var for address calculation with this_cpu_ptr. Link: http://lkml.kernel.org/p/alpine.DEB.2.10.1404291415560.18364@gentwo.org Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-30tracing: Remove mock up poll wait functionSteven Rostedt (Red Hat)
Now that the ring buffer has a built in way to wake up readers when there's data, using irq_work such that it is safe to do it in any context. But it was still using the old "poor man's" wait polling that checks every 1/10 of a second to see if it should wake up a waiter. This makes the latency for a wake up excruciatingly long. No need to do that anymore. Completely remove the different wait_poll types from the tracers and have them all use the default one now. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-29tracing: Break out of tracing_wait_pipe() before wait_pipe() is calledSteven Rostedt (Red Hat)
When reading from trace_pipe, if tracing is off but nothing was read it should block. If something is read and tracing is off, then EOF is returned. If tracing is on and there's nothing to read, it will block. But because the check of whether tracing is off and something was read is done after the block on the pipe, it is hit or miss if the EOF is returned or not leading to inconsistent behavior. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21tracing: Add static to local functionsFabian Frederick
This patch adds static to the following functions: -cycle_t buffer_ftrace_now -void free_snapshot -int trace_selftest_startup_dynamic_tracing Link: http://lkml.kernel.org/p/20140417214442.d7abc7c0b0e4b90e7fedecc9@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21tracing: Move ftrace_max_lock into trace_arraySteven Rostedt (Red Hat)
In preparation for having tracers enabled in instances, the max_lock should be unique as updating the max for one tracer is a separate operation than updating it for another tracer using a different max. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21tracing: Move tracing_max_latency into trace_arraySteven Rostedt (Red Hat)
In preparation for letting the latency tracers be used by instances, remove the global tracing_max_latency variable and add a max_latency field to the trace_array that the latency tracers will now use. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21ftrace: Remove global function list and call function directlySteven Rostedt (Red Hat)
Instead of having a list of global functions that are called, as only one global function is allow to be enabled at a time, there's no reason to have a list. Instead, simply have all the users of the global ops, use the global ops directly, instead of registering their own ftrace_ops. Just switch what function is used before enabling the function tracer. This removes a lot of code as well as the complexity involved with it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The first vfs pile, with deep apologies for being very late in this window. Assorted cleanups and fixes, plus a large preparatory part of iov_iter work. There's a lot more of that, but it'll probably go into the next merge window - it *does* shape up nicely, removes a lot of boilerplate, gets rid of locking inconsistencie between aio_write and splice_write and I hope to get Kent's direct-io rewrite merged into the same queue, but some of the stuff after this point is having (mostly trivial) conflicts with the things already merged into mainline and with some I want more testing. This one passes LTP and xfstests without regressions, in addition to usual beating. BTW, readahead02 in ltp syscalls testsuite has started giving failures since "mm/readahead.c: fix readahead failure for memoryless NUMA nodes and limit readahead pages" - might be a false positive, might be a real regression..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) missing bits of "splice: fix racy pipe->buffers uses" cifs: fix the race in cifs_writev() ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure kill generic_file_buffered_write() ocfs2_file_aio_write(): switch to generic_perform_write() ceph_aio_write(): switch to generic_perform_write() xfs_file_buffered_aio_write(): switch to generic_perform_write() export generic_perform_write(), start getting rid of generic_file_buffer_write() generic_file_direct_write(): get rid of ppos argument btrfs_file_aio_write(): get rid of ppos kill the 5th argument of generic_file_buffered_write() kill the 4th argument of __generic_file_aio_write() lustre: don't open-code kernel_recvmsg() ocfs2: don't open-code kernel_recvmsg() drbd: don't open-code kernel_recvmsg() constify blk_rq_map_user_iov() and friends lustre: switch to kernel_sendmsg() ocfs2: don't open-code kernel_sendmsg() take iov_iter stuff to mm/iov_iter.c process_vm_access: tidy up a bit ...
2014-04-12missing bits of "splice: fix racy pipe->buffers uses"Al Viro
that commit has fixed only the parts of that mess in fs/splice.c itself; there had been more in several other ->splice_read() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-10tracing: Add missing function triggers dump and cpudump to READMESteven Rostedt (Red Hat)
The debugfs tracing README file lists all the function triggers except for dump and cpudump. These should be added too. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-03Merge tag 'trace-3.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Most of the changes were largely clean ups, and some documentation. But there were a few features that were added: Uprobes now work with event triggers and multi buffers and have support under ftrace and perf. The big feature is that the function tracer can now be used within the multi buffer instances. That is, you can now trace some functions in one buffer, others in another buffer, all functions in a third buffer and so on. They are basically agnostic from each other. This only works for the function tracer and not for the function graph trace, although you can have the function graph tracer running in the top level buffer (or any tracer for that matter) and have different function tracing going on in the sub buffers" * tag 'trace-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (45 commits) tracing: Add BUG_ON when stack end location is over written tracepoint: Remove unused API functions Revert "tracing: Move event storage for array from macro to standalone function" ftrace: Constify ftrace_text_reserved tracepoints: API doc update to tracepoint_probe_register() return value tracepoints: API doc update to data argument ftrace: Fix compilation warning about control_ops_free ftrace/x86: BUG when ftrace recovery fails ftrace: Warn on error when modifying ftrace function ftrace: Remove freelist from struct dyn_ftrace ftrace: Do not pass data to ftrace_dyn_arch_init ftrace: Pass retval through return in ftrace_dyn_arch_init() ftrace: Inline the code from ftrace_dyn_table_alloc() ftrace: Cleanup of global variables ftrace_new_pgs and ftrace_update_cnt tracing: Evaluate len expression only once in __dynamic_array macro tracing: Correctly expand len expressions from __dynamic_array macro tracing/module: Replace include of tracepoint.h with jump_label.h in module.h tracing: Fix event header migrate.h to include tracepoint.h tracing: Fix event header writeback.h to include tracepoint.h tracing: Warn if a tracepoint is not set via debugfs ...
2014-04-01pipe: kill ->map() and ->unmap()Al Viro
all pipe_buffer_operations have the same instances of those... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-25tracing: Fix traceon trigger condition to actually turn tracing onSteven Rostedt (Red Hat)
While working on my tutorial for 2014 Linux Collaboration Summit I found that the traceon trigger did not work when conditions were used. The other triggers worked fine though. Looking into it, it is because of the way the triggers use the ring buffer to store the fields it will use for the condition. But if tracing is off, nothing is stored in the buffer, and the tracepoint exits before calling the trigger to test the condition. This is fine for all the triggers that only work when tracing is on, but for traceon trigger that is to work when tracing is off, nothing happens. The fix is simple, just use a temp ring buffer to record the event if tracing is off and the event has a trace event conditional trigger enabled. The rest of the tracepoint code will work just fine, but the tracepoint wont be recorded in the other buffers. Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Add trace_clock=<clock> kernel parameterSteven Rostedt
Being able to change the trace clock at boot can be advantageous if you need a better source of when things happen across CPUs. The default trace clock is the fastest, but it uses local clocks which may not be synced across CPUs and it does not let you know when events took place with respect to events on other CPUs. The global trace clock can help in this case, and if you do not care about timings, the counter "clock" is the best, as that is just a simple atomic counter that is incremented for every event. Usage is to add "trace_clock=counter" on the kernel command line. You can replace counter with "global" or any of the clocks listed in /sys/kernel/debug/tracing/trace_clock Suggested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Appreciated-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20ftrace: Allow for function tracing instance to filter functionsSteven Rostedt (Red Hat)
Create a "set_ftrace_filter" and "set_ftrace_notrace" files in the instance directories to let users filter of functions to trace for the given instance. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Convert tracer->enabled to counterSteven Rostedt (Red Hat)
As tracers will soon be used by instances, the tracer enabled field needs to be converted to a counter instead of a boolean. This counter is protected by the trace_types_lock mutex. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Disable tracers before deletion of instanceSteven Rostedt (Red Hat)
When an instance is about to be deleted, make sure the tracer is set to nop. If it isn't reset the tracer and set it to the nop tracer, otherwise memory leaks and bad pointers may result. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Only let top level have option filesSteven Rostedt (Red Hat)
Currently, only the top level instance can have tracing options. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Set up infrastructure to allow tracers for instancesSteven Rostedt (Red Hat)
Currently the tracers (function, function_graph, irqsoff, etc) can only be used by the top level tracing directory (not for instances). This sets up the infrastructure to allow instances to be able to run a separate tracer apart from the what the top level tracing is doing. As tracers need to adapt for being used by instances, the tracers must flag if they can be used by instances or not. Currently only the 'nop' tracer can be used by all instances. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to flag_changed callbackSteven Rostedt (Red Hat)
As options (flags) may affect instances instead of being global the flag_changed() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to set_flag callbackSteven Rostedt (Red Hat)
As options (flags) may affect instances instead of being global the set_flag() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-23tracing: Check if tracing is enabled in trace_puts()Steven Rostedt (Red Hat)
If trace_puts() is used very early in boot up, it can crash the machine if it is called before the ring buffer is allocated. If a trace_printk() is used with no arguments, then it will be converted into a trace_puts() and suffer the same fate. Cc: stable@vger.kernel.org # 3.10+ Fixes: 09ae72348ecc "tracing: Add trace_puts() for even faster trace_printk() tracing" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-23tracing: Fix formatting of trace README fileSteven Rostedt (Red Hat)
Fix the formatting of the README file in the trace debugfs to fit in an 80 character window. Also add a comment about the event trigger counter with regards to traceon and traceoff. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-22tracing/README: Add event file usage to tracing mini-HOWTOTom Zanussi
It would be useful to have a cheat-sheet for everything under tracing/events/ alongside the existing text describing the other files in the tracing/ dir. Add short descriptions of the directories and files under events/ along with examples, similar to the existing text for the other files in tracing/. Also clean up a few minor alignment problems noticed when adding the new text. Link: http://lkml.kernel.org/r/1389993104.3040.445.camel@empanada Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-19tracing: Fix buggered tee(2) on tracing_pipeAl Viro
In kernel/trace/trace.c we have this: static void tracing_pipe_buf_release(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { __free_page(buf->page); } static const struct pipe_buf_operations tracing_pipe_buf_ops = { .can_merge = 0, .map = generic_pipe_buf_map, .unmap = generic_pipe_buf_unmap, .confirm = generic_pipe_buf_confirm, .release = tracing_pipe_buf_release, .steal = generic_pipe_buf_steal, .get = generic_pipe_buf_get, }; with void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { page_cache_get(buf->page); } and I don't see anything that would've prevented tee(2) called on the pipe that got stuff spliced into it from that sucker. ->ops->get() will be called, then buf gets copied into target pipe's ->bufs[] and eventually readers get to both copies of the buffer. With get_page(page) look at that page __free_page(page) look at that page __free_page(page) which is not a good thing, to put it mildly. AFAICS, that ought to use the normal generic_pipe_buf_release() (aka page_cache_release(buf->page)), shouldn't it? [ SDR - As trace_pipe just allocates the page with alloc_page(GFP_KERNEL), and doesn't do anything special with it (no LRU logic). The __free_page() should be fine, as it wont actually free a page with reference count. Maybe there's a chance to leak memory? Anyway, This change is at a minimum good for being symmetric with generic_pipe_buf_get, it is fine to add. ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ SDR - Removed no longer used tracing_pipe_buf_release ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-14tracing: Have trace buffer point back to trace_arraySteven Rostedt (Red Hat)
The trace buffer has a descriptor pointer that goes back to the trace array. But it was never assigned. Luckily, nothing uses it (yet), but it will in the future. Although nothing currently uses this, if any of the new features get backported to older kernels, and because this is such a simple change, I'm marking it for stable too. Cc: stable@vger.kernel.org # v3.10+ Fixes: 12883efb670c "tracing: Consolidate max_tr into main trace_array structure" Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02tracing: Add generic tracing_lseek() functionSteven Rostedt (Red Hat)
Trace event triggers added a lseek that uses the ftrace_filter_lseek() function. Unfortunately, when function tracing is not configured in that function is not defined and the kernel fails to build. This is the second time that function was added to a file ops and it broke the build due to requiring special config dependencies. Make a generic tracing_lseek() that all the tracing utilities may use. Also, modify the old ftrace_filter_lseek() to return 0 instead of 1 on WRONLY. Not sure why it was a 1 as that does not make sense. This also changes the old tracing_seek() to modify the file pos pointer on WRONLY as well. Reported-by: kbuild test robot <fengguang.wu@intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-21tracing: Add 'snapshot' event trigger commandTom Zanussi
Add 'snapshot' event_command. snapshot event triggers are added by the user via this command in a similar way and using practically the same syntax as the analogous 'snapshot' ftrace function command, but instead of writing to the set_ftrace_filter file, the snapshot event trigger is written to the per-event 'trigger' files: echo 'snapshot' > .../somesys/someevent/trigger The above command will turn on snapshots for someevent i.e. whenever someevent is hit, a snapshot will be done. This also adds a 'count' version that limits the number of times the command will be invoked: echo 'snapshot:N' > .../somesys/someevent/trigger Where N is the number of times the command will be invoked. The above command will snapshot N times for someevent i.e. whenever someevent is hit N times, a snapshot will be done. Also adds a new tracing_alloc_snapshot() function - the existing tracing_snapshot_alloc() function is a special version of tracing_snapshot() that also does the snapshot allocation - the snapshot triggers would like to be able to do just the allocation but not take a snapshot; the existing tracing_snapshot_alloc() in turn now also calls tracing_alloc_snapshot() underneath to do that allocation. Link: http://lkml.kernel.org/r/c9524dd07ce01f9dcbd59011290e0a8d5b47d7ad.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> [ fix up from kbuild test robot <fengguang.wu@intel.com report ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-16Merge tag 'trace-3.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing update from Steven Rostedt: "This batch of changes is mostly clean ups and small bug fixes. The only real feature that was added this release is from Namhyung Kim, who introduced "set_graph_notrace" filter that lets you run the function graph tracer and not trace particular functions and their call chain. Tom Zanussi added some updates to the ftrace multibuffer tracing that made it more consistent with the top level tracing. One of the fixes for perf function tracing required an API change in RCU; the addition of "rcu_is_watching()". As Paul McKenney is pushing that change in this release too, he gave me a branch that included all the changes to get that working, and I pulled that into my tree in order to complete the perf function tracing fix" * tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Add rcu annotation for syscall trace descriptors tracing: Do not use signed enums with unsigned long long in fgragh output tracing: Remove unused function ftrace_off_permanent() tracing: Do not assign filp->private_data to freed memory tracing: Add helper function tracing_is_disabled() tracing: Open tracer when ftrace_dump_on_oops is used tracing: Add support for SOFT_DISABLE to syscall events tracing: Make register/unregister_ftrace_command __init tracing: Update event filters for multibuffer recordmcount.pl: Add support for __fentry__ ftrace: Have control op function callback only trace when RCU is watching rcu: Do not trace rcu_is_watching() functions ftrace/x86: skip over the breakpoint for ftrace caller trace/trace_stat: use rbtree postorder iteration helper instead of opencoding ftrace: Add set_graph_notrace filter ftrace: Narrow down the protected area of graph_lock ftrace: Introduce struct ftrace_graph_data ftrace: Get rid of ftrace_graph_filter_enabled tracing: Fix potential out-of-bounds in trace_get_user() tracing: Show more exact help information about snapshot
2013-11-11ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHEDPeter Zijlstra
Since the introduction of PREEMPT_NEED_RESCHED in: f27dde8deef3 ("sched: Add NEED_RESCHED to the preempt_count") we need to be able to look at both TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED to understand the full preemption behaviour. Add it to the trace output. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com> Link: http://lkml.kernel.org/r/20131004152826.GP3081@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06tracing: Remove unused function ftrace_off_permanent()Steven Rostedt (Red Hat)
In the past, ftrace_off_permanent() was called if something strange was detected. But the ftrace_bug() now handles all the anomolies that can happen with ftrace (function tracing), and there are no uses of ftrace_off_permanent(). Get rid of it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Add helper function tracing_is_disabled()Geyslan G. Bem
This patch creates the function 'tracing_is_disabled', which can be used outside of trace.c. Link: http://lkml.kernel.org/r/1382141754-12155-1-git-send-email-geyslan@gmail.com Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Open tracer when ftrace_dump_on_oops is usedCody P Schafer
With ftrace_dump_on_oops, we previously did not open the tracer in question, sometimes causing the trace output to be useless. For example, the function_graph tracer with tracing_thresh set dumped via ftrace_dump_on_oops would show a series of '}' indented at different levels, but no function names. call trace->open() (and do a few other fixups copied from the normal dump path) to make the output more intelligible. Link: http://lkml.kernel.org/r/1382554197-16961-1-git-send-email-cody@linux.vnet.ibm.com Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Make register/unregister_ftrace_command __initTom Zanussi
register/unregister_ftrace_command() are only ever called from __init functions, so can themselves be made __init. Also make register_snapshot_cmd() __init for the same reason. Link: http://lkml.kernel.org/r/d4042c8cadb7ae6f843ac9a89a24e1c6a3099727.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Update event filters for multibufferTom Zanussi
The trace event filters are still tied to event calls rather than event files, which means you don't get what you'd expect when using filters in the multibuffer case: Before: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 2048 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 Setting the filter in tracing/instances/test1/events shouldn't affect the same event in tracing/events as it does above. After: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 We'd like to just move the filter directly from ftrace_event_call to ftrace_event_file, but there are a couple cases that don't yet have multibuffer support and therefore have to continue using the current event_call-based filters. For those cases, a new USE_CALL_FILTER bit is added to the event_call flags, whose main purpose is to keep the old behavior for those cases until they can be updated with multibuffer support; at that point, the USE_CALL_FILTER flag (and the new associated call_filter_check_discard() function) can go away. The multibuffer support also made filter_current_check_discard() redundant, so this change removes that function as well and replaces it with filter_check_discard() (or call_filter_check_discard() as appropriate). Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-18tracing: Fix potential out-of-bounds in trace_get_user()Steven Rostedt
Andrey reported the following report: ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3 ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3) Accessed by thread T13003: #0 ffffffff810dd2da (asan_report_error+0x32a/0x440) #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40) #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20) #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260) #4 ffffffff812a1065 (__fput+0x155/0x360) #5 ffffffff812a12de (____fput+0x1e/0x30) #6 ffffffff8111708d (task_work_run+0x10d/0x140) #7 ffffffff810ea043 (do_exit+0x433/0x11f0) #8 ffffffff810eaee4 (do_group_exit+0x84/0x130) #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30) #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b) Allocated by thread T5167: #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0) #1 ffffffff8128337c (__kmalloc+0xbc/0x500) #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90) #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0) #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40) #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430) #6 ffffffff8129b668 (finish_open+0x68/0xa0) #7 ffffffff812b66ac (do_last+0xb8c/0x1710) #8 ffffffff812b7350 (path_openat+0x120/0xb50) #9 ffffffff812b8884 (do_filp_open+0x54/0xb0) #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0) #11 ffffffff8129d4b7 (SyS_open+0x37/0x50) #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b) Shadow bytes around the buggy address: ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap redzone: fa Heap kmalloc redzone: fb Freed heap region: fd Shadow gap: fe The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;' Although the crash happened in ftrace_regex_open() the real bug occurred in trace_get_user() where there's an incrementation to parser->idx without a check against the size. The way it is triggered is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop that reads the last character stores it and then breaks out because there is no more characters. Then the last character is read to determine what to do next, and the index is incremented without checking size. Then the caller of trace_get_user() usually nulls out the last character with a zero, but since the index is equal to the size, it writes a nul character after the allocated space, which can corrupt memory. Luckily, only root user has write access to this file. Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-09tracing: Show more exact help information about snapshotWang YanQing
The current "help" that comes out of the snapshot file when it is not allocated looks like this: # * Snapshot is freed * # # Snapshot commands: # echo 0 > snapshot : Clears and frees snapshot buffer # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. # Takes a snapshot of the main buffer. # echo 2 > snapshot : Clears snapshot buffer (but does not allocate) # (Doesn't have to be '2' works with any number that # is not a '0' or '1') Echo 2 says that it does not allocate the buffer, which is correct, but to be more consistent with "echo 0" it should also state that it does not free. Link: http://lkml.kernel.org/r/20130914045916.GA4243@udknight Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-08-22tracing: Make tracing_cpumask available for all instancesAlexander Z Lam
Allow tracer instances to disable tracing by cpu by moving the static global tracing_cpumask into trace_array. Link: http://lkml.kernel.org/r/921622317f239bfc2283cac2242647801ef584f2.1375980149.git.azl@google.com Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-08-02tracing: Fix reset of time stamps during trace_clock changesAlexander Z Lam
Fixed two issues with changing the timestamp clock with trace_clock: - The global buffer was reset on instance clock changes. Change this to pass the correct per-instance buffer - ftrace_now() is used to set buf->time_start in tracing_reset_online_cpus(). This was incorrect because ftrace_now() used the global buffer's clock to return the current time. Change this to use buffer_ftrace_now() which returns the current time for the correct per-instance buffer. Also removed tracing_reset_current() because it is not used anywhere Link: http://lkml.kernel.org/r/1375493777-17261-2-git-send-email-azl@google.com Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>