stapprobes(3stap) — Linux manual page


STAPPROBES(3stap)                                      STAPPROBES(3stap)

NAME         top

       stapprobes - systemtap probe points

DESCRIPTION         top

       The following sections enumerate the variety of probe points
       supported by the systemtap translator, and some of the additional
       aliases defined by standard tapset scripts.  Many are
       individually documented in the 3stap manual section, with the
       probe:: prefix.

SYNTAX         top

              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }

       A probe declaration may list multiple comma-separated probe
       points in order to attach a handler to all of the named events.
       Normally, the handler statements are run whenever any of events
       occur.  Depending on the type of probe point, the handler
       statements may refer to context variables (denoted with a dollar-
       sign prefix like $foo) to read or write state.  This may include
       function parameters for function probes, or local variables for
       statement probes.

       The syntax of a single probe point is a general dotted-symbol
       sequence.  This allows a breakdown of the event namespace into
       parts, somewhat like the Domain Name System does on the Internet.
       Each component identifier may be parametrized by a string or
       number literal, with a syntax like a function call.  A component
       may include a "*" character, to expand to a set of matching probe
       points.  It may also include "**" to match multiple sequential
       components at once.  Probe aliases likewise expand to other probe

       Probe aliases can be given on their own, or with a suffix. The
       suffix attaches to the underlying probe point that the alias is
       expanded to. For example,


       expands to


       with the component maxactive(10) being recognized as a suffix.

       Normally, each and every probe point resulting from wildcard- and
       alias-expansion must be resolved to some low-level system
       instrumentation facility (e.g., a kprobe address, marker, or a
       timer configuration), otherwise the elaboration phase will fail.

       However, a probe point may be followed by a "?" character, to
       indicate that it is optional, and that no error should result if
       it fails to resolve.  Optionalness passes down through all levels
       of alias/wildcard expansion.  Alternately, a probe point may be
       followed by a "!" character, to indicate that it is both optional
       and sufficient.  (Think vaguely of the Prolog cut operator.) If
       it does resolve, then no further probe points in the same comma-
       separated list will be resolved.  Therefore, the "!"  sufficiency
       mark only makes sense in a list of probe point alternatives.

       Additionally, a probe point may be followed by a "if (expr)"
       statement, in order to enable/disable the probe point on-the-fly.
       With the "if" statement, if the "expr" is false when the probe
       point is hit, the whole probe body including alias's body is
       skipped. The condition is stacked up through all levels of
       alias/wildcard expansion. So the final condition becomes the
       logical-and of conditions of all expanded alias/wildcard.  The
       expressions are necessarily restricted to global variables.

       These are all syntactically valid probe points.  (They are
       generally semantically invalid, depending on the contents of the
       tapsets, and the versions of kernel/user software installed.)

              kernel.function("no_such_function") ?
              module("awol").function("no_such_function") !
              signal.*? if (switch)

       Probes may be broadly classified into "synchronous" and
       "asynchronous".  A "synchronous" event is deemed to occur when
       any processor executes an instruction matched by the
       specification.  This gives these probes a reference point
       (instruction address) from which more contextual data may be
       available.  Other families of probe points refer to
       "asynchronous" events such as timers/counters rolling over, where
       there is no fixed reference point that is related.  Each probe
       point specification may match multiple locations (for example,
       using wildcards or aliases), and all them are then probed.  A
       probe declaration may also contain several comma-separated
       specifications, all of which are probed.

       Brace expansion is a mechanism which allows a list of probe
       points to be generated. It is very similar to shell expansion. A
       component may be surrounded by a pair of curly braces to indicate
       that the comma-separated sequence of one or more subcomponents
       will each constitute a new probe point. The braces may be
       arbitrarily nested. The ordering of expanded results is based on
       product order.

       The question mark (?), exclamation mark (!) indicators and probe
       point conditions may not be placed in any expansions that are
       before the last component.

       The following is an example of brace expansion.

              # Expands to

              # Expands to
              kernel.function("nfs*")!, module("nfs").function("nfs*")!


       Resolving some probe points requires DWARF debuginfo or "debug
       symbols" for the specific program being instrumented.  For some
       others, DWARF is automatically synthesized on the fly from source
       code header files.  For others, it is not needed at all.  Since a
       systemtap script may use any mixture of probe points together,
       the union of their DWARF requirements has to be met on the
       computer where script compilation occurs.  (See the --use-server
       option and the stap-server(8) man page for information about the
       remote compilation facility, which allows these requirements to
       be met on a different machine.)

       The following point lists many of the available probe point
       families, to classify them with respect to their need for DWARF
       debuginfo for the specific program for that probe point.

       DWARF                          NON-DWARF                    SYMBOL-TABLE

       kernel.function, .statement    kernel.mark                  kernel.function*
       module.function, .statement    process.mark, process.plt    module.function*
       process.function, .statement   begin, end, error, never     process.function*
       process.mark*                  timer
       .function.callee               perf
       python2, python3               procfs
       debuginfod                     kernel.statement.absolute
       AUTO-GENERATED-DWARF           kprobe.function
       kernel.trace                   process.statement.absolute
                                      process.begin, .end

       The probe types marked with * asterisks mark fallbacks, where
       systemtap can sometimes infer subset or substitute information.
       In general, the more symbolic / debugging information available,
       the higher quality probing will be available.

ON-THE-FLY ARMING         top

       The following types of probe points may be armed/disarmed on-the-
       fly to save overheads during uninteresting times.  Arming
       conditions may also be added to other types of probes, but will
       be treated as a wrapping conditional and won't benefit from
       overhead savings.

       DISARMABLE                                exceptions
       kernel.function, kernel.statement
       module.function, module.statement
       process.*.function, process.*.statement
       process.*.plt, process.*.mark
       timer.                                    timer.profile


       The probe points begin and end are defined by the translator to
       refer to the time of session startup and shutdown.  All "begin"
       probe handlers are run, in some sequence, during the startup of
       the session.  All global variables will have been initialized
       prior to this point.  All "end" probes are run, in some sequence,
       during the normal shutdown of a session, such as in the aftermath
       of an exit () function call, or an interruption from the user.
       In the case of an error-triggered shutdown, "end" probes are not
       run.  There are no target variables available in either context.

       If the order of execution among "begin" or "end" probes is
       significant, then an optional sequence number may be provided:


       The number N may be positive or negative.  The probe handlers are
       run in increasing order, and the order between handlers with the
       same sequence number is unspecified.  When "begin" or "end" are
       given without a sequence, they are effectively sequence zero.

       The error probe point is similar to the end probe, except that
       each such probe handler run when the session ends after errors
       have occurred.  In such cases, "end" probes are skipped, but each
       "error" probe is still attempted.  This kind of probe can be used
       to clean up or emit a "final gasp".  It may also be numerically
       parametrized to set a sequence.

       The probe point never is specially defined by the translator to
       mean "never".  Its probe handler is never run, though its
       statements are analyzed for symbol / type correctness as usual.
       This probe point may be useful in conjunction with optional

       The syscall.* and nd_syscall.*  aliases define several hundred
       probes, too many to detail here.  They are of the general form:


       Generally, a pair of probes are defined for each normal system
       call as listed in the syscalls(2) manual page, one for entry and
       one for return.  Those system calls that never return do not have
       a corresponding .return probe.  The nd_* family of probes are
       about the same, except it uses non-DWARF based searching
       mechanisms, which may result in a lower quality of symbolic
       context data (parameters), and may miss some system calls.  You
       may want to try them first, in case kernel debugging information
       is not immediately available.

       Each probe alias provides a variety of variables. Looking at the
       tapset source code is the most reliable way.  Generally, each
       variable listed in the standard manual page is made available as
       a script-level variable, so exposes filename, flags,
       and mode.  In addition, a standard suite of variables is
       available at most aliases:

       argstr A pretty-printed form of the entire argument list, without

       name   The name of the system call.

       retval For return probes, the raw numeric system-call result.

       retstr For return probes, a pretty-printed string form of the
              system-call result.

       As usual for probe aliases, these variables are all initialized
       once from the underlying $context variables, so that later
       changes to $context variables are not automatically reflected.
       Not all probe aliases obey all of these general guidelines.
       Please report any bothersome ones you encounter as a bug.  Note
       that on some kernel/userspace architecture combinations (e.g.,
       32-bit userspace on 64-bit kernel), the underlying $context
       variables may need explicit sign extension / masking.  When this
       is an issue, consider using the tapset-provided variables instead
       of raw $context variables.

       If debuginfo availability is a problem, you may try using the
       non-DWARF syscall probe aliases instead.  Use the nd_syscall.
       prefix instead of syscall.  The same context variables are
       available, as far as possible.

       nd_syscall probes on kernels that use syscall wrappers to pass
       arguments via pt_regs (currently 4.17+ on x86_64 and 4.19+ on
       aarch64) support syscall argument writing when guru mode is
       enabled. If a probe syscall parameter is modified in the probe
       body then immediately before the probe exits the parameter's
       current value will be written to pt_regs. This overwrites the
       previous value.  nd_syscall probes also include two parameters
       for each of the syscall's string parameters.  One holds a quoted
       version of the string passed to the syscall. The other holds an
       unquoted version of the string intended to be used when modifying
       the parameter.  If the probe modifies the unquoted string
       variable then as the probe is about to exit the contents of this
       variable will be written to the user space buffer passed to the
       syscall. It is the user's responsibility to ensure that this
       buffer is large enough to hold the modified string and that it is
       located in a writable memory segment.

       There are two main types of timer probes: "jiffies" timer probes
       and time interval timer probes.

       Intervals defined by the standard kernel "jiffies" timer may be
       used to trigger probe handlers asynchronously.  Two probe point
       variants are supported by the translator:


       The probe handler is run every N jiffies (a kernel-defined unit
       of time, typically between 1 and 60 ms).  If the "randomize"
       component is given, a linearly distributed random value in the
       range [-M..+M] is added to N every time the handler is run.  N is
       restricted to a reasonable range (1 to around a million), and M
       is restricted to be smaller than N.  There are no target
       variables provided in either context.  It is possible for such
       probes to be run concurrently on a multi-processor computer.

       Alternatively, intervals may be specified in units of time.
       There are two probe point variants similar to the jiffies timer:


       Here, N and M are specified in milliseconds, but the full options
       for units are seconds (s/sec), milliseconds (ms/msec),
       microseconds (us/usec), nanoseconds (ns/nsec), and hertz (hz).
       Randomization is not supported for hertz timers.

       The actual resolution of the timers depends on the target kernel.
       For kernels prior to 2.6.17, timers are limited to jiffies
       resolution, so intervals are rounded up to the nearest jiffies
       interval.  After 2.6.17, the implementation uses hrtimers for
       tighter precision, though the actual resolution will be arch-
       dependent.  In either case, if the "randomize" component is
       given, then the random value will be added to the interval before
       any rounding occurs.

       Profiling timers are also available to provide probes that
       execute on all CPUs at the rate of the system tick (CONFIG_HZ) or
       at a given frequency (hz). On some kernels, this is a one-
       concurrent-user-only or disabled facility, resulting in error -16
       (EBUSY) during probe registration.


       Full context information of the interrupted process is available,
       making this probe suitable for a time-based sampling profiler.

       It is recommended to use the tapset probe timer.profile rather
       than timer.profile.tick. This probe point behaves identically to
       timer.profile.tick when the underlying functionality is
       available, and falls back to using perf.sw.cpu_clock on some
       recent kernels which lack the corresponding profile timer

       Profiling timers with specified frequencies are only accurate up
       to around 100 hz. You may need to provide a larger value to
       achieve the desired rate.

       Note that if a timer probe is set to fire at a very high rate and
       if the probe body is complex, succeeding timer probes can get
       skipped, since the time for them to run has already passed.
       Normally systemtap reports missed probes, but it will not report
       these skipped probes.

       This family of probe points uses symbolic debugging information
       for the target kernel/module/program, as may be found in
       unstripped executables, or the separate debuginfo packages.  They
       allow placement of probes logically into the execution path of
       the target program, by specifying a set of points in the source
       or object code.  When a matching statement executes on any
       processor, the probe handler is run in that context.

       Probe points in the DWARF family can be identified by the target
       kernel module (or user process), source file, line number,
       function name, or some combination of these.

       Here is a list of DWARF probe points currently supported:


       (See the USER-SPACE section below for more information on the
       process probes.)

       The list above includes multiple variants and modifiers which
       provide additional functionality or filters. They are:

                     Places a probe near the beginning of the named
                     function, so that parameters are available as
                     context variables.

                     Places a probe at the moment after the return from
                     the named function, so the return value is
                     available as the "$return" context variable.

                     Filters the results to include only instances of
                     inlined functions. Note that inlined functions do
                     not have an identifiable return point, so .return
                     is not supported on .inline probes.

              .call  Filters the results to include only non-inlined
                     functions (the opposite set of .inline)

                     Filters the results to include only exported

                     Places a probe at the exact spot, exposing those
                     local variables that are visible there.

                     Places a probe at the nearest available line number
                     for each line number given in the statement.

                     Places a probe on the callee function given in the
                     .callee modifier, where the callee must be a
                     function called by the target function given in
                     .function. The advantage of doing this over
                     directly probing the callee function is that this
                     probe point is run only when the callee is called
                     from the target function (add the
                     -DSTAP_CALLEE_MATCHALL directive to override this
                     when calling stap(1)).

                     Note that only callees that can be statically
                     determined are available.  For example, calls
                     through function pointers are not available.
                     Additionally, calls to functions located in other
                     objects (e.g.  libraries) are not available
                     (instead use another probe point). This feature
                     will only work for code compiled with GCC 4.7+.

                     Shortcut for .callee("*"), which places a probe on
                     all callees of the function.

                     Recursively places probes on callees. For example,
                     .callees(2) will probe both callees of the target
                     function, as well as callees of those callees. And
                     .callees(3) goes one level deeper, etc...  A callee
                     probe at depth N is only triggered when the N
                     callers in the callstack match those that were
                     statically determined during analysis (this also
                     may be overridden using -DSTAP_CALLEE_MATCHALL).

       In the above list of probe points, MPATTERN stands for a string
       literal that aims to identify the loaded kernel module of
       interest. For in-tree kernel modules, the name suffices (e.g.
       "btrfs"). The name may also include the "*", "[]", and "?"
       wildcards to match multiple in-tree modules. Out-of-tree modules
       are also supported by specifying the full path to the ko file.
       Wildcards are not supported. The file must follow the convention
       of being named <module_name>.ko (characters ',' and '-' are
       replaced by '_').

       LPATTERN stands for a source program label. It may also contain
       "*", "[]", and "?" wildcards. PATTERN stands for a string literal
       that aims to identify a point in the program.  It is made up of
       three parts:

       •   The first part is the name of a function, as would appear in
           the nm program's output.  This part may use the "*" and "?"
           wildcarding operators to match multiple names.

       •   The second part is optional and begins with the "@"
           character.  It is followed by the path to the source file
           containing the function, which may include a wildcard
           pattern, such as mm/slab*.  If it does not match as is, an
           implicit "*/" is optionally added before the pattern, so that
           a script need only name the last few components of a possibly
           long source directory path.

       •   Finally, the third part is optional if the file name part was
           given, and identifies the line number in the source file
           preceded by a ":" or a "+".  The line number is assumed to be
           an absolute line number if preceded by a ":", or relative to
           the declaration line of the function if preceded by a "+".
           All the lines in the function can be matched with ":*".  A
           range of lines x through y can be matched with ":x-y". Ranges
           and specific lines can be mixed using commas, e.g. ":x,y-z".

       As an alternative, PATTERN may be a numeric constant, indicating
       an address.  Such an address may be found from symbol tables of
       the appropriate kernel / module object file.  It is verified
       against known statement code boundaries, and will be relocated
       for use at run time.

       In guru mode only, absolute kernel-space addresses may be
       specified with the ".absolute" suffix.  Such an address is
       considered already relocated, as if it came from /proc/kallsyms,
       so it cannot be checked against statement/instruction boundaries.

       Many of the source-level context variables, such as function
       parameters, locals, globals visible in the compilation unit, may
       be visible to probe handlers.  They may refer to these variables
       by prefixing their name with "$" within the scripts.  In
       addition, a special syntax allows limited traversal of
       structures, pointers, and arrays.  More syntax allows pretty-
       printing of individual variables or their groups.  See also
       @cast.  Note that variables may be inaccessible due to them being
       paged out, or for a few other reasons.  See also man

       Functions called from DWARF class probe points and from
       process.mark probes may also refer to context variables.

       $var   refers to an in-scope variable or thread local storage
              variable "var".  If it's an integer-like type, it will be
              cast to a 64-bit int for systemtap script use.  String-
              like pointers (char *) may be copied to systemtap string
              values using the kernel_string or user_string functions.

              an alternative syntax for $varname

              The global variable or global thread local storage
              variable in scope of the given module already loaded into
              the current probed process.  Useful to get an exported
              variable in a shared library loaded into the process being
              probed, or a global variable in a process while a shared
              library probe is being executed.  For user-space modules
              only.  For example: @var("_r_debug","/lib/")

              refers to the global (either file local or external)
              variable varname defined when the file src/file.c was
              compiled. The CU in which the variable is resolved is the
              first CU in the module of the probe point which matches
              the given file name at the end and has the shortest file
              name path (e.g. given @var("foo@bar/baz.c") and CUs with
              file name paths src/sub/module/bar/baz.c and src/bar/baz.c
              the second CU will be chosen to resolve the (file) global
              variable foo

              The global variable in scope of the given CU, defined in
              the given module, even if the variable is static (so the
              name is not unique without the CU name).

       $var->field traversal via a structure's or a pointer's field.
              generalized indirection operator may be repeated to follow
              more levels.  Note that the .  operator is not used for
              plain structure members, only -> for both purposes.  (This
              is because "." is reserved for string concatenation.) Also
              note that for direct dereferencing of $var pointer
              {kernel,user}_{char,int,...}($var) should be used. (Refer
              to stapfuncs(5) for more details.)

              is available in return probes only for functions that are
              declared with a return value, which can be determined
              using @defined($return).

              indexes into an array.  The index given with a literal
              number or even an arbitrary numeric expression.

       A number of operators exist for such basic context variable

       $$vars expands to a character string that is equivalent to

              sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
                      parm1, ..., parmN, var1, ..., varN)

              for each variable in scope at the probe point.  Some
              values may be printed as =?  if their run-time location
              cannot be found.

              expands to a subset of $$vars for only local variables.

              expands to a subset of $$vars for only function

              is available in return probes only.  It expands to a
              string that is equivalent to sprintf("return=%x", $return)
              if the probed function has a return value, or else an
              empty string.

       & $EXPR
              expands to the address of the given context variable
              expression, if it is addressable.

              expands to 1 or 0 iff the given context variable
              expression is resolvable, for use in conditionals such as

              @defined($foo->bar) ? $foo->bar : 0

              see the PROBES section of stap(1).

       $EXPR$ expands to a string with all of $EXPR's members,
              equivalent to

              sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
                       $EXPR->a, $EXPR->b)

              expands to a string with all of $var's members and
              submembers, equivalent to

              sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
                      $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])

       @errno expands to the last value the C library global variable
              errno was set to.

       For the kernel ".return" probes, only a certain fixed number of
       returns may be outstanding.  The default is a relatively small
       number, on the order of a few times the number of physical CPUs.
       If many different threads concurrently call the same blocking
       function, such as futex(2) or read(2), this limit could be
       exceeded, and skipped "kretprobes" would be reported by "stap
       -t".  To work around this, specify a

              probe FOO.return.maxactive(NNN)

       suffix, with a large enough NNN to cover all expected
       concurrently blocked threads.  Alternately, use the

              stap -DKRETACTIVE=NNNN

       stap command line macro setting to override the default for all
       ".return" probes.

       For ".return" probes, context variables other than the "$return"
       may be accessible, as a convenience for a script programmer
       wishing to access function parameters.  These values are
       snapshots taken at the time of function entry.  (Local variables
       within the function are not generally accessible, since those
       variables did not exist in allocated/initialized form at the
       snapshot moment.)  These entry-snapshot variables should be
       accessed via @entry($var).

       In addition, arbitrary entry-time expressions can also be saved
       for ".return" probes using the @entry(expr) operator.  For
       example, one can compute the elapsed time of a function:

              probe kernel.function("do_filp_open").return {
                  println( get_timeofday_us() - @entry(get_timeofday_us()) )

       The following table summarizes how values related to a function
       parameter context variable, a pointer named addr, may be accessed
       from a .return probe.
       at-entry value   past-exit value

       $addr            not available
       $addr->x->y      @cast(@entry($addr),"struct zz")->x->y
       $addr[0]         {kernel,user}_{char,int,...}(& $addr[0])

       In absence of debugging information, entry & exit points of
       kernel & module functions can be probed using the "kprobe" family
       of probes.  However, these do not permit looking up the arguments
       / local variables of the function.  Following constructs are
       supported :


       Probes of type function are recommended for kernel functions,
       whereas probes of type module are recommended for probing
       functions of the specified module.  In case the absolute address
       of a kernel or module function is known, statement probes can be

       Note that FUNCTION and MODULE names must not contain wildcards,
       or the probe will not be registered.  Also, statement probes must
       be run under guru-mode only.

       Support for user-space probing is available for kernels that are
       configured with the utrace extensions, or have the uprobes
       facility in linux 3.5.  (Various kernel build configuration
       options need to be enabled; systemtap will advise if these are

       There are several forms.  First, a non-symbolic probe point:


       is analogous to kernel.statement(ADDRESS).absolute in that both
       use raw (unverified) virtual addresses and provide no $variables.
       The target PID parameter must identify a running process, and
       ADDRESS should identify a valid instruction address.  All threads
       of that process will be probed.

       Second, non-symbolic user-kernel interface events handled by
       utrace may be probed:


       A process.begin probe gets called when new process described by
       PID or FULLPATH gets created.  In addition, it is called once
       from the context of each preexisting process, at systemtap script
       startup.  This is useful to track live processes.  A
       process.thread.begin probe gets called when a new thread
       described by PID or FULLPATH gets created.  A process.end probe
       gets called when process described by PID or FULLPATH dies.  A
       process.thread.end probe gets called when a thread described by
       PID or FULLPATH dies.  A process.syscall probe gets called when a
       thread described by PID or FULLPATH makes a system call.  The
       system call number is available in the $syscall context variable,
       and the first 6 arguments of the system call are available in the
       $argN (ex. $arg1, $arg2, ...) context variable.  A
       process.syscall.return probe gets called when a thread described
       by PID or FULLPATH returns from a system call.  The system call
       number is available in the $syscall context variable, and the
       return value of the system call is available in the $return
       context variable.  A

       If a process probe is specified without a PID or FULLPATH, all
       user threads will be probed.  However, if systemtap was invoked
       with the -c or -x options, then process probes are restricted to
       the process hierarchy associated with the target process.  If a
       process probe is unspecified (i.e. without a PID or FULLPATH),
       but with the -c option, the PATH of the -c cmd will be
       heuristically filled into the process PATH. In that case, only
       command parameters are allowed in the -c command (i.e. no command
       substitution allowed and no occurrences of any of these
       characters: '|&;<>(){}').

       Third, symbolic static instrumentation compiled into programs and
       shared libraries may be probed:


       A .mark probe gets called via a static probe which is defined in
       the application by STAP_PROBE1(PROVIDER,LABEL,arg1), which are
       macros defined in sys/sdt.h.  The PROVIDER is an arbitrary
       application identifier, LABEL is the marker site identifier, and
       arg1 is the integer-typed argument.  STAP_PROBE1 is used for
       probes with 1 argument, STAP_PROBE2 is used for probes with 2
       arguments, and so on.  The arguments of the probe are available
       in the context variables $arg1, $arg2, ...  An alternative to
       using the STAP_PROBE macros is to use the dtrace script to create
       custom macros.  Additionally, the variables $$name and $$provider
       are available as parts of the probe point name.  The sys/sdt.h
       macro names DTRACE_PROBE* are available as aliases for

       Finally, full symbolic source-level probes in user-space programs
       and shared libraries are supported.  These are exactly analogous
       to the symbolic DWARF-based kernel/module probes described above.
       They expose the same sorts of context $variables for function
       parameters, local variables, and so on.


       Note that for all process probes, PATH names refer to executables
       that are searched the same way shells do: relative to the working
       directory if they contain a "/" character, otherwise in $PATH.
       If PATH names refer to scripts, the actual interpreters
       (specified in the script in the first line after the #!
       characters) are probed.  In the debuginfod probe family PATH
       names likewise refer to executables, but are searched for in the
       currently defined $DEBUGINFOD_URLS.

       Tapset process probes placed in the special directory
       $prefix/share/systemtap/tapset/PATH/ with relative paths will
       have their process parameter prefixed with the location of the
       tapset. For example,


       expands to


       when placed in $prefix/share/systemtap/tapset/PATH/usr/bin/

       If PATH is a process component parameter referring to shared
       libraries then all processes that map it at runtime would be
       selected for probing.  If PATH is a library component parameter
       referring to shared libraries then the process specified by the
       process component would be selected.  Note that the PATH pattern
       in a library component will always apply to libraries statically
       determined to be in use by the process. However, you may also
       specify the full path to any library file even if not statically
       needed by the process.

       A .plt probe will probe functions in the program linkage table
       corresponding to the rest of the probe point.  .plt can be
       specified as a shorthand for .plt("*").  The symbol name is
       available as a $$name context variable; function arguments are
       not available, since PLTs are processed without debuginfo.  A
       .plt.return probe places a probe at the moment after the return
       from the named function.

       If the PATH string contains wildcards as in the MPATTERN case,
       then standard globbing is performed to find all matching paths.
       In this case, the $PATH environment variable is not used.

       If systemtap was invoked with the -c or -x options, then process
       probes are restricted to the process hierarchy associated with
       the target process.

       These probes take the form


       They are very similar to the process("PATH").** probe family.
       The key difference is that the process probes search for PATH in
       the host filesystem, while debuginfod probes search the current
       federation of debuginfod servers, using the currently defined
       $DEBUGINFOD_URLS (see debuginfod(8) ).

       In order to probe the contents of one or more elf/archive files
       and/or elf/archive containing directories, the below will create
       a debuginfod server which will scan and process the elf files
       within and prepare them for systemtap.

              $ debuginfod [options] [-F -R -Z etc.] /path1 /path2
              $ env DEBUGINFOD_URLS=http://localhost:8002/ stap ...

       Support for probing Java methods is available using Byteman as a
       backend. Byteman is an instrumentation tool from the JBoss
       project which systemtap can use to monitor invocations for a
       specific method or line in a Java program.

       Systemtap does so by generating a Byteman script listing the
       probes to instrument and then invoking the Byteman bminstall

       This Java instrumentation support is currently a prototype
       feature with major limitations.  Moreover, Java probing currently
       does not work across users; the stap script must run (with
       appropriate permissions) under the same user that the Java
       process being probed. (Thus a stap script under root currently
       cannot probe Java methods in a non-root-user Java process.)

       The first probe type refers to Java processes by the name of the
       Java process:


       The PNAME argument must be a pre-existing jvm pid, and be
       identifiable via a jps listing.

       The PATTERN parameter specifies the signature of the Java method
       to probe. The signature must consist of the exact name of the
       method, followed by a bracketed list of the types of the
       arguments, for instance "myMethod(int,double,Foo)". Wildcards are
       not supported.

       The probe can be set to trigger at a specific line within the
       method by appending a line number with colon, just as in other
       types of probes: "myMethod(int,double,Foo):245".

       The CLASSNAME parameter identifies the Java class the method
       belongs to, either with or without the package qualification. By
       default, the probe only triggers on descendants of the class that
       do not override the method definition of the original class.
       However, CLASSNAME can take an optional caret prefix, as in
       ^, which specifies that the probe should also
       trigger on all descendants of MyClass that override the original
       method. For instance, every method with signature foo(int) in
       program can be probed at once using


       The second probe type works analogously, but refers to Java
       processes by PID:


       (PIDs for an already running process can be obtained using the
       jps(1) utility.)

       Context variables defined within java probes include $arg1
       through $arg10 (for up to the first 10 arguments of a method),
       represented as character-pointers for the toString() form of each
       actual argument.  The arg1 through arg10 script variables provide
       access to these as ordinary strings, fetched via

       Prior to systemtap version 3.1, $arg1 through $arg10 could
       contain either integers or character pointers, depending on the
       types of the objects being passed to each particular java method.
       This previous behaviour may be invoked with the stap
       --compatible=3.0 flag.

       These probe points allow procfs "files" in
       /proc/systemtap/MODNAME to be created, read and written using a
       permission that may be modified using the proper umask value.
       Default permissions are 0400 for read probes, and 0200 for write
       probes. If both a read and write probe are being used on the same
       file, a default permission of 0600 will be used.  Using
       procfs.umask(0040).read would result in a 0404 permission set for
       the file.  (MODNAME is the name of the systemtap module). The
       proc filesystem is a pseudo-filesystem which is used as an
       interface to kernel data structures. There are several probe
       point variants supported by the translator:


       Note that there are a few differences when procfs probes are used
       in the stapbpf runtime.  FIFO special files are used instead of
       proc filesystem files.  These files are created in
       /var/tmp/systemtap-USER/MODNAME.  (USER is the name of the user).
       Additionally, users cannot create both read and write probes on
       the same file.

       PATH is the file name (relative to /proc/systemtap/MODNAME or
       /var/tmp/systemtap-USER/MODNAME) to be created.  If no PATH is
       specified (as in the last two variants above), PATH defaults to
       "command". The file name "__stdin" is used internally by
       systemtap for input probes and should not be used as a PATH for
       procfs probes; see the input probe section below.

       When a user reads /proc/systemtap/MODNAME/PATH (normal runtime)
       or /var/tmp/systemtap-USER/MODNAME (stapbpf runtime), the
       corresponding procfs read probe is triggered.  The string data to
       be read should be assigned to a variable named $value, like this:

              procfs("PATH").read { $value = "100\n" }

       When a user writes into /proc/systemtap/MODNAME/PATH (normal
       runtime) or /var/tmp/systemtap-USER/MODNAME (stapbpf runtime),
       the corresponding procfs write probe is triggered.  The data the
       user wrote is available in the string variable named $value, like

              procfs("PATH").write { printf("user wrote: %s", $value) }

       MAXSIZE is the size of the procfs read buffer.  Specifying
       MAXSIZE allows larger procfs output.  If no MAXSIZE is specified,
       the procfs read buffer defaults to STP_PROCFS_BUFSIZE (which
       defaults to MAXSTRINGLEN, the maximum length of a string).  If
       setting the procfs read buffers for more than one file is needed,
       it may be easiest to override the STP_PROCFS_BUFSIZE definition.
       Here's an example of using MAXSIZE:

                  $value = "long string..."
                  $value .= "another long string..."
                  $value .= "another long string..."
                  $value .= "another long string..."

       These probe points make input from stdin available to the script
       during runtime.  The translator currently supports two variants
       of this family:


       input.char is triggered each time a character is read from stdin.
       The current character is available in the string variable named
       char.  There is no newline buffering; the next character is read
       from stdin as soon as it becomes available.

       input.line causes all characters read from stdin to be buffered
       until a newline is read, at which point the probe will be
       triggered. The current line of characters (including the newline)
       is made available in a string variable named line.  Note that no
       more than MAXSTRINGLEN characters will be buffered. Any
       additional characters will not be included in line.

       Input probes are aliases for procfs("__stdin").write.  Systemtap
       reconfigures stdin if the presence of this procfs probe is
       detected, therefore "__stdin" should not be used as a path
       argument for procfs probes.  Additionally, input probes will not
       work with the -F and --remote options.

       These probe points allow observation of network packets using the
       netfilter mechanism. A netfilter probe in systemtap corresponds
       to a netfilter hook function in the original netfilter probes
       API. It is probably more convenient to use
       tapset::netfilter(3stap), which wraps the primitive netfilter
       hooks and does the work of extracting useful information from the
       context variables.

       There are several probe point variants supported by the


       PROTOCOL_F is the protocol family to listen for, currently one of

       HOOKNAME is the point, or 'hook', in the protocol stack at which
       to intercept the packet. The available hook names for each
       protocol family are taken from the kernel header files
       <linux/netfilter_ipv4.h>, <linux/netfilter_ipv6.h>,
       <linux/netfilter_arp.h> and <linux/netfilter_bridge.h>. For
       instance, allowable hook names for NFPROTO_IPV4 are

       PRIORITY is an integer priority giving the order in which the
       probe point should be triggered relative to any other netfilter
       hook functions which trigger on the same packet. Hook functions
       execute on each packet in order from smallest priority number to
       largest priority number. If no PRIORITY is specified (as in the
       first two probe point variants above), PRIORITY defaults to "0".

       There are a number of predefined priority names of the form
       NF_IP_PRI_* and NF_IP6_PRI_* which are defined in the kernel
       header files <linux/netfilter_ipv4.h> and
       <linux/netfilter_ipv6.h> respectively. The script is permitted to
       use these instead of specifying an integer priority. (The probe
       points for NFPROTO_ARP and NFPROTO_BRIDGE currently do not expose
       any named hook priorities to the script writer.)  Thus, allowable
       ways to specify the priority include:


       A script using guru mode is permitted to specify any identifier
       or number as the parameter for hook, pf, and priority. This
       feature should be used with caution, as the parameter is inserted
       verbatim into the C code generated by systemtap.

       The netfilter probe points define the following context

              The hook number.

       $skb   The address of the sk_buff struct representing the packet.
              See <linux/skbuff.h> for details on how to use this
              struct, or alternatively use the tapset
              tapset::netfilter(3stap) for easy access to key

       $in    The address of the net_device struct representing the
              network device on which the packet was received (if any).
              May be 0 if the device is unknown or undefined at that
              stage in the protocol stack.

       $out   The address of the net_device struct representing the
              network device on which the packet will be sent (if any).
              May be 0 if the device is unknown or undefined at that
              stage in the protocol stack.

              (Guru mode only.) Assigning one of the verdict values
              defined in <linux/netfilter.h> to this variable alters the
              further progress of the packet through the protocol stack.
              For instance, the following guru mode script forces all
              ipv6 network packets to be dropped:

              probe"NFPROTO_IPV6").hook("NF_IP6_PRE_ROUTING") {
                $verdict = 0 /* nf_drop */

              For convenience, unlike the primitive probe points
              discussed here, the probes defined in
              tapset::netfilter(3stap) export the lowercase names of the
              verdict constants (e.g. NF_DROP becomes nf_drop) as local

       This family of probe points hooks up to static probing
       tracepoints inserted into the kernel or modules.  As with
       markers, these tracepoints are special macro calls inserted by
       kernel developers to make probing faster and more reliable than
       with DWARF-based probes, and DWARF debugging information is not
       required to probe tracepoints.  Tracepoints have an extra
       advantage of more strongly-typed parameters than markers.

       Tracepoint probes look like: kernel.trace("name").  The
       tracepoint name string, which may contain the usual wildcard
       characters, is matched against the names defined by the kernel
       developers in the tracepoint header files. To restrict the search
       to specific subsystems (e.g. sched, ext3, etc...), the following
       syntax can be used: kernel.trace("system:name").  The tracepoint
       system string may also contain the usual wildcard characters.

       The handler associated with a tracepoint-based probe may read the
       optional parameters specified at the macro call site.  These are
       named according to the declaration by the tracepoint author.  For
       example, the tracepoint probe kernel.trace("sched:sched_switch")
       provides the parameters $prev and $next.  If the parameter is a
       complex type, as in a struct pointer, then a script can access
       fields with the same syntax as DWARF $target variables.  Also,
       tracepoint parameters cannot be modified, but in guru-mode a
       script may modify fields of parameters.

       The subsystem and name of the tracepoint are available in
       $$system and $$name and a string of name=value pairs for all
       parameters of the tracepoint is available in $$vars or $$parms.

       This family of probe points hooks up to an older style of static
       probing markers inserted into older kernels or modules.  These
       markers are special STAP_MARK macro calls inserted by kernel
       developers to make probing faster and more reliable than with
       DWARF-based probes.  Further, DWARF debugging information is not
       required to probe markers.

       Marker probe points begin with kernel.  The next part names the
       marker itself: mark("name").  The marker name string, which may
       contain the usual wildcard characters, is matched against the
       names given to the marker macros when the kernel and/or module
       was compiled.    Optionally, you can specify format("format").
       Specifying the marker format string allows differentiation
       between two markers with the same name but different marker
       format strings.

       The handler associated with a marker-based probe may read the
       optional parameters specified at the macro call site.  These are
       named $arg1 through $argNN, where NN is the number of parameters
       supplied by the macro.  Number and string parameters are passed
       in a type-safe manner.

       The marker format string associated with a marker is available in
       $format.  And also the marker name string is available in $name.

       This family of probes is used to set hardware watchpoints for a
        (global) kernel symbol. The probes take three components as
       inputs :

       1. The virtual address / name of the kernel symbol to be traced
       is supplied as argument to this class of probes. ( Probes for
       only data segment variables are supported. Probing local
       variables of a function cannot be done.)

       2. Nature of access to be probed : a.  .write probe gets
       triggered when a write happens at the specified address/symbol
       name.  b.  rw probe is triggered when either a read or write

       3.  .length (optional) Users have the option of specifying the
       address interval to be probed using "length" constructs. The
       user-specified length gets approximated to the closest possible
       address length that the architecture can support. If the
       specified length exceeds the limits imposed by architecture, an
       error message is flagged and probe registration fails.  Wherever
       'length' is not specified, the translator requests a hardware
       breakpoint probe of length 1. It should be noted that the
       "length" construct is not valid with symbol names.

       Following constructs are supported :


       This set of probes make use of the debug registers of the
       processor, which is a scarce resource. (4 on x86 , 1 on powerpc )
       The script translation flags a warning if a user requests more
       hardware breakpoint probes than the limits set by architecture.
       For example,a pass-2 warning is flashed when an input script
       requests 5 hardware breakpoint probes on an x86 system while x86
       architecture supports a maximum of 4 breakpoints.  Users are
       cautioned to set probes judiciously.

       It is possible to specify userspace virtual memory addresses in
       this family of probes and the handlers would trigger upon the
       corresponding memory read/write events in those processes. But
       one cannot easily control which processes are monitored. Using
       `if (pid() == target())` is a workaround but it is inefficient.
       Better use the userland hardware breakpoint probes below instead.

       This family of probes is very similar to its kernel-space
       counterpart but it targets the userland processes only.

       The following constructs are currently supported:


       Currently, only the target process specified by -x PID or -c CMD
       has the watchpoints registered. The ADDRESS must be a valid
       virtual memory address in that process's address space.

       This family of probe points interfaces to the kernel "perf event"
       infrastructure for controlling hardware performance counters.
       The events being attached to are described by the "type",
       "config" fields of the perf_event_attr structure, and are sampled
       at an interval governed by the "sample_period" and "sample_freq"

       These fields are made available to systemtap scripts using the
       following syntax:

              probe perf.type(NN).config(MM).sample(XX)
              probe perf.type(NN).config(MM).hz(XX)
              probe perf.type(NN).config(MM)
              probe perf.type(NN).config(MM).process("PROC")
              probe perf.type(NN).config(MM).counter("COUNTER")
              probe perf.type(NN).config(MM).process("PROC").counter("NAME")

       The systemtap probe handler is called once per XX increments of
       the underlying performance counter when using the .sample field
       or at a frequency in hertz when using the .hz field. When not
       specified, the default behavior is to sample at a count of
       1000000.  The range of valid type/config is described by the
       perf_event_open(2) system call, and/or the linux/perf_event.h
       file.  Invalid combinations or exhausted hardware counter
       resources result in errors during systemtap script startup.
       Systemtap does not sanity-check the values: it merely passes them
       through to the kernel for error- and safety-checking.  By default
       the perf event probe is systemwide unless .process is specified,
       which will bind the probe to a specific task.  If the name is
       omitted then it is inferred from the stap -c argument.   A perf
       event can be read on demand using .counter.  The body of the perf
       probe handler will not be invoked for a .counter probe; instead,
       the counter is read in a user space probe via:

          process("PROC").statement("func@file") {stat <<<

       Support for probing python 2 and python 3 function is available
       with the help of an extra python support module. Note that the
       debuginfo for the version of python being probed is required. To
       run a python script with the extra python support module you'd
       add the '-m HelperSDT' option to your python command, like this:

              stap foo.stp -c "python -m HelperSDT"

       Python probes look like the following:


       The list above includes multiple variants and modifiers which
       provide additional functionality or filters. They are:

                     Places a probe at the beginning of the named
                     function by default, unless modified by PATTERN.
                     Parameters are available as context variables.

              .call  Places a probe at the beginning of the named
                     function. Parameters are available as context

                     Places a probe at the moment before the return from
                     the named function. Parameters and local/global
                     python variables are available as context

       PATTERN stands for a string literal that aims to identify a point
       in the python program.  It is made up of three parts:

       •   The first part is the name of a function (e.g. "foo") or
           class method (e.g. "bar.baz"). This part may use the "*" and
           "?" wildcarding operators to match multiple names.

       •   The second part is optional and begins with the "@"
           character.  It is followed by the path to the source file
           containing the function, which may include a wildcard
           pattern. The python path is searched for a matching filename.

       •   Finally, the third part is optional if the file name part was
           given, and identifies the line number in the source file
           preceded by a ":" or a "+".  The line number is assumed to be
           an absolute line number if preceded by a ":", or relative to
           the declaration line of the function if preceded by a "+".
           All the lines in the function can be matched with ":*".  A
           range of lines x through y can be matched with ":x-y". Ranges
           and specific lines can be mixed using commas, e.g. ":x,y-z".

       In the above list of probe points, MPATTERN stands for a python
       module or script name that names the python module of interest.
       This part may use the "*" and "?" wildcarding operators to match
       multiple names. The python path is searched for a matching

EXAMPLES         top

       Here are some example probe points, defining the associated

       begin, end, end
              refers to the startup and normal shutdown of the session.
              In this case, the handler would run once during startup
              and twice during shutdown.

              refers to a periodic interrupt, every 1000 +/- 200

       kernel.function("*init*"), kernel.function("*exit*")
              refers to all kernel functions with "init" or "exit" in
              the name.

              refers to any functions within the "kernel/time.c" file
              that span line 240.  Note that this is not a probe at the
              statement at that line number.  Use the kernel.statement
              probe instead.

              refers to all scheduler-related (really, prefixed)
              tracepoints in the kernel.

              refers to an obsolete STAP_MARK(getuid, ...) macro call in
              the kernel.

              refers to the moment of return from all functions with
              "sync" in the name in any of the USB drivers.

              refers to the first byte of the statement whose compiled
              instructions include the given address in the kernel.

              refers to the statement of line 296 within

              refers to the statement at line bio_init+3 within
              refers to a hardware breakpoint of type "write" set on

              refers to the group of probe aliases with any name in the
              third position

SEE ALSO         top


COLOPHON         top

       This page is part of the systemtap (a tracing and live-system
       analysis tool) project.  Information about the project can be
       found at ⟨⟩.  If you have a bug
       report for this manual page, send it to
       This page was obtained from the project's upstream Git repository
       ⟨git://⟩ on 2023-12-22.  (At that
       time, the date of the most recent commit that was found in the
       repository was 2023-12-21.)  If you discover any rendering
       problems in this HTML version of the page, or you believe there
       is a better or more up-to-date source for the page, or you have
       corrections or improvements to the information in this COLOPHON
       (which is not part of the original manual page), send a mail to


Pages that refer to this page: stap(1)stap-merge(1)stapex(3stap)error::pass2(7stap)error::pass3(7stap)error::sdt(7stap)stappaths(7)warning::buildid(7stap)stapbpf(8)stapdyn(8)stap-exporter(8)staprun(8)stap-server(8)