stapprobes(3stap) — Linux manual page


STAPPROBES(3stap)                                          STAPPROBES(3stap)

NAME         top

       stapprobes - systemtap probe points

DESCRIPTION         top

       The following sections enumerate the variety of probe points
       supported by the systemtap translator, and some of the additional
       aliases defined by standard tapset scripts.  Many are individually
       documented in the 3stap manual section, with the probe:: prefix.

SYNTAX         top

              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }

       A probe declaration may list multiple comma-separated probe points in
       order to attach a handler to all of the named events.  Normally, the
       handler statements are run whenever any of events occur.  Depending
       on the type of probe point, the handler statements may refer to con‐
       text variables (denoted with a dollar-sign prefix like $foo) to read
       or write state.  This may include function parameters for function
       probes, or local variables for statement probes.

       The syntax of a single probe point is a general dotted-symbol se‐
       quence.  This allows a breakdown of the event namespace into parts,
       somewhat like the Domain Name System does on the Internet.  Each com‐
       ponent identifier may be parametrized by a string or number literal,
       with a syntax like a function call.  A component may include a "*"
       character, to expand to a set of matching probe points.  It may also
       include "**" to match multiple sequential components at once.  Probe
       aliases likewise expand to other probe points.

       Probe aliases can be given on their own, or with a suffix. The suffix
       attaches to the underlying probe point that the alias is expanded to.
       For example,


       expands to


       with the component maxactive(10) being recognized as a suffix.

       Normally, each and every probe point resulting from wildcard- and
       alias-expansion must be resolved to some low-level system instrumen‐
       tation facility (e.g., a kprobe address, marker, or a timer configu‐
       ration), otherwise the elaboration phase will fail.

       However, a probe point may be followed by a "?" character, to indi‐
       cate that it is optional, and that no error should result if it fails
       to resolve.  Optionalness passes down through all levels of
       alias/wildcard expansion.  Alternately, a probe point may be followed
       by a "!" character, to indicate that it is both optional and suffi‐
       cient.  (Think vaguely of the Prolog cut operator.) If it does re‐
       solve, then no further probe points in the same comma-separated list
       will be resolved.  Therefore, the "!"  sufficiency mark only makes
       sense in a list of probe point alternatives.

       Additionally, a probe point may be followed by a "if (expr)" state‐
       ment, in order to enable/disable the probe point on-the-fly. With the
       "if" statement, if the "expr" is false when the probe point is hit,
       the whole probe body including alias's body is skipped. The condition
       is stacked up through all levels of alias/wildcard expansion. So the
       final condition becomes the logical-and of conditions of all expanded
       alias/wildcard.  The expressions are necessarily restricted to global

       These are all syntactically valid probe points.  (They are generally
       semantically invalid, depending on the contents of the tapsets, and
       the versions of kernel/user software installed.)

              kernel.function("no_such_function") ?
              module("awol").function("no_such_function") !
              signal.*? if (switch)

       Probes may be broadly classified into "synchronous" and "asynchro‐
       nous".  A "synchronous" event is deemed to occur when any processor
       executes an instruction matched by the specification.  This gives
       these probes a reference point (instruction address) from which more
       contextual data may be available.  Other families of probe points re‐
       fer to "asynchronous" events such as timers/counters rolling over,
       where there is no fixed reference point that is related.  Each probe
       point specification may match multiple locations (for example, using
       wildcards or aliases), and all them are then probed.  A probe decla‐
       ration may also contain several comma-separated specifications, all
       of which are probed.

       Brace expansion is a mechanism which allows a list of probe points to
       be generated. It is very similar to shell expansion. A component may
       be surrounded by a pair of curly braces to indicate that the comma-
       separated sequence of one or more subcomponents will each constitute
       a new probe point. The braces may be arbitrarily nested. The ordering
       of expanded results is based on product order.

       The question mark (?), exclamation mark (!) indicators and probe
       point conditions may not be placed in any expansions that are before
       the last component.

       The following is an example of brace expansion.

              # Expands to

              # Expands to
              kernel.function("nfs*")!, module("nfs").function("nfs*")!


       Resolving some probe points requires DWARF debuginfo or "debug
       symbols" for the specific program being instrumented.  For some
       others, DWARF is automatically synthesized on the fly from source
       code header files.  For others, it is not needed at all.  Since a
       systemtap script may use any mixture of probe points together, the
       union of their DWARF requirements has to be met on the computer where
       script compilation occurs.  (See the --use-server option and the
       stap-server(8) man page for information about the remote compilation
       facility, which allows these requirements to be met on a different

       The following point lists many of the available probe point families,
       to classify them with respect to their need for DWARF debuginfo for
       the specific program for that probe point.

       DWARF                          NON-DWARF                    SYMBOL-TABLE

       kernel.function, .statement    kernel.mark                  kernel.function*
       module.function, .statement    process.mark, process.plt    module.function*
       process.function, .statement   begin, end, error, never     process.function*
       process.mark*                  timer
       .function.callee               perf
       python2, python3               procfs
       kernel.trace                   process.statement.absolute
                                      process.begin, .end

       The probe types marked with * asterisks mark fallbacks, where
       systemtap can sometimes infer subset or substitute information.  In
       general, the more symbolic / debugging information available, the
       higher quality probing will be available.

ON-THE-FLY ARMING         top

       The following types of probe points may be armed/disarmed on-the-fly
       to save overheads during uninteresting times.  Arming conditions may
       also be added to other types of probes, but will be treated as a
       wrapping conditional and won't benefit from overhead savings.

       DISARMABLE                                exceptions
       kernel.function, kernel.statement
       module.function, module.statement
       process.*.function, process.*.statement
       process.*.plt, process.*.mark
       timer.                                    timer.profile


       The probe points begin and end are defined by the translator to refer
       to the time of session startup and shutdown.  All "begin" probe
       handlers are run, in some sequence, during the startup of the
       session.  All global variables will have been initialized prior to
       this point.  All "end" probes are run, in some sequence, during the
       normal shutdown of a session, such as in the aftermath of an exit ()
       function call, or an interruption from the user.  In the case of an
       error-triggered shutdown, "end" probes are not run.  There are no
       target variables available in either context.

       If the order of execution among "begin" or "end" probes is
       significant, then an optional sequence number may be provided:


       The number N may be positive or negative.  The probe handlers are run
       in increasing order, and the order between handlers with the same se‐
       quence number is unspecified.  When "begin" or "end" are given with‐
       out a sequence, they are effectively sequence zero.

       The error probe point is similar to the end probe, except that each
       such probe handler run when the session ends after errors have oc‐
       curred.  In such cases, "end" probes are skipped, but each "error"
       probe is still attempted.  This kind of probe can be used to clean up
       or emit a "final gasp".  It may also be numerically parametrized to
       set a sequence.

       The probe point never is specially defined by the translator to mean
       "never".  Its probe handler is never run, though its statements are
       analyzed for symbol / type correctness as usual.  This probe point
       may be useful in conjunction with optional probes.

       The syscall.* and nd_syscall.*  aliases define several hundred
       probes, too many to detail here.  They are of the general form:


       Generally, a pair of probes are defined for each normal system call
       as listed in the syscalls(2) manual page, one for entry and one for
       return.  Those system calls that never return do not have a corre‐
       sponding .return probe.  The nd_* family of probes are about the
       same, except it uses non-DWARF based searching mechanisms, which may
       result in a lower quality of symbolic context data (parameters), and
       may miss some system calls.  You may want to try them first, in case
       kernel debugging information is not immediately available.

       Each probe alias provides a variety of variables. Looking at the
       tapset source code is the most reliable way.  Generally, each vari‐
       able listed in the standard manual page is made available as a
       script-level variable, so exposes filename, flags, and
       mode.  In addition, a standard suite of variables is available at
       most aliases:

       argstr A pretty-printed form of the entire argument list, without

       name   The name of the system call.

       retval For return probes, the raw numeric system-call result.

       retstr For return probes, a pretty-printed string form of the system-
              call result.

       As usual for probe aliases, these variables are all initialized once
       from the underlying $context variables, so that later changes to
       $context variables are not automatically reflected.  Not all probe
       aliases obey all of these general guidelines.  Please report any
       bothersome ones you encounter as a bug.  Note that on some ker‐
       nel/userspace architecture combinations (e.g., 32-bit userspace on
       64-bit kernel), the underlying $context variables may need explicit
       sign extension / masking.  When this is an issue, consider using the
       tapset-provided variables instead of raw $context variables.

       If debuginfo availability is a problem, you may try using the non-
       DWARF syscall probe aliases instead.  Use the nd_syscall.  prefix in‐
       stead of syscall.  The same context variables are available, as far
       as possible.

       There are two main types of timer probes: "jiffies" timer probes and
       time interval timer probes.

       Intervals defined by the standard kernel "jiffies" timer may be used
       to trigger probe handlers asynchronously.  Two probe point variants
       are supported by the translator:


       The probe handler is run every N jiffies (a kernel-defined unit of
       time, typically between 1 and 60 ms).  If the "randomize" component
       is given, a linearly distributed random value in the range [-M..+M]
       is added to N every time the handler is run.  N is restricted to a
       reasonable range (1 to around a million), and M is restricted to be
       smaller than N.  There are no target variables provided in either
       context.  It is possible for such probes to be run concurrently on a
       multi-processor computer.

       Alternatively, intervals may be specified in units of time.  There
       are two probe point variants similar to the jiffies timer:


       Here, N and M are specified in milliseconds, but the full options for
       units are seconds (s/sec), milliseconds (ms/msec), microseconds
       (us/usec), nanoseconds (ns/nsec), and hertz (hz).  Randomization is
       not supported for hertz timers.

       The actual resolution of the timers depends on the target kernel.
       For kernels prior to 2.6.17, timers are limited to jiffies resolu‐
       tion, so intervals are rounded up to the nearest jiffies interval.
       After 2.6.17, the implementation uses hrtimers for tighter precision,
       though the actual resolution will be arch-dependent.  In either case,
       if the "randomize" component is given, then the random value will be
       added to the interval before any rounding occurs.

       Profiling timers are also available to provide probes that execute on
       all CPUs at the rate of the system tick (CONFIG_HZ) or at a given
       frequency (hz). On some kernels, this is a one-concurrent-user-only
       or disabled facility, resulting in error -16 (EBUSY) during probe


       Full context information of the interrupted process is available,
       making this probe suitable for a time-based sampling profiler.

       It is recommended to use the tapset probe timer.profile rather than
       timer.profile.tick. This probe point behaves identically to
       timer.profile.tick when the underlying functionality is available,
       and falls back to using perf.sw.cpu_clock on some recent kernels
       which lack the corresponding profile timer facility.

       Profiling timers with specified frequencies are only accurate up to
       around 100 hz. You may need to provide a larger value to achieve the
       desired rate.

       Note that if a timer probe is set to fire at a very high rate and if
       the probe body is complex, succeeding timer probes can get skipped,
       since the time for them to run has already passed. Normally systemtap
       reports missed probes, but it will not report these skipped probes.

       This family of probe points uses symbolic debugging information for
       the target kernel/module/program, as may be found in unstripped exe‐
       cutables, or the separate debuginfo packages.  They allow placement
       of probes logically into the execution path of the target program, by
       specifying a set of points in the source or object code.  When a
       matching statement executes on any processor, the probe handler is
       run in that context.

       Probe points in the DWARF family can be identified by the target ker‐
       nel module (or user process), source file, line number, function
       name, or some combination of these.

       Here is a list of DWARF probe points currently supported:


       (See the USER-SPACE section below for more information on the process

       The list above includes multiple variants and modifiers which provide
       additional functionality or filters. They are:

                     Places a probe near the beginning of the named func‐
                     tion, so that parameters are available as context vari‐

                     Places a probe at the moment after the return from the
                     named function, so the return value is available as the
                     "$return" context variable.

                     Filters the results to include only instances of in‐
                     lined functions. Note that inlined functions do not
                     have an identifiable return point, so .return is not
                     supported on .inline probes.

              .call  Filters the results to include only non-inlined func‐
                     tions (the opposite set of .inline)

                     Filters the results to include only exported functions.

                     Places a probe at the exact spot, exposing those local
                     variables that are visible there.

                     Places a probe at the nearest available line number for
                     each line number given in the statement.

                     Places a probe on the callee function given in the
                     .callee modifier, where the callee must be a function
                     called by the target function given in .function. The
                     advantage of doing this over directly probing the
                     callee function is that this probe point is run only
                     when the callee is called from the target function (add
                     the -DSTAP_CALLEE_MATCHALL directive to override this
                     when calling stap(1)).

                     Note that only callees that can be statically deter‐
                     mined are available.  For example, calls through func‐
                     tion pointers are not available.  Additionally, calls
                     to functions located in other objects (e.g.  libraries)
                     are not available (instead use another probe point).
                     This feature will only work for code compiled with GCC

                     Shortcut for .callee("*"), which places a probe on all
                     callees of the function.

                     Recursively places probes on callees. For example,
                     .callees(2) will probe both callees of the target func‐
                     tion, as well as callees of those callees. And
                     .callees(3) goes one level deeper, etc...  A callee
                     probe at depth N is only triggered when the N callers
                     in the callstack match those that were statically de‐
                     termined during analysis (this also may be overridden
                     using -DSTAP_CALLEE_MATCHALL).

       In the above list of probe points, MPATTERN stands for a string lit‐
       eral that aims to identify the loaded kernel module of interest. For
       in-tree kernel modules, the name suffices (e.g. "btrfs"). The name
       may also include the "*", "[]", and "?" wildcards to match multiple
       in-tree modules. Out-of-tree modules are also supported by specifying
       the full path to the ko file. Wildcards are not supported. The file
       must follow the convention of being named <module_name>.ko (charac‐
       ters ',' and '-' are replaced by '_').

       LPATTERN stands for a source program label. It may also contain "*",
       "[]", and "?" wildcards. PATTERN stands for a string literal that
       aims to identify a point in the program.  It is made up of three

       ·   The first part is the name of a function, as would appear in the
           nm program's output.  This part may use the "*" and "?" wildcard‐
           ing operators to match multiple names.

       ·   The second part is optional and begins with the "@" character.
           It is followed by the path to the source file containing the
           function, which may include a wildcard pattern, such as mm/slab*.
           If it does not match as is, an implicit "*/" is optionally added
           before the pattern, so that a script need only name the last few
           components of a possibly long source directory path.

       ·   Finally, the third part is optional if the file name part was
           given, and identifies the line number in the source file preceded
           by a ":" or a "+".  The line number is assumed to be an absolute
           line number if preceded by a ":", or relative to the declaration
           line of the function if preceded by a "+".  All the lines in the
           function can be matched with ":*".  A range of lines x through y
           can be matched with ":x-y". Ranges and specific lines can be
           mixed using commas, e.g. ":x,y-z".

       As an alternative, PATTERN may be a numeric constant, indicating an
       address.  Such an address may be found from symbol tables of the ap‐
       propriate kernel / module object file.  It is verified against known
       statement code boundaries, and will be relocated for use at run time.

       In guru mode only, absolute kernel-space addresses may be specified
       with the ".absolute" suffix.  Such an address is considered already
       relocated, as if it came from /proc/kallsyms, so it cannot be checked
       against statement/instruction boundaries.

       Many of the source-level context variables, such as function parame‐
       ters, locals, globals visible in the compilation unit, may be visible
       to probe handlers.  They may refer to these variables by prefixing
       their name with "$" within the scripts.  In addition, a special syn‐
       tax allows limited traversal of structures, pointers, and arrays.
       More syntax allows pretty-printing of individual variables or their
       groups.  See also @cast.  Note that variables may be inaccessible due
       to them being paged out, or for a few other reasons.  See also man

       Functions called from DWARF class probe points and from process.mark
       probes may also refer to context variables.

       $var   refers to an in-scope variable "var".  If it's an integer-like
              type, it will be cast to a 64-bit int for systemtap script
              use.  String-like pointers (char *) may be copied to systemtap
              string values using the kernel_string or user_string func‐

              an alternative syntax for $varname

              refers to the global (either file local or external) variable
              varname defined when the file src/file.c was compiled. The CU
              in which the variable is resolved is the first CU in the mod‐
              ule of the probe point which matches the given file name at
              the end and has the shortest file name path (e.g. given
              @var("foo@bar/baz.c") and CUs with file name paths
              src/sub/module/bar/baz.c and src/bar/baz.c the second CU will
              be chosen to resolve the (file) global variable foo

       $var->field traversal via a structure's or a pointer's field.  This
              generalized indirection operator may be repeated to follow
              more levels.  Note that the .  operator is not used for plain
              structure members, only -> for both purposes.  (This is be‐
              cause "." is reserved for string concatenation.) Also note
              that for direct dereferencing of $var pointer {kernel,us‐
              er}_{char,int,...}($var) should be used. (Refer to stap‐
              funcs(5) for more details.)

              is available in return probes only for functions that are de‐
              clared with a return value, which can be determined using @de‐

              indexes into an array.  The index given with a literal number
              or even an arbitrary numeric expression.

       A number of operators exist for such basic context variable expres‐

       $$vars expands to a character string that is equivalent to

              sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x",
                      parm1, ..., parmN, var1, ..., varN)

              for each variable in scope at the probe point.  Some values
              may be printed as =?  if their run-time location cannot be

              expands to a subset of $$vars for only local variables.

              expands to a subset of $$vars for only function parameters.

              is available in return probes only.  It expands to a string
              that is equivalent to sprintf("return=%x", $return) if the
              probed function has a return value, or else an empty string.

       & $EXPR
              expands to the address of the given context variable expres‐
              sion, if it is addressable.

              expands to 1 or 0 iff the given context variable expression is
              resolvable, for use in conditionals such as

              @defined($foo->bar) ? $foo->bar : 0

       $EXPR$ expands to a string with all of $EXPR's members, equivalent to

              sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}",
                       $EXPR->a, $EXPR->b)

              expands to a string with all of $var's members and submembers,
              equivalent to

              sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}",
                      $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])

       For the kernel ".return" probes, only a certain fixed number of re‐
       turns may be outstanding.  The default is a relatively small number,
       on the order of a few times the number of physical CPUs.  If many
       different threads concurrently call the same blocking function, such
       as futex(2) or read(2), this limit could be exceeded, and skipped
       "kretprobes" would be reported by "stap -t".  To work around this,
       specify a

              probe FOO.return.maxactive(NNN)

       suffix, with a large enough NNN to cover all expected concurrently
       blocked threads.  Alternately, use the

              stap -DKRETACTIVE=NNNN

       stap command line macro setting to override the default for all ".re‐
       turn" probes.

       For ".return" probes, context variables other than the "$return" may
       be accessible, as a convenience for a script programmer wishing to
       access function parameters.  These values are snapshots taken at the
       time of function entry.  (Local variables within the function are not
       generally accessible, since those variables did not exist in allocat‐
       ed/initialized form at the snapshot moment.)  These entry-snapshot
       variables should be accessed via @entry($var).

       In addition, arbitrary entry-time expressions can also be saved for
       ".return" probes using the @entry(expr) operator.  For example, one
       can compute the elapsed time of a function:

              probe kernel.function("do_filp_open").return {
                  println( get_timeofday_us() - @entry(get_timeofday_us()) )

       The following table summarizes how values related to a function pa‐
       rameter context variable, a pointer named addr, may be accessed from
       a .return probe.

       at-entry value   past-exit value

       $addr            not available
       $addr->x->y      @cast(@entry($addr),"struct zz")->x->y
       $addr[0]         {kernel,user}_{char,int,...}(& $addr[0])

       In absence of debugging information, entry & exit points of kernel &
       module functions can be probed using the "kprobe" family of probes.
       However, these do not permit looking up the arguments / local vari‐
       ables of the function.  Following constructs are supported :


       Probes of type function are recommended for kernel functions, whereas
       probes of type module are recommended for probing functions of the
       specified module.  In case the absolute address of a kernel or module
       function is known, statement probes can be utilized.

       Note that FUNCTION and MODULE names must not contain wildcards, or
       the probe will not be registered.  Also, statement probes must be run
       under guru-mode only.

       Support for user-space probing is available for kernels that are con‐
       figured with the utrace extensions, or have the uprobes facility in
       linux 3.5.  (Various kernel build configuration options need to be
       enabled; systemtap will advise if these are missing.)

       There are several forms.  First, a non-symbolic probe point:


       is analogous to kernel.statement(ADDRESS).absolute in that both use
       raw (unverified) virtual addresses and provide no $variables.  The
       target PID parameter must identify a running process, and ADDRESS
       should identify a valid instruction address.  All threads of that
       process will be probed.

       Second, non-symbolic user-kernel interface events handled by utrace
       may be probed:


       A process.begin probe gets called when new process described by PID
       or FULLPATH gets created.  In addition, it is called once from the
       context of each preexisting process, at systemtap script startup.
       This is useful to track live processes.  A process.thread.begin probe
       gets called when a new thread described by PID or FULLPATH gets cre‐
       ated.  A process.end probe gets called when process described by PID
       or FULLPATH dies.  A process.thread.end probe gets called when a
       thread described by PID or FULLPATH dies.  A process.syscall probe
       gets called when a thread described by PID or FULLPATH makes a system
       call.  The system call number is available in the $syscall context
       variable, and the first 6 arguments of the system call are available
       in the $argN (ex. $arg1, $arg2, ...) context variable.  A
       process.syscall.return probe gets called when a thread described by
       PID or FULLPATH returns from a system call.  The system call number
       is available in the $syscall context variable, and the return value
       of the system call is available in the $return context variable.  A
       process.insn probe gets called for every single-stepped instruction
       of the process described by PID or FULLPATH.  A process.insn.block
       probe gets called for every block-stepped instruction of the process
       described by PID or FULLPATH.

       If a process probe is specified without a PID or FULLPATH, all user
       threads will be probed.  However, if systemtap was invoked with the
       -c or -x options, then process probes are restricted to the process
       hierarchy associated with the target process.  If a process probe is
       unspecified (i.e. without a PID or FULLPATH), but with the -c option,
       the PATH of the -c cmd will be heuristically filled into the process
       PATH. In that case, only command parameters are allowed in the -c
       command (i.e. no command substitution allowed and no occurrences of
       any of these characters: '|&;<>(){}').

       Third, symbolic static instrumentation compiled into programs and
       shared libraries may be probed:


       A .mark probe gets called via a static probe which is defined in the
       application by STAP_PROBE1(PROVIDER,LABEL,arg1), which are macros de‐
       fined in sys/sdt.h.  The PROVIDER is an arbitrary application identi‐
       fier, LABEL is the marker site identifier, and arg1 is the integer-
       typed argument.  STAP_PROBE1 is used for probes with 1 argument,
       STAP_PROBE2 is used for probes with 2 arguments, and so on.  The ar‐
       guments of the probe are available in the context variables $arg1,
       $arg2, ...  An alternative to using the STAP_PROBE macros is to use
       the dtrace script to create custom macros.  Additionally, the vari‐
       ables $$name and $$provider are available as parts of the probe point
       name.  The sys/sdt.h macro names DTRACE_PROBE* are available as
       aliases for STAP_PROBE*.

       Finally, full symbolic source-level probes in user-space programs and
       shared libraries are supported.  These are exactly analogous to the
       symbolic DWARF-based kernel/module probes described above.  They ex‐
       pose the same sorts of context $variables for function parameters,
       local variables, and so on.


       Note that for all process probes, PATH names refer to executables
       that are searched the same way shells do: relative to the working di‐
       rectory if they contain a "/" character, otherwise in $PATH.  If PATH
       names refer to scripts, the actual interpreters (specified in the
       script in the first line after the #! characters) are probed.

       Tapset process probes placed in the special directory $pre‐
       fix/share/systemtap/tapset/PATH/ with relative paths will have their
       process parameter prefixed with the location of the tapset. For exam‐


       expands to


       when placed in $prefix/share/systemtap/tapset/PATH/usr/bin/

       If PATH is a process component parameter referring to shared li‐
       braries then all processes that map it at runtime would be selected
       for probing.  If PATH is a library component parameter referring to
       shared libraries then the process specified by the process component
       would be selected.  Note that the PATH pattern in a library component
       will always apply to libraries statically determined to be in use by
       the process. However, you may also specify the full path to any li‐
       brary file even if not statically needed by the process.

       A .plt probe will probe functions in the program linkage table corre‐
       sponding to the rest of the probe point.  .plt can be specified as a
       shorthand for .plt("*").  The symbol name is available as a $$name
       context variable; function arguments are not available, since PLTs
       are processed without debuginfo.  A .plt.return probe places a probe
       at the moment after the return from the named function.

       If the PATH string contains wildcards as in the MPATTERN case, then
       standard globbing is performed to find all matching paths.  In this
       case, the $PATH environment variable is not used.

       If systemtap was invoked with the -c or -x options, then process
       probes are restricted to the process hierarchy associated with the
       target process.

       Support for probing Java methods is available using Byteman as a
       backend. Byteman is an instrumentation tool from the JBoss project
       which systemtap can use to monitor invocations for a specific method
       or line in a Java program.

       Systemtap does so by generating a Byteman script listing the probes
       to instrument and then invoking the Byteman bminstall utility.

       This Java instrumentation support is currently a prototype feature
       with major limitations.  Moreover, Java probing currently does not
       work across users; the stap script must run (with appropriate permis‐
       sions) under the same user that the Java process being probed. (Thus
       a stap script under root currently cannot probe Java methods in a
       non-root-user Java process.)

       The first probe type refers to Java processes by the name of the Java


       The PNAME argument must be a pre-existing jvm pid, and be identifi‐
       able via a jps listing.

       The PATTERN parameter specifies the signature of the Java method to
       probe. The signature must consist of the exact name of the method,
       followed by a bracketed list of the types of the arguments, for in‐
       stance "myMethod(int,double,Foo)". Wildcards are not supported.

       The probe can be set to trigger at a specific line within the method
       by appending a line number with colon, just as in other types of
       probes: "myMethod(int,double,Foo):245".

       The CLASSNAME parameter identifies the Java class the method belongs
       to, either with or without the package qualification. By default, the
       probe only triggers on descendants of the class that do not override
       the method definition of the original class. However, CLASSNAME can
       take an optional caret prefix, as in ^, which specifies
       that the probe should also trigger on all descendants of MyClass that
       override the original method. For instance, every method with signa‐
       ture foo(int) in program can be probed at once using


       The second probe type works analogously, but refers to Java processes
       by PID:


       (PIDs for an already running process can be obtained using the jps(1)

       Context variables defined within java probes include $arg1 through
       $arg10 (for up to the first 10 arguments of a method), represented as
       character-pointers for the toString() form of each actual argument.
       The arg1 through arg10 script variables provide access to these as
       ordinary strings, fetched via user_string_warn().

       Prior to systemtap version 3.1, $arg1 through $arg10 could contain
       either integers or character pointers, depending on the types of the
       objects being passed to each particular java method.  This previous
       behaviour may be invoked with the stap --compatible=3.0 flag.

       These probe points allow procfs "files" in /proc/systemtap/MODNAME to
       be created, read and written using a permission that may be modified
       using the proper umask value. Default permissions are 0400 for read
       probes, and 0200 for write probes. If both a read and write probe are
       being used on the same file, a default permission of 0600 will be
       used.  Using procfs.umask(0040).read would result in a 0404 permis‐
       sion set for the file.  (MODNAME is the name of the systemtap mod‐
       ule). The proc filesystem is a pseudo-filesystem which is used as an
       interface to kernel data structures. There are several probe point
       variants supported by the translator:


       Note that there are a few differences when procfs probes are used in
       the stapbpf runtime.  FIFO special files are used instead of proc
       filesystem files.  These files are created in /var/tmp/systemtap-US‐
       ER/MODNAME.  (USER is the name of the user).  Additionally, users
       cannot create both read and write probes on the same file.

       PATH is the file name (relative to /proc/systemtap/MODNAME or
       /var/tmp/systemtap-USER/MODNAME) to be created.  If no PATH is speci‐
       fied (as in the last two variants above), PATH defaults to "command".
       The file name "__stdin" is used internally by systemtap for input
       probes and should not be used as a PATH for procfs probes; see the
       input probe section below.

       When a user reads /proc/systemtap/MODNAME/PATH (normal runtime) or
       /var/tmp/systemtap-USER/MODNAME (stapbpf runtime), the corresponding
       procfs read probe is triggered.  The string data to be read should be
       assigned to a variable named $value, like this:

              procfs("PATH").read { $value = "100\n" }

       When a user writes into /proc/systemtap/MODNAME/PATH (normal runtime)
       or /var/tmp/systemtap-USER/MODNAME (stapbpf runtime), the correspond‐
       ing procfs write probe is triggered.  The data the user wrote is
       available in the string variable named $value, like this:

              procfs("PATH").write { printf("user wrote: %s", $value) }

       MAXSIZE is the size of the procfs read buffer.  Specifying MAXSIZE
       allows larger procfs output.  If no MAXSIZE is specified, the procfs
       read buffer defaults to STP_PROCFS_BUFSIZE (which defaults to
       MAXSTRINGLEN, the maximum length of a string).  If setting the procfs
       read buffers for more than one file is needed, it may be easiest to
       override the STP_PROCFS_BUFSIZE definition.  Here's an example of us‐
       ing MAXSIZE:

                  $value = "long string..."
                  $value .= "another long string..."
                  $value .= "another long string..."
                  $value .= "another long string..."

       These probe points make input from stdin available to the script dur‐
       ing runtime.  The translator currently supports two variants of this


       input.char is triggered each time a character is read from stdin. The
       current character is available in the string variable named char.
       There is no newline buffering; the next character is read from stdin
       as soon as it becomes available.

       input.line causes all characters read from stdin to be buffered until
       a newline is read, at which point the probe will be triggered. The
       current line of characters (including the newline) is made available
       in a string variable named line.  Note that no more than MAXSTRINGLEN
       characters will be buffered. Any additional characters will not be
       included in line.

       Input probes are aliases for procfs("__stdin").write.  Systemtap re‐
       configures stdin if the presence of this procfs probe is detected,
       therefore "__stdin" should not be used as a path argument for procfs
       probes.  Additionally, input probes will not work with the -F and
       --remote options.

       These probe points allow observation of network packets using the
       netfilter mechanism. A netfilter probe in systemtap corresponds to a
       netfilter hook function in the original netfilter probes API. It is
       probably more convenient to use tapset::netfilter(3stap), which wraps
       the primitive netfilter hooks and does the work of extracting useful
       information from the context variables.

       There are several probe point variants supported by the translator:


       PROTOCOL_F is the protocol family to listen for, currently one of NF‐

       HOOKNAME is the point, or 'hook', in the protocol stack at which to
       intercept the packet. The available hook names for each protocol fam‐
       ily are taken from the kernel header files <linux/netfilter_ipv4.h>,
       <linux/netfilter_ipv6.h>, <linux/netfilter_arp.h> and <linux/netfil‐
       ter_bridge.h>. For instance, allowable hook names for NFPROTO_IPV4

       PRIORITY is an integer priority giving the order in which the probe
       point should be triggered relative to any other netfilter hook func‐
       tions which trigger on the same packet. Hook functions execute on
       each packet in order from smallest priority number to largest priori‐
       ty number. If no PRIORITY is specified (as in the first two probe
       point variants above), PRIORITY defaults to "0".

       There are a number of predefined priority names of the form
       NF_IP_PRI_* and NF_IP6_PRI_* which are defined in the kernel header
       files <linux/netfilter_ipv4.h> and <linux/netfilter_ipv6.h> respec‐
       tively. The script is permitted to use these instead of specifying an
       integer priority. (The probe points for NFPROTO_ARP and NFPRO‐
       TO_BRIDGE currently do not expose any named hook priorities to the
       script writer.)  Thus, allowable ways to specify the priority in‐


       A script using guru mode is permitted to specify any identifier or
       number as the parameter for hook, pf, and priority. This feature
       should be used with caution, as the parameter is inserted verbatim
       into the C code generated by systemtap.

       The netfilter probe points define the following context variables:

              The hook number.

       $skb   The address of the sk_buff struct representing the packet. See
              <linux/skbuff.h> for details on how to use this struct, or al‐
              ternatively use the tapset tapset::netfilter(3stap) for easy
              access to key information.

       $in    The address of the net_device struct representing the network
              device on which the packet was received (if any). May be 0 if
              the device is unknown or undefined at that stage in the proto‐
              col stack.

       $out   The address of the net_device struct representing the network
              device on which the packet will be sent (if any). May be 0 if
              the device is unknown or undefined at that stage in the proto‐
              col stack.

              (Guru mode only.) Assigning one of the verdict values defined
              in <linux/netfilter.h> to this variable alters the further
              progress of the packet through the protocol stack. For in‐
              stance, the following guru mode script forces all ipv6 network
              packets to be dropped:

              probe"NFPROTO_IPV6").hook("NF_IP6_PRE_ROUTING") {
                $verdict = 0 /* nf_drop */

              For convenience, unlike the primitive probe points discussed
              here, the probes defined in tapset::netfilter(3stap) export
              the lowercase names of the verdict constants (e.g. NF_DROP be‐
              comes nf_drop) as local variables.

       This family of probe points hooks up to static probing tracepoints
       inserted into the kernel or modules.  As with markers, these trace‐
       points are special macro calls inserted by kernel developers to make
       probing faster and more reliable than with DWARF-based probes, and
       DWARF debugging information is not required to probe tracepoints.
       Tracepoints have an extra advantage of more strongly-typed parameters
       than markers.

       Tracepoint probes look like: kernel.trace("name").  The tracepoint
       name string, which may contain the usual wildcard characters, is
       matched against the names defined by the kernel developers in the
       tracepoint header files. To restrict the search to specific subsys‐
       tems (e.g. sched, ext3, etc...), the following syntax can be used:
       kernel.trace("system:name").  The tracepoint system string may also
       contain the usual wildcard characters.

       The handler associated with a tracepoint-based probe may read the op‐
       tional parameters specified at the macro call site.  These are named
       according to the declaration by the tracepoint author.  For example,
       the tracepoint probe kernel.trace("sched:sched_switch") provides the
       parameters $prev and $next.  If the parameter is a complex type, as
       in a struct pointer, then a script can access fields with the same
       syntax as DWARF $target variables.  Also, tracepoint parameters can‐
       not be modified, but in guru-mode a script may modify fields of pa‐

       The subsystem and name of the tracepoint are available in $$system
       and $$name and a string of name=value pairs for all parameters of the
       tracepoint is available in $$vars or $$parms.

       This family of probe points hooks up to an older style of static
       probing markers inserted into older kernels or modules.  These mark‐
       ers are special STAP_MARK macro calls inserted by kernel developers
       to make probing faster and more reliable than with DWARF-based
       probes.  Further, DWARF debugging information is not required to
       probe markers.

       Marker probe points begin with kernel.  The next part names the mark‐
       er itself: mark("name").  The marker name string, which may contain
       the usual wildcard characters, is matched against the names given to
       the marker macros when the kernel and/or module was compiled.    Op‐
       tionally, you can specify format("format").  Specifying the marker
       format string allows differentiation between two markers with the
       same name but different marker format strings.

       The handler associated with a marker-based probe may read the option‐
       al parameters specified at the macro call site.  These are named
       $arg1 through $argNN, where NN is the number of parameters supplied
       by the macro.  Number and string parameters are passed in a type-safe

       The marker format string associated with a marker is available in
       $format.  And also the marker name string is available in $name.

       This family of probes is used to set hardware watchpoints for a given
        (global) kernel symbol. The probes take three components as inputs :

       1. The virtual address / name of the kernel symbol to be traced is
       supplied as argument to this class of probes. ( Probes for only data
       segment variables are supported. Probing local variables of a func‐
       tion cannot be done.)

       2. Nature of access to be probed : a.  .write probe gets triggered
       when a write happens at the specified address/symbol name.  b.  rw
       probe is triggered when either a read or write happens.

       3.  .length (optional) Users have the option of specifying the ad‐
       dress interval to be probed using "length" constructs. The user-spec‐
       ified length gets approximated to the closest possible address length
       that the architecture can support. If the specified length exceeds
       the limits imposed by architecture, an error message is flagged and
       probe registration fails.  Wherever 'length' is not specified, the
       translator requests a hardware breakpoint probe of length 1. It
       should be noted that the "length" construct is not valid with symbol

       Following constructs are supported :


       This set of probes make use of the debug registers of the processor,
       which is a scarce resource. (4 on x86 , 1 on powerpc ) The script
       translation flags a warning if a user requests more hardware break‐
       point probes than the limits set by architecture. For example,a
       pass-2 warning is flashed when an input script requests 5 hardware
       breakpoint probes on an x86 system while x86 architecture supports a
       maximum of 4 breakpoints.  Users are cautioned to set probes judi‐

       This family of probe points interfaces to the kernel "perf event" in‐
       frastructure for controlling hardware performance counters.  The
       events being attached to are described by the "type", "config" fields
       of the perf_event_attr structure, and are sampled at an interval gov‐
       erned by the "sample_period" and "sample_freq" fields.

       These fields are made available to systemtap scripts using the fol‐
       lowing syntax:

              probe perf.type(NN).config(MM).sample(XX)
              probe perf.type(NN).config(MM).hz(XX)
              probe perf.type(NN).config(MM)
              probe perf.type(NN).config(MM).process("PROC")
              probe perf.type(NN).config(MM).counter("COUNTER")
              probe perf.type(NN).config(MM).process("PROC").counter("NAME")

       The systemtap probe handler is called once per XX increments of the
       underlying performance counter when using the .sample field or at a
       frequency in hertz when using the .hz field. When not specified, the
       default behavior is to sample at a count of 1000000.  The range of
       valid type/config is described by the perf_event_open(2) system call,
       and/or the linux/perf_event.h file.  Invalid combinations or exhaust‐
       ed hardware counter resources result in errors during systemtap
       script startup.  Systemtap does not sanity-check the values: it mere‐
       ly passes them through to the kernel for error- and safety-checking.
       By default the perf event probe is systemwide unless .process is
       specified, which will bind the probe to a specific task.  If the name
       is omitted then it is inferred from the stap -c argument.   A perf
       event can be read on demand using .counter.  The body of the perf
       probe handler will not be invoked for a .counter probe; instead, the
       counter is read in a user space probe via:

          process("PROC").statement("func@file") {stat <<< @perf("NAME")}

       Support for probing python 2 and python 3 function is available with
       the help of an extra python support module. Note that the debuginfo
       for the version of python being probed is required. To run a python
       script with the extra python support module you'd add the '-m
       HelperSDT' option to your python command, like this:

              stap foo.stp -c "python -m HelperSDT"

       Python probes look like the following:


       The list above includes multiple variants and modifiers which provide
       additional functionality or filters. They are:

                     Places a probe at the beginning of the named function
                     by default, unless modified by PATTERN. Parameters are
                     available as context variables.

              .call  Places a probe at the beginning of the named function.
                     Parameters are available as context variables.

                     Places a probe at the moment before the return from the
                     named function. Parameters and local/global python
                     variables are available as context variables.

       PATTERN stands for a string literal that aims to identify a point in
       the python program.  It is made up of three parts:

       ·   The first part is the name of a function (e.g. "foo") or class
           method (e.g. "bar.baz"). This part may use the "*" and "?" wild‐
           carding operators to match multiple names.

       ·   The second part is optional and begins with the "@" character.
           It is followed by the path to the source file containing the
           function, which may include a wildcard pattern. The python path
           is searched for a matching filename.

       ·   Finally, the third part is optional if the file name part was
           given, and identifies the line number in the source file preceded
           by a ":" or a "+".  The line number is assumed to be an absolute
           line number if preceded by a ":", or relative to the declaration
           line of the function if preceded by a "+".  All the lines in the
           function can be matched with ":*".  A range of lines x through y
           can be matched with ":x-y". Ranges and specific lines can be
           mixed using commas, e.g. ":x,y-z".

       In the above list of probe points, MPATTERN stands for a python mod‐
       ule or script name that names the python module of interest. This
       part may use the "*" and "?" wildcarding operators to match multiple
       names. The python path is searched for a matching filename.

EXAMPLES         top

       Here are some example probe points, defining the associated events.

       begin, end, end
              refers to the startup and normal shutdown of the session.  In
              this case, the handler would run once during startup and twice
              during shutdown.

              refers to a periodic interrupt, every 1000 +/- 200 jiffies.

       kernel.function("*init*"), kernel.function("*exit*")
              refers to all kernel functions with "init" or "exit" in the

              refers to any functions within the "kernel/time.c" file that
              span line 240.   Note that this is not a probe at the
              statement at that line number.  Use the kernel.statement probe

              refers to all scheduler-related (really, prefixed) tracepoints
              in the kernel.

              refers to an obsolete STAP_MARK(getuid, ...) macro call in the

              refers to the moment of return from all functions with "sync"
              in the name in any of the USB drivers.

              refers to the first byte of the statement whose compiled
              instructions include the given address in the kernel.

              refers to the statement of line 296 within "kernel/time.c".

              refers to the statement at line bio_init+3 within "fs/bio.c"."pid_max").write
              refers to a hardware breakpoint of type "write" set on pid_max

              refers to the group of probe aliases with any name in the
              third position

SEE ALSO         top


COLOPHON         top

       This page is part of the systemtap (a tracing and live-system
       analysis tool) project.  Information about the project can be found
       at ⟨⟩.  If you have a bug report for
       this manual page, send it to  This page was
       obtained from the project's upstream Git repository
       ⟨git://⟩ on 2020-09-18.  (At that
       time, the date of the most recent commit that was found in the repos‐
       itory was 2020-09-18.)  If you discover any rendering problems in
       this HTML version of the page, or you believe there is a better or
       more up-to-date source for the page, or you have corrections or
       improvements to the information in this COLOPHON (which is not part
       of the original manual page), send a mail to


Pages that refer to this page: stap(1)stap-merge(1)stapex(3stap)error::pass2(7stap)error::pass3(7stap)error::sdt(7stap)stappaths(7)warning::buildid(7stap)stapbpf(8)stapdyn(8)stap-exporter(8)staprun(8)stap-server(8)