NAME | SYNOPSIS | DESCRIPTION | PARAMETERS | EXAMPLES | FURTHER READING | SEE ALSO | AUTHORS | COLOPHON

BPF classifier and actions in tc(8) LinuxBPF classifier and actions in tc(8)

NAME         top

       BPF  -  BPF  programmable  classifier  and actions for ingress/egress
       queueing disciplines

SYNOPSIS         top

   eBPF classifier (filter) or action:
       tc filter ... bpf [ object-file OBJ_FILE ] [ section CLS_NAME ] [
       export UDS_FILE ] [ verbose ] [ skip_hw | skip_sw ] [ police
       POLICE_SPEC ] [ action ACTION_SPEC ] [ classid CLASSID ]
       tc action ... bpf [ object-file OBJ_FILE ] [ section CLS_NAME ] [
       export UDS_FILE ] [ verbose ]

   cBPF classifier (filter) or action:
       tc filter ... bpf [ bytecode-file BPF_FILE | bytecode BPF_BYTECODE ]
       [ police POLICE_SPEC ] [ action ACTION_SPEC ] [ classid CLASSID ]
       tc action ... bpf [ bytecode-file BPF_FILE | bytecode BPF_BYTECODE ]

DESCRIPTION         top

       Extended Berkeley Packet Filter ( eBPF ) and classic Berkeley Packet
       Filter (originally known as BPF, for better distinction referred to
       as cBPF here) are both available as a fully programmable and highly
       efficient classifier and actions. They both offer a minimal
       instruction set for implementing small programs which can safely be
       loaded into the kernel and thus executed in a tiny virtual machine
       from kernel space. An in-kernel verifier guarantees that a specified
       program always terminates and neither crashes nor leaks data from the
       kernel.

       In Linux, it's generally considered that eBPF is the successor of
       cBPF.  The kernel internally transforms cBPF expressions into eBPF
       expressions and executes the latter. Execution of them can be
       performed in an interpreter or at setup time, they can be just-in-
       time compiled (JIT'ed) to run as native machine code. Currently,
       x86_64, ARM64, s390, ppc64 and sparc64 architectures have eBPF JIT
       support, whereas PPC, SPARC, ARM and MIPS have cBPF, but did not
       (yet) switch to eBPF JIT support.

       eBPF's instruction set has similar underlying principles as the cBPF
       instruction set, it however is modelled closer to the underlying
       architecture to better mimic native instruction sets with the aim to
       achieve a better run-time performance. It is designed to be JIT'ed
       with a one to one mapping, which can also open up the possibility for
       compilers to generate optimized eBPF code through an eBPF backend
       that performs almost as fast as natively compiled code. Given that
       LLVM provides such an eBPF backend, eBPF programs can therefore
       easily be programmed in a subset of the C language. Other than that,
       eBPF infrastructure also comes with a construct called "maps". eBPF
       maps are key/value stores that are shared between multiple eBPF
       programs, but also between eBPF programs and user space applications.

       For the traffic control subsystem, classifier and actions that can be
       attached to ingress and egress qdiscs can be written in eBPF or cBPF.
       The advantage over other classifier and actions is that eBPF/cBPF
       provides the generic framework, while users can implement their
       highly specialized use cases efficiently. This means that the
       classifier or action written that way will not suffer from feature
       bloat, and can therefore execute its task highly efficient. It allows
       for non-linear classification and even merging the action part into
       the classification. Combined with efficient eBPF map data structures,
       user space can push new policies like classids into the kernel
       without reloading a classifier, or it can gather statistics that are
       pushed into one map and use another one for dynamically load
       balancing traffic based on the determined load, just to provide a few
       examples.

PARAMETERS         top

   object-file
       points to an object file that has an executable and linkable format
       (ELF) and contains eBPF opcodes and eBPF map definitions. The LLVM
       compiler infrastructure with clang(1) as a C language front end is
       one project that supports emitting eBPF object files that can be
       passed to the eBPF classifier (more details in the EXAMPLES section).
       This option is mandatory when an eBPF classifier or action is to be
       loaded.

   section
       is the name of the ELF section from the object file, where the eBPF
       classifier or action resides. By default the section name for the
       classifier is called "classifier", and for the action "action". Given
       that a single object file can contain multiple classifier and
       actions, the corresponding section name needs to be specified, if it
       differs from the defaults.

   export
       points to a Unix domain socket file. In case the eBPF object file
       also contains a section named "maps" with eBPF map specifications,
       then the map file descriptors can be handed off via the Unix domain
       socket to an eBPF "agent" herding all descriptors after tc lifetime.
       This can be some third party application implementing the IPC
       counterpart for the import, that uses them for calling into bpf(2)
       system call to read out or update eBPF map data from user space, for
       example, for monitoring purposes or to push down new policies.

   verbose
       if set, it will dump the eBPF verifier output, even if loading the
       eBPF program was successful. By default, only on error, the verifier
       log is being emitted to the user.

   skip_hw | skip_sw
       hardware offload control flags. By default TC will try to offload
       filters to hardware if possible.  skip_hw explicitly disables the
       attempt to offload.  skip_sw forces the offload and disables running
       the eBPF program in the kernel.  If hardware offload is not possible
       and this flag was set kernel will report an error and filter will not
       be installed at all.

   police
       is an optional parameter for an eBPF/cBPF classifier that specifies a
       police in tc(1) which is attached to the classifier, for example, on
       an ingress qdisc.

   action
       is an optional parameter for an eBPF/cBPF classifier that specifies a
       subsequent action in tc(1) which is attached to a classifier.

   classid
   flowid
       provides the default traffic control class identifier for this
       eBPF/cBPF classifier. The default class identifier can also be
       overwritten by the return code of the eBPF/cBPF program. A default
       return code of -1 specifies the here provided default class
       identifier to be used. A return code of the eBPF/cBPF program of 0
       implies that no match took place, and a return code other than these
       two will override the default classid. This allows for efficient,
       non-linear classification with only a single eBPF/cBPF program as
       opposed to having multiple individual programs for various class
       identifiers which would need to reparse packet contents.

   bytecode
       is being used for loading cBPF classifier and actions only. The cBPF
       bytecode is directly passed as a text string in the form of ´s,c t f
       k,c t f k,c t f k,...´ , where s denotes the number of subsequent
       4-tuples. One such 4-tuple consists of c t f k decimals, where c
       represents the cBPF opcode, t the jump true offset target, f the jump
       false offset target and k the immediate constant/literal. There are
       various tools that generate code in this loadable format, for
       example, bpf_asm that ships with the Linux kernel source tree under
       tools/net/ , so it is certainly not expected to hack this by hand.
       The bytecode or bytecode-file option is mandatory when a cBPF
       classifier or action is to be loaded.

   bytecode-file
       also being used to load a cBPF classifier or action. It's effectively
       the same as bytecode only that the cBPF bytecode is not passed
       directly via command line, but rather resides in a text file.

EXAMPLES         top

   eBPF TOOLING
       A full blown example including eBPF agent code can be found inside
       the iproute2 source package under: examples/bpf/

       As prerequisites, the kernel needs to have the eBPF system call
       namely bpf(2) enabled and ships with cls_bpf and act_bpf kernel
       modules for the traffic control subsystem. To enable eBPF/eBPF JIT
       support, depending which of the two the given architecture supports:

           echo 1 > /proc/sys/net/core/bpf_jit_enable

       A given restricted C file can be compiled via LLVM as:

           clang -O2 -emit-llvm -c bpf.c -o - | llc -march=bpf -filetype=obj
           -o bpf.o

       The compiler invocation might still simplify in future, so for now,
       it's quite handy to alias this construct in one way or another, for
       example:

           __bcc() {
                   clang -O2 -emit-llvm -c $1 -o - | \
                   llc -march=bpf -filetype=obj -o "`basename $1 .c`.o"
           }

           alias bcc=__bcc

       A minimal, stand-alone unit, which matches on all traffic with the
       default classid (return code of -1) looks like:

           #include <linux/bpf.h>

           #ifndef __section
           # define __section(x)  __attribute__((section(x), used))
           #endif

           __section("classifier") int cls_main(struct __sk_buff *skb)
           {
                   return -1;
           }

           char __license[] __section("license") = "GPL";

       More examples can be found further below in subsection eBPF
       PROGRAMMING as focus here will be on tooling.

       There can be various other sections, for example, also for actions.
       Thus, an object file in eBPF can contain multiple entrance points.
       Always a specific entrance point, however, must be specified when
       configuring with tc. A license must be part of the restricted C code
       and the license string syntax is the same as with Linux kernel
       modules.  The kernel reserves its right that some eBPF helper
       functions can be restricted to GPL compatible licenses only, and thus
       may reject a program from loading into the kernel when such a license
       mismatch occurs.

       The resulting object file from the compilation can be inspected with
       the usual set of tools that also operate on normal object files, for
       example objdump(1) for inspecting ELF section headers:

           objdump -h bpf.o
           [...]
           3 classifier    000007f8  0000000000000000  0000000000000000  00000040  2**3
                           CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
           4 action-mark   00000088  0000000000000000  0000000000000000  00000838  2**3
                           CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
           5 action-rand   00000098  0000000000000000  0000000000000000  000008c0  2**3
                           CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
           6 maps          00000030  0000000000000000  0000000000000000  00000958  2**2
                           CONTENTS, ALLOC, LOAD, DATA
           7 license       00000004  0000000000000000  0000000000000000  00000988  2**0
                           CONTENTS, ALLOC, LOAD, DATA
           [...]

       Adding an eBPF classifier from an object file that contains a
       classifier in the default ELF section is trivial (note that instead
       of "object-file" also shortcuts such as "obj" can be used):

           bcc bpf.c
           tc filter add dev em1 parent 1: bpf obj bpf.o flowid 1:1

       In case the classifier resides in ELF section "mycls", then that same
       command needs to be invoked as:

           tc filter add dev em1 parent 1: bpf obj bpf.o sec mycls flowid
           1:1

       Dumping the classifier configuration will tell the location of the
       classifier, in other words that it's from object file "bpf.o" under
       section "mycls":

           tc filter show dev em1
           filter parent 1: protocol all pref 49152 bpf
           filter parent 1: protocol all pref 49152 bpf handle 0x1 flowid
           1:1 bpf.o:[mycls]

       The same program can also be installed on ingress qdisc side as
       opposed to egress ...

           tc qdisc add dev em1 handle ffff: ingress
           tc filter add dev em1 parent ffff: bpf obj bpf.o sec mycls flowid
           ffff:1

       ... and again dumped from there:

           tc filter show dev em1 parent ffff:
           filter protocol all pref 49152 bpf
           filter protocol all pref 49152 bpf handle 0x1 flowid ffff:1
           bpf.o:[mycls]

       Attaching a classifier and action on ingress has the restriction that
       it doesn't have an actual underlying queueing discipline. What
       ingress can do is to classify, mangle, redirect or drop packets. When
       queueing is required on ingress side, then ingress must redirect
       packets to the ifb device, otherwise policing can be used. Moreover,
       ingress can be used to have an early drop point of unwanted packets
       before they hit upper layers of the networking stack, perform network
       accounting with eBPF maps that could be shared with egress, or have
       an early mangle and/or redirection point to different networking
       devices.

       Multiple eBPF actions and classifier can be placed into a single
       object file within various sections. In that case, non-default
       section names must be provided, which is the case for both actions in
       this example:

           tc filter add dev em1 parent 1: bpf obj bpf.o flowid 1:1 \
                                    action bpf obj bpf.o sec action-mark \
                                    action bpf obj bpf.o sec action-rand ok

       The advantage of this is that the classifier and the two actions can
       then share eBPF maps with each other, if implemented in the programs.

       In order to access eBPF maps from user space beyond tc(8) setup
       lifetime, the ownership can be transferred to an eBPF agent via Unix
       domain sockets. There are two possibilities for implementing this:

       1) implementation of an own eBPF agent that takes care of setting up
       the Unix domain socket and implementing the protocol that tc(8)
       dictates. A code example of this can be found inside the iproute2
       source package under: examples/bpf/

       2) use tc exec for transferring the eBPF map file descriptors through
       a Unix domain socket, and spawning an application such as sh(1) .
       This approach's advantage is that tc will place the file descriptors
       into the environment and thus make them available just like stdin,
       stdout, stderr file descriptors, meaning, in case user applications
       run from within this fd-owner shell, they can terminate and restart
       without losing eBPF maps file descriptors. Example invocation with
       the previous classifier and action mixture:

           tc exec bpf imp /tmp/bpf
           tc filter add dev em1 parent 1: bpf obj bpf.o exp /tmp/bpf flowid
           1:1 \
                                    action bpf obj bpf.o sec action-mark \
                                    action bpf obj bpf.o sec action-rand ok

       Assuming that eBPF maps are shared with classifier and actions, it's
       enough to export them once, for example, from within the classifier
       or action command. tc will setup all eBPF map file descriptors at the
       time when the object file is first parsed.

       When a shell has been spawned, the environment will have a couple of
       eBPF related variables. BPF_NUM_MAPS provides the total number of
       maps that have been transferred over the Unix domain socket.
       BPF_MAP<X>'s value is the file descriptor number that can be accessed
       in eBPF agent applications, in other words, it can directly be used
       as the file descriptor value for the bpf(2) system call to retrieve
       or alter eBPF map values. <X> denotes the identifier of the eBPF map.
       It corresponds to the id member of struct bpf_elf_map  from the tc
       eBPF map specification.

       The environment in this example looks as follows:

           sh# env | grep BPF
               BPF_NUM_MAPS=3
               BPF_MAP1=6
               BPF_MAP0=5
               BPF_MAP2=7
           sh# ls -la /proc/self/fd
               [...]
               lrwx------. 1 root root 64 Apr 14 16:46 5 -> anon_inode:bpf-map
               lrwx------. 1 root root 64 Apr 14 16:46 6 -> anon_inode:bpf-map
               lrwx------. 1 root root 64 Apr 14 16:46 7 -> anon_inode:bpf-map
           sh# my_bpf_agent

       eBPF agents are very useful in that they can prepopulate eBPF maps
       from user space, monitor statistics via maps and based on that
       feedback, for example, rewrite classids in eBPF map values during
       runtime. Given that eBPF agents are implemented as normal
       applications, they can also dynamically receive traffic control
       policies from external controllers and thus push them down into eBPF
       maps to dynamically adapt to network conditions. Moreover, eBPF maps
       can also be shared with other eBPF program types (e.g. tracing), thus
       very powerful combination can therefore be implemented.

   eBPF PROGRAMMING
       eBPF classifier and actions are being implemented in restricted C
       syntax (in future, there could additionally be new language frontends
       supported).

       The header file linux/bpf.h provides eBPF helper functions that can
       be called from an eBPF program.  This man page will only provide two
       minimal, stand-alone examples, have a look at examples/bpf from the
       iproute2 source package for a fully fledged flow dissector example to
       better demonstrate some of the possibilities with eBPF.

       Supported 32 bit classifier return codes from the C program and their
       meanings:
           0 , denotes a mismatch
           -1 , denotes the default classid configured from the command line
           else , everything else will override the default classid to
           provide a facility for non-linear matching

       Supported 32 bit action return codes from the C program and their
       meanings ( linux/pkt_cls.h ):
           TC_ACT_OK (0) , will terminate the packet processing pipeline and
           allows the packet to proceed
           TC_ACT_SHOT (2) , will terminate the packet processing pipeline
           and drops the packet
           TC_ACT_UNSPEC (-1) , will use the default action configured from
           tc (similarly as returning -1 from a classifier)
           TC_ACT_PIPE (3) , will iterate to the next action, if available
           TC_ACT_RECLASSIFY (1) , will terminate the packet processing
           pipeline and start classification from the beginning
           else , everything else is an unspecified return code

       Both classifier and action return codes are supported in eBPF and
       cBPF programs.

       To demonstrate restricted C syntax, a minimal toy classifier example
       is provided, which assumes that egress packets, for instance
       originating from a container, have previously been marked in interval
       [0, 255]. The program keeps statistics on different marks for user
       space and maps the classid to the root qdisc with the marking itself
       as the minor handle:

           #include <stdint.h>
           #include <asm/types.h>

           #include <linux/bpf.h>
           #include <linux/pkt_sched.h>

           #include "helpers.h"

           struct tuple {
                   long packets;
                   long bytes;
           };

           #define BPF_MAP_ID_STATS        1 /* agent's map identifier */
           #define BPF_MAX_MARK            256

           struct bpf_elf_map __section("maps") map_stats = {
                   .type           =       BPF_MAP_TYPE_ARRAY,
                   .id             =       BPF_MAP_ID_STATS,
                   .size_key       =       sizeof(uint32_t),
                   .size_value     =       sizeof(struct tuple),
                   .max_elem       =       BPF_MAX_MARK,
           };

           static inline void cls_update_stats(const struct __sk_buff *skb,
                                               uint32_t mark)
           {
                   struct tuple *tu;

                   tu = bpf_map_lookup_elem(&map_stats, &mark);
                   if (likely(tu)) {
                           __sync_fetch_and_add(&tu->packets, 1);
                           __sync_fetch_and_add(&tu->bytes, skb->len);
                   }
           }

           __section("cls") int cls_main(struct __sk_buff *skb)
           {
                   uint32_t mark = skb->mark;

                   if (unlikely(mark >= BPF_MAX_MARK))
                           return 0;

                   cls_update_stats(skb, mark);

                   return TC_H_MAKE(TC_H_ROOT, mark);
           }

           char __license[] __section("license") = "GPL";

       Another small example is a port redirector which demuxes destination
       port 80 into the interval [8080, 8087] steered by RSS, that can then
       be attached to ingress qdisc. The exercise of adding the egress
       counterpart and IPv6 support is left to the reader:

           #include <asm/types.h>
           #include <asm/byteorder.h>

           #include <linux/bpf.h>
           #include <linux/filter.h>
           #include <linux/in.h>
           #include <linux/if_ether.h>
           #include <linux/ip.h>
           #include <linux/tcp.h>

           #include "helpers.h"

           static inline void set_tcp_dport(struct __sk_buff *skb, int nh_off,
                                            __u16 old_port, __u16 new_port)
           {
                   bpf_l4_csum_replace(skb, nh_off + offsetof(struct tcphdr, check),
                                       old_port, new_port, sizeof(new_port));
                   bpf_skb_store_bytes(skb, nh_off + offsetof(struct tcphdr, dest),
                                       &new_port, sizeof(new_port), 0);
           }

           static inline int lb_do_ipv4(struct __sk_buff *skb, int nh_off)
           {
                   __u16 dport, dport_new = 8080, off;
                   __u8 ip_proto, ip_vl;

                   ip_proto = load_byte(skb, nh_off +
                                        offsetof(struct iphdr, protocol));
                   if (ip_proto != IPPROTO_TCP)
                           return 0;

                   ip_vl = load_byte(skb, nh_off);
                   if (likely(ip_vl == 0x45))
                           nh_off += sizeof(struct iphdr);
                   else
                           nh_off += (ip_vl & 0xF) << 2;

                   dport = load_half(skb, nh_off + offsetof(struct tcphdr, dest));
                   if (dport != 80)
                           return 0;

                   off = skb->queue_mapping & 7;
                   set_tcp_dport(skb, nh_off - BPF_LL_OFF, __constant_htons(80),
                                 __cpu_to_be16(dport_new + off));
                   return -1;
           }

           __section("lb") int lb_main(struct __sk_buff *skb)
           {
                   int ret = 0, nh_off = BPF_LL_OFF + ETH_HLEN;

                   if (likely(skb->protocol == __constant_htons(ETH_P_IP)))
                           ret = lb_do_ipv4(skb, nh_off);

                   return ret;
           }

           char __license[] __section("license") = "GPL";

       The related helper header file helpers.h in both examples was:

           /* Misc helper macros. */
           #define __section(x) __attribute__((section(x), used))
           #define offsetof(x, y) __builtin_offsetof(x, y)
           #define likely(x) __builtin_expect(!!(x), 1)
           #define unlikely(x) __builtin_expect(!!(x), 0)

           /* Used map structure */
           struct bpf_elf_map {
               __u32 type;
               __u32 size_key;
               __u32 size_value;
               __u32 max_elem;
               __u32 id;
           };

           /* Some used BPF function calls. */
           static int (*bpf_skb_store_bytes)(void *ctx, int off, void *from,
                                             int len, int flags) =
                 (void *) BPF_FUNC_skb_store_bytes;
           static int (*bpf_l4_csum_replace)(void *ctx, int off, int from,
                                             int to, int flags) =
                 (void *) BPF_FUNC_l4_csum_replace;
           static void *(*bpf_map_lookup_elem)(void *map, void *key) =
                 (void *) BPF_FUNC_map_lookup_elem;

           /* Some used BPF intrinsics. */
           unsigned long long load_byte(void *skb, unsigned long long off)
               asm ("llvm.bpf.load.byte");
           unsigned long long load_half(void *skb, unsigned long long off)
               asm ("llvm.bpf.load.half");

       Best practice, we recommend to only have a single eBPF classifier
       loaded in tc and perform all necessary matching and mangling from
       there instead of a list of individual classifier and separate
       actions. Just a single classifier tailored for a given use-case will
       be most efficient to run.

   eBPF DEBUGGING
       Both tc filter and action commands for bpf support an optional
       verbose parameter that can be used to inspect the eBPF verifier log.
       It is dumped by default in case of an error.

       In case the eBPF/cBPF JIT compiler has been enabled, it can also be
       instructed to emit a debug output of the resulting opcode image into
       the kernel log, which can be read via dmesg(1) :

           echo 2 > /proc/sys/net/core/bpf_jit_enable

       The Linux kernel source tree ships additionally under tools/net/ a
       small helper called bpf_jit_disasm that reads out the opcode image
       dump from the kernel log and dumps the resulting disassembly:

           bpf_jit_disasm -o

       Other than that, the Linux kernel also contains an extensive
       eBPF/cBPF test suite module called test_bpf . Upon ...

           modprobe test_bpf

       ... it performs a diversity of test cases and dumps the results into
       the kernel log that can be inspected with dmesg(1) . The results can
       differ depending on whether the JIT compiler is enabled or not. In
       case of failed test cases, the module will fail to load. In such
       cases, we urge you to file a bug report to the related JIT authors,
       Linux kernel and networking mailing lists.

   cBPF
       Although we generally recommend switching to implementing eBPF
       classifier and actions, for the sake of completeness, a few words on
       how to program in cBPF will be lost here.

       Likewise, the bpf_jit_enable switch can be enabled as mentioned
       already. Tooling such as bpf_jit_disasm is also independent whether
       eBPF or cBPF code is being loaded.

       Unlike in eBPF, classifier and action are not implemented in
       restricted C, but rather in a minimal assembler-like language or with
       the help of other tooling.

       The raw interface with tc takes opcodes directly. For example, the
       most minimal classifier matching on every packet resulting in the
       default classid of 1:1 looks like:

           tc filter add dev em1 parent 1: bpf bytecode '1,6 0 0
           4294967295,' flowid 1:1

       The first decimal of the bytecode sequence denotes the number of
       subsequent 4-tuples of cBPF opcodes. As mentioned, such a 4-tuple
       consists of c t f k decimals, where c represents the cBPF opcode, t
       the jump true offset target, f the jump false offset target and k the
       immediate constant/literal. Here, this denotes an unconditional
       return from the program with immediate value of -1.

       Thus, for egress classification, Willem de Bruijn implemented a
       minimal stand-alone helper tool under the GNU General Public License
       version 2 for iptables(8) BPF extension, which abuses the libpcap
       internal classic BPF compiler, his code derived here for usage with
       tc(8) :

           #include <pcap.h>
           #include <stdio.h>

           int main(int argc, char **argv)
           {
                   struct bpf_program prog;
                   struct bpf_insn *ins;
                   int i, ret, dlt = DLT_RAW;

                   if (argc < 2 || argc > 3)
                           return 1;
                   if (argc == 3) {
                           dlt = pcap_datalink_name_to_val(argv[1]);
                           if (dlt == -1)
                                   return 1;
                   }

                   ret = pcap_compile_nopcap(-1, dlt, &prog, argv[argc - 1],
                                             1, PCAP_NETMASK_UNKNOWN);
                   if (ret)
                           return 1;

                   printf("%d,", prog.bf_len);
                   ins = prog.bf_insns;

                   for (i = 0; i < prog.bf_len - 1; ++ins, ++i)
                           printf("%u %u %u %u,", ins->code,
                                  ins->jt, ins->jf, ins->k);
                   printf("%u %u %u %u",
                          ins->code, ins->jt, ins->jf, ins->k);

                   pcap_freecode(&prog);
                   return 0;
           }

       Given this small helper, any tcpdump(8) filter expression can be
       abused as a classifier where a match will result in the default
       classid:

           bpftool EN10MB 'tcp[tcpflags] & tcp-syn != 0' > /var/bpf/tcp-syn
           tc filter add dev em1 parent 1: bpf bytecode-file /var/bpf/tcp-
           syn flowid 1:1

       Basically, such a minimal generator is equivalent to:

           tcpdump -iem1 -ddd 'tcp[tcpflags] & tcp-syn != 0' | tr '\n' ',' >
           /var/bpf/tcp-syn

       Since libpcap does not support all Linux' specific cBPF extensions in
       its compiler, the Linux kernel also ships under tools/net/ a minimal
       BPF assembler called bpf_asm for providing full control. For detailed
       syntax and semantics on implementing such programs by hand, see
       references under FURTHER READING .

       Trivial toy example in bpf_asm for classifying IPv4/TCP packets,
       saved in a text file called foobar :

           ldh [12]
           jne #0x800, drop
           ldb [23]
           jneq #6, drop
           ret #-1
           drop: ret #0

       Similarly, such a classifier can be loaded as:

           bpf_asm foobar > /var/bpf/tcp-syn
           tc filter add dev em1 parent 1: bpf bytecode-file /var/bpf/tcp-
           syn flowid 1:1

       For BPF classifiers, the Linux kernel provides additionally under
       tools/net/ a small BPF debugger called bpf_dbg , which can be used to
       test a classifier against pcap files, single-step or add various
       breakpoints into the classifier program and dump register contents
       during runtime.

       Implementing an action in classic BPF is rather limited in the sense
       that packet mangling is not supported. Therefore, it's generally
       recommended to make the switch to eBPF, whenever possible.

FURTHER READING         top

       Further and more technical details about the BPF architecture can be
       found in the Linux kernel source tree under
       Documentation/networking/filter.txt .

       Further details on eBPF tc(8) examples can be found in the iproute2
       source tree under examples/bpf/ .

SEE ALSO         top

       tc(8), tc-ematch(8) bpf(2) bpf(4)

AUTHORS         top

       Manpage written by Daniel Borkmann.

       Please report corrections or improvements to the Linux kernel
       networking mailing list: <netdev@vger.kernel.org>

COLOPHON         top

       This page is part of the iproute2 (utilities for controlling TCP/IP
       networking and traffic) project.  Information about the project can
       be found at 
       ⟨http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2⟩.
       If you have a bug report for this manual page, send it to
       netdev@vger.kernel.org, shemminger@osdl.org.  This page was obtained
       from the project's upstream Git repository 
       ⟨git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git⟩
       on 2017-07-05.  If you discover any rendering problems in this HTML
       version of the page, or you believe there is a better or more up-to-
       date source for the page, or you have corrections or improvements to
       the information in this COLOPHON (which is not part of the original
       manual page), send a mail to man-pages@man7.org

iproute2                         18 May 20B1P5F classifier and actions in tc(8)

Pages that refer to this page: bpf(2)tc(8)