Tetragon Archives - Linuxcent

Detection Engineering with eBPF: Kernel-Level Visibility for Cloud Incidents

July 6, 2026 by Vamshi Krishna Santhapuri

Reading Time: 13 minutes

What is purple team security → OWASP Top 10 mapped to cloud infrastructure → Cloud security breaches 2020–2025 → Broken access control in AWS → MFA fatigue attacks → CI/CD secrets exposure → SSRF to cloud metadata → Kubernetes container escape → Supply chain attack detection → Cloud lateral movement → Detection Engineering with eBPF

TL;DR

Detection engineering with eBPF addresses OWASP A09 directly: most process-level attack techniques leave no trace in CloudTrail, VPC Flow Logs, or syslog — eBPF hooks in the kernel observe them before the attacker has any ability to suppress the record
CloudTrail is API-plane only; VPC Flow Logs are network-plane only with a 15-minute aggregation delay and no process context; syslog captures only what userspace processes voluntarily emit — all three miss the OS-level attack surface entirely
eBPF attaches to kernel syscall tracepoints and kprobes to capture connect(), execve(), mount(), setuid(), and open() with full context: PID, process name, container cgroup, parent process, timestamp — in real time
Falco and Tetragon are the production-grade always-on options; bpftrace is the ad-hoc investigation tool — use each for what it is designed for
Tetragon’s TracingPolicy can kill a process at the moment of the violating syscall, before the attack completes — this is enforcement, not just alerting
Every attack in EP07 through EP10 has a detectable kernel-level signal; this episode maps each one to a concrete eBPF detection rule

OWASP Mapping: A09 Security Logging and Monitoring Failures — the structural gap this series has referenced from EP04 onward: attacks that succeed not because defenses are absent, but because the telemetry layer cannot see the OS surface where the attacks execute.

The Big Picture

┌─────────────────────────────────────────────────────────────────────────┐
│                  DETECTION ENGINEERING WITH eBPF                        │
│                                                                         │
│   KERNEL SPACE                          USERSPACE                       │
│                                                                         │
│   syscall/kprobe hooks                                                  │
│   ┌──────────────────┐                                                  │
│   │ connect()        │──▶ ring buffer ──▶ Tetragon ──▶ Hubble/SIEM     │
│   │ execve()         │                                                  │
│   │ mount()          │──▶ ring buffer ──▶ Falco   ──▶ Slack/PagerDuty │
│   │ setuid()         │                                                  │
│   │ open()           │──▶ perf buffer ──▶ bpftrace ──▶ stdout/log     │
│   └──────────────────┘                                                  │
│          │                                                              │
│          │  Context captured at hook:                                   │
│          │  PID · comm · cgroup (container ID) · args · timestamp      │
│          │  parent PID · network namespace · mount namespace           │
│                                                                         │
│   ═══════════════════════════════════════════════════════════           │
│   WHAT OTHER TOOLS SEE                                                  │
│   CloudTrail:     API calls only — nothing below the AWS SDK            │
│   VPC Flow Logs:  src/dst IP+port only — 15-min delay, no PID          │
│   Syslog:         What the process chose to log — attacker controls it  │
│   eBPF:           Every syscall — attacker cannot suppress it          │
│                   without kernel access                                 │
└─────────────────────────────────────────────────────────────────────────┘

Detection engineering with eBPF closes the observability gap that every previous episode in this series exploited. The SSRF in EP07 made an outbound connection to 169.254.169.254 — the EC2 metadata endpoint — from a web application process. VPC Flow Logs show that IP eventually. CloudTrail shows nothing. eBPF shows the connect() syscall with the PID, the process name, the container cgroup ID, and the timestamp, in the sub-millisecond window it occurred.

The Problem: Your SIEM Has a 15-Minute Hole

During a cloud incident response engagement, the question came up in the first hour: did this process make any outbound connections in the last 30 minutes?

Four telemetry sources, four answers:

CloudTrail: Not applicable. CloudTrail records AWS API calls. A process inside an EC2 instance making a raw TCP connection to an external IP — or to the metadata endpoint — is OS-level activity. CloudTrail has no record of it.

VPC Flow Logs: Maybe, eventually. Flow Logs aggregate at 1-minute or 10-minute intervals (configurable), then land in S3 or CloudWatch Logs with additional delay. In practice, you’re looking at 10–15 minutes before the data is queryable. The flow record contains source IP, destination IP, source port, destination port, protocol, bytes, packets — and nothing else. There is no PID. There is no process name. There is no indication of which container inside the EC2 instance made the connection. If ten pods are running on the same node, VPC Flow Logs tells you the node talked to an external IP. You don’t know which pod.

Syslog: Nothing logged. The process — a compromised web application exploited via SSRF — didn’t log the connection. It wouldn’t. Application code doesn’t emit syslog entries for every outbound connection it makes. And an attacker controlling the process would not add logging.

eBPF TC hook: Every TCP connection attempt, from the moment it entered the network stack, with PID, process name, container cgroup ID, destination IP, destination port, source IP, and timestamp — in real time, with zero delay.

That is the gap. Everything in EP04 through EP10 of this series lived in it.

The OWASP A09 framing is exactly right: these are not failures of detection rules, they are failures of the telemetry layer. You cannot write a SIEM rule for data that is never collected. eBPF collects the data that the other layers structurally cannot.

What eBPF Detects That Other Tools Miss

Technique	CloudTrail	VPC Flow Logs	Syslog	eBPF
Process spawn inside container	No	No	Maybe (if auditd configured)	Yes — execve(): PID, command, args, parent PID, container cgroup
Outbound TCP connection	No	IP+port, 15-min delay, no PID	No	connect(): IP+port+PID+comm+container, real-time
File write to /etc/passwd	No	No	No	openat()+write(): exact path, PID, comm, container
Privilege escalation (setuid/setgid)	No	No	Maybe (auditd)	Yes — setuid() syscall args: target UID, calling PID, comm
Container escape attempt via mount	No	No	No	mount(): args, mount namespace ID, calling PID — namespace mismatch detectable
SSRF to 169.254.169.254	No	IP only, 15-min delay	No	connect() from app process to metadata IP — PID, comm, container, real-time
Binary execution with unusual parent	No	No	No	execve(): full parent chain — detects shell spawned from web process
Kubernetes secret file read	No	No	No	openat() on /run/secrets/kubernetes.io/serviceaccount/token
STS credential fetch from Lambda	No	Endpoint IP only	No	connect() to sts.amazonaws.com from unexpected process

The pattern across the table is consistent: CloudTrail covers the AWS control plane. VPC Flow Logs cover the network plane with delay and no process context. Syslog covers what processes choose to emit. eBPF covers the syscall surface — the layer where every one of these events must pass, regardless of what the attacker wants.

For operators not writing eBPF: This table tells you what your current SIEM can and cannot see. If your threat model includes container escapes, SSRF-to-metadata attacks, or post-compromise lateral movement through process execution, the detection signal for those techniques does not exist in your CloudTrail or your flow logs. It exists only at the kernel level.

Detection Rule 1: Unexpected Outbound from an Application Container

The SSRF attack in EP07 — and the lateral movement in EP10 — both required an outbound TCP connection from a process that had no legitimate reason to make one. This is the detection.

Ad-hoc investigation with bpftrace

When you’re on a node right now and need to know what’s connecting outbound:

# Shows PID, process name, and destination IP in real time
# Run on the node (requires root or CAP_BPF)
bpftrace -e '
#include <linux/socket.h>
#include <linux/in.h>

tracepoint:syscalls:sys_enter_connect {
  $sa = (struct sockaddr_in *)args->uservaddr;
  if ($sa->sin_family == AF_INET) {
    printf("connect: pid=%-6d comm=%-20s dst=%s:%d\n",
           pid,
           comm,
           ntop($sa->sin_addr.s_addr),
           (uint16)bswap($sa->sin_port));
  }
}
'

Sample output — what you’d see during an SSRF exploit targeting the EC2 metadata service:

connect: pid=18422  comm=python3              dst=169.254.169.254:80
connect: pid=18422  comm=python3              dst=169.254.169.254:80
connect: pid=18432  comm=curl                 dst=169.254.169.254:80

The python3 process — your web application — connecting to 169.254.169.254 is the metadata endpoint. That’s not a legitimate application dependency. That’s the SSRF signal.

bpftrace — kernel answers in one line goes deep on the tracepoint/kprobe model and how to filter by cgroup for container-specific traces. The one-liners above are the starting point; that post covers building targeted investigation scripts.

Production-grade enforcement with Tetragon

bpftrace is for investigation. Tetragon is for always-on detection — and optionally, prevention.

# TracingPolicy: alert on outbound connections from non-host network namespaces
# (any container making outbound TCP connections)
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "detect-outbound-connections"
spec:
  kprobes:
  - call: "tcp_connect"
    syscall: false
    args:
    - index: 0
      type: "sock"
    selectors:
    - matchNamespaces:
      - namespace: Net
        operator: NotIn
        values:
        - "host"
      matchActions:
      - action: Post   # Generate an alert event; change to Sigkill to prevent

To detect specifically the SSRF-to-metadata pattern — connections to 169.254.169.254:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "detect-imds-access"
spec:
  kprobes:
  - call: "tcp_connect"
    syscall: false
    args:
    - index: 0
      type: "sock"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Equal"
        values:
        - "169.254.169.254/32"
      matchActions:
      - action: Post
        rateLimit: "1/minute"

Tetragon events include process_kprobe JSON with the pod name, namespace, container ID, binary path, parent binary, and all arguments. This feeds directly into your SIEM or to Hubble’s flow log.

Detection Rule 2: Process Execution Inside a Container

A shell spawning inside a container that has no business running a shell is a post-compromise indicator. It covers the container escape setup from EP08, the supply chain implant from EP09, and any hands-on-keyboard phase after initial access.

Falco rule: shell spawned from application container

# Falco rule: detect any shell spawned in a container
# Add to /etc/falco/rules.d/purple-team.yaml
- list: shell_binaries
  items: [bash, sh, zsh, ksh, fish, tcsh, csh, dash]

- list: allowed_shell_images
  items: [
    "debug-tools",     # Your approved debug container image names
    "toolbox"
  ]

- rule: Shell Spawned in Container
  desc: >
    A shell was spawned inside a container. In application containers (web servers,
    APIs, data processors) this is almost always a post-compromise indicator.
  condition: >
    evt.type = execve and
    evt.dir = < and
    container and
    container.image.repository != "" and
    proc.name in (shell_binaries) and
    not proc.pname in (shell_binaries) and
    not container.image.repository in (allowed_shell_images) and
    not k8s.ns.name in (kube-system, kube-public)
  output: >
    Shell spawned in container
    (user=%user.name
     container=%container.name
     image=%container.image.repository
     cmd=%proc.cmdline
     parent=%proc.pname
     pod=%k8s.pod.name
     ns=%k8s.ns.name)
  priority: WARNING
  tags: [purple-team, post-compromise, container]

The proc.pname condition is the key signal: a shell spawned by a web server process (nginx, node, gunicorn, java) is a different threat than a shell spawned by another shell in a debug context. The rule above passes the second case through the allowed_shell_images exclusion; it flags the first.

Detecting the supply chain implant pattern

EP09 covered supply chain attacks where a build artifact executes unexpected binaries at runtime. The bpftrace version for ad-hoc investigation of what a specific container is executing:

# bpftrace: trace all execve() calls from processes inside a specific container
# First, find the container's cgroup ID:
# systemd-cgls | grep <pod-name>
# Or: cat /sys/fs/cgroup/unified/<cgroup-path>/cgroup.procs

bpftrace -e '
tracepoint:syscalls:sys_enter_execve {
  printf("execve: pid=%-6d ppid=%-6d comm=%-20s file=%s\n",
         pid,
         curtask->real_parent->tgid,
         comm,
         str(args->filename));
}
' 2>/dev/null | grep -v "^\[" | head -50

Sample output during a supply chain compromise scenario — unexpected binary execution from a package manager implant:

execve: pid=31204  ppid=31190  comm=node                 file=/bin/sh
execve: pid=31205  ppid=31204  comm=sh                   file=/tmp/.x/beacon
execve: pid=31206  ppid=31205  comm=beacon               file=/usr/bin/curl

The chain node → sh → /tmp/.x/beacon → curl — application process spawning a shell, which executes an unknown binary from /tmp, which runs curl — is the supply chain implant execution pattern. None of this appears in CloudTrail.

Detection Rule 3: Privilege Escalation — setuid(0) and Capability Abuse

A process calling setuid(0) to elevate to root, or setcap to acquire new capabilities, is a privilege escalation indicator. The EP08 container escape path used a setuid binary to gain root inside the container as the first step toward escaping the namespace.

bpftrace: catch setuid(0) calls in real time

# bpftrace: alert on any process calling setuid(0)
# Any process attempting to switch to UID 0
bpftrace -e '
tracepoint:syscalls:sys_enter_setuid {
  if (args->uid == 0) {
    printf("ALERT setuid(0): pid=%-6d comm=%-20s ppid=%d pcomm=%s\n",
           pid,
           comm,
           curtask->real_parent->tgid,
           str(curtask->real_parent->comm));
  }
}
tracepoint:syscalls:sys_enter_setresuid {
  if (args->ruid == 0 || args->euid == 0) {
    printf("ALERT setresuid(root): pid=%-6d comm=%-20s\n", pid, comm);
  }
}
'

Falco rule: setuid binary execution inside container

- rule: Setuid Binary Executed in Container
  desc: >
    A setuid binary was executed inside a container. Setuid binaries inside
    containers are a privilege escalation path — they run as root regardless
    of the container's user setting.
  condition: >
    evt.type = execve and
    evt.dir = < and
    container and
    proc.is_suid_exe = true
  output: >
    Setuid binary executed in container
    (binary=%proc.exepath
     user=%user.name
     container=%container.name
     pod=%k8s.pod.name
     cmd=%proc.cmdline)
  priority: ERROR
  tags: [purple-team, privilege-escalation, container]

Detection Rule 4: Container Escape Attempt via Namespace-Crossing Mount

The privileged container escape path from EP08 requires calling mount() from a container namespace to access the host filesystem. The kernel records the mount namespace of the calling process — an eBPF kprobe on mount() can detect when the caller’s mount namespace differs from the host namespace.

Tetragon policy: kill any mount from a non-host namespace

# This covers the --privileged container escape path documented in EP08
# The mount() call that crosses from container namespace to host filesystem
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "detect-container-mount-escape"
spec:
  kprobes:
  - call: "security_sb_mount"
    syscall: false
    args:
    - index: 0
      type: "string"     # dev_name
    - index: 3
      type: "string"     # mount flags
    selectors:
    - matchNamespaces:
      - namespace: Mnt
        operator: NotIn
        values:
        - "host"
      matchArgs:
      - index: 0
        operator: "NotEqual"
        values:
        - "proc"
        - "sysfs"
        - "tmpfs"        # Common legitimate mounts in containers
      matchActions:
      - action: Sigkill
        rateLimit: "10/minute"

Start with action: Post and tune the exclusions for your environment before switching to Sigkill. See the production gotchas below.

bpftrace: ad-hoc namespace crossing investigation

# bpftrace: trace mount() calls and show the mount namespace of the caller
# Mount namespace ID of the host: read from /proc/1/ns/mnt
HOST_MNT_NS=$(readlink /proc/1/ns/mnt | grep -oP '\d+')

bpftrace -e '
#include <linux/nsproxy.h>
#include <linux/mount.h>

kprobe:__x64_sys_mount {
  $nsproxy = (struct nsproxy *)curtask->nsproxy;
  $mnt_ns_id = $nsproxy->mnt_ns->ns.inum;
  printf("mount: pid=%-6d comm=%-20s mnt_ns=%u\n",
         pid, comm, $mnt_ns_id);
}
' 2>/dev/null

Compare the mnt_ns value in output against $HOST_MNT_NS. Any mount call with a mnt_ns value other than the host’s is from inside a container. A privileged container attempting host filesystem access shows a container namespace ID.

Building a Detection Pipeline

Ad-hoc bpftrace commands answer questions during an incident. Always-on detection requires a pipeline that runs continuously, routes alerts to a durable destination, and survives pod restarts. The two production-grade options in this stack:

eBPF hooks
    │
    ├── Tetragon (always-on, Kubernetes-native)
    │       └── TracingPolicy CRDs
    │               └── JSON events → Hubble → Grafana
    │                               → SIEM (Splunk/Elastic)
    │                               → PagerDuty
    │
    └── Falco (rule-based, declarative)
            └── /etc/falco/rules.d/*.yaml
                    └── falcosidekick
                            ├── Slack
                            ├── PagerDuty
                            ├── Elasticsearch
                            └── AWS Lambda (custom response)

The TC eBPF pod-level network policy post covers how Cilium and Tetragon share the same underlying kernel attachment points — understanding TC hooks helps explain why Tetragon’s network-level policies fire at the same layer as Cilium’s NetworkPolicy enforcement.

Falco with falcosidekick: complete local testing setup

Use this to validate your Falco rules before deploying to a cluster. It routes Falco alerts to Slack in real time.

# docker-compose.yml — local Falco + falcosidekick testing
# Requires: Docker with kernel headers or eBPF driver support
version: "3.8"

services:
  falco:
    image: falcosecurity/falco-no-driver:latest
    privileged: true
    volumes:
      - /var/run/docker.sock:/host/var/run/docker.sock
      - /dev:/host/dev
      - /proc:/host/proc:ro
      - /boot:/host/boot:ro
      - /lib/modules:/host/lib/modules:ro
      - /usr:/host/usr:ro
      - /etc/falco:/etc/falco
      - ./rules:/etc/falco/rules.d:ro
    environment:
      FALCO_GRPC_ENABLED: "true"
      FALCO_GRPC_BIND_ADDRESS: "0.0.0.0:5060"
    ports:
      - "5060:5060"
    command: >
      /usr/bin/falco
        --modern-bpf
        -o "json_output=true"
        -o "grpc.enabled=true"
        -o "grpc_output.enabled=true"

  falcosidekick:
    image: falcosecurity/falcosidekick:latest
    depends_on:
      - falco
    environment:
      FALCO_GRPC_CONN: "falco:5060"
      FALCO_GRPC_TLS: "false"
      SLACK_WEBHOOKURL: "${SLACK_WEBHOOK}"
      SLACK_MINIMUMPRIORITY: "warning"
      SLACK_MESSAGEFORMAT: >
        "[{{.Priority}}] {{.Rule}}
        | pod={{.OutputFields.k8s_pod_name}}
        | ns={{.OutputFields.k8s_ns_name}}
        | cmd={{.OutputFields.proc_cmdline}}"
    ports:
      - "2801:2801"

# Start the stack (set SLACK_WEBHOOK first)
export SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
docker compose up -d

# Trigger a test alert: exec into any running container
docker exec -it <any-container> /bin/sh

# Check falcosidekick received it
curl -s http://localhost:2801/metrics | grep falcosidekick_inputs_total

Deploying Falco to Kubernetes with Helm

# Add Falco Helm repo
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

# Install Falco with eBPF driver (not kernel module — required in Kubernetes)
helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set driver.kind=modern_ebpf \
  --set falcosidekick.enabled=true \
  --set falcosidekick.config.slack.webhookurl="${SLACK_WEBHOOK}" \
  --set falcosidekick.config.slack.minimumpriority=warning \
  --set customRules."purple-team\.yaml"="$(cat ./rules/purple-team.yaml)"

# Verify Falco pods are running on all nodes
kubectl get pods -n falco -o wide

# Tail Falco logs for a specific node's pod
kubectl logs -n falco -l app.kubernetes.io/name=falco -f

# Validate a specific rule is loaded
kubectl exec -n falco <falco-pod> -- falco --list-rules 2>/dev/null | grep "Shell Spawned"

What This Means for Each Prior Attack

Every attack in EP07 through EP10 had a detectable kernel-level signal that the standard telemetry stack missed. Here’s the detection mapping:

Episode	Attack	What Standard Telemetry Missed	eBPF Detection Signal
EP07	SSRF to EC2 IMDS	CloudTrail: nothing. VPC Flow Logs: 169.254.169.254 destination, 15-min delay, no PID	TC kprobe: `connect()` to `169.254.169.254` from app process — PID, comm, container, real-time
EP08	Container escape via privileged mount	CloudTrail: nothing. Syslog: nothing	kprobe: `security_sb_mount()` from non-host mount namespace — namespace ID mismatch fires alert
EP09	Supply chain implant execution	CloudTrail: nothing (OS-level). GuardDuty: maybe if beacon calls AWS APIs	kprobe: `execve()` with anomalous parent chain — web process → shell → unknown binary from `/tmp`
EP10	Lateral movement via cross-account role chaining	CloudTrail: AssumeRole events present but no process context	TC hook: `connect()` to `sts.amazonaws.com` from Lambda handler process — unexpected process identity

The table is not theoretical. It reflects what you would actually observe running these detection rules against the attack simulations in those episodes.

For the SSRF case (EP07): the connection to 169.254.169.254 from the web application process would fire within milliseconds of the exploit. VPC Flow Logs would record the same IP 10–15 minutes later, with no information about which process made it. By the time the flow log is queryable, the attacker has the IAM credentials and may have made subsequent API calls in a different region.

For the container escape (EP08): the mount() from a non-host mount namespace is the earliest detectable signal of the escape attempt. It fires before the attacker has host filesystem access. With action: Sigkill in the Tetragon policy, the process is terminated at this syscall — the escape does not complete.

⚠ Production Gotchas

Use the eBPF driver for Falco in Kubernetes, not the kernel module. The kernel module requires installing a kernel module on every node, which creates a dependency on kernel headers being present and compatible. The modern_ebpf driver (Falco 0.35+) uses BTF and CO-RE — it works on kernels 5.8+ without kernel module installation and survives kernel upgrades. In managed Kubernetes (EKS, GKE, AKS), the kernel module path often doesn’t work at all due to the OS image restrictions.

Test Tetragon’s Sigkill action exhaustively before enabling it in production. The Sigkill action terminates the process at the moment of the violating syscall — before it completes. This is powerful for prevention but catastrophic if your exclusions are wrong. Common false positive sources: debug containers (kubectl debug), init containers that perform legitimate mounts, Kubernetes admission webhooks calling shell scripts. Always deploy with action: Post first, tune for two weeks of normal traffic, then switch to Sigkill only on rules with zero false positives in your environment.

bpftrace is an investigation tool, not a production detector. bpftrace compiles and loads an eBPF program per invocation — it has no persistence, no alerting, and no output routing to your SIEM. It is for the incident response scenario described in the opening: “did this process make outbound connections in the last 30 minutes?” (answered: it’s what’s happening right now). For always-on detection, use Tetragon or Falco. Running bpftrace as a daemon substitute introduces overhead without the management plane that production tools provide.

The shell-in-container rule will fire on kubectl exec sessions. Any time an operator runs kubectl exec -it <pod> -- /bin/bash, the Falco rule above triggers. This is working as intended — kubectl exec is a post-compromise technique as well as an operational tool. Handle this with an exclusion on the user identity or namespace:

# Add to the rule condition to exclude operator kubectl exec sessions
# Map your cluster admin users or service account here
and not user.name in (cluster-admin-users)
and not k8s.ns.name in (ops-tooling, debug-ns)

High-frequency kprobes on hot paths add measurable overhead. Attaching to tcp_connect fires on every outbound connection from every process on the node. On a node handling hundreds of microservices with high connection rates (service mesh with short-lived connections), this adds CPU overhead. Profile before deploying. Tetragon’s namespace-scoped selectors (matchNamespaces: NotHost) help by skipping host-namespace processes. Filter as narrowly as your threat model allows.

Ring buffer overflow silently drops events on high-throughput nodes. Both Falco and bpftrace use kernel ring buffers to pass events to userspace. If the userspace consumer (the Falco daemon, the bpftrace process) cannot keep up with the event rate, the kernel drops events silently. Falco exposes a falco_events_dropped_total metric — monitor it. Tune ring_buffer_size in the Falco configuration if drops occur on high-throughput nodes.

Quick Reference

Use Case	Tool	Hook Type	Detection Latency
Ad-hoc outbound connection investigation	bpftrace	tracepoint:syscalls:sys_enter_connect	Real-time
Always-on container shell detection	Falco	eBPF modern driver / syscall	< 100ms
Container escape prevention	Tetragon + Sigkill	kprobe: security_sb_mount	Blocking (pre-completion)
Privilege escalation detection	Falco / bpftrace	tracepoint:syscalls:sys_enter_setuid	Real-time
Supply chain implant execution	Falco execve rule	eBPF modern driver	< 100ms
SSRF-to-metadata detection	Tetragon kprobe	kprobe: tcp_connect	Real-time
Lateral movement via unexpected STS call	Tetragon kprobe	kprobe: tcp_connect + process filter	Real-time
Audit trail for incident response	Tetragon JSON events	kprobe / tracepoint	Persistent, SIEM-routable

Tool	Best For	Not For
bpftrace	Ad-hoc node investigation during IR	Always-on production detection
Falco	Rule-based behavioral detection	Network-layer enforcement
Tetragon	Always-on detection + optional enforcement	Ad-hoc one-liner investigation

Key Takeaways

Detection engineering with eBPF closes the telemetry gap that CloudTrail, VPC Flow Logs, and syslog cannot close: OS-level process activity is only visible at the kernel syscall layer, and eBPF is the only production-grade mechanism that reads it without kernel module risk
Every attack in EP07 through EP10 has a real-time kernel-level signal — SSRF connections, container mount calls, unexpected execve chains, privilege escalation attempts — none of which appear in your current SIEM unless you’ve built this layer
Falco provides declarative, rule-based behavioral detection; Tetragon provides syscall-level enforcement that can terminate an attack before it completes — use both with complementary scopes
bpftrace is the incident response tool for asking the kernel a direct question right now; it is not a monitoring agent and should not be treated as one
The false positive problem is real and must be addressed before enabling enforcement: kubectl exec, debug containers, init containers with legitimate mounts — exclusions must be tuned per environment before moving from action: Post to action: Sigkill

What’s Next

EP11 closed the detection gap. You’ve instrumented the kernel, you’re receiving Falco alerts, Tetragon is firing on namespace-crossing mount attempts. Then the alert fires at 2:47 AM on a Sunday — not a test, not a false positive. Something got in.

EP12 is the playbook for the first 24 hours after a confirmed cloud breach: what to isolate and how without destroying forensic evidence, what to preserve before it rotates out of CloudTrail’s 90-day window, what eBPF data to capture while the node is still live, who to call and in what order, and how to avoid the common mistakes that turn a containable incident into a regulatory event. The response phase — where everything you built in EP04 through EP11 either pays off or reveals what you missed.

Get EP12 in your inbox when it publishes → subscribe at linuxcent.com

LSM and Tetragon — When the Kernel Says No

July 6, 2026June 12, 2026 by Vamshi Krishna Santhapuri

Reading Time: 8 minutes

eBPF: From Kernel to Cloud, Episode 12
What Is eBPF? · The BPF Verifier · eBPF vs Kernel Modules · eBPF Program Types · eBPF Maps · CO-RE and libbpf · XDP · TC eBPF · bpftrace · Network Flow Observability · DNS Observability · LSM and Tetragon

TL;DR

LSM eBPF Tetragon integrates Linux Security Module hooks with eBPF programs — enforcement happens at the syscall boundary, before the operation completes, with no detect-and-respond window
(LSM hook = Linux Security Module hook: a callback point built into the kernel that fires before a security-relevant operation completes, allowing the security module to approve or reject it)
Falco and similar sidecar-based tools detect after the fact — the syscall returns, the file is written, the connection is established, the alert fires; with LSM, the syscall never returns success
BPF_PROG_TYPE_LSM is the eBPF program type that attaches to LSM hooks — introduced in kernel 5.7, stable in 5.10+; available on all current Ubuntu LTS, Fedora, and EKS/GKE nodes
Tetragon attaches eBPF programs to LSM hooks and kprobes simultaneously — observing and enforcing from the same kernel attachment point
Tetragon’s enforcement sends SIGKILL from within the kernel context — not from a userspace agent reading an audit log and then killing the process
Production caution: LSM enforce mode without thorough policy testing in audit mode first will kill legitimate workloads; always audit before enforce

EP11 showed how to observe DNS queries at the kernel level — seeing what a workload resolves before it establishes a connection. But observation is passive. It tells you what happened. LSM eBPF Tetragon changes the question entirely: instead of watching the workload, the kernel refuses the operation. This episode covers how that enforcement layer works and why the difference between “detect” and “prevent” matters in runtime security.

Quick Check: Is Your Cluster Running LSM-Based Enforcement?

# On any cluster node — what security modules are active?
cat /sys/kernel/security/lsm

# Expected output on a modern kernel:
# lockdown,capability,landlock,yama,apparmor,bpf
#                                              ^^^
#                            "bpf" here means BPF LSM is enabled

# Is Tetragon running on this cluster?
kubectl get pods -n kube-system -l app.kubernetes.io/name=tetragon

# If Tetragon is present, check what TracingPolicies are enforcing:
kubectl get tracingpolicies -A

# Sample output:
# NAMESPACE    NAME                      AGE
# kube-system  block-privileged-exec     3d
# kube-system  restrict-sensitive-paths  3d

# See what eBPF programs Tetragon has loaded
bpftool prog list | grep -i tetragon

# Output sample:
# 89: lsm  name tetragon_lsm_bprm  tag 8f2a1c3e4d5b7a9f  gpl
#     loaded_at 2026-04-22T09:13:45+0530  uid 0
#     xlated 3312B  jited 2184B  memlock 8192B
# 91: kprobe  name tetragon_kp_exec tag 3c1d8e2f7a4b5c9d  gpl

lsm program type confirms LSM hook attachment. If you see tetragon_lsm_* entries, Tetragon is enforcing at the kernel level on this node.

Not running Tetragon? Check if your cluster uses AppArmor or seccomp profiles instead — kubectl get pod <name> -o jsonpath='{.metadata.annotations}' and look for seccomp.security.alpha.kubernetes.io or container.apparmor.security.beta.kubernetes.io annotations. These are userspace-applied profiles that the kernel enforces. Tetragon is additive — it can run alongside AppArmor/seccomp and provides per-process, dynamic policy that static profiles cannot.

Falco fired at 03:14 AM. The alert: a process inside a production container had opened /etc/passwd for writing. By the time I was on the call, the container had been restarted by a health check failure — the compromised process had already exited. The file had already been modified. Falco had detected the open, emitted the alert, and by the time any automated response could have acted, the syscall had returned, the write had completed, and the file was changed.

Falco did exactly what it’s designed to do: observe and alert. The gap isn’t in Falco — it’s in the architecture. When a tool detects from userspace by reading kernel audit events, there is always a window between the operation completing and the alert firing. For a fast exploit, that window is the entire attack.

I added a Tetragon TracingPolicy the following week:

spec:
  kprobes:
    - call: "security_inode_permission"
      syscall: false
      return: false
      args:
        - index: 0
          type: "inode"
      selectors:
        - matchArgs:
            - index: 0
              operator: "Prefix"
              values: ["/etc/passwd", "/etc/shadow"]
          matchActions:
            - action: Sigkill

Next time a process tries to open /etc/passwd for writing in a container covered by that policy, the kernel sends SIGKILL from within the LSM hook. The open never completes. There is no window.

How LSM Hooks Are Placed in the Kernel

Linux Security Modules (LSM) is a framework built into the Linux kernel that inserts hook points before security-sensitive operations. The hook fires before the operation is allowed to complete — the LSM module can return an error code that causes the kernel to reject the operation and return -EPERM to the calling process.

Process calls open("/etc/passwd", O_WRONLY)
      ↓
VFS (Virtual Filesystem) layer receives the request
      ↓
VFS calls security_inode_permission()   ← LSM hook fires here
      ↓
LSM module checks policy
      ↓
      ├── ALLOW → open() proceeds, file descriptor returned
      └── DENY  → open() returns -EPERM, process gets "Permission denied"
                  File is never touched

LSM hook — a callback point embedded in Linux kernel source at every security-sensitive operation: file open, execute, socket connect, capability check, mount, ptrace, and more. The kernel calls registered LSM modules at each hook. Before BPF LSM (kernel 5.7), only statically compiled security modules (SELinux, AppArmor, BPF LSM itself) could register at these hooks.

BPF_PROG_TYPE_LSM — the eBPF program type that attaches to LSM hooks. Introduced in kernel 5.7. Requires BPF LSM to be enabled in the kernel (lsm=bpf in kernel command line, or present alongside other LSMs). When this program type is loaded and attached to an LSM hook, the eBPF program runs at the hook point and returns 0 (allow) or a negative error code (deny).

The full list of LSM hooks:

# All LSM hook points available for eBPF attachment
bpftool feature list | grep lsm_hook | head -20

# Or browse the kernel source list:
# include/linux/security.h — every security_*() function is an LSM hook point

There are 200+ LSM hook points. The most operationally relevant for container security:

LSM Hook	What it guards
`security_bprm_check`	Process execution (execve)
`security_inode_permission`	File read/write/execute
`security_inode_create`	File creation
`security_socket_connect`	Outbound TCP/UDP connect
`security_socket_bind`	Port binding
`security_ptrace_access_check`	ptrace (debugger attach)
`security_capable`	Capability checks (CAP_SYS_ADMIN etc.)

How Tetragon Combines LSM and kprobe

Tetragon attaches two types of programs simultaneously for comprehensive runtime security:

kprobe programs          LSM programs
(observation layer)      (enforcement layer)
       │                        │
       ↓                        ↓
Process executes              Kernel LSM hook fires
kernel function               BEFORE operation completes
       │                        │
       ↓                        ↓
Tetragon reads context:       Tetragon checks TracingPolicy:
  - process name                - selectors match?
  - PID, UID                    - action = Sigkill?
  - namespace, pod name         │
  - parent process              ↓
  - capabilities                SIGKILL sent from kernel context
       │                        Process terminated
       ↓                        Operation never completes
Tetragon exports event
  to userspace observer

The kprobe side provides the rich context (pod name, namespace, process tree) because it has access to Kubernetes metadata that Tetragon’s userspace component has pre-populated into maps. The LSM side provides the enforcement capability. Together, they give you context-aware kernel enforcement.

SIGKILL from kernel vs userspace kill — When a userspace process runs kill -9 <pid>, it issues a kill syscall, the kernel schedules the signal delivery, and the target process dies on its next scheduler timeslice. There is a measurable delay — and more importantly, the target process may run for several more instructions before the signal is delivered. When a BPF LSM program returns a non-zero error code or calls bpf_send_signal(SIGKILL) from within the hook, the signal is delivered synchronously within the kernel’s execution context. The process does not execute another instruction in the problematic syscall. This is not a speed difference — it is a structural difference in when the enforcement happens relative to the operation.

Writing a Tetragon TracingPolicy for Enforcement

Tetragon policies are Kubernetes custom resources. Here’s a policy that prevents any container from executing shells:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: block-shell-exec
spec:
  kprobes:
    - call: "security_bprm_check"
      syscall: false
      args:
        - index: 0
          type: "linux_binprm"
      selectors:
        - matchBinaries:
            - operator: "In"
              values:
                - "/bin/sh"
                - "/bin/bash"
                - "/bin/dash"
                - "/usr/bin/sh"
                - "/usr/bin/bash"
          matchNamespaces:
            - namespace: Pid
              operator: "NotIn"
              values: ["1"]      # exclude host namespace (PID 1 = init)
          matchActions:
            - action: Sigkill
              argError: -1       # EPERM returned to the caller

Apply and verify:

kubectl apply -f block-shell-exec.yaml

# Confirm it's active
kubectl get tracingpolicies
# NAME               ENABLED   REASON   AGE
# block-shell-exec   true               5s

# Verify Tetragon loaded the eBPF program for this policy
bpftool prog list | grep bprm
# 94: lsm  name tetragon_lsm_bprm  tag 8f2a1c3e4d5b7a9f  gpl
#     loaded_at 2026-04-22T14:22:13+0530  uid 0

Test it (in a non-production namespace):

kubectl exec -it test-pod -- /bin/sh

# Expected output:
# OCI runtime exec failed: exec failed: unable to start container process:
# error during container init: error starting executable ["/bin/sh"]:
# container_linux.go: ... starting container process caused: process_linux.go:
# ... SIGKILL

The shell never started. The security_bprm_check LSM hook fired, the Tetragon eBPF program evaluated the policy, returned SIGKILL from kernel space. The exec system call returned -EPERM to the container runtime. No shell process was created.

Audit Mode Before Enforce Mode

Running a new LSM policy in enforce mode without prior testing will kill legitimate workloads. Tetragon supports audit mode for every policy:

          matchActions:
            - action: Post     # audit mode: log event, do NOT kill

Post emits a Tetragon event that you can observe:

# Watch audit events for the policy (before switching to Sigkill)
kubectl exec -n kube-system -it \
  $(kubectl get pod -n kube-system -l app.kubernetes.io/name=tetragon -o name | head -1) \
  -- tetra getevents --event-types PROCESS_KPROBE | grep bprm

Sample audit event:

{
  "process_kprobe": {
    "process": {
      "pod": {"name": "my-app-6d4f9-xk2p1", "namespace": "production"},
      "binary": "/bin/sh",
      "pid": 18293
    },
    "function_name": "security_bprm_check",
    "action": "KPROBE_ACTION_POST"
  }
}

If my-app legitimately needs /bin/sh for its health check script, you’ll see it here before you kill it. Refine the selector (add matchLabels to exclude that specific deployment, or add the binary to an allowlist) and then switch to Sigkill.

⚠ Production Gotchas

Enforce mode kills anything the selector matches — including health checks and init containers. Most production containers have some shell usage: liveness probes that run sh -c, init containers that chmod files, entrypoint wrappers. Run in Post (audit) mode for at least 48 hours across a representative workload set before switching to Sigkill. Track all matched events and understand every process in the trace before enforcing.

LSM hooks fire in kernel context — eBPF program complexity is limited. The verifier enforces strict limits on LSM programs because they run synchronously in the kernel’s hot path. Policies with many conditions or complex map lookups may be rejected by the verifier. Tetragon’s policy engine compiles your TracingPolicy into eBPF that stays within verifier limits, but very complex matchArgs chains with many values can hit limits. Test with kubectl apply and check Tetragon pod logs for verifier rejection messages.

BPF_PROG_TYPE_LSM requires kernel 5.7+ and BPF LSM enabled. Check /sys/kernel/security/lsm for bpf in the list. EKS nodes running Amazon Linux 2 with kernel 5.10+ have BPF LSM available. GKE nodes with kernel 5.10+ on Container-Optimized OS have it enabled. Ubuntu 22.04 (kernel 5.15) has it enabled by default. Ubuntu 20.04 kernels before 5.7 do not — check your actual kernel version.

Policy scope: Tetragon TracingPolicies are cluster-wide by default. A policy without a matchNamespaces or matchLabels selector applies to every pod on every node. Start with namespace-scoped policies during testing. Use namespaced TracingPolicy resources (Tetragon 0.10+) to limit scope to a specific namespace.

bpf_send_signal(SIGKILL) vs returning an error code. Tetragon’s Sigkill action uses bpf_send_signal() rather than returning a negative error from the LSM hook. This means the syscall may return before the signal is delivered — there can be a single instruction window. For critical enforcement paths, combining LSM deny (return -EPERM) with bpf_send_signal(SIGKILL) is the belt-and-suspenders approach; Tetragon’s maintainers have documented which actions use which mechanism.

Quick Reference

What you want	Command
Is BPF LSM enabled?	`cat /sys/kernel/security/lsm` (look for `bpf`)
What LSM programs are loaded?	`bpftool prog list \| grep lsm`
What Tetragon policies exist?	`kubectl get tracingpolicies -A`
Audit events (before enforce)	`tetra getevents --event-types PROCESS_KPROBE`
Watch Tetragon enforcement	`kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -f`
Test a policy safely	Set `action: Post` before `action: Sigkill`

Tetragon action	Effect
`Post`	Log event only — audit mode
`Sigkill`	Send SIGKILL from kernel context
`Override`	Return custom error code to syscall caller
`FollowFD`	Track file descriptor for future hook correlation

LSM hook	Protects
`security_bprm_check`	exec (block shell spawning)
`security_inode_permission`	file access (block reads/writes to sensitive paths)
`security_socket_connect`	outbound connections (block C2 connections)
`security_capable`	capability escalation (block CAP_SYS_ADMIN attempts)

Key Takeaways

LSM eBPF Tetragon enforces at the syscall boundary — the operation either never completes or returns an error before the kernel performs the action, with no detect-and-respond window
Falco, Datadog, and sidecar-based tools detect events after the syscall returns; this is architectural, not a product limitation — they operate at a layer where the operation has already occurred
BPF_PROG_TYPE_LSM attaches eBPF programs directly to Linux Security Module hooks; available on kernel 5.7+, enabled on all current EKS/GKE LTS node images
Tetragon sends SIGKILL from kernel context using bpf_send_signal() — not from a userspace agent polling an audit log
Always run Tetragon policies in Post (audit) mode for 48+ hours before switching to Sigkill — legitimate workloads trigger many of the same LSM hooks that attacks use
The combination of kprobe (rich context: pod name, namespace, process tree) and LSM (enforcement) gives Tetragon context-aware kernel enforcement that static profiles (AppArmor, seccomp) cannot provide dynamically

What’s Next

LSM hooks prevent operations in the moment. But after an incident — when enforcement failed, or when you’re doing post-hoc forensics — the question changes: what did this process spawn, what files did it touch, what connections did it make, and in what order? Answering that from logs alone is guesswork. Answering it from kernel-level process lineage is reconstruction.

EP13 covers how eBPF kprobe hooks on fork and exec build a complete, tamper-resistant process tree. Even after the attacker’s process has exited, the record remains — in kernel maps, exported to a persistent store, tied to the pod that ran it.

Next: process lineage with eBPF — reconstructing what happened after the fact

Get EP13 in your inbox when it publishes → linuxcent.com/subscribe