LSM and Tetragon — When the Kernel Says No

Reading Time: 9 minutes

eBPF: From Kernel to Cloud, Episode 12
What Is eBPF? · The BPF Verifier · eBPF vs Kernel Modules · eBPF Program Types · eBPF Maps · CO-RE and libbpf · XDP · TC eBPF · bpftrace · Network Flow Observability · DNS Observability · LSM and Tetragon


Architecture Overview

LSM BPF and Tetragon — kernel security enforcement architecture showing syscall interception and policy evaluation
LSM BPF hooks fire before every sensitive syscall — Tetragon uses them to enforce and kill, not just observe.

TL;DR

  • LSM eBPF Tetragon integrates Linux Security Module hooks with eBPF programs — enforcement happens at the syscall boundary, before the operation completes, with no detect-and-respond window
    (LSM hook = Linux Security Module hook: a callback point built into the kernel that fires before a security-relevant operation completes, allowing the security module to approve or reject it)
  • Falco and similar sidecar-based tools detect after the fact — the syscall returns, the file is written, the connection is established, the alert fires; with LSM, the syscall never returns success
  • BPF_PROG_TYPE_LSM is the eBPF program type that attaches to LSM hooks — introduced in kernel 5.7, stable in 5.10+; available on all current Ubuntu LTS, Fedora, and EKS/GKE nodes
  • Tetragon attaches eBPF programs to LSM hooks and kprobes simultaneously — observing and enforcing from the same kernel attachment point
  • Tetragon’s enforcement sends SIGKILL from within the kernel context — not from a userspace agent reading an audit log and then killing the process
  • Production caution: LSM enforce mode without thorough policy testing in audit mode first will kill legitimate workloads; always audit before enforce

EP11 showed how to observe DNS queries at the kernel level — seeing what a workload resolves before it establishes a connection. But observation is passive. It tells you what happened. LSM eBPF Tetragon changes the question entirely: instead of watching the workload, the kernel refuses the operation. This episode covers how that enforcement layer works and why the difference between “detect” and “prevent” matters in runtime security.

Quick Check: Is Your Cluster Running LSM-Based Enforcement?

# On any cluster node — what security modules are active?
cat /sys/kernel/security/lsm

# Expected output on a modern kernel:
# lockdown,capability,landlock,yama,apparmor,bpf
#                                              ^^^
#                            "bpf" here means BPF LSM is enabled
# Is Tetragon running on this cluster?
kubectl get pods -n kube-system -l app.kubernetes.io/name=tetragon

# If Tetragon is present, check what TracingPolicies are enforcing:
kubectl get tracingpolicies -A

# Sample output:
# NAMESPACE    NAME                      AGE
# kube-system  block-privileged-exec     3d
# kube-system  restrict-sensitive-paths  3d
# See what eBPF programs Tetragon has loaded
bpftool prog list | grep -i tetragon

# Output sample:
# 89: lsm  name tetragon_lsm_bprm  tag 8f2a1c3e4d5b7a9f  gpl
#     loaded_at 2026-04-22T09:13:45+0530  uid 0
#     xlated 3312B  jited 2184B  memlock 8192B
# 91: kprobe  name tetragon_kp_exec tag 3c1d8e2f7a4b5c9d  gpl

lsm program type confirms LSM hook attachment. If you see tetragon_lsm_* entries, Tetragon is enforcing at the kernel level on this node.

Not running Tetragon? Check if your cluster uses AppArmor or seccomp profiles instead — kubectl get pod <name> -o jsonpath='{.metadata.annotations}' and look for seccomp.security.alpha.kubernetes.io or container.apparmor.security.beta.kubernetes.io annotations. These are userspace-applied profiles that the kernel enforces. Tetragon is additive — it can run alongside AppArmor/seccomp and provides per-process, dynamic policy that static profiles cannot.


Falco fired at 03:14 AM. The alert: a process inside a production container had opened /etc/passwd for writing. By the time I was on the call, the container had been restarted by a health check failure — the compromised process had already exited. The file had already been modified. Falco had detected the open, emitted the alert, and by the time any automated response could have acted, the syscall had returned, the write had completed, and the file was changed.

Falco did exactly what it’s designed to do: observe and alert. The gap isn’t in Falco — it’s in the architecture. When a tool detects from userspace by reading kernel audit events, there is always a window between the operation completing and the alert firing. For a fast exploit, that window is the entire attack.

I added a Tetragon TracingPolicy the following week:

spec:
  kprobes:
    - call: "security_inode_permission"
      syscall: false
      return: false
      args:
        - index: 0
          type: "inode"
      selectors:
        - matchArgs:
            - index: 0
              operator: "Prefix"
              values: ["/etc/passwd", "/etc/shadow"]
          matchActions:
            - action: Sigkill

Next time a process tries to open /etc/passwd for writing in a container covered by that policy, the kernel sends SIGKILL from within the LSM hook. The open never completes. There is no window.


How LSM Hooks Are Placed in the Kernel

Linux Security Modules (LSM) is a framework built into the Linux kernel that inserts hook points before security-sensitive operations. The hook fires before the operation is allowed to complete — the LSM module can return an error code that causes the kernel to reject the operation and return -EPERM to the calling process.

Process calls open("/etc/passwd", O_WRONLY)
      ↓
VFS (Virtual Filesystem) layer receives the request
      ↓
VFS calls security_inode_permission()   ← LSM hook fires here
      ↓
LSM module checks policy
      ↓
      ├── ALLOW → open() proceeds, file descriptor returned
      └── DENY  → open() returns -EPERM, process gets "Permission denied"
                  File is never touched

LSM hook — a callback point embedded in Linux kernel source at every security-sensitive operation: file open, execute, socket connect, capability check, mount, ptrace, and more. The kernel calls registered LSM modules at each hook. Before BPF LSM (kernel 5.7), only statically compiled security modules (SELinux, AppArmor, BPF LSM itself) could register at these hooks.

BPF_PROG_TYPE_LSM — the eBPF program type that attaches to LSM hooks. Introduced in kernel 5.7. Requires BPF LSM to be enabled in the kernel (lsm=bpf in kernel command line, or present alongside other LSMs). When this program type is loaded and attached to an LSM hook, the eBPF program runs at the hook point and returns 0 (allow) or a negative error code (deny).

The full list of LSM hooks:

# All LSM hook points available for eBPF attachment
bpftool feature list | grep lsm_hook | head -20

# Or browse the kernel source list:
# include/linux/security.h — every security_*() function is an LSM hook point

There are 200+ LSM hook points. The most operationally relevant for container security:

LSM Hook What it guards
security_bprm_check Process execution (execve)
security_inode_permission File read/write/execute
security_inode_create File creation
security_socket_connect Outbound TCP/UDP connect
security_socket_bind Port binding
security_ptrace_access_check ptrace (debugger attach)
security_capable Capability checks (CAP_SYS_ADMIN etc.)

How Tetragon Combines LSM and kprobe

Tetragon attaches two types of programs simultaneously for comprehensive runtime security:

kprobe programs          LSM programs
(observation layer)      (enforcement layer)
       │                        │
       ↓                        ↓
Process executes              Kernel LSM hook fires
kernel function               BEFORE operation completes
       │                        │
       ↓                        ↓
Tetragon reads context:       Tetragon checks TracingPolicy:
  - process name                - selectors match?
  - PID, UID                    - action = Sigkill?
  - namespace, pod name         │
  - parent process              ↓
  - capabilities                SIGKILL sent from kernel context
       │                        Process terminated
       ↓                        Operation never completes
Tetragon exports event
  to userspace observer

The kprobe side provides the rich context (pod name, namespace, process tree) because it has access to Kubernetes metadata that Tetragon’s userspace component has pre-populated into maps. The LSM side provides the enforcement capability. Together, they give you context-aware kernel enforcement.

SIGKILL from kernel vs userspace kill — When a userspace process runs kill -9 <pid>, it issues a kill syscall, the kernel schedules the signal delivery, and the target process dies on its next scheduler timeslice. There is a measurable delay — and more importantly, the target process may run for several more instructions before the signal is delivered. When a BPF LSM program returns a non-zero error code or calls bpf_send_signal(SIGKILL) from within the hook, the signal is delivered synchronously within the kernel’s execution context. The process does not execute another instruction in the problematic syscall. This is not a speed difference — it is a structural difference in when the enforcement happens relative to the operation.


Writing a Tetragon TracingPolicy for Enforcement

Tetragon policies are Kubernetes custom resources. Here’s a policy that prevents any container from executing shells:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: block-shell-exec
spec:
  kprobes:
    - call: "security_bprm_check"
      syscall: false
      args:
        - index: 0
          type: "linux_binprm"
      selectors:
        - matchBinaries:
            - operator: "In"
              values:
                - "/bin/sh"
                - "/bin/bash"
                - "/bin/dash"
                - "/usr/bin/sh"
                - "/usr/bin/bash"
          matchNamespaces:
            - namespace: Pid
              operator: "NotIn"
              values: ["1"]      # exclude host namespace (PID 1 = init)
          matchActions:
            - action: Sigkill
              argError: -1       # EPERM returned to the caller

Apply and verify:

kubectl apply -f block-shell-exec.yaml

# Confirm it's active
kubectl get tracingpolicies
# NAME               ENABLED   REASON   AGE
# block-shell-exec   true               5s

# Verify Tetragon loaded the eBPF program for this policy
bpftool prog list | grep bprm
# 94: lsm  name tetragon_lsm_bprm  tag 8f2a1c3e4d5b7a9f  gpl
#     loaded_at 2026-04-22T14:22:13+0530  uid 0

Test it (in a non-production namespace):

kubectl exec -it test-pod -- /bin/sh

# Expected output:
# OCI runtime exec failed: exec failed: unable to start container process:
# error during container init: error starting executable ["/bin/sh"]:
# container_linux.go: ... starting container process caused: process_linux.go:
# ... SIGKILL

The shell never started. The security_bprm_check LSM hook fired, the Tetragon eBPF program evaluated the policy, returned SIGKILL from kernel space. The exec system call returned -EPERM to the container runtime. No shell process was created.


Audit Mode Before Enforce Mode

Running a new LSM policy in enforce mode without prior testing will kill legitimate workloads. Tetragon supports audit mode for every policy:

          matchActions:
            - action: Post     # audit mode: log event, do NOT kill

Post emits a Tetragon event that you can observe:

# Watch audit events for the policy (before switching to Sigkill)
kubectl exec -n kube-system -it \
  $(kubectl get pod -n kube-system -l app.kubernetes.io/name=tetragon -o name | head -1) \
  -- tetra getevents --event-types PROCESS_KPROBE | grep bprm

Sample audit event:

{
  "process_kprobe": {
    "process": {
      "pod": {"name": "my-app-6d4f9-xk2p1", "namespace": "production"},
      "binary": "/bin/sh",
      "pid": 18293
    },
    "function_name": "security_bprm_check",
    "action": "KPROBE_ACTION_POST"
  }
}

If my-app legitimately needs /bin/sh for its health check script, you’ll see it here before you kill it. Refine the selector (add matchLabels to exclude that specific deployment, or add the binary to an allowlist) and then switch to Sigkill.


⚠ Production Gotchas

Enforce mode kills anything the selector matches — including health checks and init containers. Most production containers have some shell usage: liveness probes that run sh -c, init containers that chmod files, entrypoint wrappers. Run in Post (audit) mode for at least 48 hours across a representative workload set before switching to Sigkill. Track all matched events and understand every process in the trace before enforcing.

LSM hooks fire in kernel context — eBPF program complexity is limited. The verifier enforces strict limits on LSM programs because they run synchronously in the kernel’s hot path. Policies with many conditions or complex map lookups may be rejected by the verifier. Tetragon’s policy engine compiles your TracingPolicy into eBPF that stays within verifier limits, but very complex matchArgs chains with many values can hit limits. Test with kubectl apply and check Tetragon pod logs for verifier rejection messages.

BPF_PROG_TYPE_LSM requires kernel 5.7+ and BPF LSM enabled. Check /sys/kernel/security/lsm for bpf in the list. EKS nodes running Amazon Linux 2 with kernel 5.10+ have BPF LSM available. GKE nodes with kernel 5.10+ on Container-Optimized OS have it enabled. Ubuntu 22.04 (kernel 5.15) has it enabled by default. Ubuntu 20.04 kernels before 5.7 do not — check your actual kernel version.

Policy scope: Tetragon TracingPolicies are cluster-wide by default. A policy without a matchNamespaces or matchLabels selector applies to every pod on every node. Start with namespace-scoped policies during testing. Use namespaced TracingPolicy resources (Tetragon 0.10+) to limit scope to a specific namespace.

bpf_send_signal(SIGKILL) vs returning an error code. Tetragon’s Sigkill action uses bpf_send_signal() rather than returning a negative error from the LSM hook. This means the syscall may return before the signal is delivered — there can be a single instruction window. For critical enforcement paths, combining LSM deny (return -EPERM) with bpf_send_signal(SIGKILL) is the belt-and-suspenders approach; Tetragon’s maintainers have documented which actions use which mechanism.


Quick Reference

What you want Command
Is BPF LSM enabled? cat /sys/kernel/security/lsm (look for bpf)
What LSM programs are loaded? bpftool prog list | grep lsm
What Tetragon policies exist? kubectl get tracingpolicies -A
Audit events (before enforce) tetra getevents --event-types PROCESS_KPROBE
Watch Tetragon enforcement kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -f
Test a policy safely Set action: Post before action: Sigkill
Tetragon action Effect
Post Log event only — audit mode
Sigkill Send SIGKILL from kernel context
Override Return custom error code to syscall caller
FollowFD Track file descriptor for future hook correlation
LSM hook Protects
security_bprm_check exec (block shell spawning)
security_inode_permission file access (block reads/writes to sensitive paths)
security_socket_connect outbound connections (block C2 connections)
security_capable capability escalation (block CAP_SYS_ADMIN attempts)

Key Takeaways

  • LSM eBPF Tetragon enforces at the syscall boundary — the operation either never completes or returns an error before the kernel performs the action, with no detect-and-respond window
  • Falco, Datadog, and sidecar-based tools detect events after the syscall returns; this is architectural, not a product limitation — they operate at a layer where the operation has already occurred
  • BPF_PROG_TYPE_LSM attaches eBPF programs directly to Linux Security Module hooks; available on kernel 5.7+, enabled on all current EKS/GKE LTS node images
  • Tetragon sends SIGKILL from kernel context using bpf_send_signal() — not from a userspace agent polling an audit log
  • Always run Tetragon policies in Post (audit) mode for 48+ hours before switching to Sigkill — legitimate workloads trigger many of the same LSM hooks that attacks use
  • The combination of kprobe (rich context: pod name, namespace, process tree) and LSM (enforcement) gives Tetragon context-aware kernel enforcement that static profiles (AppArmor, seccomp) cannot provide dynamically

What’s Next

LSM hooks prevent operations in the moment. But after an incident — when enforcement failed, or when you’re doing post-hoc forensics — the question changes: what did this process spawn, what files did it touch, what connections did it make, and in what order? Answering that from logs alone is guesswork. Answering it from kernel-level process lineage is reconstruction.

EP13 covers how eBPF kprobe hooks on fork and exec build a complete, tamper-resistant process tree. Even after the attacker’s process has exited, the record remains — in kernel maps, exported to a persistent store, tied to the pod that ran it.

Next: process lineage with eBPF — reconstructing what happened after the fact

Get EP13 in your inbox when it publishes → linuxcent.com/subscribe