eBPF Program Types — What’s Actually Running on Your Nodes

eBPF: From Kernel to Cloud, Episode 4
Earlier in this series: What Is eBPF? · The BPF Verifier · eBPF vs Kernel Modules


By Episode 3, we’d covered what eBPF is, why the verifier makes it safe for production, and why it’s replaced kernel modules for observability workloads. What we hadn’t answered — and what a 2am incident eventually forced — is what kind of eBPF programs are actually running on your nodes, and why the difference matters when something breaks.

A pod in production was dropping roughly one in fifty outbound TCP connections. Not all of them — just enough to cause intermittent timeouts in the application logs. NetworkPolicy showed egress allowed. Cilium reported no violations. Running curl manually from inside the pod worked every time.

I spent the better part of three hours eliminating possibilities. DNS. MTU. Node-level conntrack table exhaustion. Upstream firewall rules. Nothing.

Eventually, almost as an afterthought, I ran this:

sudo bpftool prog list

There were two TC programs attached to that pod’s veth interface. One from the current Cilium version. One from the previous version — left behind by a rolling upgrade that hadn’t cleaned up properly. Two programs. Different policy state. One was occasionally dropping packets based on rules that no longer existed in the current policy model.

The answer had been sitting in the kernel the whole time. I just didn’t know where to look.

That incident forced me to actually understand something I’d been hand-waving for two years: eBPF isn’t a single hook. It’s a family of program types, each attached to a different location in the kernel, each seeing different data, each suited for different problems. Understanding the difference is what separates “I run Cilium and Falco” from “I understand what Cilium and Falco are actually doing on my nodes” — and that difference matters when something breaks at 2am.

The Command You Should Run on Your Cluster Right Now

Before getting into the theory, do this:

# See every eBPF program loaded on the node
sudo bpftool prog list

# See every eBPF program attached to a network interface
sudo bpftool net list

On a node running Cilium and Falco, you’ll see something like this:

42: xdp           name cil_xdp_entry       loaded_at 2026-04-01T09:23:41
43: sched_cls     name cil_from_netdev      loaded_at 2026-04-01T09:23:41
44: sched_cls     name cil_to_netdev        loaded_at 2026-04-01T09:23:41
51: cgroup_sock_addr  name cil_sock4_connect loaded_at 2026-04-01T09:23:41
88: raw_tracepoint  name sys_enter          loaded_at 2026-04-01T09:23:55
89: raw_tracepoint  name sys_exit           loaded_at 2026-04-01T09:23:55

Each line is a different program type. Each one fires at a different point in the kernel. The type column — xdp, sched_cls, raw_tracepoint, cgroup_sock_addr — tells you where in the kernel execution path that program is attached and therefore what it can and cannot see.

If you see more programs than you expect on a specific interface — like I did — that’s your first clue.

Why Program Types Exist

The Linux kernel isn’t a single pipeline. Network packets, system calls, file operations, process scheduling — these all run through different subsystems with different execution contexts and different available data.

eBPF lets you attach programs to specific points within those subsystems. The “program type” is the contract: it defines where the hook fires, what data the program receives, and what it’s allowed to do with it. A program designed to process network packets before they hit the kernel stack looks completely different from one designed to intercept system calls across all containers simultaneously.

Most of us will interact with four or five program types through the tools we already run. Understanding what each one actually is — where it sits, what it sees — is what makes you effective when those tools behave unexpectedly.

The Types Behind the Tools You Already Use

TC — Why Cilium Can Tell Which Pod Sent a Packet

TC stands for Traffic Control. It’s where Cilium enforces your NetworkPolicy, and it’s what caused my incident.

TC programs attach to network interfaces — specifically to the ingress and egress directions of the pod’s virtual interface (lxcXXXXX in Cilium’s naming). They fire after the kernel has already processed the packet enough to know its context: which socket created it, which cgroup that socket belongs to. Cgroup maps to container, container maps to pod.

This is the critical piece: TC is how Cilium knows which pod a packet belongs to. Without that cgroup context, per-pod policy enforcement isn’t possible.

# See TC programs on a pod's veth interface
sudo tc filter show dev lxc12345 ingress
sudo tc filter show dev lxc12345 egress

# If you see two entries on the same direction — that's the incident I described
# The priority number (pref 1, pref 2) tells you the order they run

When there are two TC programs on the same interface, the first one to return “drop” wins. The second program never runs. This is why the issue was intermittent rather than consistent — the stale program only matched specific connection patterns.

Fixing it is straightforward once you know what to look for:

# Remove a stale TC filter by its priority number
sudo tc filter del dev lxc12345 egress pref 2

Add this check to your post-upgrade runbook. Cilium upgrades are generally clean but not always.

XDP — Why Cilium Doesn’t Use TC for Everything

If TC is good enough for pod-level policy, why does Cilium also run an XDP program on the node’s main interface? Look at the bpftool prog list output again — there’s an xdp program loaded alongside the TC programs.

XDP fires earlier. Much earlier. Before the kernel allocates any memory for the packet. Before routing. Before connection tracking. Before anything.

The tradeoff is exactly what you’d expect: XDP is fast but context-poor. It sees raw packet bytes. It doesn’t know which pod the packet came from. It can’t read cgroup information because no socket buffer has been allocated yet.

Cilium uses XDP specifically for ClusterIP service load balancing — when a packet arrives at the node destined for a service VIP, XDP rewrites the destination to the actual pod IP in a single map lookup and sends it on its way. No iptables. No conntrack. The work is done before the kernel stack is involved.

There’s a silent failure mode worth knowing about here. XDP runs in one of two modes:

  • Native mode — runs inside the NIC driver itself, before any kernel allocation. This is where the performance comes from.
  • Generic mode — fallback when the NIC driver doesn’t support XDP. Runs later, after sk_buff allocation. No performance benefit over iptables.

If your NIC doesn’t support native XDP, Cilium silently falls back to generic mode. The policy still works — but the performance characteristics you assumed aren’t there.

# Check which XDP mode is active on your node's main interface
ip link show eth0 | grep xdp
# xdpdrv  ← native mode (fast)
# xdpgeneric ← generic mode (no perf benefit)

Most cloud provider instance types with modern Mellanox/Intel NICs support native mode. Worth verifying rather than assuming.

Tracepoints — How Falco Sees Every Container

Falco loads two programs: sys_enter and sys_exit. These are raw tracepoints — they fire on every single system call, from every process, in every container on the node.

Tracepoints are explicitly defined and maintained instrumentation points in the kernel. Unlike hooks that attach to specific internal function names (which can be renamed or inlined between kernel versions), tracepoints are stable interfaces. They’re part of the kernel’s public contract with tooling that wants to instrument it.

This matters operationally. When you patch your nodes — and cloud-managed nodes get patched frequently — tools built on tracepoints keep working. Tools built on kprobes (internal function hooks) may silently stop firing if the function they’re attached to gets renamed or inlined by the compiler in a new kernel build.

# Verify what Falco is actually using
sudo bpftool prog list | grep -E "kprobe|tracepoint"

# Falco's current eBPF driver should show raw_tracepoint entries
# If you see kprobe entries from Falco, you're on the older driver
# Check: falco --version and the driver being loaded at startup

If you’re running Falco on a cluster that gets regular OS patch upgrades and you haven’t verified the driver mode, check it. The older kprobe-based driver has a real failure mode on certain kernel versions.

LSM — How Tetragon Blocks Operations at the Kernel Level

LSM hooks run at the kernel’s security decision points: file opens, socket connections, process execution, capability checks. The defining characteristic is that they can deny an operation. Return an error from an LSM hook and the kernel refuses the syscall before it completes.

This is qualitatively different from observability hooks. kprobes and tracepoints watch. LSM hooks enforce.

When you see Tetragon configured to kill a process attempting a privileged operation, or block a container from writing to a specific path, that’s an LSM hook making the decision inside the kernel — not a sidecar watching traffic, not an admission webhook running before pod creation, not a userspace agent trying to act fast enough. The enforcement is in the kernel itself.

# See if any LSM eBPF programs are active on the node
sudo bpftool prog list | grep lsm

# Verify LSM eBPF support on your kernel (required for Tetragon enforcement mode)
grep CONFIG_BPF_LSM /boot/config-$(uname -r)
# CONFIG_BPF_LSM=y   ← required

The Practical Summary

What’s happening on your node Program type Where to look
Cilium service load balancing XDP ip link show eth0 \| grep xdp
Cilium pod network policy TC (sched_cls) tc filter show dev lxcXXXX egress
Falco syscall monitoring Tracepoint bpftool prog list \| grep tracepoint
Tetragon enforcement LSM bpftool prog list \| grep lsm
Anything unexpected All types bpftool prog list, bpftool net list

The Incident, Revisited

Three hours of debugging. The answer was a stale TC program sitting at priority 2 on a pod’s veth interface, left behind by an incomplete Cilium upgrade.

# What I should have run first
sudo bpftool net list
sudo tc filter show dev lxc12345 egress

Two commands. Thirty seconds. If I’d known that TC programs can stack on the same interface, I’d have started there.

That’s the point of understanding program types — not to write eBPF programs yourself, but to know where to look when the tools you depend on don’t behave the way you expect. The programs are already there, running on your nodes right now. bpftool prog list shows you all of them.

Key Takeaways

  • bpftool prog list and bpftool net list show every eBPF program on a node — run these before anything else when debugging eBPF-based tool behavior
  • TC programs can stack on the same interface; stale programs from incomplete Cilium upgrades cause intermittent drops — check tc filter show after every Cilium upgrade
  • XDP runs before the kernel stack — fastest hook, but no pod identity; Cilium uses it for service load balancing, not pod policy
  • XDP silently falls back to generic mode on unsupported NICs — verify with ip link show | grep xdp
  • Tracepoints are stable across kernel versions; kprobe-based tools may silently break after node OS patches — verify your Falco driver mode
  • LSM hooks enforce at the kernel level — this is what makes Tetragon’s enforcement mode fundamentally different from sidecar-based approaches

What’s Next

Every eBPF program fires, does its work, and exits — but the work always involves data. Counting connections. Tracking processes. Streaming events to a detection engine. In EP05, I’ll cover eBPF maps: the persistent data layer that connects kernel programs to the tools consuming their output. Understanding maps explains a class of production issues — and makes bpftool map dump useful rather than cryptic.

eBPF vs Kernel Modules: An Honest Comparison for K8s Engineers

~2,100 words · Reading time: 8 min · Series: eBPF: From Kernel to Cloud, Episode 3 of 18

In Episode 1 we covered what eBPF is. In Episode 2 we covered why it is safe. The question that comes next is the one most tutorials skip entirely:

If eBPF can do everything a kernel module does for observability, why do kernel modules still exist? And when should you still reach for one?

Most comparisons on this topic are written by people who have used one or the other. I have used both — device driver work from 2012 to 2014 and eBPF in production Kubernetes clusters for the last several years. This is the honest version of that comparison, including the cases where kernel modules are still the right answer.


What Kernel Modules Actually Are

A kernel module is a piece of compiled code that loads directly into the running Linux kernel. Once loaded, it operates with full kernel privileges — the same level of access as the kernel itself. There is no sandbox. There is no safety check. There is no verifier.

This is both the power and the problem.

Kernel modules can do things that nothing else in the Linux ecosystem can do: implement new filesystems, add hardware drivers, intercept and modify kernel data structures, hook into scheduler internals. They are how the kernel extends itself without requiring a recompile or a reboot.

But the operating model is unforgiving:

  • A bug in a kernel module causes an immediate kernel panic — no exceptions, no recovery
  • Modules must be compiled against the exact kernel headers of the running kernel
  • A module that works on RHEL 8 may refuse to load on RHEL 9 without recompilation
  • Loading a module requires root privileges and deliberate coordination in production
  • Debugging a module failure means kernel crash dumps, kdump analysis, and time

I experienced all of these during device driver work. The discipline that environment instils is real — you think very carefully before touching anything, because mistakes are instantaneous and complete.


What eBPF Does Differently

eBPF was not designed to replace kernel modules. It was designed to provide a safe, programmable interface to kernel internals for the specific use cases where modules had always been used but were too dangerous: observability, networking, and security monitoring.

The fundamental difference is the verifier, covered in depth in Episode 2. Before any eBPF program runs, the kernel proves it is safe. Before any kernel module runs, nothing checks anything.

That single architectural decision produces a completely different operational profile:

Property Kernel module eBPF program
Safety check before load None BPF verifier — mathematical proof of safety
A bug causes Kernel panic, immediate Program rejected at load time
Kernel version coupling Compiled per kernel version CO-RE: compile once, run on any kernel 5.4+
Hot load / unload Risky, requires coordination Safe, zero downtime, zero pod restarts
Access scope Full kernel, unrestricted Restricted, granted per program type
Debugging Kernel crash dumps, kdump bpftool, bpftrace, readable error messages
Portability Recompile per distro per version Single binary runs across distros and versions
Production risk High — no safety net Low — verifier enforced before execution

CO-RE: Why Portability Matters More Than Most Engineers Realise

The portability column in that table deserves more than a one-line entry, because it is the operational advantage that compounds over time.

A kernel module written for RHEL 8 ships compiled against 4.18.0-xxx.el8.x86_64 kernel headers. When RHEL 8 moves to a new minor version, the module may need recompilation. When you migrate to RHEL 9 — kernel 5.14 with a completely different ABI in places — the module almost certainly needs a full rewrite of any code that touches kernel internals that changed between versions.

If you are running Falco with its kernel module driver and you upgrade a node from Ubuntu 20.04 to 22.04, Falco needs a pre-built module for your exact new kernel or it needs to compile one. If the pre-built is not available and compilation fails — no runtime security monitoring until it is resolved.

eBPF with CO-RE works differently. CO-RE (Compile Once, Run Everywhere) uses the kernel’s embedded BTF (BPF Type Format) information to patch field offsets and data structure layouts at load time to match the running kernel. The eBPF program was compiled once, against a reference kernel. When it loads on a different kernel, libbpf reads the BTF data from /sys/kernel/btf/vmlinux and fixes up the relocations automatically.

The practical result: a Cilium or Falco binary built six months ago loads and runs correctly on a node you just upgraded to a newer kernel version — without any module rebuilding, without any intervention, without any downtime.

In a Kubernetes environment where node images update regularly — especially on managed services like EKS, GKE, and AKS — this is not a minor convenience. It is the difference between eBPF tooling that survives an upgrade cycle and kernel module tooling that breaks one.


Security Implications: Container Escape and Privilege Escalation

The security difference between the two approaches matters specifically for container environments, and it goes beyond the verifier’s protection of your own nodes.

Kernel modules as an attack surface

Historically, kernel module vulnerabilities have been a primary vector for container escape. The attack pattern is straightforward: exploit a vulnerability in a loaded kernel module to gain kernel-level code execution, then use that access to break out of the container namespace into the host. Several high-profile CVEs over the past decade have followed this pattern.

The risk is compounded in environments that load third-party kernel modules — hardware drivers, filesystem modules, observability agents using the kernel module approach — because each additional module is an additional attack surface at the highest privilege level on the system.

eBPF’s security boundaries

eBPF does not eliminate the attack surface entirely, but it constrains it in important ways.

First, eBPF programs cannot leak kernel memory addresses to userspace. This is verifier-enforced and closes the class of KASLR bypass attacks that kernel module vulnerabilities have historically enabled.

Second, eBPF programs are sandboxed by design. They cannot access arbitrary kernel memory, cannot call arbitrary kernel functions, and cannot modify kernel data structures they were not explicitly granted access to. A vulnerability in an eBPF program is contained within that sandbox.

Third, the program type system controls what each eBPF program can see and do. A kprobe program watching syscalls cannot suddenly start modifying network packets. The scope is fixed at load time by the program type and verified by the kernel.

For EKS specifically: Falco running in eBPF mode on your nodes is not a kernel module that could be exploited for container escape. It is a verifier-checked program with a constrained access scope. The tool designed to detect container escapes is not itself a container escape vector — which is the correct security architecture.

Audit and visibility

eBPF programs are auditable in ways that kernel modules are not. You can list every eBPF program currently loaded on a node:

$ bpftool prog list
14: kprobe  name sys_enter_execve  tag abc123...  gpl
    loaded_at 2025-03-01T07:30:00+0000  uid 0
    xlated 240B  jited 172B  memlock 4096B  map_ids 3,4

27: cgroup_skb  name egress_filter  tag def456...  gpl
    loaded_at 2025-03-01T07:30:01+0000  uid 0

Every program is listed with its load time, its type, its tag (a hash of the program), and the maps it accesses. You can audit exactly what is running in your kernel at any point. Kernel modules offer no equivalent — lsmod tells you what is loaded but nothing about what it is actually doing.


EKS and Managed Kubernetes: Where the Difference Is Most Visible

The eBPF vs kernel module distinction plays out most clearly in managed Kubernetes environments, because you do not control when nodes upgrade.

On EKS, when AWS releases a new optimised AMI for a node group and you update it, your nodes are replaced. Any kernel module-based tooling on those nodes needs pre-built modules for the new kernel, or it needs to compile them at node startup, or it fails. AWS does not provide the kernel source for EKS-optimised AMIs in the same way a standard distribution does, which makes module compilation at runtime unreliable.

This is precisely why the EKS 1.33 migration covered in the EKS 1.33 post was painful for Rocky Linux: it involved kernel-level networking behaviour that had been assumed stable. When the kernel networking stack changed, everything built on top of those assumptions broke.

eBPF-based tooling on EKS does not have this problem, provided the node OS ships with BTF enabled — which Amazon Linux 2023 and Ubuntu 22.04 EKS-optimised AMIs do. Cilium and Falco survive node replacements without any module rebuilding because CO-RE handles the kernel version differences automatically.

For GKE and AKS the story is similar. Both use node images with BTF enabled on current versions, and both upgrade nodes on a managed schedule that is difficult to predict precisely. eBPF tooling survives this. Kernel module tooling fights it.


When You Should Still Use Kernel Modules

eBPF is not the right answer for every use case. Kernel modules remain the correct tool when:

You are implementing hardware support. Device drivers for new hardware still require kernel modules. eBPF cannot provide the low-level hardware interrupt handling, DMA operations, or hardware register access that a device driver needs. If you are bringing up a new network interface card, storage controller, or GPU, you are writing a kernel module.

You need to modify kernel behaviour, not just observe it. eBPF can observe and filter. It can drop packets, block syscalls via LSM hooks, and redirect traffic. But it cannot fundamentally change how the kernel handles a syscall, implement a new scheduling algorithm from scratch, or add a new filesystem type. Those changes require kernel modules or upstream kernel patches.

You are on a kernel older than 5.4. Without BTF and CO-RE, eBPF programs must be compiled per kernel version — which largely eliminates the portability advantage. On RHEL 7 or very old Ubuntu LTS versions still in production, kernel modules may be the more practical path for instrumentation work, though migrating the underlying OS is a better long-term answer.

You need capabilities the eBPF verifier rejects. The verifier’s safety constraints occasionally reject programs that are logically safe but that the verifier cannot prove safe statically. Complex loops, large stack allocations, and certain pointer arithmetic patterns hit verifier limits. In these edge cases, a kernel module can do what the verifier would not allow. These situations are rare and becoming rarer as the verifier improves across kernel versions.


The Practical Decision Framework

For most engineers reading this — Linux admins, DevOps engineers, SREs managing Kubernetes clusters — the decision is straightforward:

  • Observability, security monitoring, network policy, performance profiling on Linux 5.4+ → eBPF
  • Hardware drivers, new kernel subsystems, or kernels older than 5.4 → kernel modules
  • Production Kubernetes on EKS, GKE, or AKS → eBPF, always, because CO-RE survives managed upgrades and kernel modules do not

The overlap between the two technologies — the use cases where both could work — has been shrinking for five years and continues to shrink as the verifier becomes more capable and CO-RE becomes more widely supported. The direction of travel is clear.

Kernel modules are a precision instrument for modifying kernel behaviour. eBPF is a safe, portable interface for observing and influencing it. In 2025, if you are reaching for a kernel module to instrument a production system, there is almost certainly a better path.


Up Next

Episode 4 covers the five things eBPF can observe that no other tool can — without agents, without sidecars, and without any changes to your application code. If you are running production Kubernetes and want to understand what true zero-instrumentation observability looks like, that is the post.

The full series is on LinkedIn — search #eBPFSeries — and all episodes are indexed on linuxcent.com under the eBPF Series tag.


Further Reading


Questions or corrections? Reach me on LinkedIn. If this was useful, the full series index is on linuxcent.com — search the eBPF Series tag for all episodes.

What Is eBPF? A Plain-English Guide for Linux and Kubernetes Engineers

~1,900 words · Reading time: 7 min · Series: eBPF: From Kernel to Cloud, Episode 1 of 18

Your Linux kernel has had a technology built into it since 2014 that most engineers working with Linux every day have never looked at directly. You have almost certainly been using it — through Cilium, Falco, Datadog, or even systemd — without knowing it was there.

This post is the plain-English introduction to eBPF that I wished existed when I first encountered it. No kernel engineering background required. No bytecode, no BPF maps, no JIT compilation. Just a clear answer to the question every Linux admin and DevOps engineer eventually asks: what actually is eBPF, and why does it matter for the infrastructure I run every day?


First: Forget the Name

eBPF stands for extended Berkeley Packet Filter. It is one of the most misleading names in computing for what the technology actually does.

The original BPF was a 1992 mechanism for filtering network packets — the engine behind tcpdump. The extended version, introduced in Linux 3.18 (2014) and significantly matured through Linux 5.x, is a completely different technology. It is no longer just about packets. It is no longer just about filtering.

Forget the name. Here is what eBPF actually is:

eBPF lets you run small, safe programs directly inside the Linux kernel — without writing a kernel module, without rebooting, and without modifying your applications.

That is the complete definition. Everything else is implementation detail. The one-liner above is what matters for how you use it day to day.


What the Linux Kernel Can See That Nothing Else Can

To understand why eBPF is significant, you need to understand what the Linux kernel already sees on every server and every Kubernetes node you run.

The kernel is the lowest layer of software on your machine. Every action that happens — every file opened, every process started, every network packet sent — passes through the kernel. That means it has a complete, real-time view of everything:

  • Every syscall — every open(), execve(), connect(), write() from every process in every container on the node, in real time
  • Every network packet — source, destination, port, protocol, bytes, and latency for every pod-to-pod and pod-to-external connection
  • Every process event — every fork, exec, and exit, including processes spawned inside containers that your container runtime never reports
  • Every file access — which process opened which file, when, and with what permissions, across all workloads on the node simultaneously
  • CPU and memory usage — per-process CPU time, function-level latency, and memory allocation patterns without profiling agents

The kernel has always had this visibility. The problem was that there was no safe, practical way to access it without writing kernel modules — which are complex, kernel version-specific, and genuinely dangerous to run in production. eBPF is the safe, practical way to access it.


The Problem eBPF Solves — A Real Kubernetes Scenario

Here is a situation every Kubernetes engineer has faced. A production pod starts behaving strangely — elevated CPU, slow responses, occasional connection failures. You want to understand what is happening at a low level: what syscalls is it making, what network connections is it opening, is something spawning unexpected processes?

The old approaches and their problems

Restart the pod with a debug sidecar. You lose the current state immediately. The issue may not reproduce. You have modified the workload.

Run strace inside the container via kubectl exec. strace uses ptrace, which adds 50–100% CPU overhead to the traced process and is unavailable in hardened containers. You are tracing one process at a time with no cluster-wide view.

Poll /proc with a monitoring agent. Snapshot-based. Any event that happens between polls is invisible. A process that starts, does something, and exits between intervals is completely missed.

The eBPF approach

# Use a debug pod on the node — no changes to your workload
$ kubectl debug node/your-node -it --image=cilium/hubble-cli

# Real-time kernel events from every container on this node:
sys_enter_execve  pid=8821  comm=sh    args=["/bin/sh","-c","curl http://..."]
sys_enter_connect pid=8821  comm=curl  dst=203.0.113.42:443
sys_enter_openat  pid=8821  comm=curl  path=/etc/passwd

# Something inside the pod spawned a shell, made an outbound connection,
# and read /etc/passwd — all visible without touching the pod.

Real-time visibility. No overhead on your workload. Nothing restarted. Nothing modified. That is what eBPF makes possible.


Tools You Are Probably Already Running on eBPF

eBPF is not a standalone product — it is the foundation that many tools in the cloud-native ecosystem are built on. You may already be running eBPF on your nodes without thinking about it explicitly.

Tool What eBPF does for it Without eBPF
Cilium Replaces kube-proxy and iptables with kernel-level packet routing. 2–3× faster at scale. iptables rules — linear lookup, degrades with service count
Falco Watches every syscall in every container for security rule violations. Sub-millisecond detection. Kernel module (risky) or ptrace (high overhead)
Tetragon Runtime security enforcement — can kill a process or drop a network packet at the kernel level. No practical alternative at this detection speed
Datadog Agent Network performance monitoring and universal service monitoring without application code changes. Language-specific agents injected into application code
systemd cgroup resource accounting and network traffic control on your Linux nodes. Legacy cgroup v1 interfaces with limited visibility

eBPF vs the Old Ways

Before eBPF, getting deep visibility into a running Linux system meant choosing between three approaches, each with a significant trade-off:

Approach Visibility Cost Production safe?
Kernel modules Full kernel access One bug = kernel panic. Version-specific, must recompile per kernel update. No
ptrace / strace One process at a time 50–100% CPU overhead on the traced process. Unusable in production. No
Polling /proc Snapshots only Events between polls are invisible. Short-lived processes are missed entirely. Partial
eBPF Full kernel visibility 1–3% overhead. Verifier-guaranteed safety. Real-time stream, not polling. Yes

Is It Safe to Run in Production?

This is always the first question from any experienced Linux admin, and it is exactly the right question to ask. The answer is yes — and the reason is the BPF verifier.

Before any eBPF program is allowed to run on your node, the Linux kernel runs it through a built-in static safety analyser. This analyser examines every possible execution path and asks: could this program crash the kernel, loop forever, or access memory it should not?

If the answer is yes — or even maybe — the program is rejected at load time. It never runs.

This is fundamentally different from kernel modules. A kernel module loads immediately with no safety check. If it has a bug, you find out at runtime — usually as a kernel panic. An eBPF program that would cause a panic is rejected before it ever loads. The safety guarantee is mathematical, not hopeful.

Episode 2 of this series covers the BPF verifier in full: what it checks, how it makes Cilium and Falco safe on your production nodes, and what questions to ask eBPF tool vendors about their implementation.


Common Misconceptions

eBPF is not a specific tool or product. It is a kernel technology — a platform. Cilium, Falco, Tetragon, and Pixie are tools built on top of it. When a vendor says “we use eBPF”, they mean they build on this kernel capability, not that they share a single implementation.

eBPF is not only for networking. The Berkeley Packet Filter name suggests networking, but modern eBPF covers security, observability, performance profiling, and tracing. The networking origin is historical, not a limitation.

eBPF is not only for Kubernetes. It works on any Linux system running kernel 4.9+, including bare metal servers, Docker hosts, and VMs. K8s is the most popular deployment target because of the observability challenges at scale, but it is not a requirement.

You do not need to write eBPF programs to benefit from eBPF. Most Linux admins and DevOps engineers will use eBPF through tools like Cilium, Falco, and Datadog — never writing a line of BPF code themselves. This series covers the writing side later. Understanding what eBPF is makes you a significantly better user of these tools today.


Kernel Version Requirements

eBPF is a Linux kernel feature. The capabilities available depend directly on the kernel version running on your nodes. Run uname -r on any node to check.

Kernel What becomes available
4.9+ Basic eBPF support. Tracing, socket filtering. Most production systems today meet this minimum.
5.4+ BTF (BPF Type Format) and CO-RE — programs that adapt to different kernel versions without recompile. Recommended minimum for production tooling.
5.8+ Ring buffers for high-performance event streaming. Global variables. The target kernel for Cilium, Falco, and Tetragon full feature support.
6.x Open-coded iterators, improved verifier, LSM security enforcement hooks. Amazon Linux 2023 and Ubuntu 22.04+ ship 5.15 or newer and are fully eBPF-ready.

EKS users: Amazon Linux 2023 AMIs ship with kernel 6.1+ and support the full modern eBPF feature set out of the box. If you are still on AL2, the migration also resolves the NetworkManager deprecation issues covered in the EKS 1.33 post.


The Bottom Line

eBPF is the answer to a question Linux engineers have been asking for years: how do I get deep visibility into what is happening on my servers and Kubernetes nodes — without adding massive overhead, injecting sidecars, or risking a kernel panic?

The answer is: run small, safe programs at the kernel level, where everything is already visible. Let the BPF verifier guarantee those programs are safe before they run. Stream the results to your observability tools through shared memory maps.

The tools you already use — Cilium for networking, Falco for security, Datadog for APM — are built on this foundation. Understanding eBPF means understanding why those tools work the way they do, what they can and cannot see, and how to evaluate new tools that claim to use it.

Every eBPF-based tool you run on your nodes passed through the BPF verifier before it touched your cluster. Episode 2 covers exactly what that means — and why it matters for your infrastructure decisions.


Further Reading


Questions or corrections? Reach me on LinkedIn. If this was useful, the full series index is on linuxcent.com — search the eBPF Series tag for all episodes.