<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>AWS EKS Archives - Linuxcent</title>
	<atom:link href="https://linuxcent.com/tag/aws-eks/feed/" rel="self" type="application/rss+xml" />
	<link>https://linuxcent.com/tag/aws-eks/</link>
	<description>Infrastructure security, from the kernel up.</description>
	<lastBuildDate>Sat, 28 Feb 2026 17:46:23 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://linuxcent.com/wp-content/uploads/2026/04/favicon-512x512-1-150x150.png</url>
	<title>AWS EKS Archives - Linuxcent</title>
	<link>https://linuxcent.com/tag/aws-eks/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">211632295</site>	<item>
		<title>EKS 1.33 Upgrade Blocker: Fixing Dead Nodes &#038; NetworkManager on Rocky Linux</title>
		<link>https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/</link>
					<comments>https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/#respond</comments>
		
		<dc:creator><![CDATA[Vamshi Krishna Santhapuri]]></dc:creator>
		<pubDate>Tue, 17 Feb 2026 19:42:18 +0000</pubDate>
				<category><![CDATA[Bash]]></category>
		<category><![CDATA[Devops]]></category>
		<category><![CDATA[Linux Tutorials]]></category>
		<category><![CDATA[SRE]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[AMI Build]]></category>
		<category><![CDATA[AWS EKS]]></category>
		<category><![CDATA[AWS networking]]></category>
		<category><![CDATA[cloud-init]]></category>
		<category><![CDATA[CoreDNS]]></category>
		<category><![CDATA[Kubernetes 1.33]]></category>
		<category><![CDATA[NetworkManager]]></category>
		<category><![CDATA[Packer]]></category>
		<category><![CDATA[resolv.conf]]></category>
		<category><![CDATA[Rocky Linux EKS]]></category>
		<category><![CDATA[systemd-networkd]]></category>
		<category><![CDATA[systemd-networkd migration]]></category>
		<category><![CDATA[systemd-resolved]]></category>
		<category><![CDATA[VPC]]></category>
		<guid isPermaLink="false">https://linuxcent.com/?p=1400</guid>

					<description><![CDATA[<p><span class="span-reading-time rt-reading-time" style="display: block;"><span class="rt-label rt-prefix">Reading Time: </span> <span class="rt-time"> 5</span> <span class="rt-label rt-postfix">minutes</span></span>The EKS 1.33+ NetworkManager Trap: A Complete systemd-networkd Migration Guide for Rocky &#038; Alma Linux TL;DR: The Blocker: Upgrading to EKS 1.33+ is breaking worker nodes, especially on free community distributions like Rocky Linux and AlmaLinux. Boot times are spiking past 6 minutes, and nodes are failing to get IPs. The Root Cause: AWS is ... <a title="EKS 1.33 Upgrade Blocker: Fixing Dead Nodes &#038; NetworkManager on Rocky Linux" class="read-more" href="https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/" aria-label="Read more about EKS 1.33 Upgrade Blocker: Fixing Dead Nodes &#038; NetworkManager on Rocky Linux">Read more</a></p>
<p>The post <a href="https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/">EKS 1.33 Upgrade Blocker: Fixing Dead Nodes &#038; NetworkManager on Rocky Linux</a> appeared first on <a href="https://linuxcent.com">Linuxcent</a>.</p>
]]></description>
										<content:encoded><![CDATA[<span class="span-reading-time rt-reading-time" style="display: block;"><span class="rt-label rt-prefix">Reading Time: </span> <span class="rt-time"> 5</span> <span class="rt-label rt-postfix">minutes</span></span><h1>The EKS 1.33+ NetworkManager Trap: A Complete systemd-networkd Migration Guide for Rocky &#038; Alma Linux</h1>
<h2>TL;DR:</h2>
<ul>
<li><strong>The Blocker:</strong> Upgrading to EKS 1.33+ is breaking worker nodes, especially on free community distributions like Rocky Linux and AlmaLinux. Boot times are spiking past 6 minutes, and nodes are failing to get IPs.</li>
<li><strong>The Root Cause:</strong> AWS is deprecating <code class="" data-line="">NetworkManager</code> in favor of <code class="" data-line="">systemd-networkd</code>. However, ripping out NetworkManager can leave stale VPC IPs in <code class="" data-line="">/etc/resolv.conf</code>. Combined with the <code class="" data-line="">systemd-resolved</code> stub listener (<code class="" data-line="">127.0.0.53</code>) and a few configuration missteps, it causes a total internal DNS collapse where CoreDNS pods crash and burn.</li>
<li><strong>The Subtext:</strong> AWS is pushing this modern networking standard hard. Subtly, this acts as a major drawback for Rocky/Alma AMIs, silently steering frustrated engineers toward Amazon Linux 2023 (AL2023) as the &#8220;easy&#8221; way out.</li>
<li><strong>The &#8220;Super Hack&#8221;:</strong> Automate the clean removal of NetworkManager, bypass the DNS stub listener by symlinking <code class="" data-line="">/etc/resolv.conf</code> directly to the <code class="" data-line="">systemd</code> uplink, and enforce strict state validation during the AMI build.</li>
</ul>
<hr>
<p>If you’ve been in the DevOps and SRE space long enough, you know that vendor upgrades rarely go exactly as planned. But lately, if you are running enterprise Linux distributions like Rocky Linux or AlmaLinux on AWS EKS, you might have noticed the ground silently shifting beneath your feet.</p>
<p>With the push to EKS 1.33+, AWS is mandating a shift toward modern, cloud-native networking standards. Specifically, they are phasing out the legacy <code class="" data-line="">NetworkManager</code> in favor of <code class="" data-line="">systemd-networkd</code>.</p>
<p>While this makes sense on paper, the transition for community distributions has been incredibly painful. AWS support couldn&#8217;t resolve our issues, and my SRE team had practically given up, officially halting our EKS upgrade process. It’s hard not to notice that this massive, undocumented friction in Rocky Linux and AlmaLinux conveniently positions AWS&#8217;s own Amazon Linux 2023 (AL2023) as the path of least resistance.</p>
<p>I’m hoping the incredible maintainers at free distributions like Rocky Linux and AlmaLinux take note of this architectural shift. But until the official AMIs catch up, we have to fix it ourselves. Here is the exact breakdown of the cascading failure that brought our clusters to their knees, and the &#8220;super hack&#8221; script we used to fix it.</p>
<h2>The Investigation: A Cascading SRE Failure</h2>
<p>When our EKS 1.33+ worker nodes started booting with 6+ minute latencies or outright failing to join the cluster, I pulled apart our Rocky Linux AMIs to monitor the network startup sequence. What I found was a classic cascading failure of services, stale data, and human error.</p>
<h3>Step 1: The Race Condition</h3>
<p>Initially, the problem was a violent tug-of-war. <code class="" data-line="">NetworkManager</code> was not correctly disabled by default, and <code class="" data-line="">cloud-init</code> was still trying to invoke it. This conflicted directly with <code class="" data-line="">systemd-networkd</code>, paralyzing the network stack during boot. To fix this, we initially disabled the <code class="" data-line="">NetworkManager</code> service and removed it from <code class="" data-line="">cloud-init</code>.</p>
<h3>Step 2: The Stale Data Landmine</h3>
<p>Here is where the trap snapped shut. Because <code class="" data-line="">NetworkManager</code> was historically the primary service responsible for dynamically generating and updating <code class="" data-line="">/etc/resolv.conf</code>, completely disabling it stopped that file from being updated.</p>
<p>When we baked the new AMI via Packer, <code class="" data-line="">/etc/resolv.conf</code> was orphaned and preserved the old configuration—specifically, a stale <code class="" data-line="">.2</code> VPC IP address from the temporary subnet where the AMI build ran.</p>
<h3>Step 3: The Human Element</h3>
<p>We&#8217;ve all been there: during a stressful outage, wires get crossed. While troubleshooting the dead nodes, one of our SREs mistakenly stopped the <code class="" data-line="">systemd-resolved</code> service entirely, thinking it was conflicting with something else.</p>
<h3>Step 4: Total DNS Collapse</h3>
<p>When the new AMI booted up and joined the EKS node group, the environment was a disaster zone:</p>
<ol>
<li><code class="" data-line="">NetworkManager</code> was dead (intentional).</li>
<li><code class="" data-line="">systemd-resolved</code> was stopped (accidental).</li>
<li><code class="" data-line="">/etc/resolv.conf</code> contained a dead, stale IP address from a completely different subnet.</li>
</ol>
<p>When <code class="" data-line="">kubelet</code> started, it dutifully read the host&#8217;s broken <code class="" data-line="">/etc/resolv.conf</code> and passed it up to CoreDNS. CoreDNS attempted to route traffic to the stale IP, failed, and started crash-looping. Internal DNS resolution (<code class="" data-line="">pod.namespace.svc.cluster.local</code>) totally collapsed. The cluster was dead in the water.</p>
<figure class="wp-block-image size-large">
<img fetchpriority="high" decoding="async" src="[https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-EKS-Cascading-Failure.jpg](https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-EKS-Cascading-Failure.jpg)" alt="Flowchart showing the cascading DNS failure in EKS worker nodes" width="221" height="661" class="alignnone size-full wp-image-1410" srcset="https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-EKS-Cascading-Failure.jpg 221w, https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-EKS-Cascading-Failure-100x300.jpg 100w" sizes="(max-width: 221px) 100vw, 221px" /><figcaption>The perfect storm: How stale data and disabled services led to a total CoreDNS collapse.</figcaption><hr>
<h2>Linux Internals: How systemd Manages DNS (And Why CoreDNS Breaks)</h2>
<p>To understand how to permanently fix this, we need to look at how <code class="" data-line="">systemd</code> actually handles DNS under the hood. When using <code class="" data-line="">systemd-networkd</code>, <code class="" data-line="">resolv.conf</code> management is handled through a strict partnership with <code class="" data-line="">systemd-resolved</code>. </p>
<figure class="wp-block-image size-large">
<img decoding="async" src="[https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-Page-2-1024x299.jpg](https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-Page-2-1024x299.jpg)" alt="Architecture diagram of systemd-networkd and systemd-resolved D-Bus communication" width="1024" height="299" class="alignnone size-large wp-image-1411" srcset="https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-Page-2-1024x299.jpg 1024w, https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-Page-2-300x87.jpg 300w, https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-Page-2-768x224.jpg 768w, https://linuxcent.com/wp-content/uploads/2026/02/Untitled-Diagram-Page-2.jpg 1101w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption>How systemd collects network data and the critical symlink choice that dictates EKS DNS health.</figcaption><p>Here is how the flow works: <code class="" data-line="">systemd-networkd</code> collects network and DNS information (from DHCP, Router Advertisements, or static configs) and pushes it to <code class="" data-line="">systemd-resolved</code> via D-Bus. To manage your DNS resolution effectively, you must configure the <code class="" data-line="">/etc/resolv.conf</code> symbolic link to match your desired mode of operation. You have three choices:</p>
<h3>1. The &#8220;Recommended&#8221; Local DNS Stub (The EKS Killer)</h3>
<p>By default, systemd recommends using <code class="" data-line="">systemd-resolved</code> as a local DNS cache and manager, providing features like DNS-over-TLS and mDNS.</p>
<ul>
<li><strong>The Symlink:</strong> <code class="" data-line="">ln -sf /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf</code></li>
<li><strong>Contents:</strong> Points to <code class="" data-line="">127.0.0.53</code> as the only nameserver.</li>
<li><strong>The Problem:</strong> This is a disaster for Kubernetes. If Kubelet passes <code class="" data-line="">127.0.0.53</code> to CoreDNS, CoreDNS queries its own loopback interface inside the pod network namespace, blackholing all cluster DNS.</li>
</ul>
<h3>2. Direct Uplink DNS (The &#8220;Super Hack&#8221; Solution)</h3>
<p>This mode bypasses the local stub entirely. The system lists the actual upstream DNS servers (e.g., your AWS VPC nameservers) discovered by <code class="" data-line="">systemd-networkd</code> directly in the file.</p>
<ul>
<li><strong>The Symlink:</strong> <code class="" data-line="">ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf</code></li>
<li><strong>Contents:</strong> Lists all actual VPC DNS servers currently known to <code class="" data-line="">systemd-resolved</code>.</li>
<li><strong>The Benefit:</strong> CoreDNS gets the real AWS VPC nameservers, allowing it to route external queries correctly while managing internal cluster resolution perfectly.</li>
</ul>
<h3>3. Static Configuration (Manual)</h3>
<p>If you want to manage DNS manually without systemd modifying the file, you break the symlink and create a regular file (<code class="" data-line="">rm /etc/resolv.conf</code>). While <code class="" data-line="">systemd-networkd</code> still receives DNS info from DHCP, it won&#8217;t touch this file. (Not ideal for dynamic cloud environments).</p>
<hr>
<h2>The Solution: A Surgical systemd Cutover</h2>
<p>Knowing the internals, the path forward is clear. We needed to not only remove the legacy stack but explicitly rewire the DNS resolution to the <strong>Direct Uplink</strong> to prevent the stale data trap and bypass the notorious <code class="" data-line="">127.0.0.53</code> stub listener.</p>
<p>Here is the exact state we achieved:</p>
<ol>
<li><strong>Lock down <code class="" data-line="">cloud-init</code></strong> so it stops triggering legacy network services.</li>
<li><strong>Completely mask <code class="" data-line="">NetworkManager</code></strong> to ensure it never wakes up.</li>
<li><strong>Ensure <code class="" data-line="">systemd-resolved</code> is enabled and running</strong>, but with the <code class="" data-line="">DNSStubListener</code> explicitly disabled (<code class="" data-line="">DNSStubListener=no</code>) so it doesn&#8217;t conflict with anything.</li>
<li><strong>Destroy the stale <code class="" data-line="">/etc/resolv.conf</code></strong> and create a symlink to the <strong>Direct Uplink</strong> (<code class="" data-line="">ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf</code>).</li>
<li><strong>Reconfigure and restart <code class="" data-line="">systemd-networkd</code></strong>.</li>
</ol>
<blockquote><p>
<strong>Pro-Tip for Debugging:</strong> To ensure <code class="" data-line="">systemd-networkd</code> is successfully pushing DNS info to the resolver, verify your <code class="" data-line="">.network</code> files in <code class="" data-line="">/etc/systemd/network/</code>. Ensure <code class="" data-line="">UseDNS=yes</code> (which is the default) is set in the <code class="" data-line="">[DHCPv4]</code> section. You can always run <code class="" data-line="">resolvectl status</code> to see exactly which DNS servers are currently assigned to each interface over D-Bus!
</p></blockquote>
<h2>The Automation: Production AMI Prep Script</h2>
<p>Manual hacks are great for debugging, but SRE is about repeatable automation. We&#8217;ve open-sourced the <code class="" data-line="">eks-production-ami-prep.sh</code> script to handle this cutover automatically during your Packer or Image Builder pipeline. It standardizes the cutover, wipes out the stale data, and includes a strict validation suite.</p>
<div class="wp-block-buttons">
<div class="wp-block-button"><a class="wp-block-button__link" href="//github.com/rrskris/NetworkManager-systemd-networkd-config/blob/main/eks-production-ami-prep.sh" target="_blank" rel="noreferrer noopener">View Migration Script on GitHub</a></div>
</div>
<p></br></p>
<h2>The Results</h2>
<p>By actively taking control of the <code class="" data-line="">systemd</code> stack and ensuring <code class="" data-line="">/etc/resolv.conf</code> was dynamically linked rather than statically abandoned, we completely unblocked our EKS 1.33+ upgrade.</p>
<p>More impressively, <strong>our system bootup time dropped from a crippling 6+ minutes down to under 2 minutes.</strong> We shouldn&#8217;t have to abandon fantastic, free enterprise distributions just because a cloud provider shifts their networking paradigm. If your team is struggling with AWS EKS upgrades on Rocky Linux or AlmaLinux, integrate this automation into your pipeline and get your clusters back in the fast lane.</p>
<p><a class="a2a_button_mastodon" href="https://www.addtoany.com/add_to/mastodon?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="Mastodon" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_email" href="https://www.addtoany.com/add_to/email?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="Email" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_reddit" href="https://www.addtoany.com/add_to/reddit?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="Reddit" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_x" href="https://www.addtoany.com/add_to/x?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="X" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_linkedin" href="https://www.addtoany.com/add_to/linkedin?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="LinkedIn" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_copy_link" href="https://www.addtoany.com/add_to/copy_link?linkurl=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&amp;linkname=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" title="Copy Link" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Flinuxcent.com%2Feks-1-33-networkmanager-systemd-networkd-migration-fix%2F&#038;title=EKS%201.33%20Upgrade%20Blocker%3A%20Fixing%20Dead%20Nodes%20%26%20NetworkManager%20on%20Rocky%20Linux" data-a2a-url="https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/" data-a2a-title="EKS 1.33 Upgrade Blocker: Fixing Dead Nodes &amp; NetworkManager on Rocky Linux"></a></p><p>The post <a href="https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/">EKS 1.33 Upgrade Blocker: Fixing Dead Nodes &#038; NetworkManager on Rocky Linux</a> appeared first on <a href="https://linuxcent.com">Linuxcent</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://linuxcent.com/eks-1-33-networkmanager-systemd-networkd-migration-fix/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1400</post-id>	</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Page Caching using Disk: Enhanced 

Served from: linuxcent.com @ 2026-04-21 21:32:57 by W3 Total Cache
-->