EKS 1.33 Upgrade Blocker: Fixing Dead Nodes & NetworkManager on Rocky Linux

The EKS 1.33+ NetworkManager Trap: A Complete systemd-networkd Migration Guide for Rocky & Alma Linux

TL;DR:

  • The Blocker: Upgrading to EKS 1.33+ is breaking worker nodes, especially on free community distributions like Rocky Linux and AlmaLinux. Boot times are spiking past 6 minutes, and nodes are failing to get IPs.
  • The Root Cause: AWS is deprecating NetworkManager in favor of systemd-networkd. However, ripping out NetworkManager can leave stale VPC IPs in /etc/resolv.conf. Combined with the systemd-resolved stub listener (127.0.0.53) and a few configuration missteps, it causes a total internal DNS collapse where CoreDNS pods crash and burn.
  • The Subtext: AWS is pushing this modern networking standard hard. Subtly, this acts as a major drawback for Rocky/Alma AMIs, silently steering frustrated engineers toward Amazon Linux 2023 (AL2023) as the “easy” way out.
  • The “Super Hack”: Automate the clean removal of NetworkManager, bypass the DNS stub listener by symlinking /etc/resolv.conf directly to the systemd uplink, and enforce strict state validation during the AMI build. (Full script below).

If you’ve been in the DevOps and SRE space long enough, you know that vendor upgrades rarely go exactly as planned. But lately, if you are running enterprise Linux distributions like Rocky Linux or AlmaLinux on AWS EKS, you might have noticed the ground silently shifting beneath your feet.

With the push to EKS 1.33+, AWS is mandating a shift toward modern, cloud-native networking standards. Specifically, they are phasing out the legacy NetworkManager in favor of systemd-networkd.

While this makes sense on paper, the transition for community distributions has been incredibly painful. AWS support couldn’t resolve our issues, and my SRE team had practically given up, officially halting our EKS upgrade process. It’s hard not to notice that this massive, undocumented friction in Rocky Linux and AlmaLinux conveniently positions AWS’s own Amazon Linux 2023 (AL2023) as the path of least resistance.

I’m hoping the incredible maintainers at free distributions like Rocky Linux and AlmaLinux take note of this architectural shift. But until the official AMIs catch up, we have to fix it ourselves. Here is the exact breakdown of the cascading failure that brought our clusters to their knees, and the “super hack” script we used to fix it.

The Investigation: A Cascading SRE Failure

When our EKS 1.33+ worker nodes started booting with 6+ minute latencies or outright failing to join the cluster, I pulled apart our Rocky Linux AMIs to monitor the network startup sequence. What I found was a classic cascading failure of services, stale data, and human error.

Step 1: The Race Condition

Initially, the problem was a violent tug-of-war. NetworkManager was not correctly disabled by default, and cloud-init was still trying to invoke it. This conflicted directly with systemd-networkd, paralyzing the network stack during boot. To fix this, we initially disabled the NetworkManager service and removed it from cloud-init.

Step 2: The Stale Data Landmine

Here is where the trap snapped shut. Because NetworkManager was historically the primary service responsible for dynamically generating and updating /etc/resolv.conf, completely disabling it stopped that file from being updated.

When we baked the new AMI via Packer, /etc/resolv.conf was orphaned and preserved the old configuration—specifically, a stale .2 VPC IP address from the temporary subnet where the AMI build ran.

Step 3: The Human Element

We’ve all been there: during a stressful outage, wires get crossed. While troubleshooting the dead nodes, one of our SREs mistakenly stopped the systemd-resolved service entirely, thinking it was conflicting with something else.

Step 4: Total DNS Collapse

When the new AMI booted up and joined the EKS node group, the environment was a disaster zone:

  1. NetworkManager was dead (intentional).
  2. systemd-resolved was stopped (accidental).
  3. /etc/resolv.conf contained a dead, stale IP address from a completely different subnet.

When kubelet started, it dutifully read the host’s broken /etc/resolv.conf and passed it up to CoreDNS. CoreDNS attempted to route traffic to the stale IP, failed, and started crash-looping. Internal DNS resolution (pod.namespace.svc.cluster.local) totally collapsed. The cluster was dead in the water.

Flowchart showing the cascading DNS failure in EKS worker nodes
The perfect storm: How stale data and disabled services led to a total CoreDNS collapse.

Linux Internals: How systemd Manages DNS (And Why CoreDNS Breaks)

To understand how to permanently fix this, we need to look at how systemd actually handles DNS under the hood. When using systemd-networkd, resolv.conf management is handled through a strict partnership with systemd-resolved.

Architecture diagram of systemd-networkd and systemd-resolved D-Bus communication
How systemd collects network data and the critical symlink choice that dictates EKS DNS health.

Here is how the flow works: systemd-networkd collects network and DNS information (from DHCP, Router Advertisements, or static configs) and pushes it to systemd-resolved via D-Bus. To manage your DNS resolution effectively, you must configure the /etc/resolv.conf symbolic link to match your desired mode of operation. You have three choices:

1. The “Recommended” Local DNS Stub (The EKS Killer)

By default, systemd recommends using systemd-resolved as a local DNS cache and manager, providing features like DNS-over-TLS and mDNS.

  • The Symlink: ln -sf /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
  • Contents: Points to 127.0.0.53 as the only nameserver.
  • The Problem: This is a disaster for Kubernetes. If Kubelet passes 127.0.0.53 to CoreDNS, CoreDNS queries its own loopback interface inside the pod network namespace, blackholing all cluster DNS.

2. Direct Uplink DNS (The “Super Hack” Solution)

This mode bypasses the local stub entirely. The system lists the actual upstream DNS servers (e.g., your AWS VPC nameservers) discovered by systemd-networkd directly in the file.

  • The Symlink: ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf
  • Contents: Lists all actual VPC DNS servers currently known to systemd-resolved.
  • The Benefit: CoreDNS gets the real AWS VPC nameservers, allowing it to route external queries correctly while managing internal cluster resolution perfectly.

3. Static Configuration (Manual)

If you want to manage DNS manually without systemd modifying the file, you break the symlink and create a regular file (rm /etc/resolv.conf). While systemd-networkd still receives DNS info from DHCP, it won’t touch this file. (Not ideal for dynamic cloud environments).


The Solution: A Surgical systemd Cutover

Knowing the internals, the path forward is clear. We needed to not only remove the legacy stack but explicitly rewire the DNS resolution to the Direct Uplink to prevent the stale data trap and bypass the notorious 127.0.0.53 stub listener.

Here is the exact state we achieved:

  1. Lock down cloud-init so it stops triggering legacy network services.
  2. Completely mask NetworkManager to ensure it never wakes up.
  3. Ensure systemd-resolved is enabled and running, but with the DNSStubListener explicitly disabled (DNSStubListener=no) so it doesn’t conflict with anything.
  4. Destroy the stale /etc/resolv.conf and create a symlink to the Direct Uplink (ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf).
  5. Reconfigure and restart systemd-networkd.

Pro-Tip for Debugging: To ensure systemd-networkd is successfully pushing DNS info to the resolver, verify your .network files in /etc/systemd/network/. Ensure UseDNS=yes (which is the default) is set in the [DHCPv4] section. You can always run resolvectl status to see exactly which DNS servers are currently assigned to each interface over D-Bus!

The Automation: Production AMI Prep Script

Manual hacks are great for debugging, but SRE is about repeatable automation. I wrote the following script to run during our AMI build process. It standardizes the cutover, wipes out the stale data, and includes a strict validation suite so no engineer has to manually fight this issue again.

#!/bin/bash
# ---------------------------------------------------------------------
# Author: Vamshi Krishna Santhapuri
# Script: eks-production-ami-prep.sh
# Objective: Migrate to systemd-networkd, fix stale DNS & disable Stub
# ---------------------------------------------------------------------

set -euo pipefail

log() { echo -e "\n[$(date +'%Y-%m-%dT%H:%M:%S')] $1"; }

log "Starting production network stack migration..."

# 1. Identify Primary Interface
PRIMARY_IF=$(ip route | grep default | awk '{print $5}' | head -n1)
[ -z "$PRIMARY_IF" ] && PRIMARY_IF=$(ip -o link show | awk -F': ' '{print $2}' | grep -E '^(eth|en)' | head -n1)

if [ -z "$PRIMARY_IF" ]; then
    echo "[ERROR] No primary interface detected."
    exit 1
fi
log "Targeting interface: $PRIMARY_IF"

# 2. Lock Cloud-Init Network Config
log "Step 1: Disabling cloud-init network management..."
sudo mkdir -p /etc/cloud/cloud.cfg.d/
echo "network: {config: disabled}" | sudo tee /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg

# 3. Deploy systemd-networkd Config
log "Step 2: Configuring systemd-networkd..."
sudo mkdir -p /etc/systemd/network/
cat <<EOF | sudo tee /etc/systemd/network/10-$PRIMARY_IF.network
[Match]
Name=$PRIMARY_IF

[Network]
DHCP=ipv4
LinkLocalAddressing=yes
IPv6AcceptRA=no

[DHCPv4]
ClientIdentifier=mac
RouteMetric=100
UseMTU=true
EOF

# 4. Disable DNS Stub for EKS/CoreDNS compatibility
log "Step 3: Disabling systemd-resolved stub listener..."
sudo mkdir -p /etc/systemd/resolved.conf.d/
cat <<EOF | sudo tee /etc/systemd/resolved.conf.d/99-eks-dns.conf
[Resolve]
DNSStubListener=no
EOF

# 5. Destroy Stale Data & Point resolv.conf to Uplink
log "Step 4: Linking /etc/resolv.conf to Uplink file (Fixing stale IPs)..."
sudo rm -f /etc/resolv.conf
sudo ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf

# 6. Decommission NetworkManager
log "Step 5: Stopping and masking NetworkManager..."
sudo systemctl stop NetworkManager NetworkManager-wait-online.service || true
sudo systemctl mask NetworkManager NetworkManager-wait-online.service

# 7. Clean up Legacy Artifacts
log "Step 6: Purging legacy network artifacts..."
sudo rm -f /etc/NetworkManager/system-connections/*
sudo rm -f /etc/sysconfig/network-scripts/ifcfg-$PRIMARY_IF || true
sudo rm -rf /run/systemd/netif/leases/*

# 8. Restart and Enable Services
log "Step 7: Activating new network stack..."
sudo systemctl unmask systemd-networkd
sudo systemctl enable --now systemd-networkd systemd-resolved
sudo systemctl restart systemd-resolved
sudo systemctl restart systemd-networkd
sudo networkctl reconfigure "$PRIMARY_IF"

# =====================================================================
# PRODUCTION VALIDATION SUITE
# =====================================================================
log "RUNNING PRODUCTION VALIDATION..."

# CHECK 1: Ensure systemd-networkd is managing the interface
if ! networkctl status "$PRIMARY_IF" | grep -q "configured"; then
    echo "[FAIL] $PRIMARY_IF is not in 'configured' state."
    exit 1
fi

# CHECK 2: Ensure the Stub Listener (127.0.0.53) is DEAD
if ss -lnpt | grep -q "127.0.0.53:53"; then
    echo "[FAIL] DNS Stub Listener is still active on 127.0.0.53! CoreDNS will fail."
    exit 1
fi

# CHECK 3: Verify /etc/resolv.conf link integrity
if [ "$(readlink /etc/resolv.conf)" != "/run/systemd/resolve/resolv.conf" ]; then
    echo "[FAIL] /etc/resolv.conf is not pointing to the uplink resolv.conf."
    exit 1
fi

# CHECK 4: Verify DNS resolution is functional on the node
if ! host -W 2 google.com > /dev/null 2>&1; then
    # Fallback to check if nameservers exist if internet isn't available in VPC during build
    if ! grep -q "nameserver" /etc/resolv.conf; then
        echo "[FAIL] No nameservers found in /etc/resolv.conf."
        exit 1
    fi
fi

log "SUCCESS: All production validations passed."
echo "--------------------------------------------------------"
echo "Primary IF: $PRIMARY_IF"
echo "Resolv.conf Target: $(readlink /etc/resolv.conf)"
grep "nameserver" /etc/resolv.conf
resolvectl status "$PRIMARY_IF" | grep "DNS Servers" || true
echo "--------------------------------------------------------"

The Results

By actively taking control of the systemd stack and ensuring /etc/resolv.conf was dynamically linked rather than statically abandoned, we completely unblocked our EKS 1.33+ upgrade.

More impressively, our system bootup time dropped from a crippling 6+ minutes down to under 2 minutes. We shouldn’t have to abandon fantastic, free enterprise distributions just because a cloud provider shifts their networking paradigm. If your team is struggling with AWS EKS upgrades on Rocky Linux or AlmaLinux, drop this script into your Packer pipeline and get your clusters back in the fast lane.

Bash How to Loop through Files

Bash How to Loop through Files

You can use it for the loop. Here is the syntax: for ELEMENT in $ARRAY; do command; done

Let’s try in in live example. First, create a new directory:

mkdir /root/test
cd /root/test
touch /root/test/sample.sh
touch /root/test/example.sh
touch /root/test/document.doc

Secondly, we process all files that have suffix .sh with for loop:

for FILE in /root/test/*.sh
do echo "Processing ${FILE} file.."
done

Output: Processing /root/test/example.sh file.. Processing /root/test/sample.sh file..

Based on the results of output (above), we can see that the loop processes all files in the /root/test, which have the suffix .sh

If you want to process command-line arguments use:

for FILE in "$*"
do
echo "Processing $FILE" file.

$* expands to file names of all files in the current directory.

Bash How to Wait Seconds

Bash How to Wait Seconds

How can we in the script wait for it until the system completes some tasks? The answer is to use sleep. This command suspends the script so that the script is low almost no system resources. Timing is sufficient.

Do you want to wait some seconds? In the next example, we will wait one second:

sleep 1

To turn off the script only for a split second? You can. This example shows the ingested sleep 100ms:

sleep 0.1

In this example, we will wait 20 minutes:

sleep 20m

In the next example, we will wait 8 hours:

sleep 8h

Do you want to wait several days? This is possible if you use a parameter d. However, consider using a cron scheduler. It is robust. Can you schedule to run it in your scripts at specified times and periodic runs?

In the last example, we will wait 7 days:

sleep 7d

Bash provides command wait to wait for the process/processes given as arguments:

wait $PID

PID is processing ID. ‘wait ${!}’ waits until the last background process is completed.

Bash How to Print Array

Bash How to Print Array

Arrays are collection of elements, The Arrays in bash are indexed from 0 (zero-based).
Below is the definition on an Array in Bash
my_array=(zero one two three four)

Now our array is defined.
Here is exactly how the my_array is stored on BASH:

my_array=([0]="zero" [1]="one" [2]="two" [3]="three" [4]="four")

You can explicit define an array:

declare -a MY_ARRAY

You can view the declarations along with other environment variables using the declare command.

declare -p my_array
declare -a my_array=([0]="zero" [1]="one" [2]="two" [3]="three" [4]="four")

Now if you try to print the array:

my_array=(zero one two three four)
echo $my_array
zero

By default only the first element value is printed which belongs to the 0 index.

To print the first element of the array using the indexing:

my_array=(zero one two three four)
echo ${my_array[0]}
zero

The change we noticed here is the use of the Curly Braces ‘{}’, its used to refer to the value of an item in the array. The curly braces are required to avoid issues with path name expansion.

To read all elements of the array use the symbols “@” or “*”.

echo ${my_array[@]}
zero one two three four
echo ${my_array[*]}
zero one two three four

The difference between “$@” and “$*” is “$@” expands each element as a separate argument, however “$*” expand to the arguments merged into one argument.

To prove this, print the index elements followed by $@ or $* format.

echo ${my_array[$*0]}
zero
echo ${my_array[$@1]}
one
echo ${my_array[$*2]}
two

Getting the Length of the Array

If you need to get the length of the array uses the symbol “#” before the name of the array:

echo "${#my_array[*]}"
5

How do I print an array in bash?

Print Bash Array

We can use the keyword ‘declare’ with a ‘-p’ option to print all the elements of a Bash Array with all the indexes and details. The syntax to print the Bash Array can be defined as: declare -p ARRAY_NAME.

How do I print in bash?

After typing in this program in your Bash file, you need to save it by pressing Ctrl +S and then close it. In this program, the echo command and the printf command is used to print the output on the console.

How do you print an array element in a new line in Shell?

To print each word on a new line, we need to use the keys “%s’\n”. ‘%s’ is to read the string till the end. At the same time, ‘\n’ moves the words to the next line. To display the content of the array, we will not use the “#” sign.

How do I display all array elements at once?

Program:
  1. public class PrintArray {
  2. public static void main(String[] args) {
  3. //Initialize array.
  4. int [] arr = new int [] {1, 2, 3, 4, 5};
  5. System. out. println(“Elements of given array: “);
  6. //Loop through the array by incrementing value of i.
  7. for (int i = 0; i < arr. length; i++) {
  8. System. out. print(arr[i] + ” “);

How do I create an array in bash?

  1. To declare your array, follow these steps:
    Give your array a name.
  2. Follow that variable name with an equal sign. The equal sign should not have any spaces around it.
  3. Enclose the array in parentheses (not brackets like in JavaScript)
  4. Type your strings using quotes, but with no commas between them.

How do you create an array in bash?

Define An Array in Bash

You have two ways to create a new array in bash script. The first one is to use declare command to define an Array. This command will define an associative array named test_array. In another way, you can simply create Array by assigning elements.

How does printf work in bash?

What Is the Bash printf Function? As the name suggests, printf is a function that prints formatted strings of text. That means you can write a string structure (the format) and later fill it in with values (the arguments). If you’re familiar with the C/C++ programming languages, you might already know how printf works.

How do you pass an array to a function in bash?

10 Answers

  1. Expanding an array without an index only gives the first element, use copyFiles “${array[@]}” instead of copyFiles $array.
  2. Use a she-bang #!/bin/bash.
  3. Use the correct function syntax. Valid variants are function copyFiles {… …
  4. Use the right syntax to get the array parameter arr=(“$@”) instead of arr=”$1″

How do you create an empty array in bash?

To declare an empty array, the simplest method is given here. It contains the keyword “declare” following a constant “-a” and the array name. The name of the array is assigned with empty parenthesis.

How do I get the size of an array in bash?

To get the length of an array, we can use the {#array[@]} syntax in bash. The # in the above syntax calculates the array size, without hash # it just returns all elements in the array.

Bash Vs KSH

Bash Vs KSH

Linux and Unix have various shells. Two kinds of these numerous shells are KSH and BASH.

KSH (The Korn Shell) was developed many years before the BASH. Ksh has associative arrays and handles loop syntax better than bash. Also ksh’s command print is better than bash’s echo command. In other way, ksh does not support history completion, process substitution and rebindable command-line editing.

Bash has more added extension than ksh. Bash has tab completion and easier method to set a prompt in order to display current directory.

Compared to ksh, bash is newer and more popular.

Example of difference ksh and bash in condition test. First bash:

if [ $i -eq 3 ]

and condition test in ksh:

if (($i==3))

Bash can handle exit codes from pipes in a cleaner way. Bash and KSH are both Bourne=compatible shells, they share common functions and features and can be interchangeable to use.

Is ksh same as bash?

KSH and Bash shells are also products of combinations of other shells’ features. Bash and KSH are both Bourne-compatible shells. Since they share common features, they can be used interchangeably.

Is ksh faster than bash?

The ksh and zsh seems about seven times faster than bash . The ksh excelled in 17 tests and the zsh in six tests.

What is the difference between ksh CSH and bash?

CSH is C shell while BASH is Bourne Again shell. 2. C shell and BASH are both Unix and Linux shells. While CSH has its own features, BASH has incorporated the features of other shells including that of CSH with its own features which provides it with more features and makes it the most widely used command processor.

Is ksh a Linux shell?

Ksh is an acronym for KornSHell. It is a shell and programming language that executes commands read from a terminal or a file. It was developed by David Korn at AT&T Bell Laboratories in the early 1980s. It is backwards-compatible with the Bourne shell and includes many features of the C shell.

Why is ksh used?

ksh is a command and programming language that executes commands read from a terminal or a file. rksh is a restricted version of the command interpreter ksh; it is used to set up login names and execution environments whose capabilities are more controlled than those of the standard shell.

How do I run a ksh script in Bash?

How do I run . sh file shell script in Linux?

  • Open the Terminal application on Linux or Unix.
  • Create a new script file with .sh extension using a text editor.
  • Write the script file using nano script-name-here.sh.
  • Set execute permission on your script using chmod command : chmod +x script-name-here.sh.
    To run your script :

Is dash faster than Bash?

If you need speed, go definitely with dash, it is much faster than any other shell and about 4x faster than bash.

What is faster than Bash?

Perl is absurdly faster than Bash. And, for text manipulation, you can actually achieve better performances with Perl than with C, unless you take time to write complex algorithms.

How much faster is dash than Bash?

Dash is not Bash compatible, but Bash tries to be mostly compatible with POSIX, and thus Dash. Dash shines in: Speed of execution. Roughly 4x times faster than Bash and others.

What is difference between sh and ksh in Unix?

sh is the original Bourne shell. On many non-Linux systems, this is an old shell without the POSIX features. Thus bash and ksh (or even csh and tcsh) are better choices than sh. … Public domain ksh (pdksh) is Bourne-compatible and mostly POSIX-compatible.

Is zsh better than Bash?

It has many features like Bash but some features of Zsh make it better and improved than Bash, such as spelling correction, cd automation, better theme, and plugin support, etc. Linux users don’t need to install the Bash shell because it is installed by default with Linux distribution.

Bash How to Read from Keyboard

Bash How to Read from Keyboard

To read input from the keyboard and assign input value to a variable use the read command.

Read syntax:

read options var1 var2

To write the value of first entered word use:

read var1 var2
echo $var1

If you don’t give any argument to the read command, input will assign to the environment variable REPLY.

The “s” option does not echo input while reading from a keyboard.

read -s -p "Enter password:" $password

The -p “TEXT” option displays TEXT to the user without a newline.

The -e option means that command readline is used to obtain the line.
Option -u FD reads inputs from file descriptor FD (0,1,2).
Option -t TIME causes that read returns a failure if the input is not read within TIME seconds.

Bash How to Return from Function

Bash How to Return from Function

Use the statement “return” to return from the function. You can also specify the return value. Example:

return 1234

We returned function status 1234 from function. Usually, we use 0 value for success and 1 for failure. It is similar to command exit which we use to terminate the script.

If you don’t use the return statement in the whole function, the status of the last executed command will be returned.

To verify returned status from the last called function use “$?”.

There is a difference between commands return and exit. The exit will cause the script to end at the line where it is called. Return will cause the current function to go out of scope and continue execution command after the function.

What is Bash Script

What is Bash Script

Metaphorically speaking bash script is like a ‘to-do list’. After you read the first entry you start realizing it. After you finished first entry, you continue with the second entry and so on.

A bash script is a text file that contains a mixture of commands.

Bash script can contain also functions, loops, conditional constructs. Scripts are commonly used for administration task like change file permission, creating disk backups. Using bash scripts is often faster than using the graphical user interface.

It’s important to mention that there is no difference between putting series of 10 commands into a script file and executing that script or you entering commands one by one to the command-line interface. In both situations, the result will be exactly the same thing.

After you create your script it is good practice to add the extension ‘.sh’ to the filename, for example ‘myFirstScript.sh’.

Bash Error Output Redirect

Bash Error Output Redirect

Each open file gets assigned a file descriptor. The file descriptors for STDIN is 0, for STDOUT is 1 nad STDERR is 2.

If you want to redirect just STDERR (standard error output) to file, ju do:

cmd_name 2> /file

If you want o redirect STDOUT to other files, and STDERR to other files, just do:

cmd_name >/stdout_file 2>/stderr_file

If you want to merge both (STDERR, STDOUT) into one file, you can do:

cmd_name >/file_4_both 2>&1

For the same effect, you can also use this syntax:

cmd_name &> /file_4_both

Even more, it is very useful to use the tee command. By definition, tee read from standard input and write to standard output and files, in same time.

In next example, we merge STDOUT and STDERR of “find /” command together. Then, we pass it to STDIN of tee command. Tee command is executed with -a parameter, that means append to an existing file (if exists):

find / 2>&1 | tee -a /home/tee.output
ll /home

Output: (lines omitted) -rw-r–r–. 1 root root 2.2M Jun 10 13:16 tee.output

The child process inherits open file descriptors. If you want to prevent file descriptors from being inherited, close it. For example:

<&-

This close stddin descriptor.

Bash Scripting Best Practices

Bash Scripting Best Practices

Let’s begin with the first line of your script. The first rule always starts script with shebang. Without the shebang line, the system does not know which shell to process. For example:

#!/bin/bash

After shebang line write what is your script about and what it would do.

For debugging printout run your script with the “-x” or “-v” option like:

bash -x script.sh

Use the set lines:

set -e

set -u

set -o pipefail

“-e” to immediately exit if any command has non-zero exit status. “-u” causes the program to exit when you haven’t previously defined variable (except $* and $@). Option “-o pipefail” prevents fails in a pipeline from being masked. The exit status of the last command that threw non-zero exit code is returned.

Never use backticks, use:

$( ... )

Back-ticks are visually similar to single quotes, in a larger script with hundreds of lines you could be confused if it is back-ticks or single quotes.

If you need to create temporary files, use “mktemp” for temporary files and cleanup with “trap”.

Bash Hello World Script

Bash Hello World Script

This bash example creates an archive from /home directory to /backup/ directory as one tar.gz file. Let’s create a file backup.sh. It will consist of two lines:

#!/bin/bash
tar -czf /var/home-backup.tar.gz /home/

First line is a hashpling. Basically, it says who execute script. In this example, we choose /bin/bash.

The second line is tar command. It tarballs and compress the whole directory (/home) to one file.

I recommend you give also the third line (empty line). Why? If you execute this script in unusual UNIXes (ec. SCO UNIX), UNIX coudn’t execute last line, becasuse last symbol of file – EOF (end of file) is different from symbol EOLN (end of line). Symbol EOLN (enter) or semicolon (;) executes command.

Would you to extend this script to some output? Here is an example.

#!/bin/bash
echo -n Creating backup of home directory to /backup...
tar -czf /var/home-backup.tar.gz /home/ >/dev/null 2>&1
echo done.

On the second line, echo with n parameter doesn’t give a new line.

On the third line, output from tar command is redirected to trash (/dev/null).

Last line, just echoes done.

If you don’t know what to put on the first line (hashpling) type:

echo $SHELL

Output: /bin/bash

You will get path to your shell, which can use in the hashspling.