GCP IAM Policy Inheritance: How the Resource Hierarchy Controls Access

Reading Time: 11 minutes

Meta Description: Map how GCP resource hierarchy IAM inheritance works — design org-level policies that don’t accidentally grant access to resources lower in the tree.
What Is Cloud IAMAuthentication vs AuthorizationIAM Roles vs PoliciesAWS IAM Deep DiveGCP Resource Hierarchy IAMAzure RBAC Scopes


TL;DR

  • GCP IAM bindings inherit downward — a binding at Organization or Folder level applies to every project and resource beneath it
  • Basic roles (viewer/editor/owner) are legacy constructs; use predefined or custom roles in production
  • Service account keys are a long-lived credential antipattern — use ADC, impersonation, or Workload Identity Federation instead
  • allAuthenticatedUsers bindings expose resources to any of 3 billion Google accounts — audit for these in every environment
  • iam.serviceAccounts.actAs is the GCP equivalent of AWS iam:PassRole — a direct privilege escalation vector
  • Conditional bindings with time-bound expiry eliminate “I’ll remember to remove this” as an operational pattern

The Big Picture

                GCP IAM Inheritance Model
─────────────────────────────────────────────────────────
Organization (company.com)
│
├─ IAM binding at Org level ──────────────────┐
│                                             │ inherits down
├── Folder: Production                        ▼
│   │                                 ALL nodes below
│   ├── Folder: Shared-Services
│   │       └── Project: infra-core
│   │               ├── GCS: config-bucket  ← affected
│   │               └── Secret Manager      ← affected
│   │
│   └── Project: prod-web-app
│           ├── GCS: prod-assets             ← affected
│           ├── Cloud SQL: prod-db           ← affected
│           └── BigQuery: analytics          ← affected
│
└── Folder: Development                      ← NOT affected by
        └── Project: dev-app                    Production binding

GCP resource hierarchy IAM inheritance is the mechanism that makes a single binding cascade through an entire estate. It’s also the reason high-level bindings carry far more blast radius than they appear to.


Introduction

GCP resource hierarchy IAM operates on one rule: bindings propagate downward. Grant access at the Organization level and it applies to every Folder, every Project, and every resource in your GCP estate. Grant it at a Folder and it applies to every Project below. This is by design — and it’s the reason IAM misconfigurations in GCP can have a blast radius that teams migrating from AWS don’t anticipate.

I once inherited a GCP environment where the previous team had taken what they thought was a shortcut. They had a folder called Production with twelve projects in it. Rather than grant developers access to each project individually, they bound roles/editor at the folder level. One binding, twelve projects, all covered. Fast.

When I audited what roles/editor on that folder actually meant, I found it gave every developer in that binding write access to Cloud SQL databases they’d never heard of, BigQuery datasets from other teams, Pub/Sub topics in shared services, and Cloud Storage buckets that held data exports. Not because anyone intended that. Because permissions in GCP flow downward through the hierarchy, and a broad role at a high level means a broad role everywhere below it.

The developer who made that binding understood “Editor means edit access.” They didn’t think through what “edit access at the folder level” means across twelve projects. This is the GCP IAM trap that catches teams coming from AWS: the hierarchy feels like an organizational convenience feature, not an access control mechanism. It’s both.


The Resource Hierarchy — Not Just Org Structure

GCP’s resource hierarchy is the backbone of its IAM model:

Organization  (e.g., company.com)
  └── Folder  (e.g., Production, Development, Shared-Services)
        └── Folder  (nested, optional — up to 10 levels)
              └── Project  (unit of resource ownership and billing)
                    └── Resource  (GCE instance, GCS bucket, Cloud SQL, BigQuery, etc.)

The critical rule: IAM bindings at any level inherit downward to every node below.

Org IAM binding:
  [email protected] → roles/viewer (org-level)
    ↓ inherited by
  Folder: Production
    ↓ inherited by
  Project: prod-web-app
    ↓ inherited by
  GCS bucket "prod-assets"

Result: alice can list and read resources across the ENTIRE org,
        across every folder, every project, every resource.
        Even if none of those resources have a direct binding for alice.

roles/viewer at the org level sounds benign — it’s just read access. But read access to everything in the organization, including infrastructure configurations, customer data exports in GCS, BigQuery analytics, Cloud SQL connection details, and Kubernetes cluster configs. Not benign.

Before making any binding above the project level, trace it down. Ask: what does this role grant, and at every project and resource below this folder, am I comfortable with that?

# Understand your org structure before making changes
gcloud organizations list

gcloud resource-manager folders list --organization=ORG_ID

gcloud projects list --filter="parent.id=FOLDER_ID"

# See all existing bindings at the org level — do this regularly
gcloud organizations get-iam-policy ORG_ID --format=json | jq '.bindings[]'

Member Types — Who Can Hold a Binding

GCP uses the term member (being renamed to principal) for the identity in a binding:

Member Type Format Notes
Google Account user:[email protected] Individual Google/Workspace account
Service Account serviceAccount:[email protected] Machine identity
Google Group group:[email protected] Workspace group
Workspace Domain domain:company.com All users in a Workspace domain
All Authenticated allAuthenticatedUsers Any authenticated Google identity — extremely broad
All Users allUsers Anonymous + authenticated — public access
Workload Identity principal://iam.googleapis.com/... External workloads via WIF

The ones that have caused data exposure incidents: allAuthenticatedUsers and allUsers. Any GCS bucket or GCP resource bound to allAuthenticatedUsers is accessible to any of the ~3 billion Google accounts in existence. I have seen production customer data exposed this way. A developer testing a public CDN pattern applied the binding to the wrong bucket.

Audit for these regularly:

# Find any project-level binding with allUsers or allAuthenticatedUsers
gcloud projects get-iam-policy my-project --format=json \
  | jq '.bindings[] | select(.members[] | contains("allUsers") or contains("allAuthenticatedUsers"))'

# Check all GCS buckets in a project for public access
gsutil iam get gs://BUCKET_NAME \
  | grep -E "(allUsers|allAuthenticatedUsers)"

Role Types — Choose the Right Granularity

Basic (Primitive) Roles — Don’t Use in Production

roles/viewer   → read access to most resources across the entire project
roles/editor   → read + write to most resources
roles/owner    → full access including IAM management

These are legacy roles from before GCP had service-specific roles. roles/editor is particularly dangerous because it grants write access across almost every GCP service in the project. Use it in production and you have no meaningful separation of duties between your services.

I’ve seen roles/editor granted to a data pipeline service account because “it needed access to BigQuery, Cloud Storage, and Pub/Sub.” All three of those have predefined roles. Three specific bindings. Instead: one broad role that also grants access to Cloud SQL, Kubernetes, Secret Manager, and Compute Engine — none of which the pipeline needed.

Predefined Roles — The Default Correct Choice

Service-specific roles managed and updated by Google. For most use cases, these are the right choice:

# Find predefined roles for Cloud Storage
gcloud iam roles list --filter="name:roles/storage" --format="table(name,title)"
# roles/storage.objectViewer   — read objects (not list buckets)
# roles/storage.objectCreator  — create objects, cannot read or delete
# roles/storage.objectAdmin    — full object control
# roles/storage.admin          — full bucket + object control (much broader)

# See exactly what permissions a predefined role includes
gcloud iam roles describe roles/storage.objectViewer

The distinction between roles/storage.objectViewer and roles/storage.admin is the difference between “can read objects” and “can read objects, create objects, delete objects, and modify bucket IAM policies.” Use the narrowest role that covers the actual need.

Custom Roles — When Predefined Is Still Too Broad

When you need finer control than any predefined role offers, create a custom role:

cat > custom-log-reader.yaml << 'EOF'
title: "Log Reader"
description: "Read application logs from Cloud Logging — nothing else"
stage: "GA"
includedPermissions:
  - logging.logEntries.list
  - logging.logs.list
  - logging.logMetrics.get
  - logging.logMetrics.list
EOF

# Create at project level (available within one project)
gcloud iam roles create LogReader \
  --project=my-project \
  --file=custom-log-reader.yaml

# Or at org level (reusable across projects in the org)
gcloud iam roles create LogReader \
  --organization=ORG_ID \
  --file=custom-log-reader.yaml

# Grant the custom role
gcloud projects add-iam-policy-binding my-project \
  --member="serviceAccount:[email protected]" \
  --role="projects/my-project/roles/LogReader"

Custom roles have an operational overhead: when Google adds new permissions to a service, predefined roles are updated automatically. Custom roles are not — you have to update them manually. For roles like “Log Reader” that are unlikely to need new permissions, this isn’t a concern. For roles like “App Admin” that span many services, it becomes a maintenance burden.


IAM Policy Bindings — How Access Is Actually Granted

The mechanism for granting access in GCP is adding a binding to a resource’s IAM policy. A binding is: member + role + (optional condition).

# Grant a role on a project (all resources in the project inherit this)
gcloud projects add-iam-policy-binding my-project \
  --member="user:[email protected]" \
  --role="roles/storage.objectViewer"

# Grant on a specific GCS bucket (narrower — only this bucket)
gcloud storage buckets add-iam-policy-binding gs://prod-assets \
  --member="serviceAccount:[email protected]" \
  --role="roles/storage.objectViewer"

# Grant on a specific BigQuery dataset
bq update --add_iam_policy_binding \
  --member="group:[email protected]" \
  --role="roles/bigquery.dataViewer" \
  my-project:analytics_dataset

# View the current IAM policy on a project
gcloud projects get-iam-policy my-project --format=json

# View a specific resource's policy
gcloud storage buckets get-iam-policy gs://prod-assets

The choice between project-level and resource-level binding has real consequences. A binding on the GCS bucket affects only that bucket. A binding at the project level affects the bucket AND every other resource in the project. In practice, default to the most specific scope available. Only move up the hierarchy when the alternative is an unmanageable number of bindings.

Conditional Bindings — Time-Limited and Context-Scoped Access

Conditions scope when a binding applies. They use CEL (Common Expression Language):

# Temporary access for a contractor — automatically expires
gcloud projects add-iam-policy-binding my-project \
  --member="user:[email protected]" \
  --role="roles/storage.objectViewer" \
  --condition="expression=request.time < timestamp('2026-06-30T00:00:00Z'),title=Contractor access Q2 2026"

# Access only from corporate network
gcloud projects add-iam-policy-binding my-project \
  --member="user:[email protected]" \
  --role="roles/bigquery.admin" \
  --condition="expression=request.origin.ip.startsWith('10.0.'),title=Corp network only"

Temporary access that automatically expires is one of the most practical applications of conditional bindings. Instead of “I’ll grant access and remember to remove it,” you set an expiry and it removes itself. The cognitive overhead of tracking temporary grants doesn’t disappear — you still need to know the grant exists — but the risk of it outliving its purpose drops significantly.


Service Accounts — GCP’s Machine Identity

Service accounts are the machine identity in GCP. They should be used for every workload that needs to call GCP APIs — GCE instances, GKE pods, Cloud Functions, Cloud Run services.

# Create a service account
gcloud iam service-accounts create app-backend \
  --display-name="App Backend Service Account" \
  --project=my-project

SA_EMAIL="[email protected]"

# Grant it the specific role it needs — on the specific resource it needs
gcloud storage buckets add-iam-policy-binding gs://app-assets \
  --member="serviceAccount:${SA_EMAIL}" \
  --role="roles/storage.objectViewer"

# Attach to a GCE instance
gcloud compute instances create my-vm \
  --service-account="${SA_EMAIL}" \
  --scopes="cloud-platform" \
  --zone=us-central1-a

From inside the VM, Application Default Credentials (ADC) handles authentication automatically:

# From the VM — ADC uses the attached SA without any credential configuration
gcloud auth application-default print-access-token

# Or via the metadata server directly
curl -H "Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"

Service Account Keys — The Antipattern to Avoid

A service account key is a JSON file containing a private key. It’s long-lived, it doesn’t expire automatically, and if it leaks it gives an attacker persistent access as that service account until someone discovers and revokes it.

# Creating a key — only if there is genuinely no alternative
gcloud iam service-accounts keys create key.json --iam-account="${SA_EMAIL}"
# This generates a long-lived credential. It will exist until explicitly deleted.

# List all active keys — do this in every audit
gcloud iam service-accounts keys list --iam-account="${SA_EMAIL}"

# Delete a key
gcloud iam service-accounts keys delete KEY_ID --iam-account="${SA_EMAIL}"

In the GCP environment I mentioned earlier — the one with roles/editor at the folder level — I also found 23 service account key files downloaded across the team’s laptops over 18 months. Nobody had a complete list of which keys were still valid and where they were stored. Several were for accounts that no longer existed. That’s not a hypothetical attack surface. It’s a breach waiting for a laptop to be stolen.

Never create service account keys when:
– Code runs on GCE/GKE/Cloud Run/Cloud Functions — use the attached service account and ADC
– Code runs in GitHub Actions — use Workload Identity Federation
– Code runs on-premises with Kubernetes — use Workload Identity Federation with OIDC

Service Account Impersonation — The Right Alternative to Keys

Instead of downloading a key, grant a user or service account permission to impersonate the service account. They generate a short-lived token, not a permanent credential:

# Allow alice to impersonate the service account
gcloud iam service-accounts add-iam-policy-binding "${SA_EMAIL}" \
  --member="user:[email protected]" \
  --role="roles/iam.serviceAccountTokenCreator"

# Alice generates a token for the SA — no key file, short-lived
gcloud auth print-access-token --impersonate-service-account="${SA_EMAIL}"

# Or configure ADC to use impersonation
export GOOGLE_IMPERSONATE_SERVICE_ACCOUNT="${SA_EMAIL}"
gcloud storage ls gs://app-assets  # runs as the SA

This is the right model for humans who need to act as service accounts for debugging or deployment: impersonate, use, done. The token expires. No file to manage.


Workload Identity Federation — Credentials Eliminated

The cleanest solution for any workload running outside GCP that needs to call GCP APIs: Workload Identity Federation. The external workload authenticates with its native identity (a GitHub Actions OIDC JWT, an AWS IAM role, a Kubernetes service account token), exchanges it for a short-lived GCP access token, and never handles a service account key.

# Create a Workload Identity Pool
gcloud iam workload-identity-pools create "github-actions-pool" \
  --project=my-project \
  --location=global \
  --display-name="GitHub Actions WIF Pool"

# Create a provider (GitHub OIDC)
gcloud iam workload-identity-pools providers create-oidc "github-provider" \
  --project=my-project \
  --location=global \
  --workload-identity-pool="github-actions-pool" \
  --issuer-uri="https://token.actions.githubusercontent.com" \
  --attribute-mapping="google.subject=assertion.sub,attribute.repository=assertion.repository" \
  --attribute-condition="assertion.repository_owner == 'my-org'"

# Allow a specific GitHub repo to impersonate the SA
gcloud iam service-accounts add-iam-policy-binding "${SA_EMAIL}" \
  --role="roles/iam.workloadIdentityUser" \
  --member="principalSet://iam.googleapis.com/projects/PROJECT_NUM/locations/global/workloadIdentityPools/github-actions-pool/attribute.repository/my-org/my-repo"

GitHub Actions workflow — no key files, no secrets stored in GitHub:

jobs:
  deploy:
    permissions:
      id-token: write   # required for OIDC token request
      contents: read
    steps:
      - uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: "projects/PROJECT_NUM/locations/global/workloadIdentityPools/github-actions-pool/providers/github-provider"
          service_account: "[email protected]"

      - run: gcloud storage cp dist/ gs://app-assets/ --recursive

The OIDC JWT from GitHub is presented to GCP, which verifies it against GitHub’s public keys, checks the attribute mapping and condition (only the specified repo can use this), and issues a short-lived GCP access token. The credential exists for the duration of the job and is then gone.


IAM Deny Policies — Org-Wide Guardrails

GCP added standalone deny policies separate from bindings. They override grants:

cat > deny-iam-escalation.json << 'EOF'
{
  "displayName": "Deny IAM escalation permissions to non-admins",
  "rules": [{
    "denyRule": {
      "deniedPrincipals": ["principalSet://goog/group/[email protected]"],
      "deniedPermissions": [
        "iam.googleapis.com/roles.create",
        "iam.googleapis.com/roles.update",
        "iam.googleapis.com/serviceAccounts.actAs"
      ]
    }
  }]
}
EOF

gcloud iam policies create deny-iam-escalation-policy \
  --attachment-point="cloudresourcemanager.googleapis.com/projects/my-project" \
  --policy-file=deny-iam-escalation.json

iam.serviceAccounts.actAs is worth calling out specifically. It’s the GCP equivalent of AWS’s iam:PassRole — it allows an identity to make a service act as a specified service account. If a developer can call actAs on a high-privileged service account, they can launch a GCE instance using that service account and then operate with its permissions. Same privilege escalation pattern as iam:PassRole, different name. Deny it for anyone who doesn’t explicitly need it.


⚠ Production Gotchas

roles/editor at folder level is a blast radius waiting to happen
The role sounds like “edit access.” At folder level it means edit access to every service in every project under that folder — including services nobody thought to trace. Always scope to the specific project or resource, never a folder unless the use case explicitly requires it.

allAuthenticatedUsers on a GCS bucket is public to 3 billion accounts
Any Google account — personal Gmail included — qualifies as “authenticated.” I’ve seen production customer data exposed this way while a developer tested a CDN pattern on the wrong bucket. Audit for these bindings before they become a breach notification.

Service account keys accumulate and nobody tracks them
In every GCP environment I’ve audited that allowed SA key creation, there were active keys for accounts that no longer existed, stored on laptops with no central inventory. Keys don’t expire. Audit with gcloud iam service-accounts keys list across every SA in every project.

iam.serviceAccounts.actAs is a privilege escalation path
If a principal can call actAs on a high-privileged SA, they can launch a GCE instance with that SA and operate with its full permissions — without ever being directly granted those permissions. Block this with a deny policy for everyone who doesn’t explicitly need it.

Org-level roles/viewer is not a safe broad grant
Read access to every project config, every service configuration, every infrastructure metadata object across your entire GCP estate is not a benign grant. Treat any binding above the project level as high-blast-radius, regardless of the role.


Quick Reference

# Audit org and folder structure before any high-level change
gcloud organizations list
gcloud resource-manager folders list --organization=ORG_ID
gcloud projects list --filter="parent.id=FOLDER_ID"

# Inspect all bindings at org level
gcloud organizations get-iam-policy ORG_ID --format=json | jq '.bindings[]'

# Find allUsers / allAuthenticatedUsers in a project
gcloud projects get-iam-policy PROJECT_ID --format=json \
  | jq '.bindings[] | select(.members[] | contains("allUsers") or contains("allAuthenticatedUsers"))'

# Check a GCS bucket for public access
gsutil iam get gs://BUCKET_NAME | grep -E "(allUsers|allAuthenticatedUsers)"

# Audit all user-managed SA keys across a project
gcloud iam service-accounts list --project=PROJECT_ID --format="value(email)" \
  | xargs -I{} gcloud iam service-accounts keys list --iam-account={} --managed-by=user

# List predefined roles for a service
gcloud iam roles list --filter="name:roles/storage" --format="table(name,title)"

# Inspect what permissions a role actually includes
gcloud iam roles describe roles/storage.objectViewer

# Grant time-limited access with conditional binding
gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="user:[email protected]" \
  --role="roles/storage.objectViewer" \
  --condition="expression=request.time < timestamp('2026-06-30T00:00:00Z'),title=Contractor Q2 2026"

# Enable SA impersonation (avoids key creation)
gcloud iam service-accounts add-iam-policy-binding SA_EMAIL \
  --member="user:[email protected]" \
  --role="roles/iam.serviceAccountTokenCreator"

Framework Alignment

Framework Reference What It Covers Here
CISSP Domain 5 — Identity and Access Management GCP’s hierarchical model and service account patterns are the primary IAM constructs for GCP environments
CISSP Domain 3 — Security Architecture Resource hierarchy design determines access inheritance — architectural decisions with direct security implications
ISO 27001:2022 5.15 Access control GCP IAM bindings are the technical implementation of access control policy in GCP environments
ISO 27001:2022 5.18 Access rights Service account provisioning, conditional bindings with expiry, and workload identity federation
ISO 27001:2022 8.2 Privileged access rights Folder/org-level bindings and basic roles represent the highest-risk privilege grants in GCP
SOC 2 CC6.1 IAM bindings and Workload Identity Federation address machine identity controls for CC6.1
SOC 2 CC6.3 Conditional bindings with time-bound expiry directly satisfy access removal requirements

Key Takeaways

  • GCP IAM is hierarchical — bindings inherit downward; a binding at org or folder level has much larger scope than it appears
  • Basic roles (viewer/editor/owner) are too coarse for production; use predefined or custom roles and grant at the narrowest scope
  • Service account keys are a long-lived credential antipattern; use ADC on GCP infrastructure, impersonation for humans, and Workload Identity Federation for external workloads
  • allAuthenticatedUsers and allUsers bindings expose resources to the internet — audit for these in every environment
  • iam.serviceAccounts.actAs is a privilege escalation vector — treat it like iam:PassRole
  • Conditional bindings with expiry dates are better than “I’ll remember to remove this later”

What’s Next

EP06 covers Azure RBAC and Entra ID — the most directory-centric of the three models, where Active Directory’s 25 years of enterprise history shapes both the strengths and the complexity of Azure’s access control.

Next: Azure RBAC Scopes — Management Groups, Subscriptions, and how role inheritance works across the Microsoft estate.

Get EP06 in your inbox when it publishes → subscribe

What Is Cloud IAM — and Why Every API Call Depends on It

Reading Time: 11 minutes

Meta Description: Understand what cloud IAM is and why every API call in AWS, GCP, and Azure hits a deny-by-default check — the foundational model behind all cloud access.


What Is Cloud IAMAuthentication vs AuthorizationIAM Roles vs PoliciesAWS IAM Deep DiveGCP Resource Hierarchy IAMAzure RBAC Scopes


TL;DR

  • Cloud IAM is the system that decides whether any API call is allowed or denied — deny by default, explicit Allow required at every layer
  • Every API call answers four questions: Who? (Identity) What? (Action) On what? (Resource) Under what conditions? (Context)
  • Two identity types in every cloud account: human (engineers) and machine (Lambda, EC2, Kubernetes pods) — machine identities outnumber human by 10:1 in most production environments
  • AWS, GCP, and Azure share the same model: deny-by-default, policy-driven, principal-based — different syntax, same mental model
  • The gap between granted and used permissions is where attackers move — the average IAM entity uses under 5% of its granted permissions
  • IAM failure has two modes: over-permissioned (“it works”) and over-restricted (“it’s secure, engineers work around it”) — both end in incidents

The Big Picture

                        WHAT IS CLOUD IAM?

  Every API call in AWS, GCP, or Azure answers four questions:

  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐   ┌─────────────┐
  │    WHO?     │   │   WHAT?     │   │  ON WHAT?   │   │  UNDER      │
  │             │   │             │   │             │   │  WHAT?      │
  │  Identity / │   │  Action /   │   │  Resource   │   │             │
  │  Principal  │   │  Permission │   │             │   │  Condition  │
  │             │   │             │   │             │   │             │
  │ IAM Role    │   │ s3:GetObject│   │ arn:aws:s3: │   │ MFA: true   │
  │ Svc Account │   │ ec2:Start   │   │ ::prod-data │   │ IP: 10.0/8  │
  │ Managed     │   │ iam:        │   │ /exports/*  │   │ Time: 09-17 │
  │ Identity    │   │   PassRole  │   │             │   │             │
  └─────────────┘   └─────────────┘   └─────────────┘   └─────────────┘
        └────────────────┴────────────────┴────────────────┘
                                  │
                     ┌────────────▼────────────┐
                     │    IAM Policy Engine    │
                     │    deny by default      │
                     │                         │
                     │  Explicit ALLOW?   ─────┼──→  PERMIT
                     │  Explicit DENY?    ─────┼──→  DENY (overrides Allow)
                     │  No matching rule? ─────┼──→  DENY (implicit)
                     └─────────────────────────┘

Cloud IAM is the answer to a question every growing infrastructure team hits: at scale, how do you know who can do what, why they can do it, and whether they still should?


Introduction

Cloud IAM (Identity and Access Management) is the control plane for access in every major cloud provider. Every API call — reading a file, starting an instance, invoking a function — goes through an IAM evaluation. The result is binary: explicit Allow or deny. There is no implicit access. Nothing is open by default. This is what makes cloud IAM fundamentally different from the access models that came before it.

Understanding why it works that way requires tracing how access control evolved — and what kept breaking at each stage.

A few years into my career managing Linux infrastructure, I was handed a production server audit. The task was straightforward: find out who had access to what. I pulled /etc/passwd, checked the sudoers file, reviewed SSH authorized_keys across the fleet.

Three days later, I had a spreadsheet nobody wanted to read.

The problem wasn’t that the access was wrong. Most of it was fine. The problem was that nobody — not the team lead, not the security team, not the engineers who’d been there five years — could tell me why a particular account had access to a particular server. It had accumulated. People joined, got access, changed teams, left. The access stayed.

That was a 40-server fleet in 2012.

Fast-forward to a cloud environment today: you might have 50 engineers, 300 Lambda functions, 20 microservices, CI/CD pipelines, third-party integrations, compliance scanners — all making API calls, all needing access to something. The identity sprawl problem I spent three days auditing manually on 40 servers now exists at a scale where manual auditing isn’t even a conversation.

This is the problem Identity and Access Management exists to solve. Not just in theory — in practice, at the scale cloud infrastructure demands.


How We Got Here — The Evolution of Access Control

To understand why cloud IAM works the way it does, you need to trace how access control evolved. The design decisions in AWS IAM, GCP, and Azure didn’t come out of nowhere. They’re answers to lessons learned the hard way across decades of broken systems.

The Unix Model (1970s–1990s): Simple and Sufficient

Unix got the fundamentals right early. Every resource (file, device, process) has an owner and a group. Every action is one of three: read, write, execute. Every user is either the owner, in the group, or everyone else.

-rw-r--r--  1 vamshi  engineers  4096 Apr 11 09:00 deploy.conf
# owner can read/write | group can read | others can read

For a single machine or a small network, this model is elegant. The permissions are visible in a ls -l. Reasoning about access is straightforward. Auditing means reading a few files.

However, the cracks started showing when organizations grew. You’d add sudo to give specific commands to specific users. Then sudoers files became 300 lines long. Then you’d have shared accounts because managing individual ones was “too much overhead.” Shared accounts mean no individual accountability. No accountability means no audit trail worth anything.

The Directory Era (1990s–2000s): Centralise or Collapse

As networks grew, every server managing its own /etc/passwd became untenable. Enter LDAP and Active Directory. Instead of distributing identity management across every machine, you centralised it: one directory, one place to add users, one place to disable them when someone left.

This was a significant step forward. Onboarding got faster. Offboarding became reliable. Group membership drove access to resources across the network.

Why Groups Became the New Problem

But the permission model was still coarse. You were either in the Domain Admins group or you weren’t. “Read access to the file share” was a group. “Deploy to the staging web server” was a group. Managing fine-grained permissions at scale meant managing hundreds of groups, and the groups themselves became the audit nightmare.

I spent time in environments like this. The group named SG_Prod_App_ReadWrite_v2_FINAL that nobody could explain. The AD group from a project that ended three years ago but was still in twenty user accounts. The contractor whose AD account was disabled but whose service account was still running a nightly job.

The directory model centralised identity. It didn’t solve the permissions sprawl problem.

The Cloud Shift (2006–2014): Everything Changes

AWS launched EC2 in 2006. In 2011, AWS IAM went into general availability. That date matters — for the first five years of AWS, access control was primitive. Root accounts. Access keys. No roles.

Early AWS environments I’ve seen (and had to clean up) reflect this era: a single root account access key shared across a team, rotated manually on a shared spreadsheet. Static credentials in application config files. EC2 instances with AdministratorAccess because “it was easier at the time.”

The Model That Changed Everything

The AWS team understood what they’d built was dangerous. IAM in 2011 introduced the model that all three major cloud providers now share: deny-by-default, policy-driven, principal-based access control. Not “who is in which group.” The question became: which policy explicitly grants this specific action on this specific resource to this specific identity.

GCP launched its IAM model with a different flavour in 2012 — hierarchical, additive, binding-based. Azure RBAC came to general availability in 2014, built on top of Active Directory’s identity model.

By 2015, the modern cloud IAM era was established. The primitives existed. The problem shifted from “does IAM exist?” to “are we using it correctly?” — and most teams were not.

In practice, that question is still the right one to ask today.


The Problem IAM Actually Solves

Here’s the honest version of what IAM is for, based on what I’ve seen go wrong without it.

Without proper IAM, you get one of two outcomes:

The first is what I call the “it works” environment. Everything runs. The developers are happy. Access requests take five minutes because everyone gets the same broad policy. And then a Lambda function’s execution role — which had s3:* on * because someone once needed to debug something — gets its credentials exposed through an SSRF vulnerability in the app it runs. That role can now read every bucket in the account, including the one with the customer database exports.

The second is the “it’s secure” environment. Access is locked down. Every request goes through a ticket. The ticket goes to a security team that approves it in three to five business days. Engineers work around it by storing credentials locally. The workarounds become the real access model. The formal IAM posture and the actual access posture diverge. The audit finds the formal one. Attackers find the real one.

IAM, done right, is the discipline of walking the line between those two outcomes. It’s not a product you buy or a feature you turn on. It’s a practice — a continuous process of defining what access exists, why it exists, and whether it’s still needed.


The Core Concepts — Taught, Not Listed

Let me walk you through the vocabulary you need, grounded in what each concept means in practice.

Identity: Who Is Making This Request?

An identity is any entity that can hold a credential and make requests. In cloud environments, identities split into two types:

Human identities are engineers, operators, and developers. They authenticate via the console, CLI, or SDK. They should ideally authenticate through a central IdP (Okta, Google Workspace, Entra ID) using federation — more on that in SAML vs OIDC: Which Federation Protocol Belongs in Your Cloud?.

Machine identities are everything else: Lambda functions, EC2 instances, Kubernetes pods, CI/CD pipelines, monitoring agents, data pipelines. In most production environments, machine identities outnumber human identities by 10:1 or more.

This ratio matters. When your security model is designed primarily for human access, the 90% of identities that are machines become an afterthought. That’s where access keys end up in environment variables, where Lambda functions get broad permissions because nobody thought carefully about what they actually need, where the real attack surface lives.

Principal: The Authenticated Identity Making a Specific Request

A principal is an identity that has been authenticated and is currently making a request. The distinction from “identity” is subtle but important: the principal includes the context of how the identity authenticated.

In AWS, an IAM role assumed by EC2, assumed by a Lambda, and assumed by a developer’s CLI session are three different principals — even if they all assume the same role. The session context, source, and expiration differ.

{
  "Principal": {
    "AWS": "arn:aws:iam::123456789012:role/DataPipelineRole"
  }
}

In GCP, the equivalent term is member. In Azure, it’s security principal — a user, group, service principal, or managed identity.

Resource: What Is Being Accessed?

A resource is whatever is being acted upon. In AWS, every resource has an ARN (Amazon Resource Name) — a globally unique identifier.

arn:aws:s3:::customer-data-prod          # S3 bucket
arn:aws:s3:::customer-data-prod/*        # everything inside that bucket
arn:aws:ec2:ap-south-1:123456789012:instance/i-0abcdef1234567890
arn:aws:iam::123456789012:role/DataPipelineRole

The ARN structure tells you: service, region, account, resource type, resource name. Once you can read ARNs fluently, IAM policies become much less intimidating.

Action: What Is Being Done?

An action (AWS/Azure) or permission (GCP) is the operation being attempted. Cloud providers express these as service:Operation strings:

# AWS
s3:GetObject           # read a specific object
s3:PutObject           # write an object
s3:DeleteObject        # delete an object — treat differently than read
iam:PassRole           # assign a role to a service — one of the most dangerous permissions
ec2:DescribeInstances  # list instances — often overlooked, but reveals infrastructure

# GCP
storage.objects.get
storage.objects.create
iam.serviceAccounts.actAs   # impersonate a service account — equivalent to iam:PassRole danger

When I audit IAM configurations, I pay special attention to any policy that includes iam:*, iam:PassRole, or wildcards like "Action": "*". These are the permissions that let a compromised identity create new identities, assign itself more power, or impersonate other accounts. They’re the privilege escalation primitives — more on that in AWS IAM Privilege Escalation: How iam:PassRole Leads to Full Compromise.

Policy: The Document That Connects Everything

A policy is a document that says: this principal can perform these actions on these resources, under these conditions.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadCustomerDataBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::customer-data-prod",
        "arn:aws:s3:::customer-data-prod/*"
      ]
    }
  ]
}

Notice what’s explicit here: the effect (Allow), the exact actions (not s3:*), and the exact resource (not *). Every word in this document is a deliberate decision. The moment you start using wildcards to save typing, you’re writing technical debt that will come back as a security incident.


How IAM Actually Works — The Decision Flow

When any API call hits a cloud service, an IAM engine evaluates it. Understanding this flow is the foundation of debugging access issues, and more importantly, of understanding why your security posture is what it is.

Request arrives:
  Action:    s3:PutObject
  Resource:  arn:aws:s3:::customer-data-prod/exports/2026-04-11.csv
  Principal: arn:aws:iam::123456789012:role/DataPipelineRole
  Context:   { source_ip: "10.0.2.15", mfa: false, time: "02:30 UTC" }

IAM Engine evaluation (AWS):
  1. Is there an explicit Deny anywhere? → No
  2. Does the SCP (if any) allow this? → Yes
  3. Does the identity-based policy allow this? → Yes (via DataPipelinePolicy)
  4. Does the resource-based policy (bucket policy) allow or deny? → No explicit rule → implicit allow for same-account
  5. Is there a permissions boundary? → No
  Decision: ALLOW

The critical insight here: cloud IAM is deny-by-default. There is no implicit allow. If there is no policy that explicitly grants s3:PutObject to this role on this bucket, the request fails. The only way in is through an explicit "Effect": "Allow".

This is the opposite of how most traditional systems work. In a Unix permission model, if your file is world-readable (-r--r--r--), anyone can read it unless you actively restrict them. In cloud IAM, nothing is accessible unless you actively grant it.

When I’m debugging an AccessDenied error — and every engineer who works with cloud IAM spends significant time doing this — the mental model is always: “what is the chain of explicit Allows that should be granting this access, and at which layer is it missing?”


Why This Is Harder Than It Looks

Understanding the concepts is the easy part. The hard part is everything that happens at organisational scale over time.

Scale. A real AWS account in a growing company might have 600+ IAM roles, 300+ policies, and 40+ cross-account trust relationships. None of these were designed together. They evolved incrementally, each change made by someone who understood the context at the time and may have left the organisation since. The cumulative effect is an IAM configuration that no single person fully understands.

Drift. IAM configs don’t stay clean. An engineer needs to debug a production issue at 2 AM and grants themselves broad access temporarily. The temporary access never gets revoked. Multiply that by a team of 20 over three years. I’ve audited environments where 60% of the permissions in a role had never been used — not once — in the 90-day CloudTrail window. That unused 60% is pure attack surface.

The machine identity blind spot. Most IAM governance practices were built for human users. Service accounts, Lambda roles, and CI/CD pipeline identities get created rapidly and reviewed rarely. In my experience, these are the identities most likely to have excess permissions, least likely to be in the access review process, and most likely to be the initial foothold in a cloud breach.

The gap between granted and used. That said, this one surprised me most when I first started doing cloud security work. AWS data from real customer accounts shows the average IAM entity uses less than 5% of its granted permissions. That 95% excess isn’t just waste — it’s attack surface. Every permission that exists but isn’t needed is a permission an attacker can use if they compromise that identity.


IAM Across AWS, GCP, and Azure — The Conceptual Map

The three major providers implement IAM differently in syntax, but the same model underlies all of them. Once you understand one deeply, the others become a translation exercise.

Concept AWS GCP Azure
Identity store IAM users / roles Google accounts, Workspace Entra ID
Machine identity IAM Role (via instance profile or AssumeRole) Service Account Managed Identity
Access grant mechanism Policy document attached to identity or resource IAM binding on resource (member + role + condition) Role Assignment (principal + role + scope)
Hierarchy Account is the boundary; Org via SCPs Org → Folder → Project → Resource Tenant → Management Group → Subscription → Resource Group → Resource
Default stance Deny Deny Deny
Wildcard risk "Action": "*" on "Resource": "*" Primitive roles (viewer/editor/owner) Owner or Contributor assigned broadly

The hierarchy point is worth pausing on. AWS is relatively flat — the account is the primary security boundary. GCP’s hierarchy means a binding at the Organisation level propagates down to every project. Azure’s hierarchy means a role assignment at the Management Group level flows through every subscription beneath it.

The blast radius of a misconfiguration scales with how high in the hierarchy it sits.

This will matter in GCP IAM Policy Inheritance and Azure RBAC Explained when we go deep on GCP and Azure specifically. For now, the takeaway is: understand where in the hierarchy a permission is granted, because the same permission granted at the wrong level has a very different security implication.


Framework Alignment

If you’re mapping this episode to a control framework — for a compliance audit, a certification study, or building a security program — here’s where it lands:

Framework Reference What It Covers Here
CISSP Domain 1 — Security & Risk Management IAM as a risk reduction control; blast radius is a risk variable
CISSP Domain 5 — Identity and Access Management Direct implementation: who can do what, to which resources, under what conditions
ISO 27001:2022 5.15 Access control Policy requirements for restricting access to information and systems
ISO 27001:2022 5.16 Identity management Managing the full lifecycle of identities in the organization
ISO 27001:2022 5.18 Access rights Provisioning, review, and removal of access rights
SOC 2 CC6.1 Logical access security controls to protect against unauthorized access
SOC 2 CC6.3 Access removal and review processes to limit unauthorized access

Key Takeaways

  • IAM evolved from Unix file permissions → directory services → cloud policy engines, driven by scale and the failure modes of each prior model
  • Cloud IAM is deny-by-default: every access requires an explicit Allow somewhere in the policy chain
  • Identities are human or machine; in production, machines dominate — and they’re the under-governed majority
  • A policy binds a principal to actions on resources; every word is a deliberate security decision
  • The hardest IAM problems aren’t technical — they’re organisational: drift, unused permissions, machine identities nobody owns, and access reviews that never happen
  • The gap between granted and used permissions is where attackers find room to move

What’s Next

Now that you understand what IAM is and why it exists, the next question is the one that trips up even experienced engineers: what’s the difference between authentication and authorization, and why does conflating them cause security failures?

EP02 works through both — how cloud providers implement each, where the boundary sits, and why getting this boundary wrong creates exploitable gaps.

Next: Authentication vs Authorization: AWS AccessDenied Explained

Get EP02 in your inbox when it publishes → subscribe