What Is Cloud IAM → Authentication vs Authorization → IAM Roles vs Policies → AWS IAM Deep Dive → GCP Resource Hierarchy IAM → Azure RBAC Scopes → OIDC Workload Identity → AWS IAM Privilege Escalation → AWS Least Privilege Audit → SAML vs OIDC Federation → Kubernetes RBAC and AWS IAM → Zero Trust Access in the Cloud
TL;DR
- Zero Trust: trust nothing implicitly, verify everything explicitly, minimize blast radius by assuming you will be breached
- Network location is not identity — VPN is authentication for the tunnel, not authorization for the resource
- JIT privilege elevation removes standing admin access: engineers request elevation for a specific purpose, scoped to a specific duration
- Device posture is an access signal — a compromised endpoint with valid credentials is still a threat; Conditional Access gates on device compliance
- Continuous session validation re-evaluates signals throughout the session — device falls out of compliance, sessions revoke in minutes, not at expiry
- The highest-ROI early moves: eliminate machine static credentials, enforce MFA on all human access, federate to a single IdP
The Big Picture
ZERO TRUST IAM — EVERY REQUEST EVALUATED INDEPENDENTLY
API call arrives
│
▼
Identity verified? ──── No ────► DENY
│
Yes
│
▼
Device compliant? ───── No ────► DENY (or step-up MFA)
│
Yes
│
▼
Policy allows this ─── No ────► DENY
action on this ARN?
│
Yes
│
▼
Conditions met? ─────── No ────► DENY
(time, IP, MFA age, (e.g., outside business hours,
risk score, session) impossible travel detected)
│
Yes
│
▼
ALLOW ──────────────────────► LOG every decision (allow and deny)
│
└── Continuous re-evaluation:
device state changes → revoke
anomaly detected → revoke or step-up
credential age → require re-auth
Introduction
The perimeter model of network security made a bet: inside the network is trusted, outside is not. Lock down the perimeter tightly enough and you’re safe. VPN in, and you’re one of us.
I grew up professionally in that model. Firewalls, DMZs, trusted zones. The idea had intuitive appeal — you build walls, you control what crosses them. For a while it worked reasonably well.
Then I watched it fail, repeatedly, in ways that were predictable in hindsight. An engineer’s laptop gets compromised at a coffee shop. They VPN in. Now the attacker is “inside.” A contractor account gets phished. They have valid Active Directory credentials. They’re inside. A cloud service gets misconfigured and exposes a management interface. There’s no perimeter for that to be inside of.
The perimeter model failed not because the walls weren’t strong enough, but because the premise was wrong. There is no inside. There is no perimeter that reliably separates trusted from untrusted. In a world of remote work, cloud services, contractor access, and API integrations, the attack surface doesn’t respect network boundaries.
Zero Trust is the architecture built on a different premise: trust nothing implicitly. Verify everything explicitly. Minimize blast radius by assuming you will be breached.
This isn’t a product you buy. It’s a set of principles applied to how you design, build, and operate your IAM. This episode is how those principles translate to concrete practices — building on everything we’ve covered in this series.
The Three Principles
Verify Explicitly
Every request must carry verifiable identity and context. Network location is not identity.
Old model: request from 10.0.0.0/8 → trusted, proceed
Zero Trust: request from 10.0.0.0/8 → still must present verifiable identity
still must pass authorization check
still must pass context evaluation
then proceed (or deny)
In cloud IAM terms: every API call carries identity claims (IAM role ARN, federated identity, managed identity), and those claims are verified against policy on every single request. There’s no concept of “once authenticated, trusted until logout.” In cloud IAM, this already exists natively — every API call is authenticated and authorized independently. The challenge is extending this model to internal services, internal APIs, and human access patterns.
Implementation in practice:
– mTLS for service-to-service communication — both sides present certificates; identity is the certificate, not the network path
– Bearer tokens on every internal API call — no session cookies, no “we’re on the same VPC so it’s fine”
– Short-lived credentials everywhere — a compromised credential expires, not “after the session times out in 8 hours”
Use Least Privilege — Just-in-Time, Just-Enough
No standing access to sensitive resources. Access granted when needed, for the minimum scope, for the minimum duration.
Old model: alice is in the DBA group → permanent access to all databases
Zero Trust: alice requests access to production DB →
verified: alice's device is enrolled in MDM and compliant
verified: alice has an open change ticket for this task
verified: current time is within business hours
granted: connection to this specific database, from alice's specific IP
for 2 hours, then revoked automatically
This is JIT access. It reduces the window where a compromised credential can cause damage. It requires a change in how engineers think about access: access is not a property you have, it’s something you request when you need it. The operational friction is a feature, not a bug — it’s the friction of having to justify each elevated access request that keeps the access model honest.
Assume Breach
Design systems as if the attacker is already inside. This drives different decisions:
- Micro-segmentation: one role per service, minimum permissions per role. If one service is compromised, it can’t pivot to everything else.
- Log everything: every authorization decision, allow or deny. When you’re investigating an incident, you need to know what happened, not just that something happened.
- Automate response: anomalous API call pattern → trigger automated credential revocation or session termination. Don’t wait for a human to notice.
Building Zero Trust IAM — Block by Block
Block 1: Strong Identity Foundation
You can’t verify explicitly without strong authentication. The starting point:
# AWS: require MFA for all IAM operations — enforce via SCP across the org
{
"Effect": "Deny",
"Action": "*",
"Resource": "*",
"Condition": {
"BoolIfExists": {
"aws:MultiFactorAuthPresent": "false"
},
"StringNotLike": {
"aws:PrincipalArn": [
"arn:aws:iam::*:role/AWSServiceRole*",
"arn:aws:iam::*:role/OrganizationAccountAccessRole"
]
}
}
}
# GCP: enforce OS Login for VM SSH (ties SSH access to Google identity, not SSH keys)
gcloud compute project-info add-metadata \
--metadata enable-oslogin=TRUE
# This means: SSH to a VM requires your Google identity to have roles/compute.osLogin
# or roles/compute.osAdminLogin. No more managing ~/.authorized_keys files on instances.
For human access: hardware FIDO2 keys (YubiKey, Google Titan) rather than TOTP where possible. TOTP codes can be phished in real-time adversary-in-the-middle attacks. Hardware keys cannot — the cryptographic challenge-response is bound to the origin URL.
Block 2: Device Posture as an Access Signal
In a Zero Trust model, the identity of the user is necessary but not sufficient. The state of the device matters too — a compromised endpoint with valid credentials is still a threat.
# Azure Conditional Access: block access from non-compliant devices
# (configures in Entra ID Conditional Access portal)
conditions:
clientAppTypes: [browser, mobileAppsAndDesktopClients]
devices:
deviceFilter:
mode: exclude
rule: "device.isCompliant -eq True and device.trustType -eq 'AzureAD'"
grantControls:
builtInControls: [compliantDevice]
# AWS Verified Access: identity + device posture for application access — no VPN
aws ec2 create-verified-access-instance \
--description "Zero Trust app access"
# Attach identity trust provider (Okta OIDC)
aws ec2 create-verified-access-trust-provider \
--trust-provider-type user \
--user-trust-provider-type oidc \
--oidc-options IssuerURL=https://company.okta.com,ClientId=...,ClientSecret=...,Scope=openid
# Attach device trust provider (Jamf, Intune, or CrowdStrike)
aws ec2 create-verified-access-trust-provider \
--trust-provider-type device \
--device-trust-provider-type jamf \
--device-options TenantId=JAMF_TENANT_ID
AWS Verified Access allows users to reach internal applications by verifying both their identity (via OIDC) and their device health (via MDM) — without a VPN. The access gateway evaluates both signals on every connection, not just at login.
Block 3: Just-in-Time Privilege Elevation
No standing elevated access. Engineers are eligible for elevated roles; they activate them when needed.
# Azure PIM: engineer activates an eligible privileged role
az rest --method POST \
--uri "https://graph.microsoft.com/v1.0/roleManagement/directory/roleAssignmentScheduleRequests" \
--body '{
"action": "selfActivate",
"principalId": "USER_OBJECT_ID",
"roleDefinitionId": "ROLE_DEF_ID",
"directoryScopeId": "/",
"justification": "Investigating security alert in tenant — incident ticket INC-2026-0411",
"scheduleInfo": {
"startDateTime": "2026-04-11T09:00:00Z",
"expiration": {"type": "AfterDuration", "duration": "PT4H"}
}
}'
# Access activates, lasts 4 hours, then automatically removed
# AWS: temporary account assignment via Identity Center
# (typically triggered by ITSM workflow integration, not manual CLI)
aws sso-admin create-account-assignment \
--instance-arn "arn:aws:sso:::instance/ssoins-xxx" \
--target-id ACCOUNT_ID \
--target-type AWS_ACCOUNT \
--permission-set-arn "arn:aws:sso:::permissionSet/ssoins-xxx/ps-yyy" \
--principal-type USER \
--principal-id USER_ID
# Schedule deletion (using EventBridge + Lambda in a real deployment)
aws sso-admin delete-account-assignment \
--instance-arn "arn:aws:sso:::instance/ssoins-xxx" \
--target-id ACCOUNT_ID \
--target-type AWS_ACCOUNT \
--permission-set-arn "arn:aws:sso:::permissionSet/ssoins-xxx/ps-yyy" \
--principal-type USER \
--principal-id USER_ID
The operational change this requires: engineers stop thinking of access as something they hold permanently and start thinking of it as something they request for a specific purpose. This feels like friction until you’re investigating an incident and you have a precise record of who activated what elevated access and why.
Block 4: Continuous Session Validation
Traditional auth: verify once at login, trust the session until timeout.
Zero Trust auth: re-evaluate access signals continuously throughout the session.
Session starts: identity verified + device compliant + IP in expected range
→ access granted
15 minutes later: impossible travel detected (IP changes to different country)
→ step-up authentication required, or session terminated
Later: device compliance state changes (EDR detects malware)
→ all active sessions for this device revoked immediately
This requires integration between your identity platform and your device management / EDR tooling. Entra ID Conditional Access with Continuous Access Evaluation (CAE) implements this natively — when certain events occur (device compliance change, IP anomaly, token revocation), access tokens are invalidated within minutes rather than waiting for natural expiry.
// GCP: bind IAM access to an Access Context Manager access level
// Access level enforces device compliance — if device falls out of compliance,
// the access level is no longer satisfied and requests fail immediately
gcloud projects add-iam-policy-binding my-project \
--member="user:[email protected]" \
--role="roles/bigquery.admin" \
--condition="expression=request.auth.access_levels.exists(x, x == 'accessPolicies/POLICY_NUM/accessLevels/corporate_compliant_device'),title=Compliant device required"
Block 5: Micro-Segmented Permissions
Every service has its own identity. Every identity has only what it needs. Compromise of one service cannot propagate to others.
# Terraform: IAM as code — each service gets a dedicated, scoped role
resource "aws_iam_role" "order_processor" {
name = "svc-order-processor"
permissions_boundary = aws_iam_policy.service_boundary.arn
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
resource "aws_iam_role_policy" "order_processor" {
name = "order-processor-policy"
role = aws_iam_role.order_processor.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = ["sqs:ReceiveMessage", "sqs:DeleteMessage", "sqs:GetQueueAttributes"]
Resource = aws_sqs_queue.orders.arn
},
{
Effect = "Allow"
Action = ["dynamodb:PutItem", "dynamodb:GetItem", "dynamodb:UpdateItem"]
Resource = aws_dynamodb_table.orders.arn
}
]
})
}
# Open Policy Agent: enforce IAM standards at the policy level
# Run this in CI/CD — fail the build if any policy statement has wildcard actions
package iam.policy
deny[msg] {
input.Statement[i].Effect == "Allow"
input.Statement[i].Action == "*"
msg := sprintf("Statement %d has wildcard Action — not allowed", [i])
}
deny[msg] {
input.Statement[i].Effect == "Allow"
input.Statement[i].Resource == "*"
endswith(input.Statement[i].Action, "Delete")
msg := sprintf("Statement %d allows Delete on all resources — requires specific ARN", [i])
}
Block 6: Universal Audit Trail
Zero Trust without logging is just obscurity. Every authorization decision — allow and deny — must be logged, retained, and queryable.
# AWS: verify CloudTrail is comprehensive
aws cloudtrail get-trail-status --name management-trail
# Must have: LoggingEnabled=true, IsMultiRegionTrail=true, IncludeGlobalServiceEvents=true
# Verify no management events are excluded
aws cloudtrail get-event-selectors --trail-name management-trail \
| jq '.EventSelectors[] | {ReadWrite: .ReadWriteType, Mgmt: .IncludeManagementEvents}'
# ReadWriteType should be "All"; IncludeManagementEvents should be true
# GCP: ensure Data Access audit logs are enabled for IAM
gcloud projects get-iam-policy my-project --format=json | jq '.auditConfigs'
# Should see auditLogConfigs for cloudresourcemanager.googleapis.com and iam.googleapis.com
# with both DATA_READ and DATA_WRITE enabled
# Azure: route Entra ID logs to Log Analytics for long-term retention and querying
az monitor diagnostic-settings create \
--name entra-audit-to-la \
--resource "/tenants/TENANT_ID/providers/microsoft.aad/domains/company.com" \
--logs '[{"category":"AuditLogs","enabled":true},{"category":"SignInLogs","enabled":true}]' \
--workspace /subscriptions/SUB_ID/resourceGroups/rg-monitoring/providers/Microsoft.OperationalInsights/workspaces/security-logs
Framework Alignment
Zero Trust IAM isn’t a framework itself — it’s a design philosophy. But it maps cleanly onto the controls that compliance frameworks are pushing organizations toward:
| Framework | Reference | What It Covers Here |
|---|---|---|
| CISSP | Domain 5 — IAM | Zero Trust reframes IAM as continuous, context-aware verification rather than perimeter-based trust |
| CISSP | Domain 1 — Security & Risk Management | Assume breach as a risk management posture; blast radius minimization through least privilege |
| CISSP | Domain 7 — Security Operations | Continuous monitoring, anomaly detection, and automated response are operational requirements of Zero Trust |
| ISO 27001:2022 | 5.15 Access control | Zero Trust access policy: verify explicitly, least privilege, assume breach |
| ISO 27001:2022 | 8.16 Monitoring activities | Continuous session validation and universal audit trail — all authorization decisions logged |
| ISO 27001:2022 | 8.20 Networks security | Micro-segmentation and mTLS replace implicit network trust with verified identity at every hop |
| ISO 27001:2022 | 5.23 Information security for cloud services | Zero Trust architecture applied to cloud IAM across AWS, GCP, and Azure |
| SOC 2 | CC6.1 | Zero Trust logical access controls — JIT, device posture, context-aware authorization |
| SOC 2 | CC6.7 | Continuous session validation and transmission controls across all system components |
| SOC 2 | CC7.1 | Threat detection through universal audit trails and anomaly-triggered automated response |
| SOC 2 | CC7.2 | Incident response — automated revocation and session termination on anomaly detection |
Zero Trust Maturity — Where to Start
Most organizations think about Zero Trust as a destination — something you achieve after a large, multi-year program. The reality is it’s a direction, and any movement in that direction reduces risk.
| Level | Where You Are | What to Build Next |
|---|---|---|
| 1 — Initial | Some MFA; static credentials for machines; no centralized IdP | Eliminate machine static keys → workload identity |
| 2 — Managed | Centralized IdP; SSO for most systems; some MFA enforcement | Close SSO gaps; enforce MFA everywhere; federate to cloud |
| 3 — Defined | Least privilege being enforced; audit tooling in use; JIT for some privileged access | Expand JIT; policy-as-code in CI/CD; quarterly access reviews |
| 4 — Contextual | Device posture in access decisions; conditional access policies | Continuous session evaluation; automated anomaly response |
| 5 — Optimizing | Policy-as-code everywhere; automated right-sizing; anomaly-triggered revocation | Refine and maintain — Zero Trust is never “done” |
The jump from Level 1 to Level 3 delivers the most security value per unit of effort. Start there. Don’t defer least privilege enforcement while you build a sophisticated device posture integration.
The Practical Sequence
If you’re building Zero Trust IAM from where most organizations are, this is the order that maximizes early security value:
-
Inventory all identities — human and machine. You cannot secure what you can’t see. Build a complete picture before changing anything.
-
Eliminate static credentials for machines — replace access keys and SA key files with workload identity. This is the highest-ROI change in most environments.
-
Enforce MFA for all human access — especially cloud consoles, IdP admin, and VPN. Hardware keys for privileged accounts.
-
Federate human identity — single IdP, SSO to cloud and major applications. Centralize the revocation path.
-
Right-size IAM permissions — use last-accessed data and IAM Recommender to find and remove unused permissions. This is a continuous discipline, not a one-time clean-up.
-
JIT for privileged access — Azure PIM, AWS Identity Center assignment automation, or equivalent for all elevated roles. No standing admin.
-
IAM as code — all IAM changes via Terraform/Pulumi/CDK, reviewed in pull requests, validated by Access Analyzer or OPA in CI/CD, applied through automation.
-
Continuous monitoring — alerts on IAM mutations, anomalous API call patterns, new cross-account trust relationships, new public resource exposures.
-
Add context signals — Conditional Access policies incorporating device posture. Access Context Manager in GCP. AWS Verified Access for application access.
-
Automated response — anomaly detected → automatic credential suspension or session termination. Close the window between detection and containment.
Series Complete
This series covered Cloud IAM from the question “what even is IAM?” to Zero Trust architecture:
| Episode | Topic | The Core Lesson |
|---|---|---|
| EP01 | What is IAM? | Access management is deny-by-default; every grant is an explicit decision |
| EP02 | AuthN vs AuthZ | Two separate gates; passing one doesn’t open the other |
| EP03 | Roles, Policies, Permissions | Structure prevents drift; wildcards accumulate into exposure |
| EP04 | AWS IAM Deep Dive | Trust policies and permission policies are both required; the evaluation chain has six layers |
| EP05 | GCP IAM Deep Dive | Hierarchy inheritance is a feature that needs careful handling; service account keys are an antipattern |
| EP06 | Azure RBAC and Entra ID | Two separate authorization planes; managed identities are the right model for workloads |
| EP07 | Workload Identity | Static credentials for machines are solvable at the root; OIDC token exchange replaces them |
| EP08 | IAM Attack Paths | The attack chain runs through IAM; iam:PassRole and its equivalents are privilege escalation primitives |
| EP09 | Least Privilege Auditing | 5% utilization is the average; the 95% excess is attack surface — and it’s measurable |
| EP10 | Federation, OIDC, SAML | The IdP is the trust anchor; everything downstream is bounded by its security |
| EP11 | Kubernetes RBAC | Two separate IAM layers; both must be secured; cluster-admin is the first thing to audit |
| EP12 | Zero Trust IAM | Trust nothing implicitly; verify everything explicitly; minimize blast radius through least privilege at every layer |
IAM is not a feature you configure. It’s a practice you maintain. The organizations that operate with genuinely low cloud IAM risk don’t have fewer identities — they have better visibility into what those identities can do, and why, and what happened when something went wrong.
That’s what this series has been building toward.
The full series is at linuxcent.com/cloud-iam-series. If you found it useful, the best thing you can do is subscribe — the next series covers eBPF: what’s actually running in kernel space when Cilium, Falco, and Tetragon are doing their work.
Subscribe → linuxcent.com/subscribe