One Blueprint, Six Clouds — Multi-Provider OS Image Builds

Reading Time: 6 minutes

OS Hardening as Code, Episode 3
Cloud AMI Security Risks · Linux Hardening as Code · Multi-Cloud OS Hardening**

Focus Keyphrase: multi-cloud OS hardening
Search Intent: Informational
Meta Description: Maintain one OS hardening baseline across AWS, GCP, and Azure without separate scripts that drift. One HardeningBlueprint YAML, six providers, zero duplication. (155 chars)


TL;DR

  • Multi-cloud OS hardening with separate scripts per provider means three scripts that drift within weeks
  • A HardeningBlueprint YAML separates compliance intent (portable) from provider details (handled by Stratum’s provider layer)
  • The same blueprint builds on AWS, GCP, Azure, DigitalOcean, Linode, and Proxmox with a single --provider flag change
  • Provider-specific differences — disk names, cloud-init ordering, metadata endpoint IPs — are abstracted away from the blueprint author
  • One YAML file becomes the single source of truth for OS security posture across your entire fleet, regardless of cloud
  • Drift detection works fleet-wide: rescan any instance against the original blueprint grade on any provider

The Problem: Three Clouds, Three Scripts, Three Ways to Drift

AWS hardening script          GCP hardening script          Azure hardening script
├── /dev/xvd* disk refs       ├── /dev/sda* disk refs       ├── /dev/sda* disk refs
├── 169.254.169.254 IMDS      ├── 169.254.169.254 IMDS      ├── 169.254.169.254 IMDS
├── cloud-init order A        ├── cloud-init order B        ├── cloud-init order C
└── Updated: Jan 2025         └── Updated: Aug 2024         └── Updated: Mar 2024
                                         │
                                         └─ 5 months behind
                                            on CIS updates

Multi-cloud OS hardening starts as a copy-paste of the AWS script. Within a month, the clouds diverge.

EP02 showed that a HardeningBlueprint YAML eliminates the skip-at-2am problem by making hardening a build artifact. What it assumed — quietly — is that you’re building for one provider. The moment you expand to a second cloud, the provider-specific details in the blueprint become a problem: disk names differ, cloud-init fires in a different order, and AWS-specific assumptions break silently on GCP.


We expanded from AWS to GCP six months ago. The EC2 hardening script had been working reliably for over a year. The GCP engineer took the AWS script, made some quick changes, and started building images.

The first GCP images had a subtle problem: the /tmp and /home separate partition entries in /etc/fstab referenced /dev/xvdb — an AWS disk naming convention. GCP uses /dev/sdb. The fstab entries were silently ignored. The mounts existed but weren’t restricted. The CIS controls for separate filesystem partitions were listed as passing in the scan output because the Ansible task had “run successfully” — it just hadn’t done what we thought.

It took a pentest three months later to catch it. The finding: six production GCP instances with /tmp not mounted with noexec, nosuid, nodev — despite our “CIS L1 hardened” label.

The root cause wasn’t the engineer. It was a hardening approach that required cloud-specific knowledge embedded in the script rather than in a provider abstraction layer.


How Stratum Separates Compliance Intent from Provider Details

Multi-cloud OS hardening works when the compliance intent and the provider details are kept strictly separate.

HardeningBlueprint YAML
(compliance intent — portable)
         │
         ▼
  Stratum Provider Layer
  ┌─────────────────────────────────────────────┐
  │  AWS         │  GCP         │  Azure        │
  │  /dev/xvd*   │  /dev/sda*   │  /dev/sda*    │
  │  IMDS v2     │  GCP IMDS    │  Azure IMDS   │
  │  cloud-init  │  cloud-init  │  waagent       │
  │  order A     │  order B     │  order C       │
  └─────────────────────────────────────────────┘
         │
         ▼
  Ansible-Lockdown + Provider-Aware Configuration
         │
         ▼
  OpenSCAP Scan
         │
         ▼
  Golden Image (AMI / GCP Image / Azure Image)

The blueprint author declares what should be true about the OS. Stratum’s provider layer handles how that’s achieved on each cloud.

The disk naming, cloud-init sequencing, metadata endpoint configuration, and provider-specific package repositories are all abstracted into the provider layer. They never appear in the blueprint file.


The Same Blueprint Across Six Providers

# Build the same baseline on three clouds
stratum build --blueprint ubuntu22-cis-l1.yaml --provider aws
stratum build --blueprint ubuntu22-cis-l1.yaml --provider gcp
stratum build --blueprint ubuntu22-cis-l1.yaml --provider azure

# The other three supported providers
stratum build --blueprint ubuntu22-cis-l1.yaml --provider digitalocean
stratum build --blueprint ubuntu22-cis-l1.yaml --provider linode
stratum build --blueprint ubuntu22-cis-l1.yaml --provider proxmox

The blueprint file is identical across all six. The output — AMI, GCP machine image, Azure managed image — is equivalent in terms of security posture. The same 144 CIS L1 controls apply. The same OpenSCAP scan runs. The same grade lands in the image metadata.

If you change the blueprint — add a control, update the Ansible role version, add a custom audit logging configuration — you rebuild all providers from the same source and all images come out consistent.


What the Provider Layer Handles

The provider layer is where the cloud-specific knowledge lives, so the blueprint author doesn’t have to carry it:

Disk naming:

Provider OS disk Ephemeral Data
AWS /dev/xvda /dev/xvdb /dev/xvdc+
GCP /dev/sda /dev/sdb+
Azure /dev/sda /dev/sdb (temp disk) /dev/sdc+
DigitalOcean /dev/vda /dev/vdb+

The CIS controls for separate /tmp and /home partitions reference disk paths that differ across these providers. The provider layer translates the blueprint’s filesystem.tmp declaration into the correct fstab entries for the target cloud.

Cloud-init ordering:

Different providers initialize services in different orders. On AWS, the network is available before cloud-init runs most tasks. On GCP, some network configuration happens after cloud-init starts. On Azure, the waagent handles some configuration that cloud-init handles elsewhere.

The provider layer sequences the hardening steps to run in the correct order for each provider — specifically, it waits for network availability before applying network-level hardening, and ensures the package manager is configured before running Ansible roles that require package installation.

Metadata endpoint configuration:

CIS controls include restrictions on access to the instance metadata service (IMDSv2 enforcement on AWS, equivalent controls on GCP/Azure). The provider layer applies the correct restriction for each cloud — the blueprint just declares compliance: benchmark: cis-l1.


Building for All Providers Simultaneously

For fleet standardization, you can build all providers in a single operation:

# Build for all providers in parallel
stratum build \
  --blueprint ubuntu22-cis-l1.yaml \
  --provider aws,gcp,azure

# Output:
# [aws]   Launching build instance in ap-south-1...
# [gcp]   Launching build instance in asia-south1...
# [azure] Launching build instance in southindia...
# ...
# [aws]   Grade: A (98/100) — ami-0a7f3c9e82d1b4c05
# [gcp]   Grade: A (98/100) — projects/my-project/global/images/ubuntu22-cis-l1-20260419
# [azure] Grade: A (98/100) — /subscriptions/.../images/ubuntu22-cis-l1-20260419

All three builds run in parallel. All three images carry identical compliance grades. The image names embed the date and grade for easy identification.


Blueprint Versioning and Drift Detection

Version-controlling the blueprint file solves a problem that multi-cloud environments hit consistently: knowing what your OS security posture was six months ago.

# Check the current state of a fleet instance against the blueprint
stratum scan --instance i-0abc123 --blueprint ubuntu22-cis-l1.yaml

# Compare against original build grade
# Output:
# Instance: i-0abc123 (aws, ap-south-1)
# Original grade (build): A (98/100) — 2026-01-15
# Current grade (scan):   B (89/100) — 2026-04-19
# 
# Drifted controls (9):
#   3.3.2  — TCP SYN cookies: FAIL (sysctl net.ipv4.tcp_syncookies=0)
#   5.3.2  — sudo log_input: FAIL (removed from /etc/sudoers.d/)
#   ...

Drift detection compares the current instance state against the blueprint that built it. Controls that passed at build time and now fail indicate configuration drift — something changed after the image was deployed. This is how you find the three instances that a sysadmin “temporarily” modified and never reverted.


Production Gotchas

Provider-specific CIS controls exist. CIS AWS Foundations Benchmark and CIS GCP Benchmark include cloud-specific controls (VPC flow logs, CloudTrail, etc.) that are separate from the OS-level CIS controls. The blueprint handles OS-level controls. Cloud-level controls (IAM, logging, network configuration) belong in your cloud security posture management tooling.

Build costs vary by provider. On AWS, the build instance is a t3.medium for 15–20 minutes (~$0.02). On GCP and Azure, equivalent pricing applies. For multi-provider builds, run them in regions close to your primary workloads to minimize image transfer time.

Proxmox builds require a local Stratum agent. Unlike cloud providers, Proxmox doesn’t have an API that Stratum can reach from outside. The Proxmox provider requires the Stratum agent running on the Proxmox host. The build process and blueprint format are identical; only the network topology differs.

GCP image sharing across projects requires explicit IAM. GCP machine images aren’t automatically available to other projects in the organization. After building, run stratum image share --provider gcp --image ubuntu22-cis-l1-20260419 --projects

or configure sharing at the organization level.


Key Takeaways

  • Multi-cloud OS hardening with separate scripts per provider creates inevitable drift; a provider-abstracted blueprint eliminates it
  • The same HardeningBlueprint YAML builds on AWS, GCP, Azure, DigitalOcean, Linode, and Proxmox — the compliance intent is in the file, the provider details are in Stratum’s provider layer
  • Parallel multi-provider builds produce images with identical compliance grades on the same schedule
  • Drift detection works fleet-wide: any instance on any provider can be rescanned against the blueprint that built it
  • Blueprint version control is the single source of truth for OS security posture history — what was true on any given date, across any provider

What’s Next

One blueprint, six clouds, identical compliance grades. EP03 showed that the multi-cloud drift problem disappears when provider details are abstracted away from the blueprint.

What neither EP02 nor EP03 answered is the auditor’s question: how do you know the image is actually compliant? “We ran CIS L1” is not an answer. “Grade A, 98/100 controls, SARIF export attached” is.

EP04 covers automated OpenSCAP compliance: the post-build scan in detail — how the A-F grade is calculated, what controls block an A grade, how SARIF exports work, and how drift detection catches what changed after deployment.

Next: automated OpenSCAP compliance — CIS benchmark grading before deployment

Get EP04 in your inbox when it publishes → linuxcent.com/subscribe

Azure RBAC Explained: Management Groups, Subscriptions, and Scope

Reading Time: 11 minutes

Meta Description: Understand Azure RBAC scopes across management groups, subscriptions, and resources — assign roles at the right level without over-provisioning access.
What Is Cloud IAMAuthentication vs AuthorizationIAM Roles vs PoliciesAWS IAM Deep DiveGCP Resource Hierarchy IAMAzure RBAC Scopes


TL;DR

  • Entra ID and Azure RBAC are two separate authorization planes — Entra ID roles control the identity system; RBAC roles control Azure resources. Global Administrator doesn’t grant VM access.
  • Azure RBAC role assignments inherit downward through the hierarchy: Management Group → Subscription → Resource Group → Resource
  • Use managed identities for all Azure-hosted workloads — system-assigned for one-to-one resource binding, user-assigned for shared access across multiple resources
  • Contributor is the right role for most service identities — full resource management without the ability to modify RBAC assignments
  • The Actions vs DataActions split means you can audit management access and data access independently — an incomplete audit checks only one
  • PIM (Privileged Identity Management) should govern all Entra ID privileged roles — nobody should permanently hold Global Admin or Subscription Owner

The Big Picture

         Azure: Two Separate Authorization Planes
─────────────────────────────────────────────────────────
  Entra ID (Identity Plane)      Azure RBAC (Resource Plane)
  ─────────────────────────      ───────────────────────────
  Controls:                      Controls:
  · Users, groups, apps          · Azure resources
  · Tenant settings              · Management groups
  · App registrations            · Subscriptions
  · Conditional access           · Resource groups
                                 · Individual resources

  Roles (examples):              Scope hierarchy:
  · Global Administrator         Management Group
  · User Administrator             └─ Subscription
  · Security Reader                     └─ Resource Group
  · Application Administrator                └─ Resource

  Scope: tenant-wide             Role assignment at any level
                                 inherits down to all nodes below

  Both planes use Entra ID identities.
  Authorization in each plane is completely independent.
  Global Admin ≠ Subscription Owner.

Azure RBAC scopes determine how far a role assignment reaches — and the blast radius of a misconfiguration scales directly with how high in the hierarchy it sits.


Introduction

Azure RBAC scopes define where a role assignment applies and everything it inherits. A role at the Management Group level touches every subscription, every resource group, and every resource across your entire Azure estate. A role at the resource level touches only that resource. Understanding scope before making any assignment is the difference between “access for this storage account” and “access for your entire org.”

When I first worked seriously in Azure environments, I had a mental model carried over from Active Directory administration. Users, groups, directory roles — I knew how that worked. I assumed Azure’s IAM would be an extension of the same system, just with cloud resources bolted on.

That assumption got me into trouble within the first week.

I was trying to understand why an engineer had Global Administrator access in Entra ID but couldn’t see the resources in a Subscription. In Active Directory terms, if you’re a Domain Admin, you can see everything. In Azure, it doesn’t work that way.

Entra ID roles and Azure RBAC roles are two different systems. Global Administrator is an Entra ID role — it controls who can manage the identity plane: create users, manage app registrations, configure tenant settings. It has nothing to do with Azure resources like virtual machines, storage accounts, or Kubernetes clusters. Those are governed by Azure RBAC, which is an entirely separate authorization system.

I spent two hours trying to understand why a Global Admin couldn’t list VMs before someone explained this. I’m putting it at the top of this episode so you don’t lose those two hours.


Entra ID vs Azure RBAC — The Two Separate Planes

Entra ID Azure RBAC
Controls access to Entra ID itself — users, groups, apps, tenant settings Azure resources — VMs, storage, databases, subscriptions
Role types Entra ID directory roles Azure resource roles
Example roles Global Admin, User Admin, Security Reader Owner, Contributor, Storage Blob Data Reader
Scope Tenant-wide Management group → Subscription → Resource Group → Resource
Managed via Entra ID admin center Azure portal / ARM / Azure CLI

A user can be Global Administrator — the highest Entra ID role — and have zero access to Azure resources unless explicitly assigned an Azure RBAC role. And vice versa: a user with Subscription Owner (highest Azure RBAC role) has no ability to manage Entra ID user accounts without an Entra ID role assignment.

These are not the same system. They’re connected — both use Entra ID identities as principals — but authorization in each plane is independent.


The Azure Resource Hierarchy

Azure RBAC role assignments can be made at any level of the resource hierarchy, and they inherit downward:

Tenant (Entra ID)
  └── Management Group  (policy and RBAC inheritance across subscriptions)
        └── Management Group  (nested, up to 6 levels)
              └── Subscription  (billing and resource boundary)
                    └── Resource Group  (logical container for resources)
                          └── Resource  (VM, storage account, key vault, AKS cluster...)

A role assigned at the Subscription level applies to every resource group and resource in that subscription. A role at the Management Group level applies to every subscription beneath it.

The blast radius of a misconfiguration scales with how high in the hierarchy it sits. Subscription Owner at the subscription level is contained to that subscription. Management Group Contributor at the root management group touches your entire Azure estate.

# View management group hierarchy
az account management-group list --output table

# List subscriptions
az account list --output table

# View all role assignments at a scope — start here in any audit
az role assignment list \
  --scope /subscriptions/SUB_ID \
  --include-inherited \
  --output table

Principal Types in Azure RBAC

Type What It Is Best For
User Entra ID user account Human access
Group Entra ID security group Team-based access
Service Principal App registration with credentials (secret or cert) External systems, apps with their own identity
Managed Identity Credential-less identity for Azure-hosted workloads Everything running in Azure

Managed Identities — The Right Model for Workloads

Managed identities are Azure’s answer to AWS instance profiles and GCP service accounts attached to compute. Azure manages the entire credential lifecycle — tokens are issued automatically, there’s nothing to create, rotate, or revoke manually.

System-assigned managed identity is tied to a specific Azure resource. When the resource is deleted, the identity is deleted. One-to-one, no sharing.

# Enable system-assigned managed identity on a VM
az vm identity assign \
  --name my-vm \
  --resource-group rg-prod

# Get the principal ID (needed to assign RBAC roles to it)
az vm show \
  --name my-vm \
  --resource-group rg-prod \
  --query identity.principalId \
  --output tsv

User-assigned managed identity is a standalone resource that can be attached to multiple Azure resources and persists independently. This is the right model when multiple services need the same access — instead of assigning the same RBAC roles to ten separate system-assigned identities, you create one user-assigned identity, grant it the roles, and attach it to all ten resources.

# Create a user-assigned managed identity
az identity create \
  --name app-backend-identity \
  --resource-group rg-identities

# Get its identifiers
az identity show \
  --name app-backend-identity \
  --resource-group rg-identities \
  --query '{principalId:principalId, clientId:clientId}'

# Attach to a VM
az vm identity assign \
  --name my-vm \
  --resource-group rg-prod \
  --identities /subscriptions/SUB/resourceGroups/rg-identities/providers/Microsoft.ManagedIdentity/userAssignedIdentities/app-backend-identity

Code running inside an Azure VM or App Service with a managed identity gets tokens via IMDS, with no credential management required:

from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

# DefaultAzureCredential automatically picks up the managed identity in Azure
credential = DefaultAzureCredential()
client = BlobServiceClient(
    account_url="https://myaccount.blob.core.windows.net",
    credential=credential
)

The DefaultAzureCredential chain: managed identity → environment variables → workload identity → Visual Studio / VS Code authentication → Azure CLI. In Azure-hosted services, the managed identity path is used automatically. In local development, it falls through to the developer’s az login session.


Azure Role Definitions — Understanding Actions vs DataActions

A role definition specifies what actions it grants. Azure distinguishes two planes:

  • Actions: Control plane — managing the resource itself (create, delete, configure)
  • DataActions: Data plane — accessing data within the resource (read blob contents, get secrets)
  • NotActions / NotDataActions: Exceptions carved out from the grant
{
  "Name": "Storage Blob Data Reader",
  "IsCustom": false,
  "Actions": [
    "Microsoft.Storage/storageAccounts/blobServices/containers/read",
    "Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action"
  ],
  "NotActions": [],
  "DataActions": [
    "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read"
  ],
  "NotDataActions": [],
  "AssignableScopes": ["/"]
}

The control/data plane split matters in audits. An identity with Microsoft.Storage/storageAccounts/read (an Action) can see the storage account exists and view its properties. To actually read blob contents, it needs the DataAction Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read. These are separate grants. In an access audit, checking only Actions and missing DataActions is an incomplete picture.

Built-in Roles Worth Understanding

Role Scope What It Grants
Owner Any Full access + can manage RBAC assignments — the highest trust role
Contributor Any Full resource management, but cannot manage RBAC
Reader Any Read-only on all resources
User Access Administrator Any Can manage RBAC assignments, no resource access
Storage Blob Data Contributor Storage Read/write/delete blob data
Storage Blob Data Reader Storage Read blob data only
Key Vault Secrets Officer Key Vault Manage secrets, not keys or certificates
AcrPush / AcrPull Container Registry Push or pull images

The gap between Owner and Contributor is important: Contributor can do everything to a resource except manage who has access to it. This is the right role for most service identities and automation — they need to manage resources, not manage permissions. If a compromised Contributor identity can’t modify RBAC assignments, it can’t grant itself or an attacker additional access.

Owner should be granted to people, not service identities, and only at the narrowest scope necessary.

Custom Roles

cat > custom-app-storage.json << 'EOF'
{
  "Name": "App Storage Blob Reader",
  "IsCustom": true,
  "Description": "Read app blobs only — no container management, no key operations",
  "Actions": [
    "Microsoft.Storage/storageAccounts/blobServices/containers/read"
  ],
  "DataActions": [
    "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read"
  ],
  "NotActions": [],
  "NotDataActions": [],
  "AssignableScopes": ["/subscriptions/SUB_ID"]
}
EOF

az role definition create --role-definition custom-app-storage.json

# Assign it — specifically to this storage account
az role assignment create \
  --assignee-object-id "$(az identity show --name app-backend-identity -g rg-identities --query principalId -o tsv)" \
  --assignee-principal-type ServicePrincipal \
  --role "App Storage Blob Reader" \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-prod/providers/Microsoft.Storage/storageAccounts/appstore

Role Assignments — Where Access Is Actually Granted

The assignment brings everything together: principal + role + scope. This is the actual grant.

# Assign to a user (less common — prefer group assignments)
az role assignment create \
  --assignee [email protected] \
  --role "Storage Blob Data Reader" \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-prod/providers/Microsoft.Storage/storageAccounts/prodstore

# Assign to a group (better — one assignment, maintained via group membership)
GROUP_ID=$(az ad group show --group "Backend-Team" --query id -o tsv)
az role assignment create \
  --assignee-object-id "$GROUP_ID" \
  --assignee-principal-type Group \
  --role "Contributor" \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-dev

# Assign to a managed identity
MI_PRINCIPAL=$(az identity show --name app-backend-identity --resource-group rg-identities --query principalId -o tsv)
az role assignment create \
  --assignee-object-id "$MI_PRINCIPAL" \
  --assignee-principal-type ServicePrincipal \
  --role "Storage Blob Data Contributor" \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-prod/providers/Microsoft.Storage/storageAccounts/appstore

# Audit all assignments at and below a scope (including inherited)
az role assignment list \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-prod \
  --include-inherited \
  --output table

Group-based assignments are the right model for humans at scale. When an engineer joins the Backend team, they join the Entra ID group. Their access follows. When they leave, you remove them from the group or disable their account. You never need to hunt down individual role assignments.


Entra ID Roles — The Other Layer

Entra ID roles control the identity infrastructure itself. These are distinct from Azure RBAC roles and deserve separate treatment:

Role What It Controls
Global Administrator Everything in the tenant — highest privilege
Privileged Role Administrator Assign and remove Entra ID roles
User Administrator Create and manage users and groups
Application Administrator Register and manage app registrations
Security Administrator Manage security features and read reports
Security Reader Read-only on security features

Global Administrator in Entra ID is one of the most powerful identities in a Microsoft environment. It can modify any user, any app registration, any conditional access policy. Combined with the fact that Entra ID is also the identity provider for Microsoft 365, a Global Admin compromise can extend far beyond Azure resources into email, Teams, SharePoint — the entire Microsoft 365 estate.

Nobody should hold Global Administrator as a permanent assignment. This is where Privileged Identity Management (PIM) matters.

Privileged Identity Management — Just-in-Time Elevated Access

PIM is Azure’s answer to the problem of permanent privileged role assignments. Instead of permanently holding Global Admin or Subscription Owner, users are made eligible for these roles. When they need elevated access, they activate it with a justification (and optionally an approval and MFA requirement). The access is time-limited — typically 8 hours — and automatically expires.

# List roles where the user is eligible (not permanently assigned)
az rest --method GET \
  --uri "https://graph.microsoft.com/v1.0/roleManagement/directory/roleEligibilitySchedules" \
  --query "value[?principalId=='USER_OBJECT_ID']"

# A user activates an eligible role (calls this themselves when needed)
az rest --method POST \
  --uri "https://graph.microsoft.com/v1.0/roleManagement/directory/roleAssignmentScheduleRequests" \
  --body '{
    "action": "selfActivate",
    "principalId": "USER_OBJECT_ID",
    "roleDefinitionId": "ROLE_DEF_ID",
    "directoryScopeId": "/",
    "justification": "Investigating security alert in tenant audit logs",
    "scheduleInfo": {
      "startDateTime": "2026-04-16T00:00:00Z",
      "expiration": { "type": "AfterDuration", "duration": "PT8H" }
    }
  }'

PIM is the right model for any role that could be used to escalate privileges: Global Administrator, Subscription Owner, Privileged Role Administrator, User Access Administrator. Nobody should have these permanently assigned unless there’s a strong operational reason — and even then, the assignment should be reviewed quarterly.

In one Azure environment I audited, I found 11 permanent Global Administrator assignments. The team thought this was normal because they’d all been made admins when the tenant was set up two years earlier and nobody had revisited it. Of the 11, three were former employees whose Entra ID accounts had been disabled — but the Global Admin role assignment was still there. Disabled users can’t use their accounts, but this is not a pattern you want to rely on.


Federated Identity for External Workloads

For GitHub Actions, Kubernetes workloads, and other external systems that need to call Azure APIs, federated credentials eliminate service principal secrets:

# Create an app registration
APP_ID=$(az ad app create --display-name "github-actions-deploy" --query appId -o tsv)
SP_ID=$(az ad sp create --id "$APP_ID" --query id -o tsv)

# Add a federated credential for a specific GitHub repo and branch
az ad app federated-credential create \
  --id "$APP_ID" \
  --parameters '{
    "name": "github-main-branch",
    "issuer": "https://token.actions.githubusercontent.com",
    "subject": "repo:my-org/my-repo:ref:refs/heads/main",
    "audiences": ["api://AzureADTokenExchange"]
  }'

# Grant the service principal an RBAC role
az role assignment create \
  --assignee-object-id "$SP_ID" \
  --role "Contributor" \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-prod

GitHub Actions — no secrets stored in GitHub:

jobs:
  deploy:
    permissions:
      id-token: write   # required for OIDC token request
    steps:
      - uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      - run: az storage blob upload --account-name prodstore ...

The client-id, tenant-id, and subscription-id values are not secrets — they’re identifiers. The actual authentication is the OIDC JWT from GitHub, verified against GitHub’s public keys, subject-matched against the configured condition (repo:my-org/my-repo:ref:refs/heads/main). If the repo or branch doesn’t match, the token exchange fails. If it matches, a short-lived Azure token is issued.


⚠ Production Gotchas

Global Admin ≠ Azure resource access
This trips up every team migrating from on-prem AD. Entra ID roles and Azure RBAC roles are independent systems. A Global Admin with no RBAC assignments cannot list VMs. Don’t assume directory privilege translates to resource access.

Permanent Global Admin assignments are a standing breach risk
In the environment I audited: 11 permanent Global Admins, three of them disabled accounts. Disabled accounts can’t authenticate, but relying on that is not a security control. PIM eligible assignments + regular access reviews is the right answer.

Owner on service identities lets compromised workloads modify RBAC
If a managed identity or service principal holds Owner, a compromised workload can grant additional permissions to itself or an attacker. Use Contributor for workloads — full resource management, no RBAC modification.

Checking only Actions misses data-plane access
An audit that enumerates role Actions and ignores DataActions will miss identities with read access to blob contents, Key Vault secrets, or database records. Both planes need to be in scope.

System-assigned identity is deleted with the resource
If you delete and recreate a VM using a system-assigned identity, the new identity is different. Any RBAC assignments made to the old identity are gone. User-assigned identities persist independently — use them for workloads where the resource lifecycle is separate from the identity lifecycle.


Quick Reference

# Audit all role assignments at a subscription (including inherited)
az role assignment list \
  --scope /subscriptions/SUB_ID \
  --include-inherited \
  --output table

# Find all Owner assignments at subscription scope
az role assignment list \
  --scope /subscriptions/SUB_ID \
  --role Owner \
  --output table

# Get principal ID of a VM's managed identity
az vm show \
  --name my-vm \
  --resource-group rg-prod \
  --query identity.principalId \
  --output tsv

# View role definition — check Actions AND DataActions
az role definition list --name "Storage Blob Data Reader" --output json \
  | jq '.[0] | {Actions: .permissions[0].actions, DataActions: .permissions[0].dataActions}'

# List management group hierarchy
az account management-group list --output table

# Create user-assigned managed identity
az identity create --name app-identity --resource-group rg-identities

# Assign role to managed identity at resource scope
az role assignment create \
  --assignee-object-id "$(az identity show -n app-identity -g rg-identities --query principalId -o tsv)" \
  --assignee-principal-type ServicePrincipal \
  --role "Storage Blob Data Contributor" \
  --scope /subscriptions/SUB_ID/resourceGroups/rg-prod/providers/Microsoft.Storage/storageAccounts/mystore

# Check PIM eligible roles for a user
az rest --method GET \
  --uri "https://graph.microsoft.com/v1.0/roleManagement/directory/roleEligibilitySchedules" \
  --query "value[?principalId=='USER_OBJECT_ID'].{role:roleDefinitionId,scope:directoryScopeId}"

Framework Alignment

Framework Reference What It Covers Here
CISSP Domain 5 — Identity and Access Management Azure’s directory-centric model; managed identities and PIM are the primary IAM constructs
CISSP Domain 3 — Security Architecture Entra ID spans Azure, M365, and third-party SaaS — scope boundaries determine the blast radius of a compromise
ISO 27001:2022 5.15 Access control Azure RBAC role definitions and assignments implement access control policy
ISO 27001:2022 5.16 Identity management Entra ID is the identity management platform — user lifecycle, group management, application registrations
ISO 27001:2022 8.2 Privileged access rights PIM (Privileged Identity Management) directly implements JIT controls for privileged roles
ISO 27001:2022 5.18 Access rights Role assignment scoping, managed identity provisioning, federated credential lifecycle
SOC 2 CC6.1 Managed identities and RBAC are the primary technical controls for CC6.1 in Azure-hosted environments
SOC 2 CC6.3 PIM activation expiry and access reviews directly satisfy time-bound access removal requirements

Key Takeaways

  • Entra ID and Azure RBAC are separate authorization planes — Entra ID roles control the identity system; RBAC roles control Azure resources. Global Administrator doesn’t grant VM access.
  • Use managed identities for all Azure-hosted workloads — system-assigned for one-to-one, user-assigned for shared identities across multiple resources
  • Contributor is the right role for most service identities — full resource management without RBAC modification ability
  • The control/data plane split (Actions vs DataActions) in role definitions means you can grant management access without data access or vice versa — use this
  • PIM should govern all Entra ID privileged roles and high-scope Azure roles — nobody should permanently hold Global Admin or Subscription Owner
  • Federated identity credentials replace service principal secrets for external workloads — no secrets stored in CI/CD systems

What’s Next

EP07 goes cross-cloud: workload identity federation — the shift away from static credentials entirely, with IRSA for EKS, GKE Workload Identity, AKS workload identity, and GitHub Actions-to-all-three-clouds patterns.

Next: OIDC Workload Identity — Eliminate Cloud Access Keys Entirely.

Get EP07 in your inbox when it publishes → subscribe