Skip to main content

Enterprise-Scale Azure RBAC: Terraform Structure Design

When I scale RBAC across many subscriptions, the repo design matters more than any single Terraform trick. I use a structure that keeps state small, blast radius bounded, and PRs readable without external lookup tables.

Folder structure I use

rbac/
├── _modules/
│ ├── mg-role-assignment/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── subscription-role-assignment/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf

├── management-groups/
│ ├── backend.tf
│ ├── providers.tf
│ ├── main.tf
│ ├── locals.tf
│ ├── variables.tf
│ ├── role_assignments.yaml
│ └── role_definitions.yaml

├── platform/
│ ├── connectivity/
│ │ ├── backend.tf
│ │ ├── providers.tf
│ │ ├── main.tf
│ │ ├── locals.tf
│ │ ├── variables.tf
│ │ ├── subscriptions.yaml
│ │ ├── role_assignments.yaml
│ │ └── role_definitions.yaml
│ ├── identity/
│ │ └── (same layout)
│ └── management/
│ └── (same layout)

└── landingzones/
├── corp/
│ ├── prod/
│ │ ├── australiaeast/
│ │ └── newzealandnorth/
│ └── nonprod/
│ ├── australiaeast/
│ └── newzealandnorth/
├── online/
│ ├── prod/
│ └── nonprod/
└── sap/
└── prod/

I call one archetype/environment/region unit a cell. I model each cell as one independent Terraform state.

File contents reference from one landing zone cell

I keep examples grounded in landingzones/corp/prod/australiaeast.

subscriptions.yaml

subscriptions:
- id: "00000000-0000-0000-0000-000000000001"
name: corp-prod-australiaeast-01
primary: true

- id: "00000000-0000-0000-0000-000000000002"
name: corp-prod-australiaeast-02
primary: false

role_assignments.yaml

role_assignments:
- principal_id: "aad-group-id-corp-prod-contributors"
principal_name: "sg-corp-prod-contributors"
role_definition_name: "Contributor"
scope_suffix: ""

- principal_id: "aad-group-id-security-readers"
principal_name: "sg-security-readers"
role_definition_name: "Security Reader"
scope_suffix: ""

- principal_id: "sp-object-id-github-actions"
principal_name: "sp-github-actions-corp-prod"
role_definition_name: "Reader"
scope_suffix: ""

- principal_id: "aad-group-id-network-ops"
principal_name: "sg-network-operations"
role_definition_name: "Network Contributor"
scope_suffix: "/resourceGroups/rg-networking"

I use readable placeholder prefixes in docs so the principal type is obvious at a glance. In real files I store the real object IDs.

providers.tf

I cap complexity at two subscription aliases per cell. Terraform still needs static provider references, so this is the one place I accept duplicated subscription IDs.

terraform {
required_version = ">= 1.9"

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.117"
}
}
}

provider "azurerm" {
subscription_id = var.management_subscription_id
features {}
}

provider "azurerm" {
alias = "sub_primary"
subscription_id = var.primary_subscription_id
features {}
}

provider "azurerm" {
alias = "sub_secondary"
subscription_id = var.secondary_subscription_id
features {}
}

variables.tf

variable "management_subscription_id" {
description = "Subscription ID of the management subscription."
type = string
}

variable "primary_subscription_id" {
description = "Primary subscription ID for this cell."
type = string
}

variable "secondary_subscription_id" {
description = "Secondary subscription ID for this cell."
type = string
}

variable "principal_ids" {
description = "Optional map of principal name to object ID injected by pipeline."
type = map(string)
sensitive = true
}

locals.tf

locals {
subscriptions_raw = yamldecode(file("${path.module}/subscriptions.yaml"))
role_assignments_raw = yamldecode(file("${path.module}/role_assignments.yaml"))

subscriptions = local.subscriptions_raw["subscriptions"]
primary_sub_id = one([for s in local.subscriptions : s.id if s.primary])
secondary_sub_id = try(one([for s in local.subscriptions : s.id if !s.primary]), null)

raw_assignments = local.role_assignments_raw["role_assignments"]

assignment_map = {
for a in local.raw_assignments :
"${replace(a.role_definition_name, " ", "_")}__${substr(a.principal_id, 0, 8)}" => {
principal_id = lookup(var.principal_ids, a.principal_name, a.principal_id)
role_definition_name = a.role_definition_name
scope_suffix = lookup(a, "scope_suffix", "")
}
}
}

I keep one primary: true subscription and at most one secondary subscription in each cell.

main.tf

module "corp_prod_australiaeast_primary" {
source = "../../../../_modules/subscription-role-assignment"
subscription_id = local.primary_sub_id
assignment_map = local.assignment_map

providers = {
azurerm.subscription = azurerm.sub_primary
}
}

module "corp_prod_australiaeast_secondary" {
source = "../../../../_modules/subscription-role-assignment"
subscription_id = local.secondary_sub_id
assignment_map = local.assignment_map

providers = {
azurerm.subscription = azurerm.sub_secondary
}
}

backend.tf

terraform {
backend "azurerm" {
resource_group_name = "rg-terraform-state"
storage_account_name = "stterraformstate"
container_name = "rbac"
key = "landingzones/corp/prod/australiaeast/terraform.tfstate"
use_oidc = true
}
}

_modules/subscription-role-assignment/main.tf

terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
configuration_aliases = [azurerm.subscription]
}
}
}

variable "subscription_id" { type = string }
variable "assignment_map" {
type = map(object({
principal_id = string
role_definition_name = string
scope_suffix = string
}))
}

resource "azurerm_role_assignment" "this" {
for_each = var.assignment_map
provider = azurerm.subscription

name = uuidv5("dns", "${var.subscription_id}/${each.key}")
scope = "/subscriptions/${var.subscription_id}${each.value.scope_suffix}"

principal_id = each.value.principal_id
role_definition_name = each.value.role_definition_name
}

Why I avoid subscription GUID folder names

I have inherited GUID-folder repos before. They are unique, but they hide intent.

landingzones/corp/prod/australiaeast tells me context immediately. landingzones/00000000-0000-0000-0000-000000000001 tells me nothing until I open extra files.

The archetype/environment/region shape gives me:

  • clear operational targeting
  • cleaner PR review context
  • easier onboarding
  • cleaner subscription retirement without folder churn
  • alignment with Azure Landing Zones concepts

GitHub Actions workflow I use

I split workflow behavior into three goals.

  1. Detect changed Terraform states for PR plans
  2. Run plan only for changed states in parallel
  3. Apply in dependency order on merge to main

PR plan workflow pattern

name: PR - plan changed states

on:
pull_request:
paths:
- 'rbac/**'

jobs:
detect:
runs-on: ubuntu-latest
outputs:
state_dirs: ${{ steps.find.outputs.dirs }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Find changed state directories
id: find
run: |
changed=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- rbac/)
dirs=$(echo "$changed" \
| xargs -I{} dirname {} \
| sort -u \
| while read d; do
[ -f "$d/backend.tf" ] && echo "$d"
done \
| jq -R -s -c 'split("\n") | map(select(length > 0))')
echo "dirs=$dirs" >> "$GITHUB_OUTPUT"

plan:
needs: detect
if: needs.detect.outputs.state_dirs != '[]'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
state_dir: ${{ fromJson(needs.detect.outputs.state_dirs) }}
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_MANAGEMENT_SUBSCRIPTION_ID }}
- uses: hashicorp/setup-terraform@v3
- name: Terraform init
working-directory: ${{ matrix.state_dir }}
run: terraform init -backend-config="use_oidc=true" -input=false
- name: Terraform plan
working-directory: ${{ matrix.state_dir }}
env:
# Terraform maps TF_VAR_<name> to variable "<name>"
TF_VAR_principal_ids: ${{ secrets.PRINCIPAL_IDS_JSON }}
TF_VAR_management_subscription_id: ${{ secrets.MANAGEMENT_SUBSCRIPTION_ID }}
run: terraform plan -input=false -no-color

Merge apply workflow pattern

name: Apply - ordered by dependency layer

on:
push:
branches: [main]
paths:
- 'rbac/**'

jobs:
apply-management-groups:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_MANAGEMENT_SUBSCRIPTION_ID }}
- uses: hashicorp/setup-terraform@v3
- name: Init and apply
working-directory: rbac/management-groups
env:
TF_VAR_principal_ids: ${{ secrets.PRINCIPAL_IDS_JSON }}
TF_VAR_management_subscription_id: ${{ secrets.MANAGEMENT_SUBSCRIPTION_ID }}
run: |
terraform init -input=false
terraform apply -auto-approve -input=false

apply-platform:
needs: apply-management-groups
runs-on: ubuntu-latest
strategy:
matrix:
state_dir:
- rbac/platform/connectivity
- rbac/platform/identity
- rbac/platform/management
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_MANAGEMENT_SUBSCRIPTION_ID }}
- uses: hashicorp/setup-terraform@v3
- name: Init and apply
working-directory: ${{ matrix.state_dir }}
env:
TF_VAR_principal_ids: ${{ secrets.PRINCIPAL_IDS_JSON }}
TF_VAR_management_subscription_id: ${{ secrets.MANAGEMENT_SUBSCRIPTION_ID }}
run: |
terraform init -input=false
terraform apply -auto-approve -input=false

discover-landingzones:
needs: apply-platform
runs-on: ubuntu-latest
outputs:
state_dirs: ${{ steps.find.outputs.dirs }}
steps:
- uses: actions/checkout@v4
- name: Discover landing zone state roots
id: find
run: |
dirs=$(find rbac/landingzones -name backend.tf \
| xargs -I{} dirname {} \
| sort \
| jq -R -s -c 'split("\n") | map(select(length > 0))')
echo "dirs=$dirs" >> "$GITHUB_OUTPUT"

apply-landingzones:
needs: discover-landingzones
runs-on: ubuntu-latest
strategy:
matrix:
state_dir: ${{ fromJson(needs.discover-landingzones.outputs.state_dirs) }}
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_MANAGEMENT_SUBSCRIPTION_ID }}
- uses: hashicorp/setup-terraform@v3
- name: Init and apply
working-directory: ${{ matrix.state_dir }}
env:
TF_VAR_principal_ids: ${{ secrets.PRINCIPAL_IDS_JSON }}
TF_VAR_management_subscription_id: ${{ secrets.MANAGEMENT_SUBSCRIPTION_ID }}
run: |
terraform init -input=false
terraform apply -auto-approve -input=false

I keep discover-landingzones dynamic by using find rbac/landingzones -name backend.tf. New cells get picked up without workflow edits.

How I expand this safely

Add a region

I copy an existing region cell and update:

  • subscriptions.yaml
  • providers.tf
  • backend.tf
  • role_assignments.yaml

Add an archetype

I add a new top-level directory under landingzones/ and reuse the same per-cell layout.

Add an environment

I add the environment folder under an archetype and wire approval gates with a matching GitHub Environment if needed.

Add a second subscription to a cell

I update subscriptions.yaml, add or un-comment the secondary provider alias, and add the second module block in main.tf.

Things I have gotten wrong

I used to centralize too much RBAC into one state. Plans became slow, locking got noisy, and failure blast radius became uncomfortable.

I also used subscription GUID folders early on. That looked clean on day one and became painful as soon as on-call work started.

Benefits I get from this design

  • State isolation per cell
  • Fast plans and refresh due to small state files
  • Human-readable folder context
  • Team ownership mapping via CODEOWNERS
  • Localized YAML config next to the state it drives
  • Deterministic role assignment IDs with uuidv5
  • Horizontal growth without structural rewrites