linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
Intro
Lessons
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Knowledge base
  • Cheat sheet
  • Capstone
  • Interview prep
home/terraform/lessons/tf-advanced-04-large-scale-state

lesson ── terraform-advanced ── ~16 мин ── 6 шагов

Large-scale state, breaking up the monolith

One big state with 1000 resources means lock contention, slow refresh, and the risk of a catastrophic failure. The fix is to split it into a hierarchy: network, apps, and so on. They talk to each other through terraform_remote_state. In this lesson you build the monolith, split it into network and apps, and watch them communicate.

▶ интерактивный sandbox

Поднимется пара контейнеров: terraform 1.9 и localstack 3.8 в одной сети. В браузере откроется терминал, можно сразу terraform init. Каждый шаг проверяется автоматически. TTL 45 минут, без регистрации.

запустить sandbox →

stack ── terraform · localstack · 1 GB RAM · самоуничтожается через 45 мин простоя

Шаги

  1. 01

    The baseline monolith

    bash
    cd /home/student/scale/monolith
    cat > main.tf <<'EOF'
    resource "aws_vpc" "main" {
      cidr_block = "10.0.0.0/16"
      tags = { Name = "scale-vpc" }
    }
    resource "aws_subnet" "private" {
      count             = 2
      vpc_id            = aws_vpc.main.id
      cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
      availability_zone = "us-east-1${["a", "b"][count.index]}"
      tags = { Name = "private-${count.index}" }
    }
    resource "aws_s3_bucket" "app_logs" {
      bucket = "scale-app-logs"
    }
    resource "aws_s3_bucket" "app_data" {
      bucket = "scale-app-data"
    }
    EOF
    terraform init -no-color > /dev/null
    terraform apply -auto-approve -no-color > /dev/null
    terraform state list

    You see the VPC, 2 subnets, 2 buckets, all in one state.

    ✓ The monolith is created. Now you split it.

  2. 02

    Move the VPC and subnets into a network state

    The strategy is to create a new state and move the resources into it:

    bash
    cd /home/student/scale/network
    cat > main.tf <<'EOF'
    resource "aws_vpc" "main" {
      cidr_block = "10.0.0.0/16"
      tags = { Name = "scale-vpc" }
    }
    resource "aws_subnet" "private" {
      count             = 2
      vpc_id            = aws_vpc.main.id
      cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
      availability_zone = "us-east-1${["a", "b"][count.index]}"
      tags = { Name = "private-${count.index}" }
    }
    output "vpc_id" {
      value = aws_vpc.main.id
    }
    output "private_subnet_ids" {
      value = aws_subnet.private[*].id
    }
    EOF
    terraform init -no-color > /dev/null

    Now move the resources: state pull from the monolith, state push into here:

    bash
    # pull only the network resources out of the monolith
    cd ../monolith
    terraform state pull > /tmp/full.tfstate
    terraform state mv -state-out=/tmp/network.tfstate aws_vpc.main aws_vpc.main
    terraform state mv -state-out=/tmp/network.tfstate 'aws_subnet.private[0]' 'aws_subnet.private[0]'
    terraform state mv -state-out=/tmp/network.tfstate 'aws_subnet.private[1]' 'aws_subnet.private[1]'
    terraform state list
    # import into the network state
    cd ../network
    terraform state push -force /tmp/network.tfstate
    terraform state list

    The network state now holds the VPC and 2 subnets. The monolith has only the buckets.

    ✓ Network is extracted. There are two states now.

  3. 03

    An apps state with terraform_remote_state

    bash
    cd /home/student/scale/apps
    cat > main.tf <<'EOF'
    data "terraform_remote_state" "network" {
      backend = "local"
      config = {
        path = "../network/terraform.tfstate"
      }
    }
    resource "aws_s3_bucket" "app_logs" {
      bucket = "scale-app-logs"
      tags = {
        VPC = data.terraform_remote_state.network.outputs.vpc_id
      }
    }
    resource "aws_s3_bucket" "app_data" {
      bucket = "scale-app-data"
      tags = {
        SubnetCount = length(data.terraform_remote_state.network.outputs.private_subnet_ids)
      }
    }
    EOF
    terraform init -no-color > /dev/null

    Move the buckets out of the monolith:

    bash
    cd ../monolith
    terraform state mv -state-out=/tmp/apps.tfstate aws_s3_bucket.app_logs aws_s3_bucket.app_logs
    terraform state mv -state-out=/tmp/apps.tfstate aws_s3_bucket.app_data aws_s3_bucket.app_data
    terraform state list
    cd ../apps
    terraform state push -force /tmp/apps.tfstate
    terraform state list

    The apps state has 2 buckets. The monolith is empty.

    ✓ Apps reads network through remote_state. The monolith is taken apart.

  4. 04

    Check the cross-state reference

    bash
    cd /home/student/scale/apps
    terraform plan -no-color 2>&1 | tail -10

    The apps plan shows that the buckets want to add a tag with VPC = <vpc-id>, taken from the network state.

    Apply it:

    bash
    terraform apply -auto-approve -no-color > /dev/null
    terraform state show aws_s3_bucket.app_logs | grep -E "VPC|tags"

    The tag is there, the vpc-id is real, apps actually read the output from the network state.

    ✓ The cross-state reference works. The network outputs are available in apps.

  5. 05

    Apps and network now change independently

    Scenario: you add one more bucket to apps. Network is left alone.

    bash
    cd /home/student/scale/apps
    cat >> main.tf <<'EOF'
    resource "aws_s3_bucket" "metrics" {
      bucket = "scale-metrics"
    }
    EOF
    terraform plan -no-color 2>&1 | tail -10
    terraform apply -auto-approve -no-color > /dev/null
    # the network state was left untouched
    cd ../network
    terraform plan -no-color 2>&1 | tail -5

    The apps plan shows +1 resource. The network plan says No changes. The isolation works.

    On a large project this means a PR to apps does not block network PRs, the locks on the two states are separate, and the blast radius is contained.

    ✓ Isolation proven. Apps changes without touching network.

    The same thing on OpenTofu

    OpenTofu keeps its CLI and state compatible with Terraform for the commands in this step: migration usually goes through mv .terraform .terraform.bak; tofu init -upgrade. On your first switch, though, back up the state and run it on a feature branch first, the differences cluster in the newer features (variables in the backend, state encryption, OCI registry-backed modules). See tf-opentofu-parity for the full matrix.

    • → OpenTofu parity
  6. 06

    "Blast radius", break apps, network survives

    You simulate the destruction of the apps state (do not do this in prod without a backup):

    bash
    cd /home/student/scale/apps
    cp terraform.tfstate /tmp/apps-backup.tfstate
    echo '{"corrupt": true}' > terraform.tfstate
    set +e
    terraform plan -no-color 2>&1 | tail -5
    code=$?
    set -e
    echo "apps plan exit: $code"
    # restore it
    cp /tmp/apps-backup.tfstate terraform.tfstate

    The apps state is corrupt, the plan fails. But the network state is intact:

    bash
    cd ../network
    terraform plan -no-color 2>&1 | tail -5
    echo "network plan ok"

    The VPC and subnets keep working. This is what an isolated blast radius means: one stack can burn down while the rest lives on.

    ✓ Network is protected from breakage in apps. That is blast-radius isolation.

    When not to split

    Splitting is not always a win. The overhead:

    1. Orchestration. You have to apply in the right order. You need Terragrunt or a shell script.
    2. Cross-state coupling. Deleting an output breaks the readers.
    3. Harder for newcomers. "Where does the VPC id live?", the answer is longer now.
    4. Replicated provider config. Every stack needs a provider. Across 10 stacks, that is 10 provider blocks.

    Do not split if:

    • You have fewer than 200 resources in the state.
    • One env, one team.
    • Plan under 30s, a reasonable apply.

    Split on purpose, once the symptoms (slow plan, lock contention, fear of apply) show up. Not "for the future".

    See tf-large-scale-state.

    • → Large-scale state
    • → state mv in detail

Что ты узнал

terraform_remote_state is a data source that reads outputs from another state file. Only outputs are visible cross-state; internal resources and locals are not. Each stack has its own backend key and its own lock.

команды

  • terraform_remote_state.network.outputs.Xa reference to an output from another state.
  • terraform state listwhat is in the current state. One state is not everything.
  • terraform state pull > backup.jsona backup before a split operation.

концепции

  • · One state per layer = blast-radius isolation
  • · Cross-state goes through outputs, version them, do not rename casually
  • · One DynamoDB lock table per org, a partition key per state

← предыдущий

Native tests, .tftest.hcl and assert

следующий →

State: what lives inside the terraform.tfstate file

Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies