linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
Intro
Lessons
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Lessons
  • How it works
  • Knowledge base
  • Cheat sheet
  • Capstone
  • Interview prep
home/terraform/kb/Testing/iac-testing-theory

kb/testing ── Testing ── intermediate

What to Test in Terraform, and What to Skip

Infrastructure is not an application, so do not apply the test pyramid literally. Test module contracts, business rules, complex expressions, and refactors that should produce no destroy. Do not test that the provider works, that the AWS API returns 200, or that a trivial `name = var.name` holds. The goal is to catch regressions, not to prove correctness.

view as markdownaka: iac-testing, terraform-testing-strategy, testing-pyramid-terraform

The test pyramid does not fit infrastructure

The classic pyramid is 70% unit, 20% integration, 10% e2e. That works for an application. For Terraform it gets awkward.

  • A unit test of a Terraform module is cheap (.tftest.hcl + mock_provider), but it checks something narrow: "the module declares what the HCL says it should declare." It will not protect you from a bug in the provider or in the AWS API.
  • Integration is an order of magnitude more expensive (you bring up LocalStack or AWS, which takes minutes), but it actually checks that the result is working infrastructure.
  • E2E is the production deploy itself. The deployment already lives in the pipeline, so a separate e2e stage is usually duplication.

A realistic profile for a Terraform repo is closer to 40% unit, 40% integration on LocalStack, 20% policy/compliance, with e2e being the production pipeline.

What to test

1. The module contract

A module takes var.name and declares aws_s3_bucket.this. The test: you pass "foo", and in the plan aws_s3_bucket.this.bucket == "foo". This is insurance against an accidental rename of the variable.

hcl
# tests/contract.tftest.hcl
run "var_name_propagates" {
  command = plan
  variables { name = "foo" }
  assert {
    condition     = aws_s3_bucket.this.bucket == "foo"
    error_message = "var.name does not reach bucket"
  }
}

2. Business rules (policy)

Encryption is mandatory. The CostCenter tag is mandatory. Nothing is public by default. This is a company-level concern, not a module-level one. The better home for it is policy-as-code (tf-policy-as-code / terraform-compliance) running against the plan file in CI.

3. Complex expressions

A plain name = var.name is not worth a test. But this is:

hcl
locals {
  bucket_name = "${var.team}-${var.purpose}-${random_id.suffix.hex}"
  tags = merge(
    var.default_tags,
    { Team = var.team, ManagedBy = "terraform" },
  )
}

There is logic here. The test: "when team=ai and purpose=logs, the name starts with ai-logs-".

4. Refactors (moved blocks)

You moved aws_s3_bucket.logs into a module. Write a test for a clean plan:

hcl
run "no_diff_after_refactor" {
  command = plan
  # assert that the plan does nothing
}

Terraform may say "no changes" on its own, but without an assert in the test that fact is recorded nowhere.

5. Preconditions and postconditions

hcl
variable "env" {
  type = string
  validation {
    condition     = contains(["dev", "stage", "prod"], var.env)
    error_message = "env must be dev/stage/prod"
  }
}

A test with expect_failures = [var.env] and env = "xyz" guarantees the validation fires. See tf-test-framework.

What not to test

1. That HCL describes the AWS API correctly

hcl
resource "aws_s3_bucket" "this" {
  bucket = "foo"
}

Testing that "on apply, a bucket named foo appears in AWS" is pointless. That is the job of HashiCorp and the AWS SDK. You are not testing the compiler, you are testing your own code.

2. Trivial pass-through

hcl
output "arn" {
  value = aws_s3_bucket.this.arn
}

A test that "output arn equals the ARN" is meaningless. There is nothing here that can break.

3. That the cloud behaves like the cloud

"After apply, the bucket really does return 200 on a HEAD request" is an operations-level smoke test, not a module test. You do it in production through monitoring, not in a test suite.

4. Performance

A test like "apply 100 resources in under 60 seconds" is always flaky. Terraform performance depends on provider latency and the network. If you want it, run a separate benchmark once a week, not on every PR.

Levels and tools

LevelWhatWith
Static analysisSyntax, formatting, common mistakesterraform fmt -check, terraform validate, tf-checkov
LintStyle, deprecated args, provider best practicestflint with a rule set
Unit (module in isolation)Module contract, naming, business rules.tftest.hcl + mock_provider
IntegrationResources are really created, cross-resource interactions.tftest.hcl with command = apply on LocalStack, or Terratest
PolicyCorporate rules (tags, security)OPA+Rego, terraform-compliance, Checkov
E2EDeploying the prod environmentThe production pipeline itself

The golden plan

One light but strong test: "the current code produces a plan that matches a saved reference text byte for byte." Any change (to HCL, the provider, or a module) shows up as a diff, and the reviewer sees exactly what changed.

Implementation:

bash
terraform plan -out=plan.tfplan
terraform show -no-color plan.tfplan > plan.golden

You commit plan.golden. In CI:

bash
terraform plan -out=plan.tfplan
terraform show -no-color plan.tfplan > plan.current
diff plan.golden plan.current || exit 1

When you change HCL, you update the golden. The PR then shows a diff in the HCL and a diff in the golden, so both sides are visible. This is useful on root modules where you expect zero diff.

How many tests

No more than you can justify. Signs you went too far:

  • The tests take longer than the apply itself.
  • They break more often from a provider upgrade than from your own code.
  • There is more copy-paste in the tests than in the production code.
  • Nobody can explain what this particular test is meant to catch.

Signs you tested too little:

  • Boilerplate mistakes reach prod.
  • Refactors break something that was not visible in the plan.
  • Business rules get violated (a tag is forgotten, encryption is turned off).

Find the balance between the two.

Pitfalls

  • Tests are a liability, not an asset. Every test has to be maintained. An old test that nobody understands but everyone is afraid to delete is toxic debt.

  • Mocks do not catch integration bugs. A unit test with mock_provider can pass while the real apply fails, for example because the AWS API requires a specific argument order or name format.

  • A cheap test can be expensive to maintain. A scenario like "when var.foo=true, the plan shows 5 resources" is simple, but it breaks the moment you add a sixth. Test invariants instead ("every account has a KMS key") rather than exact counts.

  • Tests do not replace code review. Well-written HCL gets reviewed faster than bad code with 100% test coverage. Tests are an addition, not a substitute.

  • Production debugging is written down as tests. Every time something breaks in production, add a test that would have caught it. That is the one reliable way to grow a test suite that actually catches real bugs.

§ см. также

  • tf-test-frameworkNative test framework: .tftest.hcl, run, and assertSince version 1.6, Terraform ships a built-in test runner. Files named `*.tftest.hcl` describe scenarios through `run` blocks (each a mini plan or apply) and `assert` checks. The `terraform test` command runs all of them and reports pass/fail. No cloud account is required: with `command = plan` the runner evaluates expressions against plan output and creates no resources.
  • tf-test-mocksMock providers: mock_provider, override_resource, override_dataA mock provider replaces a real AWS provider with synthetic responses. Tests run without the cloud, in seconds rather than minutes. Declare one in `*.tftest.hcl` with `mock_provider "aws"`. To substitute a single resource or data source, use `override_resource` or `override_data`. Without mocks, every `command = apply` block requires LocalStack.
  • terraform-complianceterraform-compliance: BDD checks against a plan fileterraform-compliance reads a plan file (`plan.json`) and applies BDD rules written in Gherkin. "Given a resource of type X, it must contain a property Y" reads cleanly for non-engineers and enforces policy before apply. It is an alternative to OPA/Rego for teams that prefer natural language, though it is less capable: you cannot write complex cross-resource checks.
Footer
linuxlab-
Copyright © 2026 LinuxLab. All rights reserved.
Tutorials
Pricing
About
Privacy & cookies