Infrastructure testing — why your Terraform plan is not a test
“We run terraform plan in CI, so our infrastructure is tested.”
I hear this at least once a month. It’s wrong every time. Running terraform plan in your pipeline is like running a compiler and calling it a test suite. It tells you the code parses. It tells you the syntax is valid. It does not tell you the code does what you think it does.
Plan is a build step. Not a test.
What plan actually tells you
Terraform plan computes the difference between your declared configuration and the current state. It answers one question: what will change? It’ll tell you that a security group rule will be added, a VM will be resized, or a storage account will be destroyed and recreated.
That information is useful. Knowing what will change before you apply it is better than the alternative. Plan catches typos, unexpected resource replacements, and accidental deletions. It’s a critical part of the workflow.
But “what will change” is a different question from “is this correct.”
What plan can’t tell you
Plan doesn’t evaluate whether your changes produce a working system. It computes a diff. Here’s what it can’t answer.
Will the new network security group rules actually allow traffic between your application tier and your database? Plan shows you the rules. It doesn’t verify that the combination of rules, route tables, and network ACLs results in working connectivity.
Does the IAM policy you just modified still follow least-privilege principles? Plan shows you the policy document. It doesn’t tell you whether s3:* on * is an acceptable scope. (It isn’t.)
Will the VM size you selected handle the workload in production? Plan says “instance type changes from Standard_D2s_v3 to Standard_D4s_v3.” Whether that’s enough for your actual traffic is a question plan can’t touch.
Plan operates at the resource level. Your infrastructure operates at the system level. The gap between those two perspectives is where bugs live.
Policy testing with OPA and Conftest
The first layer of real testing is policy validation. Tools like Open Policy Agent (OPA) with Conftest let you write assertions against the terraform plan output as JSON.
The concept is simple. You export your plan to JSON, then run Rego policies against it. “No security group may allow ingress from 0.0.0.0/0 on port 22.” “All storage accounts must have encryption enabled.” “Every resource must have a cost-center tag.” These are rules that can be expressed as code and evaluated automatically.
Policy tests are cheap to write and fast to run. They catch the obvious mistakes: public-facing resources that should be private, missing tags that break your billing reports, permissive firewall rules that violate your security baseline.
The limitation is that policies only enforce known rules. They don’t test system behavior. “No public buckets” is a policy. “The application can reach its database” is a behavior. You need different tools for each.
Static analysis with Checkov and tfsec
Static analysis tools like Checkov and tfsec scan your HCL files directly, without needing a plan output. They check for security misconfigurations using built-in rule sets based on CIS benchmarks, cloud provider best practices, and common mistakes.
These tools catch things like hardcoded secrets in resource definitions, security groups with overly permissive rules, storage without encryption at rest, databases without backup configurations, and logging disabled on resources that should have it.
The value of static analysis is coverage with minimal effort. You add it to your pipeline, it scans everything, and it flags problems you didn’t think to look for. The false positive rate can be annoying, but it’s worth tuning the rules rather than disabling the scanner.
Static analysis and policy testing overlap somewhat. The difference is that static analysis works from built-in knowledge about what’s generally unsafe, while policy testing enforces your organization’s specific rules. You want both.
Integration testing with Terratest
When you need to verify that infrastructure actually works, not just that it looks correct on paper, you need integration tests. Terratest is the most established tool for this. It’s a Go library that applies your Terraform configuration to a real cloud environment, runs assertions against the deployed resources, and tears everything down afterward.
A typical Terratest scenario: apply a module that creates a virtual network, a subnet, and a network security group. Then make HTTP requests to verify that traffic is allowed on the expected ports and blocked on others. Or deploy a Kubernetes cluster and verify that kubectl commands succeed. Or provision a database and verify that the connection string works.
This is expensive. Every test run creates real cloud resources, runs for minutes, and costs money. Test isolation requires dedicated subscriptions or resource groups. Cleanup failures leave orphaned resources.
But integration tests catch what nothing else can. Static analysis can tell you the security group rules look right. Policy tests can verify they match your organization’s standards. Only an integration test can tell you the deployed infrastructure actually routes traffic correctly.
Use integration tests selectively. They’re appropriate for shared modules that many teams depend on, for infrastructure patterns that are complex enough to have emergent behavior, and for anything where a failure in production would be costly enough to justify the testing overhead.
The testing pyramid for IaC
Application developers are familiar with the testing pyramid: many fast unit tests at the base, fewer integration tests in the middle, a handful of end-to-end tests at the top. Infrastructure code benefits from the same structure.
At the base: policy tests. Many of them, fast to run, cheap to maintain. They validate that your infrastructure definitions meet your organization’s rules. Run them on every PR.
In the middle: static analysis. Automated scans that check for known misconfigurations and security issues. Run them alongside policy tests in CI.
At the top: integration tests. Few of them, slow and expensive, but high confidence. Run them on changes to shared modules or before major releases. Not on every commit.
Most teams I’ve worked with have zero layers of this pyramid. They have terraform plan, and they have production. The feedback loop between “this looks like it’ll work” and “it’s broken in production” has nothing in between.
Adding even the base layer, a handful of Conftest policies in your CI pipeline, puts you ahead of the majority. It takes an afternoon to set up and catches real problems from day one.
Treat infra code like app code
Terraform plan is necessary. Keep running it. But stop calling it a test. It’s a preview of intended changes, nothing more.
Real infrastructure testing means writing assertions about what your infrastructure should do, not just what it should look like. Start with policy tests because they’re cheap and fast. Add static analysis for security coverage. Build integration tests for the modules where correctness matters most.
Your application code has unit tests, integration tests, linters, and security scanners. Your infrastructure code deserves the same. The deployment pipeline doesn’t end at terraform apply. It ends when you’ve verified the result is correct.