lesson ── terraform-beginner ── ~10 мин ── 2 шагов

Data sources: reading what already exists

Not everything you need in Terraform code is something Terraform itself creates. Sometimes you need to find out the current AWS account ID, or read a VPC that someone else set up. That is what data blocks are for.

data looks like resource, but it creates nothing, it only reads. No state management, no changes in the cloud. See tf-data-source.

интерактивный sandbox

Поднимется пара контейнеров: terraform 1.9 и localstack 3.8 в одной сети. В браузере откроется терминал, можно сразу terraform init. Каждый шаг проверяется автоматически. TTL 45 минут, без регистрации.

запустить sandbox →

stack ── terraform · localstack · 1 GB RAM · самоуничтожается через 45 мин простоя

Шаги

01
Read your own account through data
Create main.tf:
hcl
data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
output "my_account" {
value = data.aws_caller_identity.current.account_id
}
output "my_region" {
value = data.aws_region.current.name
}
Notice:
- The block starts with data, not resource.
- aws_caller_identity takes no arguments, just empty {}.
- You reference it through data.aws_caller_identity.current.account_id.
Run:
bash
cd /home/student/tf-data
terraform init -input=false
terraform apply -auto-approve -input=false
The output should now show your account ID and region.
подсказка
In LocalStack the account ID is usually 000000000000 or 123456789012. Do not worry, that is normal for an emulator.
✓ The data blocks were read, the output contains the region.
02
Use the account ID in a bucket name
The most useful thing data does is feed values into resources. Add a bucket to main.tf whose name includes the account ID:
hcl
resource "aws_s3_bucket" "demo" {
bucket = "linuxlab-${data.aws_caller_identity.current.account_id}-logs"
tags = {
Region = data.aws_region.current.name
AccountId = data.aws_caller_identity.current.account_id
}
}
Run apply again:
bash
terraform apply -auto-approve
Terraform: (1) queries the data; (2) computes the name; (3) creates the bucket. The dependency graph understands that the bucket depends on the data, so the data comes first.
подсказка
If apply complains "name too long": that means the account ID is long. You can shorten it with `substr(data.aws_caller_identity.current.account_id, 0, 8)`.
✓ The bucket was created with values from the data blocks. data to resource: it works.
data plus other resources: IAM, SQS, Lambda
S3 is not the only AWS resource. The same "data to resource" pattern works with any resource that LocalStack emulates:
hcl
# IAM role whose name includes the region and account
resource "aws_iam_role" "worker" {
name = "linuxlab-worker-${data.aws_region.current.name}"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
Condition = {
StringEquals = {
"aws:SourceAccount" = data.aws_caller_identity.current.account_id
}
}
}]
})
}
# SQS queue with a name from data
resource "aws_sqs_queue" "events" {
name = "linuxlab-events-${data.aws_region.current.name}"
}
One more thing about data: drift. If someone outside deletes a resource that you read through data, it does not fail right away. The data runs on the next plan or apply, and that is when it turns out the object is gone and the run fails with an error. data is a one-time read, not a subscription to changes. So for critical dependencies, prefer a resource (which you manage yourself) over data (where someone outside can break everything).

It is the same on OpenTofu, the AWS provider data sources are identical: the same set of fields, the same attributes. See tf-opentofu-parity.
- → Data sources
- → State and drift
- → OpenTofu parity

Что ты узнал

You read the current account ID and region through data, then used them in a bucket name and in outputs. This is the basic pattern: a data block for context, a resource block for creation.

команды

data "aws_caller_identity" "current" {}find out who you are right now
data "aws_region" "current" {}which region you are working in
terraform refreshre-read data from the cloud

концепции

· resource creates, data only reads
· A data block with no arguments: `data "..." "current" {}` (empty braces)
· Referencing data: data.type.name.attribute: the `data.` prefix is required