Infrastructure as Code

01 · Why infrastructure as code 4 min

Stop clicking. Start
committing your infrastructure.

Every cloud account starts the same way: someone clicks through a console, wires up a server and a database, and it works. Six months later nobody remembers exactly what they clicked, the staging environment drifted from production, and rebuilding from scratch is a day of archaeology. IaC turns all of that into a file you can read, review, and re-run.

Infrastructure as Code (IaC) — describing your servers, networks, databases, and permissions in machine-readable files, then letting a tool create and update the real thing to match. The same cloud resources you could provision by hand — compute, storage, load balancers — become text under version control. The console becomes a read-only dashboard, not the source of truth.

ClickOps vs. code

ClickOps — you build infrastructure by hand in a web console. Fast for one thing once; impossible to reproduce, review, or audit.
IaC — you write the desired setup as text. The tool reads it and makes reality match. Need a second copy? Run it again.
The difference shows up the day you need staging to look exactly like production — or to rebuild after an outage.

One console session makes one snowflake. One file makes as many identical environments as you need.

Repeatable

Spin up an identical environment in minutes — dev, staging, a disaster-recovery region — from the same source.

Reviewable

Infrastructure changes arrive as pull requests. A teammate reads the diff before it touches production.

Versioned

Every change is a commit. You can see who changed what, when, and why — and roll back like any other code.

Self-documenting

The files are the documentation. No stale wiki describing a console nobody opens.

02 · Declarative vs. imperative & idempotency 5 min

Describe the destination,
not every turn.

There are two ways to tell a computer to build something. You can list the exact steps in order (imperative), or you can describe the end result and let the tool figure out the steps (declarative). Almost all modern IaC is declarative — and that choice is what makes re-running safe.

Declarative — you state the desired end result; the tool computes whatever actions are needed to reach it. Imperative is the opposite: you spell out each step yourself. Declarative wins for infrastructure because the tool can compare what you want against what already exists and do only the difference.

Imperative — a script of steps

# do exactly these steps, in this order aws ec2 run-instances --image-id ami-123 --count 1 aws ec2 create-tags --tags Key=env,Value=prod # re-run this? you get a SECOND server. # server already exists? the script errors out.

Declarative — the desired result

# I want ONE server tagged prod to exist. resource "aws_instance" "web" { ami = "ami-123" instance_type = "t3.micro" tags = { env = "prod" } } # re-run? already matches → nothing happens.

The engine reconciles desired against real and acts only on the gap — that is what makes a second run safe.

Idempotency — the payoff

Idempotent — running the same operation once or a hundred times leaves the system in the same final state. Apply your config; apply it again; nothing extra happens.

The imperative script above is not idempotent — re-running it makes a second server.
The declarative config is— it already matches, so the tool reports "no changes".
That property is why IaC is safe to run on every deploy, in CI, automatically.

03 · State & drift — the concept that trips everyone up 5 min

How the tool knows what
it already owns.

Here is the question that confuses every newcomer: your code says "one server", the cloud has a server — but how does the tool know thatserver is the one your code created, and not someone else's? The answer is state: a record that maps each resource in your code to a real object in the cloud.

State — a file the tool maintains that links every resource in your configuration to the real-world object it created (by its cloud ID). Without it, the tool couldn't tell "update the existing server" from "create a brand-new one". State is the tool's memory of what it owns.

State is the lookup table: the web resource in your code is the cloud object i-0ab9.

Why state matters in practice

It is precious. Lose the state file and the tool forgets what it owns — it may try to re-create resources that already exist.
It can hold secrets. Passwords and keys often land in state in plain text — which is why it must be stored securely (Part 7).
It must be shared. If two engineers keep their own copy, they will fight. Hence remote state, also Part 7.

Drift — when reality stops matching the code

Drift — the gap that opens when the real infrastructure is changed outside your code — someone edits a setting in the console, or an incident fix is applied by hand. Now the cloud no longer matches what your files say.

Someone resized the box by hand. The next plan spots the mismatch and offers to pull reality back to the code.

How drift gets handled

The tool refreshes state, sees reality differs, and shows the drift as a proposed change on the next run.
You decide: let the code win (revert the manual change) or update the code to match the new reality.
The cure for chronic drift: make the code the only way to change infrastructure — lock down console write access.
Adopting existing hand-built resources? import them into state so the tool starts tracking them.

04 · The plan & apply workflow 5 min

See the change first,
then make it.

The single best habit IaC gives you: nothing changes until you have read a preview of exactly what will change. The core loop is plan (dry run — show me the diff) then apply(do it). That preview is what turns "hope this works" into a reviewable, boring deploy.

Plan — a dry run that compares your code to state and real infrastructure and prints exactly what it would add, change, or destroy — without touching anything. Apply then executes that plan. Read the plan like a diff; the symbols tell the whole story.

# a dry run — nothing has changed yet + aws_instance.web # create ~ aws_security_group.sg # update in place port: 80 -> 443 - aws_s3_bucket.old # DESTROY Plan: 1 to add, 1 to change, 1 to destroy.

The everyday loop: write → init → plan → apply, repeating on every change. destroy tears the whole thing down cleanly when you are done.

+ add

A resource in your code that does not exist yet. The tool will create it.

~ change

A resource that exists but differs. Watch for the ones that replace rather than edit in place — that means downtime.

- destroy

A resource the code no longer wants. Read these lines hardest — a stray destroy can delete a database.

The plan is the review artifact

In a team, the plan output is posted on the pull request so a human approves the exact diff before apply.
A plan with surprising destroys is a stop sign — investigate before you ever type apply.
Running this in a pipeline (plan on PR, apply on merge) is the bridge to CI/CD tooling — covered in Part 7.

Downloads the providers — the plugins that know how to talk to AWS, Azure, GCP, Cloudflare, and so on.
Sets up the backend where state is stored (local file or remote bucket).
Pulls in any modules your code references (next section). Run it once per checkout, and again after adding a provider or module.

05 · Modules & reuse 4 min

Write the pattern once,
reuse it everywhere.

Once you have one environment described in code, you will want a second that is almost the same. Copy-pasting the files is the trap — now every fix has to be made in three places. A module packages a chunk of infrastructure once and lets you stamp it out with different inputs.

Module — a reusable, parameterized bundle of resources you call with inputs and get outputs back — the same idea as a function in normal code. Define "a network with two subnets and a gateway" once, then call it for dev, staging, and prod with different sizes. It is DRY for infrastructure.

# call one module twice with different inputs module "network" { source = "../modules/network" cidr = "10.0.0.0/16" env = "prod" } module "db" { source = "../modules/database" size = "large" # prod gets the big one }

A root module wires together child modules; each child is reused across every environment with different inputs.

Inputs & outputs

A module takes variables (size, region, name) and exposes outputs (the database endpoint) so callers can wire results together — exactly like a function signature.

Don't over-modularize

A module wrapping a single resource adds indirection for no gain. Reach for one when a group of resources repeats — not by reflex.

Public registries

The Terraform / OpenTofu Registry has battle-tested modules (VPCs, EKS clusters). Pin a version and read the code before trusting it in production.

Like a blueprint for a standard room — draw it once, then build it on three floors with different paint and furniture. Spinning up a Kubernetes cluster is a classic job for a well-tested module rather than hand-written resources.

06 · The tooling landscape 5 min

Terraform, Pulumi,
CloudFormation — and when each fits.

The tools mostly split on two questions: do you write a dedicated configuration language or a real programming language, and are you tied to one cloud or many? None is "best" — each is the right call in a different spot. Here are the honest trade-offs.

A note on names: in 2023 Terraform changed to a source-available license, and the community forked the last open version into OpenTofu (now under the Linux Foundation). They are still near-drop-in compatible and share the HCL language, so this deck treats them together.

Terraform / OpenTofu — the declarative default

# HCL — a purpose-built config language resource "aws_s3_bucket" "assets" { bucket = "my-app-assets" } resource "cloudflare_record" "www" { name = "www" type = "CNAME" } # one tool, many cloud providers

Declarative HCL. The industry default, with the largest provider ecosystem — it can manage almost any cloud or SaaS, not just AWS.

Pro — multi-cloud, huge module registry, declarative and readable; the de-facto standard.

Con — HCL is its own language with limited logic; complex conditionals get awkward.

Choose when you want a portable, declarative standard across more than one provider — the safe default.

Pulumi — real languages, declarative result

// define infra in TypeScript / Python / Go / C# const bucket = new aws.s3.Bucket("assets") // real loops, real types, real IDE help for (const env of ["dev", "prod"]) { new aws.s3.Bucket(`logs-${env}`) }

You write infrastructure in a general-purpose language, but the engine still works declaratively — state, plan, and apply, just like Terraform underneath.

Pro — full language power: loops, functions, types, tests, IDE autocomplete; great for complex logic.

Con — that power invites complexity; smaller ecosystem; the team must know the language.

Choose when developers own infra and want real code — abstractions, unit tests, and shared libraries.

CloudFormation & CDK — the AWS-native option

// CDK: code that synthesizes to CloudFormation new s3.Bucket(this, "Assets", { versioned: true, }) // CloudFormation itself is YAML/JSON templates, // managed by AWS — no state file to mind.

CloudFormationis AWS's native declarative templates; CDK lets you write real code that generates those templates. AWS stores the state for you.

Pro — deep, day-one AWS integration; AWS manages state and rollbacks; no extra tooling.

Con — AWS-only, full lock-in; raw templates are verbose; updates can be slow.

Choose when you are all-in on AWS and want a first-party, fully-managed tool with no third party.

Ansible — config management, not provisioning

# procedural-ish: tasks run top to bottom - name: install and start nginx apt: { name: nginx, state: present } - name: ensure it is running service: { name: nginx, state: started } # shines at configuring servers that exist

Ansible is task-based and leans imperative (a list of steps), though good modules are idempotent. Its sweet spot is configuring machines, not provisioning cloud resources.

Pro — agentless (just SSH), simple YAML, excellent for installing and configuring software.

Con — weak at managing cloud resource lifecycles; no real state/plan model like Terraform.

Choose for in-server configuration — often paired with Terraform: provision with one, configure with the other.

The declarative camp

Terraform / OpenTofu and CloudFormation: you state the end result. Easier to read, safer to re-run, the natural fit for provisioning cloud resources. This is where most teams should start.

The trade-off to weigh

Real-language tools (Pulumi, CDK) and imperative ones (Ansible) give you more power and flexibility — at the cost of more ways to write something clever that the next person can't follow. Reach for power only when a plain config language genuinely can't express it.

07 · Best practices + recap 4 min

Make the code the
only way in.

IaC pays off when the code is the single source of truth and nobody edits production by hand. Three practices get you there: store state remotely and safely, keep secrets out of the files, and run the whole thing through a pipeline so every change is reviewed.

Remote state

Shared, locked, backed up

Store state in a shared backend — an S3 bucket, Azure Blob, GCS, or a managed service like HCP Terraform — with lockingso two applies can't run at once, and versioning so you can recover. Never commit state to git.

Secrets

Never hard-code them

Don't paste passwords into .tf files. Pull them from a secrets manager (Vault, AWS Secrets Manager) at run time. Remember state can hold secrets in plain text — so encrypt the backend and lock down who can read it.

IaC in CI / GitOps

Plan on PR, apply on merge

Run plan automatically on every pull request, post the diff for review, and apply only after merge — never from a laptop. See Developer Tooling for the pipeline mechanics.

GitOps — git as the source of truth

GitOps — the desired state of your infrastructure lives in a git repository, and an automated process makes reality match what is merged. A change to infrastructure is a pull request: reviewed, approved, merged, and applied by the pipeline — not a person with console access.

The PR is where the plan is reviewed; the merge is what triggers apply. No human touches production directly.

1Describe the destination, not the steps. Declarative + idempotent is what makes IaC safe to re-run on every deploy.

2Respect state.It is the tool's memory — store it remotely, lock it, back it up, and keep it out of git.

3Always read the plan. Review the diff before apply; a stray destroy is the difference between a deploy and an outage.

4Reuse with modules, but don't over-engineer. Package repeating groups of resources; skip the wrapper around a single one.

5Make code the only door. Lock down console writes, run through CI, and drift stops being a problem you fight.

Just starting, or multi-cloud? → Terraform / OpenTofu. The portable, declarative default.
All-in on AWS, want first-party? → CloudFormation, or CDK if your team prefers real code.
Developers own infra and want real abstractions and tests? → Pulumi.
Configuring software on servers that already exist? → Ansible, often alongside Terraform.
When in doubt, pick the simpler, declarative option — you can add power later when you truly need it.

Knowledge check

Did it stick?

Five quick questions on declarative IaC, state, drift, plan/apply, and the tooling landscape — instant feedback, no sign-in.

Rate this deck

be the first

Navigate with ← → or scroll · back to library

Infrastructureas Code— repeatable by design.

Stop clicking. Startcommitting your infrastructure.

ClickOps vs. code

Describe the destination,not every turn.

Idempotency — the payoff

How the tool knows whatit already owns.

Why state matters in practice

Drift — when reality stops matching the code

How drift gets handled

See the change first,then make it.

The plan is the review artifact

Write the pattern once,reuse it everywhere.

Terraform, Pulumi,CloudFormation — and when each fits.

Terraform / OpenTofu — the declarative default

Pulumi — real languages, declarative result

CloudFormation & CDK — the AWS-native option

Ansible — config management, not provisioning

Make the code theonly way in.

Shared, locked, backed up

Never hard-code them

Plan on PR, apply on merge

GitOps — git as the source of truth

Did it stick?

Infrastructure
as Code
— repeatable by design.

Stop clicking. Start
committing your infrastructure.

Describe the destination,
not every turn.

How the tool knows what
it already owns.

See the change first,
then make it.

Write the pattern once,
reuse it everywhere.

Terraform, Pulumi,
CloudFormation — and when each fits.

Make the code the
only way in.