(4 of 6) Monjo AI - IaC via Terraform

AI-powered tools for teachers
Projects
Full Stack
Code
Monjo AI
Author

Aaron Carver

Published

November 2024

IaC

Results

  • Versioned, immutable artifacts for almost all cloud infrastructure
  • I really struggled to enjoy using Terraform

Onboarding

  • I followed the great series of blog posts from Yevgeniy Brikman, A Comprehensive Guide to Terraform
  • I “promote[d] immutable, versioned infrastructure artifacts from environment to environment”
  • There’s more notes about this in the infra repo
  • Still, the whole setup was a pain to initially create and to update

The problem with Terraform

  • My strenuously-crafted, resuable, immutable, versioned infrastructure moduels and artifacts became a big sphagetti monolith covered in a sauce of a thousand inconveniences, like security groups and IAM roles which broke any attempted updates by refusing deletion
  • The typical update went like this:
    • I have an idea of what needs to be added/updated/removed
    • last infra update was 2 months earlier, so I study the repo for a bit to remember how the HCL files reference each other and which version of what module is currently deployed where
    • I add a resource, update a resource, and remove a resource
    • Oh no - the remote state config is messed up again because it relies on files saved locally in the repo but not commited to the remote repo and one of the several AWS CLIv2 credentials I was using on my machine and somehow this unravels every 2 months
    • I fix that
    • while: the build verification thing fails because HCL syntax is wrong because the statis analysis isn’t helpful
      • me and my favorite LLM read the HCL site docs to figure out how to add what needs to be added
    • I fix it
    • while: the build fails because something needs to be deleted, but can’t because of…rules
      • It’s usually a Security Group or IAM resource. I start a side quest to figure out if I need to figure out how to get Terraform to delete that stubborn resource vs. whether removing it with ClickOps will mess up the tfstate and how that would be remediated
    • I fix it. Once the new infra is up, I manouvre the git commits and tag things carefully and fear making updates in the future and wonder how cloud infra management is still so bad.

Next project

  • For my next project, I used AWS CloudFormation, which I now much prefer. You can read more about that experience here.