Three Terraform Use Cases You Should Start Implementing

In engineering organizations that use infrastructure programming (IaC) tools such as Terraform, they are typically used half-heartedly. This article explores no less than three use cases for Terraform and IaC-style automation that don't directly rely on the traditional infrastructure responsible for managing application workloads.

We have noticed that a clear thread running through the work of many teams involved in platform administration is an obvious slippage in mastering the “as code” philosophy. Naturally, they use OpenTofu or Terraform (hereinafter I will call them collectively “TF”) to manage computing and other cloud resourcesbut the team usually does not proceed to apply the same principles in all aspectsrelated to the exploitation of the subject area.

The goal is not to sporadically automate individual things; in fact, we want to automate All. Without exception, all processes and operations must be expressed in code.

Why? Because once you achieve this, you will reach a new level of consistency and reliability when managing systems, even in those systems where infrastructure programming was usually the last thing to get to.

In this article we will focus on some resources that (you will be surprised!) can also be managed. This:

We hope you come to the same conclusions that we preach to everyone we work with: if you want to make a solid platform, you need to program on it. all aspects infrastructure.

❯ Why isn’t it customary to do everything through Terraform?

There are a number of reasons why developers may be reluctant to use Terraform for deploying certain resources, such as Git repositories, or for user management or monitoring. It is often seen that engineers simply do not think about this, since they are more focused on implementing other parts from scratch. When it comes to Git repositories, programmers can perceive working with them as a “what came first, the chicken or the egg” problem? In this case, repositories are created manually before deploying the code – therefore automation of all processes through Terraform does not occur. In addition, engineers are often under time pressure and are under pressure to get the product out as quickly as possible, so they tend to stick to proven workflows rather than explore new ones. Using Terraform to manage the entire system is just such an approach.

❯ Manage team member accounts and role-based access control

Managing user accounts and roles across multiple SaaS products can be a real headache, especially if those products don't support single sign-on (SSO). This is a challenge that senior management and engineering teams often face. Hours of work time are wasted that could be better spent developing strategic initiatives. Luckily, TF can make this process a lot easier.

Before we get into the examples, it's worth noting that many SaaS vendors seem to have forgotten how important SSO is from a security perspective. As the article points out: The SSO Wall of Shamemany vendors offer SSO, but only as a premium feature. It is tied to expensive “corporate” pricing plans, or the fee for it is many times higher than the base price of the simplest functional version of the product. This practice discourages people from using SSO and, on the contrary, breeds an irresponsible attitude towards security. This problem is especially relevant for small organizations that may not be able to afford expensive pricing plans.

A typical software development organization will likely have multiple service platforms that need to be managed, both for users and access levels. A typical bare minimum stack might include AWS, GitHub, CloudFlare, and Datadog (for starters). For teams that can’t afford SSO, access to these services must be managed manually, which is a labor-intensive task. Every time someone leaves or a new employee joins the organization, someone will have to go into all of these platforms and manually add or remove a specific member of the organization. With TF, you can centralize the management of users and their credentials, making the whole job much easier (and more secure).

AWS offers its users a service IAM Identity Center (formerly AWS SSO), which makes it easier to manage your AWS account and associated roles. But if an organization has not yet jumped on the SSO bandwagon or uses a set of different services, some of which support SSO, and some do not, it is with the help of TF that you can standardize the addition and removal of accounts, regardless of the vendor. The TF Root Modules service allows you to define users and roles in the team.yaml file, automatically create these entities and manage them simultaneously on all platforms that you use.

Here's an excerpt from a team.yaml file describing a hypothetical DevOps team:

devops_team:
  name: DevOps
  description: Internal DevOps Team
  privacy: closed
  members:
    - name: Jane Doe
      gh_username: JaneyDoe100
      email: doe@abccorp.com
      gh_role: maintainer
      datadog_role: Standard
    - name: John Smith
      gh_username: CloudWizard1212
      email: smith@abccorp.com
      gh_role: member
      datadog_role: Read-Only
    - name: Finn Mertens
      gh_username: IceKing99
      email: mertens@abccorp.com
      gh_role: member
      datadog_role: Standard

In this central file, we can manage our team information as code. Then you can add a new service or account not in N clicks, but simply by updating this file. You can then read this file and deploy this command to all services that use TF. In the following example, we will update GitHub and Datadog:

locals {
  # Решили выразить эту информацию в виде YAML-файла, который будем загружать вместо переменной,  
  # так, чтобы другие члены команды могли без труда добавлять / редактировать / удалять записи,  
  # даже не зная TF
  team_data = yamldecode(file("${path.root}/team.yaml"))
}

resource "github_team" "devops" {
  name        = local.team_data.devops_team.name
  description = local.team_data.devops_team.description
  privacy     = local.team_data.devops_team.privacy
}

resource "github_team_members" "devops_members" {
  for_each = { for member in local.team_data.devops_team.members : member.gh_username => member }

  team_id  = github_team.devops.id
  username = each.value.gh_username
  role     = each.value.gh_role
}

module "datadog_users" {
  source  = "masterpointio/datadog/users"
  version = "X.X.X"

  users = [ for member in local.team_data.devops_team.members: {
      email    = member.email,
      name     = member.name,
      role     = [member.datadog_role],
      username = member.gh_username
    }
  ]
}

This is a simple example, but it should make it clear what opportunities are available if you manage users and roles using IaC technology. By using IaC, rather than logging into each platform separately, if you need to add new team members or remove old ones, you can simply change a single file, and then everything will automatically update as soon as new changes are deployed to the code. Additionally, since all changes are tracked in Git, we have all the historical information about who made the change, what exactly was changed, when and why.

❯ Managing Git Repositories

If you manage code repositories hosted on GitHub, GitLab, or other Git providers, you know what a headache this can be, especially when working with polyrepository. Need to manage branch security and ensure consistency in access control across all these repositories? Yes, it is really hard. It is no wonder that IaC technologies are once again used to maintain consistent and secure repository configurations.

Many organizations require developers to manually set up code repositories. This results in a patchwork of inconsistent configurations and security settings from project to project. Without standardization, each repository may have different branch protection rules, access control principles, and other settings. This makes it harder to strictly enforce best practices. Rolling out a new configuration becomes a project. Ultimately, all this heterogeneity can create vulnerabilities and make it difficult to manage repositories at scale.

We are all for not reinventing the wheel, especially when there are so many great ready-made modules available, supported by the community. The Terraform GitHub repository is no exception. We like the module prepared by our comrades at Mineiros (the team now maintains Terramate): https://github.com/mineiros-io/terraform-github-repository. This module offers a variety of capabilities that go far beyond the basic github_repository resource. This includes a private repository, read-only deployment keys, branch management and branch protection mechanisms, merge strategies, metadata, and much more. Here's a simplified example showing how you can deploy multiple repositories using this module:

locals {
  repositories = {
    backend-api = {
      name               = "backend-api"
      license_template   = "apache-2.0"
      gitignore_template = "Go"
    },
    infra = {
      name              = "infra"
      license_template   = "mit"
      gitignore_template = "Terraform"
    }
  }
}

module "repositories" {
  source  = "mineiros-io/repository/github"
  version = "0.18.0"

  for_each = local.repositories

  name               = each.value.name
  license_template   = each.value.license_template
  gitignore_template = each.value.gitignore_template
}

❯ Monitoring and alerting management

Another task that often causes difficulties for developers is manually setting up monitoring and alerting configurations. Unsurprisingly, this can result in a hodgepodge of inconsistent thresholds and settings that vary across services and stacks. If this work is not standardized, then similar instances of a deployed application may have different criteria for issuing an alert. Again, this makes it difficult to ensure consistent adherence to best practices. Due to this inconsistency, some alerts may not arrive, while others may be noisy. Therefore, it can be difficult to manage monitoring on a large scale.

But there is a better way! By expressing your metrics thresholds and alert configurations in code, you can ensure that all teams work together in a common context. It becomes easier for the developer to add a new alert or correct an existing one, which has already driven everyone crazy with its false positives. In addition, by managing this level of integration, we do not fall into “ClickOps”: that is, we do not have to deploy complex infrastructure through UI provided by providers. On the contrary, you can put both application resources and configurations for their monitoringand also version them together.

We are big fans of the Cloud Posse Module library, and fortunately we are contributors and supporters of it. It has two great modules that are focused on this use case: terraform-datadog-platform And terraform-aws-datadog-integration. Using the integration module, it is convenient to activate the initial integration between the AWS accounts of interest to us and the Datadog account, as well as a platform module that helps configure a variety of Datadog resources, including:

Here is an example of a monitor configuration we use when working with many clients:

rds-cpuutilization:
  enabled: true
  name: "[${environment}] (RDS) CPU utilization is high"
  query: |
        avg(last_15m):avg:aws.rds.cpuutilization{env:${environment}} by {dbinstanceidentifier} > 90
  type: metric alert
  message: |
    {{#is_warning}}
    {{dbinstanceidentifier}} CPU Utilization above {{warn_threshold}}%
    {{/is_warning}}
    {{#is_alert}}
    {{dbinstanceidentifier}} CPU Utilization above {{threshold}}%
    {{/is_alert}}    
  escalation_message: ""
  tags: ${tags}
  priority: 3
  notify_no_data: false
  notify_audit: false
  require_full_window: true
  enable_logs_sample: false
  force_delete: true
  include_tags: true
  locked: false
  renotify_interval: 60
  timeout_h: 0
  evaluation_delay: 60
  new_group_delay: 0
  new_host_delay: 300
  groupby_simple_monitor: false
  renotify_occurrences: 0
  renotify_statuses: []
  validate: true
  no_data_timeframe: 10
  threshold_windows: {}
  thresholds:
    critical: 90
    warning: 85

By programming monitoring in this way, not only does the overall consistency of all SRE operations improve, but it also helps to significantly reduce the number of iterations during development and speed up the delivery of applications to production.

❯ Conclusion

Infrastructure Coding (IaC) is a powerful approach that greatly simplifies the management of software architecture. Unfortunately, in many engineering organizations, IaC is not done well. Don't get us wrong, it's a big step forward if your organization uses IaC to deploy application infrastructure, particularly containers and databases. But by neglecting this technique when monitoring and managing repositories, you are missing out on a lot of the benefits it could bring. That's why we work with TF: it allows you to express much more in code than just the configuration of computing resources and storage.

To go to fully automated infrastructure requires serious dedication from the entire team, but the benefits are obvious. Using IaC to its full potential, you can create more reliable and efficient platforms that will greatly benefit you as you scale your applications and your entire organization.


Read also:


News, product reviews and competitions from the Timeweb.Cloud team – in our Telegram channel

Go ↩

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *