Artur Carter

0 %
Zak Abdel-Illah
Automation Enthusiast
  • 📍 Location
    🇬🇧 London
Certifications
  • AWS Solutions Architect - Associate
Languages
  • Python
  • Go
Technologies
  • Ansible
  • Linux
  • Terraform
  • Kubernetes

Building a PKI using Terraform

February 24, 2024

As part of building a hybrid infrastructure, I explored different technologies for achieving a stable VPN connection from on-premises to the AWS Infrastructure and found AWS’ Client-to-Site feature nested within AWS VPC. I explored this prior to AWS Site-to-Site VPN as I didn’t have the right setup for handling IPSec/L2TP tunnels at the time, and had OpenVPN already handy from my MacBook.

Since I would be using OpenVPN (As that’s what AWS Client VPN uses), I require TLS certificates as a method of authentication and encryption. While AWS provides certificate management features, it does have a cost, making it less suitable for my testing requirements.

I’ve opted to use Terraform to create a custom PKI solution locally, and to prepare for the re-use in larger infrastructure projects.

Working Environment

  • Machine
    • MacBook Pro M2
  • Technologies
    • Terraform

Terraform Module Breakdown

terraform {
  required_providers {
    tls = {
      source = "hashicorp/tls"
    }
    pkcs12 = {
      source = "chilicat/pkcs12"
    }
  }
}

I’m making use of the following modules within my Terraform project

  • hashicorp/tls
    • For generating the private keys, certificate requests and certificates themselves
  • chilicat/pkcs12
    • For combining the private key & certificate together, a requirement for using OpenVPN client without embedding the data inside the *.ovpn configuration file (which didn’t come out-of-the-box from AWS)
/**
 * Private key for use by the self-signed certificate, used for
 * future generation of child certificates. As long as the state
 * remains unchanged, the private key and certificate should not
 * re-update at every re-run unless any variable is changed.
 */
resource "tls_private_key" "pem_ca" {
  algorithm = var.algorithm
}

I’ve made the algorithm of the certificates controllable from a global variable due to customer requirements possibly needing to adopt a different level of encryption. This resource returns a PEM-formatted key.

/**
 * Generation of the CA Certificate, which is in turn used by
 * the client.tf and server.tf submodules to generate child
 * certificates
 */
resource "tls_self_signed_cert" "ca" {
  private_key_pem = tls_private_key.pem_ca.private_key_pem
  is_ca_certificate = true

  subject {
    country             = var.ca_country
    province            = var.ca_province
    locality            = var.ca_locality
    common_name         = var.ca_cn
    organization        = var.ca_org
    organizational_unit = var.ca_org_name
  }

  validity_period_hours = var.ca_validity

  allowed_uses = [
    "digital_signature",
    "cert_signing",
    "crl_signing",
  ]
}

I then used the tls_self_signed_cert resource to generate the CA certificate itself, providing the private key generated prior into the private_key_pem attribute. Again, by providing global variables for the ca subject and validity, I’m able to re-run the same terraform module for multiple clients under different workspaces (or by referencing this into larger modules).

The subject fields I had decided to expose are a way to describe exactly what and where the TLS certificate belongs to without needing to dive back into the module.

By adding cert_signing and crl_signing to the allowed_uses list, it adds permissions to the certificate for signing child certificates. This is essential as I would still need to generate the certificates for the OpenVPN server and the client.

This resource returns a PEM-formatted certificate.

/**
 * Return the certificate itself. It's the responsibility of
 * the user of this module to determine whether the certificate should
 * be stored locally, transferred or submitted directly to a cloud
 * service
 */
output "ca_certificate" {
  value = tls_self_signed_cert.ca.cert_pem
  sensitive = true
  description = "generated ca certificate"
}

Finally, I return the CA Certificate and its’ key from the module for the user to place it where it needs to be, for example;

To a local file

resource "local_file" "ca_key" {
  content_base64 = module.pki.ca_private_key
  filename = "${path.module}/certs/ca.key"
}
resource "local_file" "ca" {
  content_base64 = module.pki.ca_certificate
  filename = "${path.module}/certs/ca.crt"
}

To the AWS Certificate Manager

resource "aws_acm_certificate" "ca" {
  private_key = module.pki.ca_private_key
  certificate_body = module.pki.ca_certificate
}

Server & Client Certificates

resource "tls_cert_request" "csr" {
  for_each = var.clients # or var.servers
  private_key_pem = tls_private_key.pem_clients[each.key].private_key_pem
    # or pem_servers[each.key]
  dns_names = [each.key]

  subject {
    country = try(each.value.country, try(var.default_client_subject.country, var.default_subject.country))
    province = try(each.value.province, try(var.default_client_subject.province, var.default_subject.province))
    locality = try(each.value.locality, try(var.default_client_subject.locality, var.default_subject.locality))
    common_name = try(each.value.cn, try(var.default_client_subject.cn, var.default_subject.cn))
    organization = try(each.value.org, try(var.default_client_subject.org, var.default_subject.org))
    organizational_unit = try(each.value.ou, try(var.default_client_subject.ou, var.default_subject.ou))
  }
}

Regardless of whether generating a server or client TLS certificate, both need to go through the ‘certificate request’ process, which is to;

  1. Generate a private key for the server or client
  2. Generate a certificate signing request based on the private key
  3. Using the CSR to get a CA-signed certificate

In this example, I made use of the try block to achieve a value priority in the following order;

  1. Resource-level
    • Do I have a value specific to the server or client?
  2. Class-level
    • Do I have a value specific to the target type?
  3. Module-level
    • Do I have a global default?

And each refers to a key / value pair that is identical for clients as it is servers, where the key is the machine name and the value is the subject data. Here is a sample of the *.tfvars.json file that drives this behaviour.

{
  "clients": {
    "mbp": {
      "country": "GB",
      "locality": "GB",
      "org": "ZAI",
      "org_name": "ZAI",
      "province": "GB"
    }
  }
}

In an ideal (and secure) scenario, the private keys should never be transmitted over the wire, instead, you generate a CSR and transmit that. Since this is aimed for test environments, security is not a concern for me. Should I want to do the generation securely, I’ve exposed the following variable as a way to override the CSR generation.

variable "client_csrs" {
  type = map
  description = "csrs to use instead of generating them within this module"
  default = {}
}

Getting the signed certificate

resource "tls_locally_signed_cert" "client" {
  for_each = var.clients
  cert_request_pem = tls_cert_request.csr_client[each.key].cert_request_pem
  ca_private_key_pem = tls_private_key.pem_ca.private_key_pem
  ca_cert_pem = tls_self_signed_cert.ca.cert_pem

  validity_period_hours = var.client_certificate_validity

  allowed_uses = [
    "digital_signature",
    "key_encipherment",
    "server_auth", # for server-side
    "client_auth", # for client-side
  ]
}

Once the *.csr is generated (or provided), I’m able to use the tls_locally_signed_cert resource type to connect that data with the CA Certificate for signing against the private key of the CA Certificate. The cert_request_pem, ca_private_key_pem and ca_cert_pem inputs allow me to do so using the raw PEM format, without needing to save to disk before passing the data in.

Relying on the data within the terraform state file allows me to also rule out any “external influence” when troubleshooting, as there will be only a single source of truth.

Adding either server_auth or client_auth (depending on use-case) to allowed_uses permits the use of the signed certificate for authentication, as required by OpenVPN.

Converting from *.PEM to PCKS12

resource "pkcs12_from_pem" "client" {
  for_each = var.clients
  ca_pem          = tls_self_signed_cert.ca.cert_pem
  cert_pem        = tls_locally_signed_cert.client[each.key].cert_pem
  private_key_pem = tls_private_key.pem_client[each.key].private_key_pem
  password = "123" # Testing purposes
  encoding = "legacyRC2"
}

Using the pkcs12_from_pem resource type from chilicat makes this process simple, as long as I have access to the private key in addition to the certificate and ca.

For compatibility with the OpenVPN Connect application, I needed to enforce the encoding of legacyRC2, rather than the modern encryption that’s offered by easy-rsa.

Returning the certificates

output "client_certificates" {
  value = [ for cert in tls_locally_signed_cert.client : cert.cert_pem ]
  description = "generated client certificates in ordered list form"
  sensitive = true
}

Finally, I return the generated certificates and their *.p12 equivalent from the module. I mark this data as sensitive due to the inclusion of private keys.

For the value, I needed to iterate over a list of resources (as I had used the foreach input earlier to handle a key/value pair) and re-build a single list with the result.

As mentioned above, it is then the responsibility of the user to determine what to do with the generated certificates, be it storing them locally or pushing them to AWS.

Posted in AWS, Cloud, DevOps, TerraformTags: