Skip to content

Commit

Permalink
Update example/test to use managed Node Group. Fix race conditions wh…
Browse files Browse the repository at this point in the history
…en applying the Kubernetes `aws-auth` ConfigMap (cloudposse#57)

* Change example/test to use `terraform-aws-eks-node-group`

* Change example/test to use `terraform-aws-eks-node-group`

* Add `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"`

* Add `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"`

* Add `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"`

* Add `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"`

* Add `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"`
  • Loading branch information
aknysh authored Mar 27, 2020
1 parent 162d71e commit 79d7bf7
Show file tree
Hide file tree
Showing 12 changed files with 174 additions and 184 deletions.
17 changes: 15 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,16 @@ The module provisions the following resources:

__NOTE:__ The module works with [Terraform Cloud](https://www.terraform.io/docs/cloud/index.html).

__NOTE:__ In `auth.tf`, we added `ignore_changes = [data["mapRoles"]]` to the `kubernetes_config_map` for the following reason:
- We provision the EKS cluster and then the Kubernetes Auth ConfigMap to map additional roles/users/accounts to Kubernetes groups
- Then we wait for the cluster to become available and for the ConfigMap to get provisioned (see `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"` in `examples/complete/main.tf`)
- Then we provision a managed Node Group
- Then EKS updates the Auth ConfigMap and adds worker roles to it (for the worker nodes to join the cluster)
- Since the ConfigMap is modified outside of Terraform state, Terraform wants to update it (remove the roles that EKS added) on each `plan/apply`

If you want to modify the Node Group (e.g. add more Node Groups to the cluster) or need to map other IAM roles to Kubernetes groups,
set the variable `kubernetes_config_map_ignore_role_changes` to `false` and re-provision the module. Then set `kubernetes_config_map_ignore_role_changes` back to `true`.

## Usage


Expand Down Expand Up @@ -308,6 +318,7 @@ Available targets:
| endpoint_private_access | Indicates whether or not the Amazon EKS private API server endpoint is enabled. Default to AWS EKS resource and it is false | bool | `false` | no |
| endpoint_public_access | Indicates whether or not the Amazon EKS public API server endpoint is enabled. Default to AWS EKS resource and it is true | bool | `true` | no |
| environment | Environment, e.g. 'prod', 'staging', 'dev', 'pre-prod', 'UAT' | string | `` | no |
| kubernetes_config_map_ignore_role_changes | Set to `true` to ignore IAM role changes in the Kubernetes Auth ConfigMap | bool | `true` | no |
| kubernetes_version | Desired Kubernetes master version. If you do not specify a value, the latest available version is used | string | `1.15` | no |
| local_exec_interpreter | shell to use for local_exec | list(string) | `<list>` | no |
| map_additional_aws_accounts | Additional AWS account numbers to add to `config-map-aws-auth` ConfigMap | list(string) | `<list>` | no |
Expand All @@ -323,8 +334,8 @@ Available targets:
| tags | Additional tags (e.g. `map('BusinessUnit','XYZ')` | map(string) | `<map>` | no |
| vpc_id | VPC ID for the EKS cluster | string | - | yes |
| wait_for_cluster_command | `local-exec` command to execute to determine if the EKS cluster is healthy. Cluster endpoint are available as environment variable `ENDPOINT` | string | `curl --silent --fail --retry 60 --retry-delay 5 --retry-connrefused --insecure --output /dev/null $ENDPOINT/healthz` | no |
| workers_role_arns | List of Role ARNs of the worker nodes | list(string) | - | yes |
| workers_security_group_ids | Security Group IDs of the worker nodes | list(string) | - | yes |
| workers_role_arns | List of Role ARNs of the worker nodes | list(string) | `<list>` | no |
| workers_security_group_ids | Security Group IDs of the worker nodes | list(string) | `<list>` | no |

## Outputs

Expand All @@ -337,7 +348,9 @@ Available targets:
| eks_cluster_identity_oidc_issuer | The OIDC Identity issuer for the cluster |
| eks_cluster_identity_oidc_issuer_arn | The OIDC Identity issuer ARN for the cluster that can be used to associate IAM roles with a service account |
| eks_cluster_managed_security_group_id | Security Group ID that was created by EKS for the cluster. EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads |
| eks_cluster_role_arn | ARN of the EKS cluster IAM role |
| eks_cluster_version | The Kubernetes server version of the cluster |
| kubernetes_config_map_id | ID of `aws-auth` Kubernetes ConfigMap |
| security_group_arn | ARN of the EKS cluster Security Group |
| security_group_id | ID of the EKS cluster Security Group |
| security_group_name | Name of the EKS cluster Security Group |
Expand Down
10 changes: 10 additions & 0 deletions README.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,16 @@ introduction: |-
__NOTE:__ The module works with [Terraform Cloud](https://www.terraform.io/docs/cloud/index.html).
__NOTE:__ In `auth.tf`, we added `ignore_changes = [data["mapRoles"]]` to the `kubernetes_config_map` for the following reason:
- We provision the EKS cluster and then the Kubernetes Auth ConfigMap to map additional roles/users/accounts to Kubernetes groups
- Then we wait for the cluster to become available and for the ConfigMap to get provisioned (see `data "null_data_source" "wait_for_cluster_and_kubernetes_configmap"` in `examples/complete/main.tf`)
- Then we provision a managed Node Group
- Then EKS updates the Auth ConfigMap and adds worker roles to it (for the worker nodes to join the cluster)
- Since the ConfigMap is modified outside of Terraform state, Terraform wants to update it (remove the roles that EKS added) on each `plan/apply`
If you want to modify the Node Group (e.g. add more Node Groups to the cluster) or need to map other IAM roles to Kubernetes groups,
set the variable `kubernetes_config_map_ignore_role_changes` to `false` and re-provision the module. Then set `kubernetes_config_map_ignore_role_changes` back to `true`.
# How to use this project
usage: |-
Expand Down
25 changes: 23 additions & 2 deletions auth.tf
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ locals {
certificate_authority_data_map = local.certificate_authority_data_list_internal[0]
certificate_authority_data = local.certificate_authority_data_map["data"]

# Add worker nodes role ARNs (could be from many worker groups) to the ConfigMap
# Add worker nodes role ARNs (could be from many un-managed worker groups) to the ConfigMap
# Note that we don't need to do this for managed Node Groups since EKS adds their roles to the ConfigMap automatically
map_worker_roles = [
for role_arn in var.workers_role_arns : {
rolearn : role_arn
Expand Down Expand Up @@ -80,8 +81,28 @@ provider "kubernetes" {
load_config_file = false
}

resource "kubernetes_config_map" "aws_auth_ignore_changes" {
count = var.enabled && var.apply_config_map_aws_auth && var.kubernetes_config_map_ignore_role_changes ? 1 : 0
depends_on = [null_resource.wait_for_cluster[0]]

metadata {
name = "aws-auth"
namespace = "kube-system"
}

data = {
mapRoles = yamlencode(distinct(concat(local.map_worker_roles, var.map_additional_iam_roles)))
mapUsers = yamlencode(var.map_additional_iam_users)
mapAccounts = yamlencode(var.map_additional_aws_accounts)
}

lifecycle {
ignore_changes = [data["mapRoles"]]
}
}

resource "kubernetes_config_map" "aws_auth" {
count = var.enabled && var.apply_config_map_aws_auth ? 1 : 0
count = var.enabled && var.apply_config_map_aws_auth && var.kubernetes_config_map_ignore_role_changes == false ? 1 : 0
depends_on = [null_resource.wait_for_cluster[0]]

metadata {
Expand Down
7 changes: 5 additions & 2 deletions docs/terraform.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
| endpoint_private_access | Indicates whether or not the Amazon EKS private API server endpoint is enabled. Default to AWS EKS resource and it is false | bool | `false` | no |
| endpoint_public_access | Indicates whether or not the Amazon EKS public API server endpoint is enabled. Default to AWS EKS resource and it is true | bool | `true` | no |
| environment | Environment, e.g. 'prod', 'staging', 'dev', 'pre-prod', 'UAT' | string | `` | no |
| kubernetes_config_map_ignore_role_changes | Set to `true` to ignore IAM role changes in the Kubernetes Auth ConfigMap | bool | `true` | no |
| kubernetes_version | Desired Kubernetes master version. If you do not specify a value, the latest available version is used | string | `1.15` | no |
| local_exec_interpreter | shell to use for local_exec | list(string) | `<list>` | no |
| map_additional_aws_accounts | Additional AWS account numbers to add to `config-map-aws-auth` ConfigMap | list(string) | `<list>` | no |
Expand All @@ -28,8 +29,8 @@
| tags | Additional tags (e.g. `map('BusinessUnit','XYZ')` | map(string) | `<map>` | no |
| vpc_id | VPC ID for the EKS cluster | string | - | yes |
| wait_for_cluster_command | `local-exec` command to execute to determine if the EKS cluster is healthy. Cluster endpoint are available as environment variable `ENDPOINT` | string | `curl --silent --fail --retry 60 --retry-delay 5 --retry-connrefused --insecure --output /dev/null $ENDPOINT/healthz` | no |
| workers_role_arns | List of Role ARNs of the worker nodes | list(string) | - | yes |
| workers_security_group_ids | Security Group IDs of the worker nodes | list(string) | - | yes |
| workers_role_arns | List of Role ARNs of the worker nodes | list(string) | `<list>` | no |
| workers_security_group_ids | Security Group IDs of the worker nodes | list(string) | `<list>` | no |

## Outputs

Expand All @@ -42,7 +43,9 @@
| eks_cluster_identity_oidc_issuer | The OIDC Identity issuer for the cluster |
| eks_cluster_identity_oidc_issuer_arn | The OIDC Identity issuer ARN for the cluster that can be used to associate IAM roles with a service account |
| eks_cluster_managed_security_group_id | Security Group ID that was created by EKS for the cluster. EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads |
| eks_cluster_role_arn | ARN of the EKS cluster IAM role |
| eks_cluster_version | The Kubernetes server version of the cluster |
| kubernetes_config_map_id | ID of `aws-auth` Kubernetes ConfigMap |
| security_group_arn | ARN of the EKS cluster Security Group |
| security_group_id | ID of the EKS cluster Security Group |
| security_group_name | Name of the EKS cluster Security Group |
Expand Down
26 changes: 10 additions & 16 deletions examples/complete/fixtures.us-east-2.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -8,28 +8,22 @@ stage = "test"

name = "eks"

instance_type = "t2.small"

health_check_type = "EC2"

wait_for_capacity_timeout = "10m"

max_size = 3
kubernetes_version = "1.15"

min_size = 2
oidc_provider_enabled = true

autoscaling_policies_enabled = true
enabled_cluster_log_types = ["audit"]

cpu_utilization_high_threshold_percent = 80
cluster_log_retention_period = 7

cpu_utilization_low_threshold_percent = 20
instance_types = ["t3.small"]

associate_public_ip_address = true
desired_size = 2

kubernetes_version = "1.15"
max_size = 3

oidc_provider_enabled = true
min_size = 2

enabled_cluster_log_types = ["audit"]
disk_size = 20

cluster_log_retention_period = 7
kubernetes_labels = {}
57 changes: 28 additions & 29 deletions examples/complete/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -51,33 +51,6 @@ module "subnets" {
tags = local.tags
}

module "eks_workers" {
source = "git::https://github.com/cloudposse/terraform-aws-eks-workers.git?ref=tags/0.12.0"
namespace = var.namespace
stage = var.stage
name = var.name
attributes = var.attributes
tags = var.tags
instance_type = var.instance_type
eks_worker_ami_name_filter = local.eks_worker_ami_name_filter
vpc_id = module.vpc.vpc_id
subnet_ids = module.subnets.public_subnet_ids
associate_public_ip_address = var.associate_public_ip_address
health_check_type = var.health_check_type
min_size = var.min_size
max_size = var.max_size
wait_for_capacity_timeout = var.wait_for_capacity_timeout
cluster_name = module.label.id
cluster_endpoint = module.eks_cluster.eks_cluster_endpoint
cluster_certificate_authority_data = module.eks_cluster.eks_cluster_certificate_authority_data
cluster_security_group_id = module.eks_cluster.security_group_id

# Auto-scaling policies and CloudWatch metric alarms
autoscaling_policies_enabled = var.autoscaling_policies_enabled
cpu_utilization_high_threshold_percent = var.cpu_utilization_high_threshold_percent
cpu_utilization_low_threshold_percent = var.cpu_utilization_low_threshold_percent
}

module "eks_cluster" {
source = "../../"
namespace = var.namespace
Expand All @@ -93,7 +66,33 @@ module "eks_cluster" {
oidc_provider_enabled = var.oidc_provider_enabled
enabled_cluster_log_types = var.enabled_cluster_log_types
cluster_log_retention_period = var.cluster_log_retention_period
}

# Ensure ordering of resource creation to eliminate the race conditions when applying the Kubernetes Auth ConfigMap.
# Do not create Node Group before the EKS cluster is created and the `aws-auth` Kubernetes ConfigMap is applied.
# Otherwise, EKS will create the ConfigMap first and add the managed node role ARNs to it,
# and the kubernetes provider will throw an error that the ConfigMap already exists (because it can't update the map, only create it).
# If we create the ConfigMap first (to add additional roles/users/accounts), EKS will just update it by adding the managed node role ARNs.
data "null_data_source" "wait_for_cluster_and_kubernetes_configmap" {
inputs = {
cluster_name = module.eks_cluster.eks_cluster_id
kubernetes_config_map_id = module.eks_cluster.kubernetes_config_map_id
}
}

workers_role_arns = [module.eks_workers.workers_role_arn]
workers_security_group_ids = [module.eks_workers.security_group_id]
module "eks_node_group" {
source = "git::https://github.com/cloudposse/terraform-aws-eks-node-group.git?ref=tags/0.4.0"
namespace = var.namespace
stage = var.stage
name = var.name
attributes = var.attributes
tags = var.tags
subnet_ids = module.subnets.public_subnet_ids
cluster_name = data.null_data_source.wait_for_cluster_and_kubernetes_configmap.outputs["cluster_name"]
instance_types = var.instance_types
desired_size = var.desired_size
min_size = var.min_size
max_size = var.max_size
kubernetes_labels = var.kubernetes_labels
disk_size = var.disk_size
}
92 changes: 21 additions & 71 deletions examples/complete/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -53,87 +53,37 @@ output "eks_cluster_identity_oidc_issuer" {
value = module.eks_cluster.eks_cluster_identity_oidc_issuer
}

output "workers_launch_template_id" {
description = "ID of the launch template"
value = module.eks_workers.launch_template_id
}

output "workers_launch_template_arn" {
description = "ARN of the launch template"
value = module.eks_workers.launch_template_arn
}

output "workers_autoscaling_group_id" {
description = "The AutoScaling Group ID"
value = module.eks_workers.autoscaling_group_id
}

output "workers_autoscaling_group_name" {
description = "The AutoScaling Group name"
value = module.eks_workers.autoscaling_group_name
}

output "workers_autoscaling_group_arn" {
description = "ARN of the AutoScaling Group"
value = module.eks_workers.autoscaling_group_arn
}

output "workers_autoscaling_group_min_size" {
description = "The minimum size of the AutoScaling Group"
value = module.eks_workers.autoscaling_group_min_size
}

output "workers_autoscaling_group_max_size" {
description = "The maximum size of the AutoScaling Group"
value = module.eks_workers.autoscaling_group_max_size
}

output "workers_autoscaling_group_desired_capacity" {
description = "The number of Amazon EC2 instances that should be running in the group"
value = module.eks_workers.autoscaling_group_desired_capacity
}

output "workers_autoscaling_group_default_cooldown" {
description = "Time between a scaling activity and the succeeding scaling activity"
value = module.eks_workers.autoscaling_group_default_cooldown
}

output "workers_autoscaling_group_health_check_grace_period" {
description = "Time after instance comes into service before checking health"
value = module.eks_workers.autoscaling_group_health_check_grace_period
}

output "workers_autoscaling_group_health_check_type" {
description = "`EC2` or `ELB`. Controls how health checking is done"
value = module.eks_workers.autoscaling_group_health_check_type
output "eks_cluster_managed_security_group_id" {
description = "Security Group ID that was created by EKS for the cluster. EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads"
value = module.eks_cluster.eks_cluster_managed_security_group_id
}

output "workers_security_group_id" {
description = "ID of the worker nodes Security Group"
value = module.eks_workers.security_group_id
output "eks_node_group_role_arn" {
description = "ARN of the worker nodes IAM role"
value = module.eks_node_group.eks_node_group_role_arn
}

output "workers_security_group_arn" {
description = "ARN of the worker nodes Security Group"
value = module.eks_workers.security_group_arn
output "eks_node_group_role_name" {
description = "Name of the worker nodes IAM role"
value = module.eks_node_group.eks_node_group_role_name
}

output "workers_security_group_name" {
description = "Name of the worker nodes Security Group"
value = module.eks_workers.security_group_name
output "eks_node_group_id" {
description = "EKS Cluster name and EKS Node Group name separated by a colon"
value = module.eks_node_group.eks_node_group_id
}

output "workers_role_arn" {
description = "ARN of the worker nodes IAM role"
value = module.eks_workers.workers_role_arn
output "eks_node_group_arn" {
description = "Amazon Resource Name (ARN) of the EKS Node Group"
value = module.eks_node_group.eks_node_group_arn
}

output "workers_role_name" {
description = "Name of the worker nodes IAM role"
value = module.eks_workers.workers_role_name
output "eks_node_group_resources" {
description = "List of objects containing information about underlying resources of the EKS Node Group"
value = module.eks_node_group.eks_node_group_resources
}

output "eks_cluster_managed_security_group_id" {
description = "Security Group ID that was created by EKS for the cluster. EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads"
value = module.eks_cluster.eks_cluster_managed_security_group_id
output "eks_node_group_status" {
description = "Status of the EKS Node Group"
value = module.eks_node_group.eks_node_group_status
}
Loading

0 comments on commit 79d7bf7

Please sign in to comment.