Skip to content

Terraform module to create GitLab Runner Autoscaler on AWS Spot Instances

License

Notifications You must be signed in to change notification settings

nesty92/terraform-aws-gitlab-runner-autoscaler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Terraform Module: GitLab Runner Autoscaler on AWS Spot Instances

This Terraform module provides a reusable infrastructure configuration for deploying GitLab Runner instances on AWS Spot Instances. Using the AWS Fleeting plugin, the Runner instances can be automatically scaled based on the number of jobs in the GitLab CI/CD pipeline. The module supports both ARM64 and AMD64 architectures.

Architecture

This module use a single GitLab Runner Manager instance to manage multiple GitLab Runner instances. The Runner Manager is responsible for scheduling jobs on the Runner instances and scaling the number of instances based on the number of jobs in the GitLab CI/CD pipeline.

The Runner Manager is deployed as an EC2 instance and the Runner instances are deployed as AWS Spot Instances. The Runner instances are managed by an Auto Scaling Group and are launched using a Launch Template.

Architecture Diagram Architecture Diagram - See: GitLab Runner Autoscaler documentation for more details

License

This module is licensed under the Apache License. See the LICENSE file for more details.

Requirements

Name Version
terraform >= 1.3
aws >= 5.11

Providers

Name Version
aws >= 5.11

Modules

No modules.

Resources

Name Type
aws_autoscaling_group.gitlab_runner_instance resource
aws_iam_instance_profile.runner_manager resource
aws_iam_role.runner_manager resource
aws_instance.runner_manager resource
aws_launch_template.gitlab_runner_instance resource
aws_security_group.runner resource
aws_security_group.runner_manager resource
aws_ami.amazon_linux_2 data source
aws_caller_identity.current data source
aws_region.current data source

Inputs

Name Description Type Default Required
architectures The architectures that the Runner will support (e.g., arm64, amd64) list(string) n/a yes
aws_azs n/a list(string) [] no
aws_key_name n/a string "" no
aws_security_group_ids n/a list(string) [] no
aws_subnet_ids n/a list(string) n/a yes
aws_vpc_id n/a string n/a yes
environment A name that identifies the environment, used as a prefix for tagging resources string n/a yes
fleeting_plugin_aws_version The version of the AWS Fleeting plugin to install. string n/a yes
gitlab_instance_url The URL of the GitLab instance string "https://gitlab.com" no
gitlab_runner_version The version of the GitLab Runner to install. string n/a yes
runner_autoscaler_options_amd64 Options added to the [runners.autoscaler] section of config.toml to configure the Runner Autoscaler. For
details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersautoscaler-section

capacity_per_instance = The number of jobs that can be executed concurrently by a single instance.
max_use_count = The maximum number of times an instance can be used before it is scheduled for removal.
max_instances = The maximum number of instances that are allowed, this is regardless of the instance state (pending, running, deleting). Default: 0 (unlimited).

The fleeting-plugin-aws is the only supported plugin.

Default values if the option is not given:
capacity_per_instance = 1
max_use_count = 10
max_instances = 2
object({
capacity_per_instance = optional(number)
max_use_count = optional(number)
max_instances = optional(number)
})
{
"capacity_per_instance": 1,
"max_instances": 2,
"max_use_count": 10
}
no
runner_autoscaler_options_arm64 Options added to the [runners.autoscaler] section of config.toml to configure the Runner Autoscaler. For
details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersautoscaler-section

capacity_per_instance = The number of jobs that can be executed concurrently by a single instance.
max_use_count = The maximum number of times an instance can be used before it is scheduled for removal.
max_instances = The maximum number of instances that are allowed, this is regardless of the instance state (pending, running, deleting). Default: 0 (unlimited).

The fleeting-plugin-aws is the only supported plugin.

Default values if the option is not given:
capacity_per_instance = 1
max_use_count = 10
max_instances = 2
object({
capacity_per_instance = optional(number)
max_use_count = optional(number)
max_instances = optional(number)
})
{
"capacity_per_instance": 1,
"max_instances": 2,
"max_use_count": 10
}
no
runner_autoscaler_plugin_connector_options_amd64 Options added to the [runners.autoscaler.connector_config] section of config.toml to configure the Runner Plugin Connector. For
details check https://gitlab.com/gitlab-org/fleeting/plugins/aws
object({
os = optional(string, "linux")
arch = optional(string)
protocol = optional(string)
username = optional(string)
password = optional(string)
key_path = optional(string)
use_static_credentials = optional(bool, false)
keepalive = optional(string)
timeout = optional(string)
use_external_addr = optional(bool, false)
})
{
"arch": "amd64"
}
no
runner_autoscaler_plugin_connector_options_arm64 Options added to the [runners.autoscaler.connector_config] section of config.toml to configure the Runner Plugin Connector. For
details check https://gitlab.com/gitlab-org/fleeting/plugins/aws
object({
os = optional(string, "linux")
arch = optional(string)
protocol = optional(string)
username = optional(string)
password = optional(string)
key_path = optional(string)
use_static_credentials = optional(bool, false)
keepalive = optional(string)
timeout = optional(string)
use_external_addr = optional(bool, false)
})
{
"arch": "arm64"
}
no
runner_autoscaler_plugin_options_amd64 Options added to the [runners.autoscaler.plugin_config] section of config.toml to configure the Runner Plugin. For
details check https://gitlab.com/gitlab-org/fleeting/plugins/aws

auto_scaling_group_name Will be set to the value of the corresponding Arch
object({
profile_name = optional(string)
config_file = optional(string)
credentials_file = optional(string)
})
{} no
runner_autoscaler_plugin_options_arm64 Options added to the [runners.autoscaler.plugin_config] section of config.toml to configure the Runner Plugin. For
details check https://gitlab.com/gitlab-org/fleeting/plugins/aws

auto_scaling_group_name Will be set to the value of the corresponding Arch
object({
profile_name = optional(string)
config_file = optional(string)
credentials_file = optional(string)
})
{} no
runner_autoscaler_policy_amd64 Options added to the [runners.autoscaler.policy] section of config.toml to configure the Runner Autoscaler Policy. For
details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersautoscalerpolicy-section

Default values if the option is not given:
idle_count = 0
idle_time = "30m"
list(object({
periods = optional(list(string))
timezone = optional(string)
idle_count = optional(number)
idle_time = optional(string)
scale_factor = optional(number)
scale_factor_limit = optional(number)
}))
[
{
"idle_count": 0,
"idle_time": "30m"
}
]
no
runner_autoscaler_policy_arm64 Options added to the [runners.autoscaler.policy] section of config.toml to configure the Runner Autoscaler Policy. For
details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnersautoscalerpolicy-section

Default values if the option is not given:
idle_count = 0
idle_time = "30m"
list(object({
periods = optional(list(string))
timezone = optional(string)
idle_count = optional(number)
idle_time = optional(string)
scale_factor = optional(number)
scale_factor_limit = optional(number)
}))
[
{
"idle_count": 0,
"idle_time": "30m"
}
]
no
runner_docker_options_amd64 Options added to the [runners.docker] section of config.toml to configure the Docker container of the Runner Worker. For
details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html

Default values if the option is not given:
disable_cache = "false"
image = "busybox:latest"
privileged = "false"
pull_policy = "always"
shm_size = 0
tls_verify = "false"
volumes = "/cache"
object({
allowed_images = optional(list(string))
allowed_privileged_images = optional(list(string))
allowed_pull_policies = optional(list(string))
allowed_services = optional(list(string))
allowed_privileged_services = optional(list(string))
cache_dir = optional(string)
cap_add = optional(list(string))
cap_drop = optional(list(string))
cpuset_cpus = optional(string)
cpu_shares = optional(number)
cpus = optional(string)
devices = optional(list(string))
device_cgroup_rules = optional(list(string))
disable_cache = optional(bool, false)
disable_entrypoint_overwrite = optional(bool)
dns = optional(list(string))
dns_search = optional(list(string))
extra_hosts = optional(list(string))
gpus = optional(string)
group_add = optional(list(string))
helper_image = optional(string)
helper_image_flavor = optional(string)
helper_image_autoset_arch_and_os = optional(bool)
host = optional(string)
hostname = optional(string)
image = optional(string, "busybox:latest")
links = optional(list(string))
memory = optional(string)
memory_swap = optional(string)
memory_reservation = optional(string)
network_mode = optional(string)
mac_address = optional(string)
oom_kill_disable = optional(bool)
oom_score_adjust = optional(number)
privileged = optional(bool, false)
services_privileged = optional(string)
pull_policy = optional(list(string), ["always"])
runtime = optional(string)
isolation = optional(string)
security_opt = optional(list(string))
shm_size = optional(number, 0)
sysctls = optional(map(string))
tls_cert_path = optional(string)
tls_verify = optional(bool, false)
user = optional(string)
userns_mode = optional(string)
ulimit = optional(list(string))
volumes = optional(list(string), [])
volumes_from = optional(list(string))
volume_driver = optional(string)
wait_for_services_timeout = optional(number)
container_labels = optional(map(string))
services_limit = optional(number)
service_cpuset_cpus = optional(string)
service_cpu_shares = optional(number)
service_cpus = optional(string)
service_memory = optional(string)
service_memory_swap = optional(string)
service_memory_reservation = optional(string)
})
{
"disable_cache": false,
"image": "busybox:latest",
"privileged": false,
"pull_policies": [
"always"
],
"shm_size": 0,
"tls_verify": false,
"volumes": [
"/cache"
]
}
no
runner_docker_options_arm64 Options added to the [runners.docker] section of config.toml to configure the Docker container of the Runner Worker. For
details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html

Default values if the option is not given:
disable_cache = false
image = "busybox:latest"
privileged = false
pull_policy = "always"
shm_size = 0
tls_verify = false
volumes = "/cache"
object({
allowed_images = optional(list(string))
allowed_privileged_images = optional(list(string))
allowed_pull_policies = optional(list(string))
allowed_services = optional(list(string))
allowed_privileged_services = optional(list(string))
cache_dir = optional(string)
cap_add = optional(list(string))
cap_drop = optional(list(string))
cpuset_cpus = optional(string)
cpu_shares = optional(number)
cpus = optional(string)
devices = optional(list(string))
device_cgroup_rules = optional(list(string))
disable_cache = optional(bool, false)
disable_entrypoint_overwrite = optional(bool)
dns = optional(list(string))
dns_search = optional(list(string))
extra_hosts = optional(list(string))
gpus = optional(string)
group_add = optional(list(string))
helper_image = optional(string)
helper_image_flavor = optional(string)
helper_image_autoset_arch_and_os = optional(bool)
host = optional(string)
hostname = optional(string)
image = optional(string, "busybox:latest")
links = optional(list(string))
memory = optional(string)
memory_swap = optional(string)
memory_reservation = optional(string)
network_mode = optional(string)
mac_address = optional(string)
oom_kill_disable = optional(bool)
oom_score_adjust = optional(number)
privileged = optional(bool, false)
services_privileged = optional(string)
pull_policy = optional(list(string), ["always"])
runtime = optional(string)
isolation = optional(string)
security_opt = optional(list(string))
shm_size = optional(number, 0)
sysctls = optional(map(string))
tls_cert_path = optional(string)
tls_verify = optional(bool, false)
user = optional(string)
userns_mode = optional(string)
ulimit = optional(list(string))
volumes = optional(list(string), [])
volumes_from = optional(list(string))
volume_driver = optional(string)
wait_for_services_timeout = optional(number)
container_labels = optional(map(string))
services_limit = optional(number)
service_cpuset_cpus = optional(string)
service_cpu_shares = optional(number)
service_cpus = optional(string)
service_memory = optional(string)
service_memory_swap = optional(string)
service_memory_reservation = optional(string)
})
{
"disable_cache": false,
"image": "busybox:latest",
"privileged": false,
"pull_policies": [
"always"
],
"shm_size": 0,
"tls_verify": false,
"volumes": [
"/cache"
]
}
no
runner_instance_amd64 ami_id = The AMI ID to use for the Runner instance.
additional_tags = Map of tags that will be added to the Runner instance.
collect_autoscaling_metrics = A list of metrics to collect. The allowed values are GroupDesiredCapacity, GroupInServiceCapacity, GroupPendingCapacity, GroupMinSize, GroupMaxSize, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupStandbyCapacity, GroupTerminatingCapacity, GroupTerminatingInstances, GroupTotalCapacity, GroupTotalInstances.
ebs_optimized = Enable EBS optimization for the Runner instance.
max_lifetime_seconds = The maximum time a Runner should live before it is killed.
monitoring = Enable the detailed monitoring on the Runner instance. Default: false
name_prefix = Set the name prefix and override the Name tag for the Runner instance.
private_address_only = Restrict the Runner to use private IP addresses only. If this is set to true the Runner will use a private IP address only in case the Runner Workers use private addresses only.
block_device_mappings = The Runner's root block device configuration. Takes the following keys: device_name, delete_on_termination, volume_type, volume_size, encrypted, iops, throughput, kms_key_id
spot_price = By setting a spot price bid price the Runner is created via a spot request. Be aware that spot instances can be stopped by AWS. Choose "on-demand-price" to pay up to the current on demand price for the instance type chosen.
ssm_access = Allows to connect to the Runner via SSM.
type = EC2 instance type used. Default: t3.micro
use_eip = Assigns an EIP to the Runner.
iam_instance_profile = The IAM instance profile to associate with the Runner instance.
security_group_ids = The security group IDs to associate with the Runner instance.
object({
ami_id = string
additional_tags = optional(map(string))
collect_autoscaling_metrics = optional(list(string), null)
ebs_optimized = optional(bool, true)
max_lifetime_seconds = optional(number, null)
monitoring = optional(bool, false)
name_prefix = optional(string)
private_address_only = optional(bool, true)
block_device_mappings = optional(map(string), {})
spot_price = optional(string, null)
ssm_access = optional(bool, false)
type = optional(string, "t3.micro")
use_eip = optional(bool, false)
iam_instance_profile = optional(string)
security_group_ids = optional(list(string), [])
})
n/a yes
runner_instance_arm64 ami_id = The AMI ID to use for the Runner instance.
additional_tags = Map of tags that will be added to the Runner instance.
collect_autoscaling_metrics = A list of metrics to collect. The allowed values are GroupDesiredCapacity, GroupInServiceCapacity, GroupPendingCapacity, GroupMinSize, GroupMaxSize, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupStandbyCapacity, GroupTerminatingCapacity, GroupTerminatingInstances, GroupTotalCapacity, GroupTotalInstances.
ebs_optimized = Enable EBS optimization for the Runner instance.
max_lifetime_seconds = The maximum time a Runner should live before it is killed.
monitoring = Enable the detailed monitoring on the Runner instance. Default: false
name_prefix = Set the name prefix and override the Name tag for the Runner instance.
private_address_only = Restrict the Runner to use private IP addresses only. If this is set to true the Runner will use a private IP address only in case the Runner Workers use private addresses only.
block_device_mappings = The Runner's root block device configuration. Takes the following keys: device_name, delete_on_termination, volume_type, volume_size, encrypted, iops, throughput, kms_key_id
spot_price = By setting a spot price bid price the Runner is created via a spot request. Be aware that spot instances can be stopped by AWS. Choose "on-demand-price" to pay up to the current on demand price for the instance type chosen.
ssm_access = Allows to connect to the Runner via SSM.
type = EC2 instance type used. Default: t4g.micro
use_eip = Assigns an EIP to the Runner.
iam_instance_profile = The IAM instance profile to associate with the Runner instance.
security_group_ids = The security group IDs to associate with the Runner instance.
object({
ami_id = string
additional_tags = optional(map(string))
collect_autoscaling_metrics = optional(list(string), null)
ebs_optimized = optional(bool, true)
max_lifetime_seconds = optional(number, null)
monitoring = optional(bool, false)
name_prefix = optional(string)
private_address_only = optional(bool, true)
block_device_mappings = optional(map(string), {})
spot_price = optional(string, null)
ssm_access = optional(bool, false)
type = optional(string, "t4g.micro")
use_eip = optional(bool, false)
iam_instance_profile = optional(string)
security_group_ids = optional(list(string), [])
})
n/a yes
runner_manager For details check https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-global-section

gitlab_check_interval = Number of seconds between checking for available jobs (check_interval)
maximum_concurrent_jobs = The maximum number of jobs which can be processed by all Runners at the same time (concurrent).
prometheus_listen_address = Defines an address (:) the Prometheus metrics HTTP server should listen on (listen_address).
sentry_dsn = Sentry DSN of the project for the Runner Manager to use (uses legacy DSN format) (sentry_dsn)
object({
gitlab_check_interval = optional(number, 3)
maximum_concurrent_jobs = optional(number, 10)
prometheus_listen_address = optional(string, "")
sentry_dsn = optional(string, "")
})
{} no
runner_ssm_token_amd64 The SSM parameter that stores the authentication token for the Runner (amd64)
object({
name = string
arn = string
region = string
})
n/a yes
runner_ssm_token_arm64 The SSM parameter that stores the authentication token for the Runner (arm64)
object({
name = string
arn = string
region = string
})
n/a yes
suppressed_tags List of tag keys which are automatically removed and never added as default tag by the module. list(string) [] no
tags A map of tags to apply to all resources map(string) {} no

Outputs

Name Description
auto_scaling_group_ids n/a
launch_template_ids n/a
runner_manager_id n/a