Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate GitHub runner deployment for stackhpc-release-train #252

Merged
merged 2 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/actions/setup/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ runs:
steps:
- uses: actions/setup-python@v3
with:
python-version: 3.x
python-version: 3.11.x
# Cache Python dependencies
cache: pip

Expand Down
46 changes: 46 additions & 0 deletions terraform/github-runners/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

148 changes: 148 additions & 0 deletions terraform/github-runners/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
========================
Terraform GitHub runners
========================

This Terraform configuration deploys a GitHub Actions runner VMs on an
OpenStack cloud for the stackhpc-release-train repository.

Usage
=====

These instructions show how to use this Terraform configuration manually. They
assume you are running an Ubuntu host that will be used to run Terraform. The
machine should have network access to the VM that will be created by this
configuration.

Install Terraform:

.. code-block:: console

wget -qO - terraform.gpg https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/terraform-archive-keyring.gpg
sudo echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/terraform-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/terraform.list
sudo apt update
sudo apt install terraform

Clone and initialise the repo:

.. code-block:: console

git clone https://github.com/stackhpc/stackhpc-release-train
cd stackhpc-release-train

Change to the terraform/github-runners directory:

.. code-block:: console

cd terraform/github-runners

Initialise Terraform:

.. code-block:: console

terraform init

Create an OpenStack clouds.yaml file with your credentials to access an
OpenStack cloud. Alternatively, download one from Horizon. The credentials
should be scoped to the stackhpc-release project.

.. code-block:: console

cat << EOF > clouds.yaml
---
clouds:
sms-lab:
auth:
auth_url: https://api.sms-lab.cloud:5000
username: <username>
project_name: <project>
domain_name: default
interface: public
EOF

Export environment variables to use the correct cloud and provide a password:

.. code-block:: console

export OS_CLOUD=sms-lab
read -p OS_PASSWORD -s OS_PASSWORD
export OS_PASSWORD

Verify that the Terraform variables in terraform.tfvars are correct.

Generate a plan:

.. code-block:: console

terraform plan

Apply the changes:

.. code-block:: console

terraform apply -auto-approve

Create a virtualenv:

.. code-block:: console

python3 -m venv venv

Activate the virtualenv:

.. code-block:: console

source venv/bin/activate

Install Python dependencies:

.. code-block:: console

pip install -r ansible/requirements.txt

Install Ansible Galaxy dependencies:

.. code-block:: console

ansible-galaxy collection install -r ansible/requirements.yml
ansible-galaxy role install -r ansible/requirements.yml

Create a GitHub PAT token (classic) with repo:all scope. Export an environment
variable with the token.

.. code-block:: console

read -p PERSONAL_ACCESS_TOKEN -s PERSONAL_ACCESS_TOKEN
export PERSONAL_ACCESS_TOKEN

Deploy runners:

.. code-block:: console

ansible-playbook ansible/site.yml -i ansible/inventory.yml

To remove runners:

.. code-block:: console

ansible-playbook ansible/site.yml -i ansible/inventory.yml -e runner_state=absent

Troubleshooting
===============

Install service fails
---------------------

If you see the following::

TASK [monolithprojects.github_actions_runner : Install service] ********************************************************************************************************************************************
fatal: [10.205.0.50]: FAILED! => changed=true
cmd: ./svc.sh install ubuntu
msg: '[Errno 2] No such file or directory: b''./svc.sh'''
rc: 2
stderr: ''
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>

It might mean the runner is already registered, possibly from a previous VM.
Remove the runner using Ansible or the GitHub settings.
5 changes: 5 additions & 0 deletions terraform/github-runners/ansible.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[defaults]
stdout_callback = community.general.yaml
host_key_checking = False
pipelining = True
deprecation_warnings=False
6 changes: 6 additions & 0 deletions terraform/github-runners/ansible/group_vars/all
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
runner_user: "{{ ansible_facts.user_id }}"
github_account: stackhpc
github_repo: stackhpc-release-train
runner_labels:
- stackhpc-release-train
1 change: 1 addition & 0 deletions terraform/github-runners/ansible/inventory.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
plugin: cloud.terraform.terraform_provider
1 change: 1 addition & 0 deletions terraform/github-runners/ansible/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ansible-core==2.15.*
6 changes: 6 additions & 0 deletions terraform/github-runners/ansible/requirements.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
collections:
- name: cloud.terraform
- name: community.general
roles:
- name: monolithprojects.github_actions_runner
22 changes: 22 additions & 0 deletions terraform/github-runners/ansible/site.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
- name: Deploy GitHub runners
hosts: runners
gather_facts: no
tasks:
- name: Wait for connection
ansible.builtin.wait_for_connection:

- name: Gather facts
ansible.builtin.setup:

- import_role:
name: monolithprojects.github_actions_runner
become: true

- name: Ensure runner service is running
ansible.builtin.service:
name: actions.runner.stackhpc-stackhpc-release-train.{{ ansible_facts.hostname }}.service
state: started
enabled: true
become: true
when: runner_state | default('started') == 'started'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackhodgkiss you might want to borrow this bit for SKC

24 changes: 24 additions & 0 deletions terraform/github-runners/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
output "access_ip_v4" {
value = openstack_compute_instance_v2.runners.*.access_ip_v4
}

output "access_cidr" {
value = data.openstack_networking_subnet_v2.network.cidr
}

output "access_gw" {
value = data.openstack_networking_subnet_v2.network.gateway_ip
}

resource "ansible_host" "runners" {
for_each = { for host in openstack_compute_instance_v2.runners : host.name => host.access_ip_v4 }
name = each.value
groups = ["runners"]
}

resource "ansible_group" "runners" {
name = "runners"
variables = {
ansible_user = var.ssh_username
}
}
18 changes: 18 additions & 0 deletions terraform/github-runners/provider.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#provider "openstack" {
# use environment variables
#}

terraform {
required_version = ">= 0.14"
backend "local" {
}
required_providers {
ansible = {
source = "ansible/ansible"
version = "1.1.0"
}
openstack = {
source = "terraform-provider-openstack/openstack"
}
}
}
18 changes: 18 additions & 0 deletions terraform/github-runners/templates/userdata.cfg.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#cloud-config
# Configure SSH keys here, to avoid creating an ephemeral keypair.
# This means only the instance needs to be cleaned up if the destroy fails.
ssh_authorized_keys:
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJXNGOmsnRcZdmjMA5f3fgq8l8+QMjBywJQzuvxlhslx mark@mark-xps15
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICY4CwVM472QS5TYhEwAHge4vbkuBtpVUCC8cyIolYR5 [email protected]
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHK1Kf1hRYskYfYtXnqnyzD6vFbpKVyUtcxcn2pYL2y+ [email protected]
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIC/F0fT8MuXUBEDeCbi9XoaefpZuWf35NOHHECN8VyIq [email protected]
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIATVu5COwRn9FirPl2GhQIY/RmZkNGM+CcXOhT7WujBf [email protected]
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJg3W6wKjLrKObikQz84K3oZJjUDpp2c6k62ZE4x1fev [email protected]
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAID1c17mnzF71ElO+xbJKwxc2bgoR8Yb+DhcENrWfYt0d dawud
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDeIsZEbrsSuxBcr8qI4jiZxXavEv+Crh+QwDNcezF6xgrTkIzCodYuJ6CNLhbee1A6IBIB5csmJWZ6BE8BD1RqxPH1AnCVIoQD+b606HCWYQ8UAhD7E485fE3DxDffAgsSsaK5LBatdIBu3epOrJiKGIMuJmGSAv3Hp86BkQ2y9bQXMp1jhn+cQ1WoZm9AK77viEM4X2sVmWMipUu3RV1tlI/W9mBO6GCi2N/+kDtzpZzjQEevr/xgu4zir99z2s/xDIMN2IssQm3nD4pMTrj7lkpKehtdYkHOFSW4qvhwgRuyONtoWAJjt7krOURYI4W6Inxva4yv/tqTyCFEIV4jfFGuoN7N9uaA7HtaEtSWtxtGXPkITutE3u4vsvKhPHOwjhHIL9cgy/EquP2Jm/gN8UG+iogSsLqbfJVbLdpO3X1i6m3dIsJ5WYKhPGBXRfTXRJQ0H/tbDTDr0zt7xqQDZREExirtyJ6auMYOXOrE60wMkH2hEPMeretx+4oZ1KE= will@juno
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCpMZpCHFX5yp7+5mrm4rW0x7dIQSS+49i5nVe5HrXxWSCvz8RVtEstFBN8OuOnUgZnDfpxrcxhoiCtv5WWEJQOzeUNlcjzMjTW7MdLqpVciLQLad7gUl1kepv6P0iyKrb0ze5xA0nfKVKInQmU3jSklE+aghHUXGre/pwAaqKJGsMLXLfW4uqjdG2RSAL/Bpg9126Ly0u/r1zhaNefkpIVqAYGqS7aXis6FTic/WfK4PbImqMuGjqiHNbNLT2pyzoBTRt3WDE0cRI6ryEYKKtHPtQ15H/J7kj18vek4ca7wA7nK+0UtrtSWnE2U4l/tybJ8AQQOKJ2ocjAyvDOrfumSRd+KS/ljxrOFSmyV4Zap6WJrUgzicQKOpZ0L9f3JxwTc/NlXk/BTA+COaoM3NwjYu+eHuJudJSiA0EHTQ3LdLhZiyFTo+DX+J6sZzr0keGWd0gRXQ8A0vKi1tHqerR7Reh1NGobaNzScO/1nyDAZJLt9BNI1OFMJEuZD83CTSU= [email protected]

# Not including these until claimed.
# - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDNG48SwtW6B9Exs08fuSyAq+KTQ7xMJaypDXiku8vIuP2Q7le6GTwmlULX12djtTvTewRzV6AsqkCyGq+2+Yo8EaKZbU08pKLuzlTH++89vKYaq+FxbbpgrDlvnT1rA5Af4AGrNcUGj3f3ovdc4XC9HL2MvVcb3lKZelCLae7HNeEES4oMVLd+GytkQPruj4SDqLLedGkJRmDN7Z88zjjhWDrVNFeXvcuJOXwlRXvuw8gR4NQgoPpET91Do8uMj5YpvH19yc9U3XOZx9wj1+sh2XJwQF6/anOpNxoFmICuSJLFc/Ulrmk8ogq3GRKHX+dT6eTIZrkX0IkSNfDnaOi20e1YP5yN0d8NffVrZHRzUwQz5LJB8fXyQgd1JIDcMyIluCjz3ND90oOoGMIDABZmz5+8eqECEh7Du7b1/XFSpPoTRWJ4YrpmSbCuYBpLKe/vNJAqUQgxwpgYuFUEJiBfWrE3B4w7FAUcc1l4/A78OrYt1umnK3OM41OZobChavE=
# - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO4AyLmXAw4uceOvUte8H0tKEif7q63KIncL39I0QkpC
# - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOUaGDMtMzd93zbAhvZO0bDBqdna1QgHRQgDImKuPTXs
6 changes: 6 additions & 0 deletions terraform/github-runners/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
vm_count = 2
ssh_username = "ubuntu"
vm_image = "Ubuntu-20.04"
vm_flavor = "general.v1.tiny"
vm_network = "stackhpc-release"
vm_subnet = "stackhpc-release-subnet"
62 changes: 62 additions & 0 deletions terraform/github-runners/vm.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
variable "vm_count" {
type = number
}

variable "ssh_username" {
type = string
}

variable "vm_name" {
type = string
default = "srt-runner"
}

variable "vm_image" {
type = string
}

variable "vm_flavor" {
type = string
}

variable "vm_network" {
type = string
}

variable "vm_subnet" {
type = string
}

locals {
image_is_uuid = length(regexall("^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", var.vm_image)) > 0
}

data "openstack_images_image_v2" "image" {
name = var.vm_image
most_recent = true
count = local.image_is_uuid ? 0 : 1
}

data "openstack_networking_subnet_v2" "network" {
name = var.vm_subnet
}

resource "openstack_compute_instance_v2" "runners" {
count = var.vm_count
name = format("%s%s", var.vm_name, count.index)
flavor_name = var.vm_flavor
config_drive = true
user_data = file("templates/userdata.cfg.tpl")
network {
name = var.vm_network
}

block_device {
uuid = local.image_is_uuid ? var.vm_image: data.openstack_images_image_v2.image[0].id
source_type = "image"
volume_size = 20
boot_index = 0
destination_type = "volume"
delete_on_termination = true
}
}