Table of Contents
[TOC]
The design.gitlab.com
runs Pajamas Design System and contains brand and product design guidelines and UI components for all things GitLab. The project is located in https://gitlab.com/gitlab-org/gitlab-services/design.gitlab.com. You can read more this system here.
This is an internally developed Rails app which is running an a GKE cluster, using an Auto DevOps deployment configuration. There is no database, and the staging/review databases currently run in pods provisioned by Auto DevOps.
- Read the README file for the GitLab Services Base project
- Note the location of the Metrics Dashboards
- Note the location of the CI Pipelines for the infrastructure components
- Note the location of the CI Pipelines for the application components
For more detailed information on the setup view the version.gitlab.com runbooks.
The application is deployed using Auto DevOps from the design-gitlab-com project. It uses a Review/Production scheme with no staging deployment. If deployment problems are suspected, check for failed or incomplete jobs, and check the Environments page to make sure everything looks reasonable.
The production deployment of the design.gitlab.com
application is in the design-prod
GCP project. The components to be aware of are:
- The kubernetes cluster
design-prod-gke
and its node pool - Load balancer (provisioned by the k8s ingress)
The review apps are in the design-staging
GCP project.
This project and its contents are managed by the GitLab Services project. Any infrastructure changes to the environment or K8s cluster should be made as an MR there. Changes will be applied via CI jobs when the MR is merged. design-prod
and design-staging
are represented as Environments in that project.
The resources in the cluster, including the KAS agent, namespaces, and service account roles and permissions, are all configured from the Cluster management project
Monitoring is currently limited to pingdom alerts.
Note: The kubernetes endpoint is protected, so kubectl commands need to be run from google cloud shell. They won't work from a workstation
Switch contexts to the design-prod-gke
cluster in the design-prod
project.
Make sure there is at least one ingress controller pod, and that it hasn't been restarting. Note the age in the last field.
$ kubectl get pods -n gitlab-managed-apps | grep ingress-nginx-ingress-controller
gitlab-managed-apps ingress-nginx-ingress-controller-85ff56cfdd-cjd9b 1/1 Running 0 20h
gitlab-managed-apps ingress-nginx-ingress-controller-85ff56cfdd-fmqnh 1/1 Running 0 20h
gitlab-managed-apps ingress-nginx-ingress-controller-85ff56cfdd-tg77w 1/1 Running 0 42h
Check for Events:
kubectl describe deployment -n gitlab-managed-apps ingress-nginx-ingress-controller
The bottom of this output will show health check failures, pod migrations and restarts, and other events which might effect availability of the ingress. Events: <none>
means the problem is probably elsewhere.
Certificates are managed by the cert-manager
pod installed CI in the cluster management project. It is configured with this helmfile
The overall usage can be checked like this:
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gke-design-prod-gke-node-pool-0-58e08e59-popj 132m 1% 3183Mi 11%
gke-design-prod-gke-node-pool-0-870e91bf-n1jh 125m 1% 2534Mi 9%
gke-design-prod-gke-node-pool-0-b4ecf86b-qhl6 178m 2% 1705Mi 6%
Pods can be checked like this:
$ kubectl top pods -n design-prod
NAME CPU(cores) MEMORY(bytes)
production-5f476b4f58-6jlb4 1m 10Mi
production-5f476b4f58-gjb7b 1m 10Mi
production-5f476b4f58-ql6jv 1m 10Mi
Currently, the only alerting is the pingdom blackbox alerts. This is the same as what was set up in the previous AWS environment, but probably needs to be improved. The preference is to use built in GitLab functionality where possible.