diff --git a/.github/actions/spelling/expect.txt b/.github/actions/spelling/expect.txt index a00fa2d8..0438b4df 100644 --- a/.github/actions/spelling/expect.txt +++ b/.github/actions/spelling/expect.txt @@ -1,8 +1,13 @@ Bance cdef Enda +jfmd +jfrou +jfrt +pvc Jacomb Qube +statefulset toset totp TTLs diff --git a/source/aks/patching-aks-apps.html.md.erb b/source/aks/patching-aks-apps.html.md.erb index 86b2978b..194d43d1 100644 --- a/source/aks/patching-aks-apps.html.md.erb +++ b/source/aks/patching-aks-apps.html.md.erb @@ -168,11 +168,12 @@ AAD Pod Identity has been deprecated hence there is not going to be further patc ## Application specific examples -- [Traefik](patching-traefik.html) -- [KEDA](patching-keda-example.html) +- [Artifactory](patching-artifactory.html) +- [dns-node-cache](patching-flux-example.html) - [Dynatrace Operator](patching-dynatrace.html) - [Flux](patching-flux-example.html) -- [dns-node-cache](patching-flux-example.html) -- [pact-broker](patching-pact-broker-example.html) +- [KEDA](patching-keda-example.html) - [kured](patching-kured-example.html) -- [Prometheus/Grafana](patching-prometheus-grafana-example.html) \ No newline at end of file +- [pact-broker](patching-pact-broker-example.html) +- [Prometheus/Grafana](patching-prometheus-grafana-example.html) +- [Traefik](patching-traefik.html) \ No newline at end of file diff --git a/source/aks/patching-artifactory.html.md.erb b/source/aks/patching-artifactory.html.md.erb index 08f0a1ea..c70c8747 100644 --- a/source/aks/patching-artifactory.html.md.erb +++ b/source/aks/patching-artifactory.html.md.erb @@ -7,35 +7,62 @@ review_in: 12 months # <%= current_page.data.title %> -Artifactory is utilized in the context of managing and updating software dependencies and configurations within a Kubernetes environment. +Artifactory is utilized in the context of managing and updating software dependencies and configurations within a Kubernetes environment. These dependencies are cached and are utilized by teams if upstream issues arise. By bringing the cached dependencies closer to build agents, fewer teams are impacted. **Artifactory can be disabled** by following the [example pr](https://github.com/hmcts/cnp-jenkins-library/pull/1266/files) as a temporary bypass to unblock teams. -The steps below will show you how to check and update **self-hosted version** of Artifactory being used. This process involves accessing [Artifactory's website](https://jfrog.com/help/r/jfrog-release-information/artifactory-release-notes) to determine the latest version available. +There are [3 additional files](https://github.com/hmcts/cnp-flux-config/tree/master/apps/artifactory/ptl-intsvc) within artifactory: `admin-pw.yaml`, `join-key.yaml` and `master-key.yaml`. These files are necessary for Artifactory to operate, `admin-pw.yaml` is used to login to artifactory, `join-key.yaml` and `master-key.yaml` are used internally and are needed for the startup of Artifactory. -Before applying the update on PTL AKS Cluster, a testing process is initiated and is conducted on PTLSBOX AKS Cluster. This involves raising a PR to the [Artifactory repo](https://github.com/hmcts/cnp-flux-config) containing configuration files, creating 'artifactory-sbox.yaml' file and editing 'kustomization.yaml' files with your changes as seen in the example PR below. In addition, checking Artifactory pods are healthy and access to [Artifcatory](https://artifactory.sandbox.platform.hmcts.net/ui/repos/tree/General/hmcts) is available. +This documentation will guide you through checking and updating the **self-hosted version** of Artifactory being used. This process includes accessing [Artifactory's website](https://jfrog.com/help/r/jfrog-release-information/artifactory-release-notes) to determine the latest version available, verifying that the Artifactory Helm Release has accepted the latest changes, and ensuring the pods are healthy and that [Artifcatory](https://artifactory.sandbox.platform.hmcts.net/ui/repos/tree/General/hmcts) is accessible. -[Example PR](https://github.com/hmcts/cnp-flux-config/pull/32185/files) of PTLSBOX changes. +**Note:** This activity is disruptive and will cause artifactory to stop working as the pods are redeployed in a statefulset. To minimize disruptions to teams the following paths can be taken: -[Example PR](https://github.com/hmcts/cnp-flux-config/pull/32191/files) of PTL changes after testing on PTLSBOX. +- A: Complete the patch early in the morning, similar to a jenkins upgrade, to avoid disruptions. +- B: Temporarily disable Artifactory, following the [example pr](https://github.com/hmcts/cnp-jenkins-library/pull/1266/files), **before** upgrading. -If any issues arise during the update process, troubleshooting steps are provided. These steps involve identifying and resolving any errors or conflicts that may occur, such as missing namespaces or configuration discrepancies. +If any issues arise during the patching process, [troubleshooting steps](#troubleshooting) are provided at the bottom of this page. These steps involve identifying and resolving any errors or conflicts that may occur, such as missing namespaces or configuration discrepancies. ## Artifactory Patching Process ### 1. Updating CFT PTLSBOX - • [Connect to PTLSBOX cluster](https://hmcts.github.io/cloud-native-platform/troubleshooting/index.html#connecting-to-aks-clusters) + • Create a new file named `artifactory-sbox.yaml` in the [artifactory directory](https://github.com/hmcts/cnp-flux-config/tree/master/apps/artifactory/artifactory "apps/artifactory/artifactory"), add the following and update to the latest version. + + ```yaml + apiVersion: helm.toolkit.fluxcd.io/v2 + kind: HelmRelease + metadata: + name: artifactory-oss + namespace: artifactory + spec: + chart: + spec: + chart: artifactory-oss + version: 107.90.9 <- update to latest version + ``` + + **Note:** The layout of Artifactory versioning includes 10 at the beginning. This version would be `7.90.9` - • Make the changes to the 'kustomization.yaml' and create a new file named 'artifactory-sbox.yaml' following the [Example PR](https://github.com/hmcts/cnp-flux-config/pull/32185/files) and updating the latest version. + • Make the changes to the bottom of `kustomization.yaml` file found in [sbox-intsvc directory](https://github.com/hmcts/cnp-flux-config/blob/master/apps/artifactory/sbox-intsvc/base/kustomization.yaml "apps/artifactory/sbox-intsvc/base/kustomization.yaml"). - • **Note** the file path of 'kustomization.yaml' and 'artifactory-sbox.yaml' in the PR. + ```yaml + ... + + patchesStrategicMerge: + - ../../artifactory/artifactory-sbox.yaml + ``` - • Raise a PR similar to the [Example PR](https://github.com/hmcts/cnp-flux-config/pull/32185/files) and get your PR approved and merged. + • Submit your PR and ensure it gets reviewed and merged. ### 2. Check new version on CFT PTLSBOX + + • [Connect to CFT PTLSBOX cluster](https://hmcts.github.io/cloud-native-platform/troubleshooting/index.html#connecting-to-aks-clusters) - • Run the following commands ensuring pods are **healthy** and has applied new version. **Note** version number found at "artifactory-oss". + • Run the command below and ensure the Helm Release is showing **True** and is running the latest chart version. - • Ensure you can access [Artifacory](https://artifactory.sandbox.platform.hmcts.net/ui/repos/tree/General/hmcts) website. **Note** you will need to be connected to F5 VPN to access this link. + ```command + kubectl get hr -n artifactory + ``` + + • Run the following commands ensuring the pod is **healthy** and has applied new version. **Note:** version number found at "artifactory-oss". ```command kubectl get pods -n artifactory @@ -45,28 +72,155 @@ If any issues arise during the update process, troubleshooting steps are provide kubectl describe pods -n artifactory artifactory-oss-0 ``` + • For further confirmation, ensure that the grep command below returns the chart version that you have applied. + + ```command + kubectl describe pods -n artifactory artifactory-oss-0 | grep chart=artifactory-107.90.9 + ``` + + • Ensure you can access [Artifacory](https://artifactory.sandbox.platform.hmcts.net/ui/repos/treeGeneral/hmcts) website. **Note:** you will need to be connected to F5 VPN to access this link. + ### 3. Update CFT PTL - - • [Connect to PTL cluster](https://hmcts.github.io/cloud-native-platform/troubleshooting/index.html#connecting-to-aks-clusters) - • Remove 'artifactory-sbox.yaml' file and its corresponding code from kustomization.yaml. + • Remove `artifactory-sbox.yaml` file and its corresponding code from `kustomization.yaml` made in step 1. - • Update the version in 'artifactory.yaml' to latest version eg, (7.84.12). + • Update the version in `artifactory.yaml` to latest version eg, (7.90.9) <- ensure `10` is in front of the version. - ```command - containers: - - image: docker.bintray.io/jfrog/artifactory-oss:7.84.12 - name: artifactory-oss + ```yaml + ... + + chart: + spec: + chart: artifactory-oss + version: 107.90.9 + + ... ``` - • Raise a PR similar to the [Example](https://github.com/hmcts/cnp-flux-config/pull/32191/files) and get your PR approved and merged. + • Submit your PR and ensure it gets reviewed and merged. - • Repeat [step 2](#2-check-new-version-on-cft-ptlsbox) checking the new version on CFT PTL ensuring the pods are healthy. + • Repeat [step 2](#2-check-new-version-on-cft-ptlsbox) for **'CFT PTL cluster'** checking the Helm release is on the new version and ensuring the pods are healthy. -### Related Links +## Related Links - [GitHub repo](https://github.com/hmcts/cnp-flux-config) + [cnp-flux repo](https://github.com/hmcts/cnp-flux-config) - [Artifactory](https://artifactory.sandbox.platform.hmcts.net/ui/repos/tree/General/hmcts) **Note** you will need to be connected to F5 VPN to access this link. + [Artifactory](https://artifactory.sandbox.platform.hmcts.net/ui/repos/tree/General/hmcts) **Note:** you will need to be connected to F5 VPN to access this link. [Artifactory Release Information](https://jfrog.com/help/r/jfrog-release-information/artifactory-release-notes) + +## Troubleshooting + +### How to view logs of artifactory + +To view the logs while artifactory is booting up / running and to spot any errors, enter the following commands below. **Note:** The default logs are from the router container. + +```command +kubectl logs -n artifactory artifactory-oss-0 +``` + +- **Tip:** To keep watching logs use `-f` at the end. + +• To view all containers in a pod and to get container specific logs enter the following commands. + +```command +kubectl get pods -n artifactory artifactory-oss-0 -o jsonpath='{.spec.containers[*].name}' +``` + +```commands +kubectl logs -n artifactory artifactory-oss-0 -c +``` + +- **Tip:** The main containers to view are `router` and `artifactory`, here you will see most errors likely to occur. + +### Missing required services + +```json +[jfrou] [WARN] [217db729f7674ef0] [local_topology_helper.go:68] [main] [] - Missing required services: [jfmd] +``` + +• You will see this error upon startup of artifactory as all services startup, if they do not clear after 60 - 90 seconds investigate further. This often occurs when one of the containers is not healthy, usually the router service. These errors most likely point to Master / Join key issues as the router cannot establish connection with the internal DB. + +### Database errors + +```json +[jfrt ] [ERROR] [ee4c12af0700e30b] [o.a.s.d.i.DbConnectionUtils:86] [Catalina-utility-2 ] - Cannot start the application with a database other than PostgreSQL. For more information, see JFrog documentation +``` + +• This error is unlikely to appear, if it does it means that the artifactory configuration has not successfully applied and is using other DB than the one specified. Restart artifactory and watch the startup logs of the container and ensure this error does not appear. + +To watch the containers enter the command below. + +```command +kubectl logs -n artifactory -c artifactory artifactory-oss-0 +``` + +### Join / Master key errors + +Join / Master key errors **can be a false error** and point to a separate issue. Investigate the `artifactory` container first before attempting to look at other areas of artifactory as this can be a bootstrapping error upon startup. Artifactory masks most issues with `join-key` or `master-key` so it is a good case to check artifactory container first. More info can be found in this [GitHub comment](https://github.com/jfrog/charts/issues/1917#issuecomment-2342990400) where a more in depth explanation can be found, but is not relevant to this documentation. + +```json +[jfrou] Cluster join: Access Service ping failed, will retry. Error: cluster join: error from service registry on ping: +``` + +• Join key issues are found before master key issues arise, these are shown upon startup as the internal database boots and usually disappear after 30 seconds. If `join-key missing` is still showing investigate containers themselves if the key has been picked up by artifactory correctly. + +```json +[jfmd ] [ERROR] - Failed resolving master key: failed resolving 'shared.security.masterKey' key; file does not exist: /opt/jfrog/artifactory/var/etc/security/master.key +``` + +• Master key is also used for the database, for internal communication purposes. If `master-key` issues are still showing ensure artifactory has these in the containers. + +#### Check keys in container + +```command +kubectl exec -it artifactory-oss-0 -n artifactory -c router -- /bin/sh +``` + +• Once inside the container cd to this directory, list the files and ensure `join.key` and `master.key` are there. + +```command +cd var/etc/security/ +``` + +```command +ls -la +``` + +#### How to fix + +After investigating artifactory containers further and if the key issues are is persisting, restart the artifactory service by deleting the `pvc`, `statefulset` and the `pod` of artifactory namespace. If the service is still not restarting successfully, then a full rebuild of the secret keys is most likely needed. + +#### Recreating new keys + +This will cover creating a new key, sops encrypt new key, rename file and update the [cnp flux repo](https://github.com/hmcts/cnp-flux-config) in a new pr with the new key. **Note:** The following example uses `master-key` replace this with `join-key` when creating that key. + +• Before we start we need a random string to be used by our key. Copy its output as we need it later. + +```command +openssl rand -hex 32 +``` + +```command +kubectl create secret generic master-key -n artifactory --from-literal=master-key='' --type=Opaque -o yaml --dry-run=client > file.enc.yaml +``` + +• **Note:** Ensure you have the correct key vault sops link, search for your cluster in [azure portal](https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.KeyVault%2Fvaults) and copy the sops key uri. The Sops key uri is found in objects -> keys -> sops-key and is under `Key Identifier`. + +```command +sops --encrypt --azure-kv --encrypted-regex "^(data|stringData)$" --in-place file.enc.yaml +``` + +• Rename the file so that artifactory can use the new key. + +```command +mv file.enc.yaml master-key.yaml +``` + +• Follow the steps again for `join-key.yaml`. + +• Submit your PR with the new keys and ensure it gets reviewed and merged. + +• Restart artifactory by deleting the `pvc`, `statefulset` and the `pod` of artifactory namespace. Monitor artifactory as it boots up and ensure the errors are gone. + +• If the new set of keys and restart of artifactory has not started up correctly reach out to a member of your team for assistance.