Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awxbackups fail with => The task includes an option with an undefined variable #1902

Open
3 tasks done
salanisor opened this issue Jun 19, 2024 · 10 comments
Open
3 tasks done

Comments

@salanisor
Copy link

salanisor commented Jun 19, 2024

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

Bug Summary

Backups were working fine and suddenly started receiving the following error, similar to issue 1577.

Started with v2.17.0 & upgraded to see if the issue would go away.

awx-operator.v2.18.0                AWX                              2.18.0    awx-operator.v2.17.0

AWX Operator version

2.18.0

AWX version

v1beta1

Kubernetes platform

openshift

Kubernetes/Platform version

4.13.17

Modifications

no

Steps to reproduce

apply yaml

---
apiVersion: awx.ansible.com/v1beta1
kind: AWXBackup
metadata:
  name: awxbackup-test-sandbox-2
  namespace: awx
spec:
  deployment_name: awx-infra
  backup_pvc: awx-sandbox-backup-claim
  no_log: false

Expected results

successful backups - can actually see that the task returns the right field when testing. Should have probably used the same ansible version.

cat awx.json | jq '.this_awx.resources[0].status.postgresConfigurationSecret'
"awx-sandbox-postgres-configuration"
ansible-playbook --version
ansible-playbook [core 2.17.0]
  config file = None
  configured module search path = ['/Users/x/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /opt/homebrew/Cellar/ansible/10.0.1/libexec/lib/python3.12/site-packages/ansible
  ansible collection location = /Users/x/.ansible/collections:/usr/share/ansible/collections
  executable location = /opt/homebrew/bin/ansible-playbook
  python version = 3.12.3 (main, Apr  9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] (/opt/homebrew/Cellar/ansible/10.0.1/libexec/bin/python)
  jinja version = 3.1.4
  libyaml = True

Actual results

fails with error

TASK [backup :Get PostgreSQL configuration] ***********************************\r\n\u001b[1;30mtask path: /opt/ansible/roles/backup/tasks/postgres.yml:3\u001b[0m\n\u001b[0;31mfatal: [localhost]: FAILED!
 
 => {\"msg\": \"The task includes an option with an undefined variable. The error was: list object has no element 0. list object has no element 0\\n\\nThe error appears to be in '/opt/ansible/roles/backup/tasks/postgres.yml'
 : line 3, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n\\n- name: Get PostgreSQL configuration\\n  ^ here\\n\"}\u001b[0m\n\r\nPLAY RECAP 
 *********************************************************************\r\n\u001b[0;31mlocalhost\u001b[0m                  : \u001b[0;32mok=16  \u001b[0m \u001b[0;33mchanged=2   \u001b[0m unreachable=0    \u001b[0;31mfailed=1 
 \u001b[0m \u001b[0;36mskipped=8   \u001b[0m rescued=0    ignored=0   \n","job":"8280544932261614561","name":"awxbackup-test-sandbox-2","namespace":"awx","error":"exit status 2","stacktrace":"github.com/operator-framework
 /ansible-operator-plugins/internal/ansible/runner.(*runner).Run.func1\n\tansible-operator-plugins/internal/ansible/runner/runner.go:269"}

Additional information

awx definition, this was working fine and had run a few recovery tests in this namespace.

We had installed OCP service mesh and thought it may be the culprit but no. Removed it and issue persists.

 oc get awxbackups
NAME                   AGE
awxbackup-2024         43d
awxbackup-2024-06-03   14d
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  annotations:
    argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
    argocd.argoproj.io/sync-wave: "200"
  name: awx-sandbox
  namespace: awx
spec:
  postgres_keepalives_count: 5
  postgres_keepalives_idle: 5
  ee_resource_requirements:
    requests:
      cpu: 50m
      memory: 64M
  create_preload_data: true
  garbage_collect_secrets: false
  loadbalancer_port: 80
  no_log: true
  task_resource_requirements:
    requests:
      cpu: 50m
      memory: 128M
  image_pull_policy: IfNotPresent
  loadbalancer_ip: ''
  projects_storage_size: 8Gi
  auto_upgrade: true
  task_privileged: false
  postgres_keepalives: true
  postgres_keepalives_interval: 5
  ipv6_disabled: false
  projects_storage_access_mode: ReadWriteMany
  set_self_labels: true
  web_resource_requirements:
    requests:
      cpu: 50m
      memory: 128M
  projects_persistence: false
  replicas: 1
  admin_user: admin
  loadbalancer_protocol: http
  
  # Set the default admin password secret 
  admin_password_secret: awx-admin-secret
  
  # Give route information for this instance 
  service_type: ClusterIP
  ingress_type: route
  route_host: awx-sandbox.apps.help.example.com
  route_tls_termination_mechanism: Edge
  hostname: awx-sandbox.apps.help.example.com
  
  # Configure this instance to trust OpenLDAP for access checks and for general lookups 
  ldap_cacert_secret: awx-ca-secret
  bundle_cacert_secret: awx-ca-secret
  
  # Configure this instance to successfully perform Kerberos lookups in domain
  # First, create the ConfigMap as part of the kustomization.yaml 
  # Second, turn the ConfigMap into a volume 
  # Third, mount the volume to the Pod Members 
  extra_volumes: |
    - name: awx-corp-krb5
      configMap:
        defaultMode: 420
        items:
        - key: krb5.conf
          path: krb5.conf
        name: awx-corp-krb5-conf-configmap
  web_extra_volume_mounts: |
    - name: awx-corp-krb5
      mountPath: /etc/krb5.conf
      subPath: krb5.conf
  task_extra_volume_mounts: |
    - name: awx-corp-krb5
      mountPath: /etc/krb5.conf
      subPath: krb5.conf
  ee_extra_volume_mounts: |
    - name: awx-corp-krb5
      mountPath: /etc/krb5.conf
      subPath: krb5.conf

Operator Logs

log.tar.gz

@D1StrX
Copy link

D1StrX commented Jun 26, 2024

Yup, same as in here: #1518

@djyasin
Copy link
Member

djyasin commented Jul 3, 2024

Hello @salanisor, is this the same issue described in #1518?

@fritz0011
Copy link

Same issue here: Rancher +k8s 1.28.9 + awx-operator awx-operator-2.19.0/1

--------------------------- Ansible Task StdOut -------------------------------

TASK [Get PostgreSQL configuration] ********************************
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: list object has no element 0. list object has no element 0\n\nThe error appears to be in '/opt/ansible/roles/backup/tasks/postgres.yml': line 3, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Get PostgreSQL configuration\n ^ here\n"}

@fritz0011
Copy link

root@mgmtrke2m01:~# kubectl get awx -o yaml -n awx-prod
apiVersion: v1
items:

  • apiVersion: awx.ansible.com/v1beta1
    kind: AWX
    metadata:
    creationTimestamp: "2024-06-24T14:35:32Z"
    generation: 1
    labels:
    app.kubernetes.io/component: awx
    app.kubernetes.io/managed-by: awx-operator
    app.kubernetes.io/operator-version: 2.19.1
    app.kubernetes.io/part-of: awx
    name: awx
    namespace: awx-prod
    ...
    ...
    image: quay.io/ansible/awx:24.6.1
    postgresConfigurationSecret: awx-db-descret
    secretKeySecret: awx-secret-key
    version: 24.6.1

root@mgmtrke2m01:~# kubectl get secret awx-db-descret -n awx-prod
NAME TYPE DATA AGE
awx-db-descret Opaque 6 14d

... could be related to the fact that the secret is using base64 enc strings instead of cleartext ?

@djyasin
Copy link
Member

djyasin commented Jul 24, 2024

There is a workaround described here #1518.

We are still investigating this issue and may have more information soon!

@fritz0011
Copy link

fritz0011 commented Jul 24, 2024

@djyasin , I just did a bit of troubleshooting
according to : https://github.com/ansible/awx-operator/blob/devel/roles/backup/tasks/postgres.yml
=> this may trigger the error " name: "{{ this_awx['resources'][0]['status']['postgresConfigurationSecret'] }}"

that traced to this =>

  • name: Look up details for this deployment
    k8s_info:
    api_version: "{{ api_version }}"
    kind: "AWX"
    name: "{{ deployment_name }}"
    namespace: "{{ ansible_operator_meta.namespace }}"
    register: this_awx

@fritz0011
Copy link

as of today 25:07

TASK [Create new AWXBackup resource and wait for complete] *********************
changed: [localhost] => {"changed": true, "duration": 50, "method": "create", "result": {"apiVersion": "awx.ansible.com/v1beta1", "kind": "AWXBackup", "metadata": {"creationTimestamp": "2024-07-25T08:30:58Z", "finalizers": ["awx.ansible.com/finalizer"], "generation": 1, "labels": {"app.kubernetes.io/component": "awx", "app.kubernetes.io/managed-by": "awx-operator", "app.kubernetes.io/operator-version": "2.19.1", "app.kubernetes.io/part-of": "awxbackup-2024-07-25-08-30-57"}, "managedFields": [{"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:metadata": {"f:finalizers": {".": {}, "v:\"awx.ansible.com/finalizer\"": {}}}}, "manager": "ansible-operator", "operation": "Update", "time": "2024-07-25T08:30:58Z"}, {"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:metadata": {"f:labels": {".": {}, "f:app.kubernetes.io/component": {}, "f:app.kubernetes.io/managed-by": {}, "f:app.kubernetes.io/operator-version": {}, "f:app.kubernetes.io/part-of": {}}}, "f:spec": {".": {}, "f:backup_pvc": {}, "f:clean_backup_on_delete": {}, "f:deployment_name": {}, "f:image_pull_policy": {}, "f:no_log": {}, "f:postgres_image": {}, "f:postgres_image_version": {}, "f:set_self_labels": {}}}, "manager": "OpenAPI-Generator", "operation": "Update", "time": "2024-07-25T08:31:00Z"}, {"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:status": {"f:backupClaim": {}, "f:backupDirectory": {}}}, "manager": "OpenAPI-Generator", "operation": "Update", "subresource": "status", "time": "2024-07-25T08:31:42Z"}, {"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:status": {".": {}, "f:conditions": {}}}, "manager": "ansible-operator", "operation": "Update", "subresource": "status", "time": "2024-07-25T08:31:46Z"}], "name": "awxbackup-2024-07-25-08-30-57", "namespace": "awx-prod", "resourceVersion": "44044680", "uid": "c98de577-a319-43c3-b734-43d6bf6d1f8f"}, "spec": {"backup_pvc": "backupawx", "clean_backup_on_delete": true, "deployment_name": "awx", "image_pull_policy": "IfNotPresent", "no_log": false, "postgres_image": "postgres", "postgres_image_version": "14", "set_self_labels": true}, "status": {"backupClaim": "backupawx", "backupDirectory": "/backups/tower-openshift-backup-2024-07-25-083121", "conditions": [{"lastTransitionTime": "2024-07-25T08:31:42Z", "reason": "", "status": "False", "type": "Failure"}, {"lastTransitionTime": "2024-07-25T08:30:58Z", "reason": "Successful", "status": "True", "type": "Running"}, {"lastTransitionTime": "2024-07-25T08:31:46Z", "reason": "Successful", "status": "True", "type": "Successful"}]}}}

backup creation: Successful

@bigtree21cn
Copy link

as of today 25:07

TASK [Create new AWXBackup resource and wait for complete] ********************* changed: [localhost] => {"changed": true, "duration": 50, "method": "create", "result": {"apiVersion": "awx.ansible.com/v1beta1", "kind": "AWXBackup", "metadata": {"creationTimestamp": "2024-07-25T08:30:58Z", "finalizers": ["awx.ansible.com/finalizer"], "generation": 1, "labels": {"app.kubernetes.io/component": "awx", "app.kubernetes.io/managed-by": "awx-operator", "app.kubernetes.io/operator-version": "2.19.1", "app.kubernetes.io/part-of": "awxbackup-2024-07-25-08-30-57"}, "managedFields": [{"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:metadata": {"f:finalizers": {".": {}, "v:"awx.ansible.com/finalizer"": {}}}}, "manager": "ansible-operator", "operation": "Update", "time": "2024-07-25T08:30:58Z"}, {"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:metadata": {"f:labels": {".": {}, "f:app.kubernetes.io/component": {}, "f:app.kubernetes.io/managed-by": {}, "f:app.kubernetes.io/operator-version": {}, "f:app.kubernetes.io/part-of": {}}}, "f:spec": {".": {}, "f:backup_pvc": {}, "f:clean_backup_on_delete": {}, "f:deployment_name": {}, "f:image_pull_policy": {}, "f:no_log": {}, "f:postgres_image": {}, "f:postgres_image_version": {}, "f:set_self_labels": {}}}, "manager": "OpenAPI-Generator", "operation": "Update", "time": "2024-07-25T08:31:00Z"}, {"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:status": {"f:backupClaim": {}, "f:backupDirectory": {}}}, "manager": "OpenAPI-Generator", "operation": "Update", "subresource": "status", "time": "2024-07-25T08:31:42Z"}, {"apiVersion": "awx.ansible.com/v1beta1", "fieldsType": "FieldsV1", "fieldsV1": {"f:status": {".": {}, "f:conditions": {}}}, "manager": "ansible-operator", "operation": "Update", "subresource": "status", "time": "2024-07-25T08:31:46Z"}], "name": "awxbackup-2024-07-25-08-30-57", "namespace": "awx-prod", "resourceVersion": "44044680", "uid": "c98de577-a319-43c3-b734-43d6bf6d1f8f"}, "spec": {"backup_pvc": "backupawx", "clean_backup_on_delete": true, "deployment_name": "awx", "image_pull_policy": "IfNotPresent", "no_log": false, "postgres_image": "postgres", "postgres_image_version": "14", "set_self_labels": true}, "status": {"backupClaim": "backupawx", "backupDirectory": "/backups/tower-openshift-backup-2024-07-25-083121", "conditions": [{"lastTransitionTime": "2024-07-25T08:31:42Z", "reason": "", "status": "False", "type": "Failure"}, {"lastTransitionTime": "2024-07-25T08:30:58Z", "reason": "Successful", "status": "True", "type": "Running"}, {"lastTransitionTime": "2024-07-25T08:31:46Z", "reason": "Successful", "status": "True", "type": "Successful"}]}}}

backup creation: Successful

facing the same issue. @fritz0011 Could you share how did you fix this?

@fritz0011
Copy link

fritz0011 commented Aug 2, 2024

@bigtree21cn

awx-operator 2.19.1

using this approach:
backup AWX within AWX as jobtemplate
https://github.com/kurokobo/awx-on-k3s
++ important: https://github.com/kurokobo/awx-on-k3s/tree/main/containergroup#create-container-group

awx deployed to this NS: awx-prod
jobtemplate: extravars

awxbackup_namespace: awx-prod
awxbackup_keep_days: 10
awxbackup_spec:
deployment_name: awx
clean_backup_on_delete: true
backup_pvc: backupawx
postgres_image: postgres
postgres_image_version: '14'
no_log: false

@salanisor
Copy link
Author

salanisor commented Aug 28, 2024

Hello @salanisor, is this the same issue described in #1518?

My bad for the late reply. And I'm not sure it's not the same issue. However, even trying the workaround provided by @fritz0011 & the solution in #1518 still produces the same error on OpenShift 4.13.17 with AWX version 2.19.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants