First helm release always succeeds and doesn't wait for all pods running #672

FischlerA · 2021-01-26T08:43:09Z

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: 0.14.4
Provider version: 2.0.2
Kubernetes version: AWS EKS 1.18
Helm version: 3

Affected Resource(s)

helm_release

Debug Output

https://gist.github.com/FischlerA/7930aff18d68a7b133ff22aadc021517

Steps to Reproduce

terraform apply

Expected Behavior

The helm deployment should fail since the pod that is being deployed is running an image that will always fail. (private image which i can't share)

Actual Behavior

The first time the helm release is deployed it always succeeds after reaching the timeout (5 min), any further deployments fail as they are supposed to after reaching the timeout (5min).

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

jrhouston · 2021-01-27T18:41:24Z

Thanks for opening @FischlerA. Did you try using the wait attribute? By default helm will not wait for all pods to become ready, just create the API resources.

FischlerA · 2021-01-28T06:35:23Z

Thanks for opening @FischlerA. Did you try using the wait attribute? By default helm will not wait for all pods to become ready, just create the API resources.

Per documentation the wait attributed defaults to true. But even after explicitly setting it to true the behavior didn't change and it was still seen as success with a crashing pod.

jrhouston · 2021-01-28T07:15:48Z

Ah yep, you're right – I will try and reproduce this.

The provider itself doesn't do the waiting, it just passes along the wait flag to the install action in the helm package. Do you get the same issue if you do a helm install --wait with your chart?

isabellemartin2 · 2021-02-01T06:48:56Z

@jrhouston
We deployed the chart again by using helm directly with helm install --wait and the behaviour was as expected:
After waiting for five minutes, we've got an error-message Error: timed out waiting for the condition.

dinandr · 2021-02-11T14:50:08Z

I had same experiences when I use helm_release in terraform, if something goes wrong, pod status is stay at "pending" or "Error", "CreateContainer" or some other unusual status for a little longer time, Helm terraform provider will not wait until pods are running, it will exit and reported completed, However terraform state was update as failed.

whiskeysierra · 2021-02-19T19:05:45Z

Saw the same behavior today when I deployed ingress-nginx and the very first job failed because it was rejected by another webhook. The terraform apply run waited for 5 minutes but reported a success, even though not a single resource was created successful. In fact only 1 job was there it was rejected.

FischlerA · 2021-03-22T06:52:58Z

@jrhouston were you able to take a look at this?

jgreat · 2021-03-24T21:30:03Z

I'm running into this too. I pretty regularly have a successful terraform apply (everything shows successful and complete) and end up with helm_release resources that show ~ status = "failed" -> "deployed" on a second run.

loreleimccollum-work · 2021-03-25T19:24:34Z

I think we are hitting this as well but not entirely sure.. we are seeing helm_release pass on first run with (wait = true) where not all the pods come online because of a Gatekeeper/PSP we have in the cluster, we are not sure how get our helm_release to fail in that case

thecb4 · 2021-04-12T22:06:00Z

Hi all. I'm new to Terraform. I've had to split up my Terraform deployments and include a time_sleep because of this issue. Looking forward to an update here.

descampsk · 2021-05-20T16:12:02Z

Same thing with helm job and wait_for_jobs = true.
It will wait the timeout and then will return true.
If i reapply I got the following :

$ terraform apply -var image_tag=dev-ed4854d
helm_release.job_helm_release: Refreshing state... [id=api-migration]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.job_helm_release.helm_release.job_helm_release will be updated in-place
  ~ resource "helm_release" "job_helm_release" {
        id                         = "api-migration"
        name                       = "api-migration"
      ~ status                     = "failed" -> "deployed"
        # (24 unchanged attributes hidden)


        # (22 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

vinothkumarsubs2019 · 2021-09-07T05:29:31Z

I faced this issue, helm-release 'timeout' options seem not working, helm-relese stated as "successfully completed" with in 5 seconds , even though PODs are init stage.

Monsterlin2018 · 2021-12-17T09:54:40Z

me too . pod status is stay at "pending" when I use helm_release in terraform, but it worked well with Helm cli.
Error: release nginx failed, and has been uninstalled due to atomic being set: timed out waiting for the condition

Monsterlin2018 · 2021-12-17T16:18:59Z

I don't know what happened,but it back to normal work。In the past 6 hours, I upgraded kubernetes to 1.23.1.

resource "helm_release" "traefik" {
  name       = "traefik"
  repository = "https://helm.traefik.io/traefik"
  chart      = "traefik"
  version    = "10.3.2"
  
 #I just tried to add this line
  wait = false
}

Versions :

bash-5.1# terraform version
Terraform v1.0.9
on linux_amd64
+ provider registry.terraform.io/hashicorp/helm v2.4.1

# kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:41:01Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:34:54Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}

# helm version
version.BuildInfo{Version:"v3.7.0-rc.2", GitCommit:"4a7c306aa9dcbdeecf79c7517851921a21b72e56", GitTreeState:"clean", GoVersion:"go1.16.7"}

BBBmau · 2022-06-23T18:56:23Z

Is anyone still encountering this issue on the latest version of the provider? I think we fixed this in #727.

Just tried to reproduce this and see the error in provider version v2.0.2 but now I see the appropriate failure diagnostic in v2.6.0.

jgreat · 2022-06-23T19:52:36Z

I can't speak for everyone, but we haven't seen this issue in a while.

n1vgabay · 2022-08-24T14:08:02Z

This happens to me as well.

FischlerA · 2022-08-25T09:56:23Z

Is anyone still encountering this issue on the latest version of the provider? I think we fixed this in #727.

Just tried to reproduce this and see the error in provider version v2.0.2 but now I see the appropriate failure diagnostic in v2.6.0.

Haven't tried it with the v2.6.0 version but will do so and report back, might take me a few days

enterdv · 2022-09-12T11:14:24Z

Reproduced on version 2.6.0 for me

BBBmau · 2022-10-13T16:01:33Z

Reproduced on version 2.6.0 for me

Hello @enterdv! Are you able to include the config that you used that will help us reproduce this issue? We'll want to look into it again if we're still seeing this bug

enterdv · 2022-10-19T06:55:24Z

Reproduced on version 2.6.0 for me

Hello @enterdv! Are you able to include the config that you used that will help us reproduce this issue? We'll want to look into it again if we're still seeing this bug

Hello, I tried with simple helm release

resource "helm_release" "redis" {
  name             = "${var.project}-redis"
  repository      = "https://charts.bitnami.com/bitnami"
  chart             = "redis"
  version          = "17.0.5"
  atomic           = true
  create_namespace = true
  namespace        = "${var.project}-infra"

  values = [
    file("${path.module}/values.yaml")
  ]

  set {
    name  = "fullnameOverride"
    value = "${var.project}-redis"
  }
  set {
    name  = "master.persistence.size"
    value = var.storage_size
  }
  set {
    name  = "master.resources.requests.memory"
    value = var.memory
  }
  set {
    name  = "master.resources.requests.cpu"
    value = var.cpu
  }
  set {
    name  = "master.resources.limits.memory"
    value = var.memory
  }
  set {
    name  = "master.resources.limits.cpu"
    value = var.cpu
  }
  set {
    name  = "replica.persistence.size"
    value = var.storage_size
  }
  set {
    name  = "replica.resources.requests.memory"
    value = var.memory
  }
  set {
    name  = "replica.resources.requests.cpu"
    value = var.cpu
  }
  set {
    name  = "replica.resources.limits.memory"
    value = var.memory
  }
  set {
    name  = "replica.resources.limits.cpu"
    value = var.cpu
  }
  set {
    name  = "replica.replicaCount"
    value = var.replica_count
  }
  set {
    name  = "sentinel.quorum"
    value = var.sentinel_quorum
  }
}

BBBmau · 2022-12-02T16:01:55Z

@enterdv Hello! Thank you for providing the TF config, could you provide the output after running TF_LOG=debug terraform apply

ricardorqr · 2023-05-10T21:36:00Z

me too . pod status is stay at "pending" when I use helm_release in terraform, but it worked well with Helm cli. Error: release nginx failed, and has been uninstalled due to atomic being set: timed out waiting for the condition

Have you fixed this problem?

github-actions · 2024-05-10T00:01:03Z

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

naxim92 · 2024-05-29T19:38:18Z

up

UtopiaWorld · 2024-09-18T10:28:10Z

up

FischlerA added the bug label Jan 26, 2021

jrhouston added the needs-investigation label Feb 10, 2021

citizenken mentioned this issue Apr 13, 2021

Failed deployment creation does not fail Terraform apply #726

Closed

jrhouston removed the needs-investigation label Jun 23, 2022

jrhouston added the waiting-response label Jun 23, 2022

github-actions bot removed the waiting-response label Jun 23, 2022

BBBmau added the waiting-response label Dec 12, 2022

github-actions bot removed the waiting-response label May 10, 2023

artemvmin mentioned this issue Mar 27, 2024

Downgrade RAG to 1.28; bump GMP resource creation delay; increase Ray worker creation timeout GoogleCloudPlatform/ai-on-gke#466

Merged

github-actions bot added the stale label May 10, 2024

github-actions bot removed the stale label May 29, 2024

UtopiaWorld mentioned this issue Sep 10, 2024

"wait_for_jobs = true" does not work as expected in helm_release terraform resource #1483

Closed

sheneska added the needs-investigation label Sep 11, 2024

BBBmau self-assigned this Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First helm release always succeeds and doesn't wait for all pods running #672

First helm release always succeeds and doesn't wait for all pods running #672

FischlerA commented Jan 26, 2021

jrhouston commented Jan 27, 2021 •

edited

Loading

FischlerA commented Jan 28, 2021

jrhouston commented Jan 28, 2021

isabellemartin2 commented Feb 1, 2021 •

edited

Loading

dinandr commented Feb 11, 2021

whiskeysierra commented Feb 19, 2021

FischlerA commented Mar 22, 2021

jgreat commented Mar 24, 2021 •

edited

Loading

loreleimccollum-work commented Mar 25, 2021

thecb4 commented Apr 12, 2021

descampsk commented May 20, 2021

vinothkumarsubs2019 commented Sep 7, 2021

Monsterlin2018 commented Dec 17, 2021 •

edited

Loading

Monsterlin2018 commented Dec 17, 2021 •

edited

Loading

BBBmau commented Jun 23, 2022

jgreat commented Jun 23, 2022

n1vgabay commented Aug 24, 2022

FischlerA commented Aug 25, 2022

enterdv commented Sep 12, 2022

BBBmau commented Oct 13, 2022

enterdv commented Oct 19, 2022 •

edited

Loading

BBBmau commented Dec 2, 2022

ricardorqr commented May 10, 2023

github-actions bot commented May 10, 2024

naxim92 commented May 29, 2024

UtopiaWorld commented Sep 18, 2024

First helm release always succeeds and doesn't wait for all pods running #672

First helm release always succeeds and doesn't wait for all pods running #672

Comments

FischlerA commented Jan 26, 2021

Terraform, Provider, Kubernetes and Helm Versions

Affected Resource(s)

Debug Output

Steps to Reproduce

Expected Behavior

Actual Behavior

Community Note

jrhouston commented Jan 27, 2021 • edited Loading

FischlerA commented Jan 28, 2021

jrhouston commented Jan 28, 2021

isabellemartin2 commented Feb 1, 2021 • edited Loading

dinandr commented Feb 11, 2021

whiskeysierra commented Feb 19, 2021

FischlerA commented Mar 22, 2021

jgreat commented Mar 24, 2021 • edited Loading

loreleimccollum-work commented Mar 25, 2021

thecb4 commented Apr 12, 2021

descampsk commented May 20, 2021

vinothkumarsubs2019 commented Sep 7, 2021

Monsterlin2018 commented Dec 17, 2021 • edited Loading

Monsterlin2018 commented Dec 17, 2021 • edited Loading

BBBmau commented Jun 23, 2022

jgreat commented Jun 23, 2022

n1vgabay commented Aug 24, 2022

FischlerA commented Aug 25, 2022

enterdv commented Sep 12, 2022

BBBmau commented Oct 13, 2022

enterdv commented Oct 19, 2022 • edited Loading

BBBmau commented Dec 2, 2022

ricardorqr commented May 10, 2023

github-actions bot commented May 10, 2024

naxim92 commented May 29, 2024

UtopiaWorld commented Sep 18, 2024

jrhouston commented Jan 27, 2021 •

edited

Loading

isabellemartin2 commented Feb 1, 2021 •

edited

Loading

jgreat commented Mar 24, 2021 •

edited

Loading

Monsterlin2018 commented Dec 17, 2021 •

edited

Loading

Monsterlin2018 commented Dec 17, 2021 •

edited

Loading

enterdv commented Oct 19, 2022 •

edited

Loading