Application Insights feature block hanging and Failure Anomalies Still Auto-Generated #18026

mindlessroman · 2022-08-17T22:34:38Z

Is there an existing issue for this?

I have searched the existing issues

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

1.2.7

AzureRM Provider Version

3.11.0

Affected Resource(s)/Data Source(s)

provider azurerm

Terraform Configuration Files

# main.tf
terraform {
  backend "azurerm" {
  }
}

provider "azurerm" {
  features {
    application_insights {
      disable_generated_rule = true
    }
  }
}

# appinsights.tf
resource "azurerm_application_insights" "appinsights" {
  name                = "${local.name}-ai"
  location            = azurerm_resource_group.main_resource_group.location
  resource_group_name = azurerm_resource_group.main_resource_group.name
  application_type    = "web"
}

Debug Output/Panic Output

(see below)

Expected Behaviour

I would expect if the disable_generated_rule was set to true, then the Smart Detector Rule that's auto-created would not be generated and/or the autocreated failure anomalies smart detector alert rule would also be turned off. The creation of an app insights resource would take about 30 seconds max. Ability to destroy a resource group not impeded.

Actual Behaviour

In the terraform apply step of our pipeline, the App insights resource will seemingly hit a 10 minute timeout. The resource will have already been created and visible in the Azure portal, but will be still creating according to the pipeline, which feels unnecessary. Waiting for this step to complete when it has completed... but terraform doesn't get that message?

azurerm_application_insights.appinsights: Still creating... [10m0s elapsed]
azurerm_application_insights.appinsights: Still creating... [10m10s elapsed]
azurerm_application_insights.appinsights: Still creating... [10m20s elapsed]
azurerm_application_insights.appinsights: Still creating... [10m30s elapsed]
azurerm_application_insights.appinsights: Creation complete after 10m39s

The rule for Failure Anomalies - {{name of App insights resource}} still is created (as a hidden resource)

Which then causes our terraform destroy step to fail:

... # other resources destroyed, uneventfully
azurerm_application_insights.appinsights: Destroying... [id=...]
azurerm_application_insights.appinsights: Destruction complete after 2s
... # other resources destroyed, uneventfully
azurerm_resource_group.main_resource_group: Destroying... [id=...]
azurerm_resource_group.main_resource_group: Still destroying... [id=..., 10s elapsed]
azurerm_resource_group.main_resource_group: Still destroying... [id=..., 9m50s elapsed]
... # a different unrelated warning
│ Error: deleting Resource Group "...": the Resource Group still contains Resources.
│ 
│ Terraform is configured to check for Resources within the Resource Group when deleting the Resource Group - and
│ raise an error if nested Resources still exist to avoid unintentionally deleting these Resources.
│ 
│ Terraform has detected that the following Resources still exist within the Resource Group:
│ 
│ * `/subscriptions/.../resourceGroups/.../providers/microsoft.alertsmanagement/smartDetectorAlertRules/Failure Anomalies - {{app insights resource name}}`
│ 
│ This feature is intended to avoid the unintentional destruction of nested Resources provisioned through some
│ other means (for example, an ARM Template Deployment) - as such you must either remove these Resources, or
│ disable this behaviour using the feature flag `prevent_deletion_if_contains_resources` within the `features`
│ block when configuring the Provider, for example:
│ 
│ provider "azurerm" {
│   features {
│     resource_group {
│       prevent_deletion_if_contains_resources = false
│     }
│   }
│ }
│ 
│ When that feature flag is set, Terraform will skip checking for any Resources within the Resource Group and
│ delete this using the Azure API directly (which will clear up any nested resources).
##[error]Error: The process '/opt/hostedtoolcache/terraform/1.2.7/x64/terraform' failed with exit code 1

My theory is that in the time that the app insights sat waiting (10 minutes) it was enough time for the auto-generated, hidden alert to come online.

First issue: Setting that feature flag makes the build time take (up to) 10 minutes as it waits... even if the resource is in fact finished being created.

If we explicitly call out a smart detection rule to disable and remove the feature block:

# appinsights.tf
resource "azurerm_application_insights_smart_detection_rule" "smart_detection_rule" {
  name                    = "Slow server response time"
  application_insights_id = azurerm_application_insights.appinsights.id
  enabled                 = false
}

# main.tf
terraform {
  backend "azurerm" {
  }
}

provider "azurerm" {
  features {
  }
}

Then in the terraform apply stage:

azurerm_application_insights.appinsights: Creating...
azurerm_application_insights.appinsights: Creation complete after 2s [id=...]
azurerm_application_insights_smart_detection_rule.smart_detection_rule: Creating...
azurerm_application_insights_smart_detection_rule.smart_detection_rule: Creation complete after 1s [id=...]

App insights does not hang, and usually we can delete the resource group before the Failure Anomalies gets generated.

** Second Issue:** Failure Anomalies are "Smart Detection Alert Rules" and not "Smart Detection Rules" are seemingly not under the purview of "disable_generated_rule" flag - ... see the note at this documentation section

This Azure Resource Manager template is unique to the Failure Anomalies alert rule and is different from the other classic Smart Detection rules described in this article. If you want to manage Failure Anomalies manually this is done in Azure Monitor Alerts whereas all other Smart Detection rules are managed in the Smart Detection pane of the UI.

The Request / The Ask

Fix the hanging when declaring that feature flag
Once that is fixed, include the Failure Anomalies as either:
- an included rule that's turned off when that disable_generated_rule flag is true
- OR, have more explicit ways in the Azure provider to disable Smart Detection Alert Rules

This documentation describes creating it explicitly. However it feels counterintuitive to explicitly create the resource in terraform (that we don't even get told is there because it's a hidden resource) just so we can have the control to delete it. We never define this hidden resource to be included in our builds in the first place, so we don't have the means to explicitly destroy it.

All this may stem from a recent change under the hood for Azure, but if the terraform equivalents could match, that would be great.

Steps to Reproduce

terraform apply
terraform destroy

Important Factoids

No response

References

PR #16170

On Azure's end, I'm trying to figure out whether some functionality changed under the hood recently that caused this to pop up? Or if it moved to be controlled by something else?

The text was updated successfully, but these errors were encountered:

DanLauerman · 2022-10-19T14:38:46Z

This seems like a major oversight on Microsoft's part for Azure. Even if an Application Insights resource is deleted in the Portal, the automatically created Smart Detector alerts do not get removed.

Link to feedback provided to Azure for upvoting on the Azure side: https://feedback.azure.com/d365community/idea/cdb1fc68-bb4f-ed11-a81b-000d3adfeb99

egorshulga · 2023-01-17T08:53:35Z

I wonder, why AzureRM creates the Failure alert rule in the first place? 🤔
I just checked, when we create AppInsights from Azure Portal, no hidden alert is created.
I am sorry, but do I miss something there?

JohnRAristizabal · 2023-01-23T11:18:32Z

We are using this to mitigate the issue:

resource_group {
   # This flag is set to mitigate an open bug in Terraform. As soon as this is fixed, we should remove this.
   prevent_deletion_if_contains_resources = false
}

egorshulga · 2023-01-23T11:57:44Z

upd: this answer appeared to be wrong

And it seems we also managed to find a workaround for the issue by declaring the resource explicitly:

resource "azurerm_monitor_smart_detector_alert_rule" "failureAnomalies" {
  count               = var.isProd ? 1 : 0
  name                = "Failure Anomalies"
  resource_group_name = azurerm_resource_group.resourceGroup.name
  detector_type       = "FailureAnomaliesDetector"
  scope_resource_ids  = [azurerm_application_insights.appInsights.id]
  severity            = "Sev3"
  frequency           = "PT1M"
  action_group {
    ids = [one(azurerm_monitor_action_group.actionGroup).id]
  }
}

The funny thing is that you can see this alert is conditional, so it is provisioned for prod only, but somehow this declaration fixes the non-prod environments as well

pgagliano5 · 2023-02-24T15:40:57Z

Even with "prevent_deletion_if_contains_resources = false" the destroy fails.

IgorZhavoronok · 2023-05-05T07:11:49Z

We have the same problem too. When I run destroy pipeline, it's creates "Application Insights Smart Detection" resorce and sometimes "Failure Anomalies", so it is block resource group destruction. That really looks like a bug.

ameyaagashe · 2023-05-26T01:07:06Z

I have the same issue. Even if you add "prevent_deletion_if_contains_resources = false" destroy fails. Indeed, a bug. Hoping Microsoft resolves this sooner. This resorts to "Click Ops," whereby one has to manually go and delete the resource and then rerun terraform for it to destroy the resource group.

miniemi · 2023-06-05T08:45:15Z

I had also the same issue.

"prevent_deletion_if_contains_resources = false" works for me. With this flag set to false, destroy deletes, as it says in the documentation, all the nested resources and the resource group even if some resources are not in the tf state.

_"When that feature flag is set, Terraform will skip checking for any Resources within the Resource Group and delete this using the Azure API directly (which will clear up any nested resources)."

After I added this flag to my tf code, I manually deleted the old state and all the resources in Azure and redeployed everything. After this, destroy runs without any errors.

GlibMartynenko · 2023-07-07T12:33:39Z

To prevent of creation "Application Insights smart detection rules" and action group I added into my observability package this code:
resource "azurerm_application_insights_smart_detection_rule" "example" { name = "Slow server response time" application_insights_id = azurerm_application_insights.example.id enabled = false }

In that case, I have my custom Action group and Azure do not create its own Action group and rule
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/application_insights_smart_detection_rule

In the documentation, I didn't find any information about this approach but it works for me.

Update:
Sorry, but it looks like it was a bug on Azure :(
Even with this code above "Application Insights Smart Detection" group and "Failure Anomalies"
Smart detector alert rule still created :(

Do we know how to prevent of creation these two resources?

ezequielan · 2023-08-21T07:52:38Z

Hello,
any news about this?

Thanks.

DenisBalan · 2023-08-28T11:43:27Z

Workaround for this

resource "azurerm_application_insights" "application_insights" {
  name                                  = local.name
  resource_group_name                   = var.resource_group_name
   ...
}


# This resource sits here just to have it imported in the state
resource "azurerm_monitor_action_group" "this" {
  name                = join("-", ["amag", local.name])
  resource_group_name = var.resource_group_name
  short_name          = "amag" # used only for sms
}

resource "azurerm_monitor_smart_detector_alert_rule" "failure_anomalies" {
  name                = "Failure Anomalies - ${local.name}"
  resource_group_name = var.resource_group_name
  detector_type       = "FailureAnomaliesDetector"
  scope_resource_ids  = [azurerm_application_insights.application_insights.id]
  severity            = "Sev0"
  frequency           = "PT1M"
  action_group {
    ids = [azurerm_monitor_action_group.this.id]
  }
}

In this way we have this in state, and when destroying, it gets destroyed automatically before resource group is.
Give it a try, at least for us, its working fine.

danpetitt · 2023-09-24T21:15:09Z

@DenisBalan This does not work for me; the apply fails because the rules already exist. There is nothing that can be done except to remove the protections, which are good to have in-place. that stops resources being deleted when the resource group is deleted

stas-sultanov · 2023-10-09T08:14:57Z

It looks like there is some kind of policy that automatically creates Failure-Anomalies-Alert-Rule for new created Application Insights instance.

I face this issue by creating Application Insights with Bicep/Arm.

danpetitt · 2023-10-09T16:34:12Z

@stas-sultanov I tried using the portal and the azure cli and it doesnt auto-create these alerts; so thats a bit weird

stas-sultanov · 2023-11-25T17:23:14Z

@danpetitt there is some kind of glitch in Azure.
I still have issue with auto creation of failure anomalies detector.
I have raised a question on MS - no luck...
Are we the only two who faces this issue?

sboulema · 2023-11-25T17:36:28Z

Nope! I also have this issue, lurking around going for a fix...

stas-sultanov · 2023-11-25T17:42:42Z

unfortunately, I do not have support plan from MS to rise an Issue via Azure portal..

I just wonder how low qualified people in Microsoft are who implemented this automatic provision of Failure Anomalies Detector..

danpetitt · 2023-11-25T20:26:31Z

@stas-sultanov I have a support plan, I will create some obvious steps and log a ticket and see what they say ... probably not a lot, but we can hope.

I can understand the first-experience that its useful to have this happen by default, but we should at least be able to opt-out especially for those using IaaC solutions and not the portal.

I will report back if they say anything

stas-sultanov · 2023-11-26T11:39:58Z

@danpetitt , thank you very much!
I guess you may include activity log from monitor that clearly shows that system is doing this on it's own...

…ate an open bug in Terraform. or instance, the Resource Group is not deleted when a `Failure Anomalies` resource is present. Reference: hashicorp/terraform-provider-azurerm#18026

… with Terraform. (#63) * Update APIM type to use api version `2023-03-01-preview` which does not have the issue when deleting the APIM. * Added dependency (`depends_on`) with for `azurerm_api_management_named_value.tenant_id` for the `azurerm_api_management_api_policy.policy` which is required when deleting the APIM due to an indirect dependency with the Tenant ID value. * Add `prevent_deletion_if_contains_resources` flag as `false` to mitigate an open bug in Terraform. or instance, the Resource Group is not deleted when a `Failure Anomalies` resource is present. Reference: hashicorp/terraform-provider-azurerm#18026

mdsharpe · 2024-09-25T12:34:57Z

Still an issue for me

agullotti · 2024-09-25T17:32:00Z

Still an issue for me

Yes, same. Why would this be marked solved? Multiple people here have stated that the proposed fix in that merge does not work?

stas-sultanov · 2024-09-25T18:46:51Z

The problem is that Microsoft states in the documentation that this behavior is by design.
Which is 100500% extremely stupid as it breaks the whole idea of IaC via declarative programming.

Patrik-Berglund · 2024-10-08T17:17:43Z

@danpetitt how did it go with the support ticket?

mindlessroman added the bug label Aug 17, 2022

github-actions bot removed the bug label Aug 17, 2022

Amier3 added bug service/application-insights labels Aug 18, 2022

yasar-observe mentioned this issue Dec 14, 2022

fix: remove app insights until provider bug is fixed observeinc/terraform-azure-collection#44

Merged

msanft mentioned this issue Apr 17, 2023

cli: force-delete Azure resource group edgelesssys/constellation#1667

Merged

3 tasks

catriona-m self-assigned this May 31, 2023

catriona-m added the v/3.x label Jul 20, 2023

rcskosir added the upstream/microsoft Indicates that there's an upstream issue blocking this issue/PR label Jul 20, 2023

rliberoff mentioned this issue Jun 12, 2024

Solved Issue 62: Unable to delete APIM when destroying infrastructure with Terraform. Azure/aihub#63

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Application Insights feature block hanging and Failure Anomalies Still Auto-Generated #18026

Application Insights feature block hanging and Failure Anomalies Still Auto-Generated #18026

mindlessroman commented Aug 17, 2022 •

edited

Loading

DanLauerman commented Oct 19, 2022

egorshulga commented Jan 17, 2023 •

edited

Loading

JohnRAristizabal commented Jan 23, 2023

egorshulga commented Jan 23, 2023 •

edited

Loading

pgagliano5 commented Feb 24, 2023

IgorZhavoronok commented May 5, 2023

ameyaagashe commented May 26, 2023

miniemi commented Jun 5, 2023

GlibMartynenko commented Jul 7, 2023 •

edited

Loading

ezequielan commented Aug 21, 2023

DenisBalan commented Aug 28, 2023

danpetitt commented Sep 24, 2023

stas-sultanov commented Oct 9, 2023

danpetitt commented Oct 9, 2023

stas-sultanov commented Nov 25, 2023

sboulema commented Nov 25, 2023

stas-sultanov commented Nov 25, 2023

danpetitt commented Nov 25, 2023

stas-sultanov commented Nov 26, 2023

mdsharpe commented Sep 25, 2024

agullotti commented Sep 25, 2024 •

edited

Loading

stas-sultanov commented Sep 25, 2024

Patrik-Berglund commented Oct 8, 2024 •

edited

Loading

Application Insights feature block hanging and Failure Anomalies Still Auto-Generated #18026

Application Insights feature block hanging and Failure Anomalies Still Auto-Generated #18026

Comments

mindlessroman commented Aug 17, 2022 • edited Loading

Is there an existing issue for this?

Community Note

Terraform Version

AzureRM Provider Version

Affected Resource(s)/Data Source(s)

Terraform Configuration Files

Debug Output/Panic Output

Expected Behaviour

Actual Behaviour

The Request / The Ask

Steps to Reproduce

Important Factoids

References

DanLauerman commented Oct 19, 2022

egorshulga commented Jan 17, 2023 • edited Loading

JohnRAristizabal commented Jan 23, 2023

egorshulga commented Jan 23, 2023 • edited Loading

pgagliano5 commented Feb 24, 2023

IgorZhavoronok commented May 5, 2023

ameyaagashe commented May 26, 2023

miniemi commented Jun 5, 2023

GlibMartynenko commented Jul 7, 2023 • edited Loading

ezequielan commented Aug 21, 2023

DenisBalan commented Aug 28, 2023

danpetitt commented Sep 24, 2023

stas-sultanov commented Oct 9, 2023

danpetitt commented Oct 9, 2023

stas-sultanov commented Nov 25, 2023

sboulema commented Nov 25, 2023

stas-sultanov commented Nov 25, 2023

danpetitt commented Nov 25, 2023

stas-sultanov commented Nov 26, 2023

mdsharpe commented Sep 25, 2024

agullotti commented Sep 25, 2024 • edited Loading

stas-sultanov commented Sep 25, 2024

Patrik-Berglund commented Oct 8, 2024 • edited Loading

mindlessroman commented Aug 17, 2022 •

edited

Loading

egorshulga commented Jan 17, 2023 •

edited

Loading

egorshulga commented Jan 23, 2023 •

edited

Loading

GlibMartynenko commented Jul 7, 2023 •

edited

Loading

agullotti commented Sep 25, 2024 •

edited

Loading

Patrik-Berglund commented Oct 8, 2024 •

edited

Loading