You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have upgraded the kube-prometheus from v0.12 to v0.13 and API server availability is showing more than 100%. In the previous version, its showing correctly.
$ git diff | cat diff --git a/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kube-apiserver-availability.rules.yaml b/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kube-apiserver-availability.rules.yaml
index 27399b19..e978af06 100644 --- a/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kube-apiserver-availability.rules.yaml
+++ b/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kube-apiserver-availability.rules.yaml@@ -82,7 +82,7 @@ spec:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
- - expr: sum by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, verb, scope, le) (increase(apiserver_request_sli_duration_seconds_bucket[1h]))+ - expr: sum by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, verb, scope, le) (increase(apiserver_request_sli_duration_seconds_bucket{job="apiserver"}[1h]))
record: cluster_verb_scope_le:apiserver_request_sli_duration_seconds_bucket:increase1h
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubeApiserverAvailability }}
labels:
K3s has several components in a single process, so metrics of Kubernetes components duplicates between different Prometheus jobs. So, this filter was important in my case. k3s-io/k3s#2262
In my k3s cluster, apiserver_request_sli_duration_seconds_bucket metrics are collected by following 5 jobs. ( and availability was indicated around 500% :) )
kube-proxy
kube-controller-manager
kube-scheduler
apiserver
kubelet
I guess current latest code which forgets adding job="apiserver" filtering probably doesn't cause problem in normal kubeadm cluster, so this issue may not have been paid attention so much.
(I haven't checked if it's correct because I don't have a cluster for test which built by kubeadm)
I have upgraded the kube-prometheus from v0.12 to v0.13 and API server availability is showing more than 100%. In the previous version, its showing correctly.
Attaching image for dashboard
Environment
Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.7+k3s1", GitCommit:"8432d7f239676dfe8f748c0c2a3fabf8cf40a826", GitTreeState:"clean", BuildDate:"2022-02-24T23:03:47Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.8+k3s2", GitCommit:"02fcbd1f57f0bc0ca1dc68f98cfa0e7d3b008225", GitTreeState:"clean", BuildDate:"2023-12-07T02:48:20Z", GoVersion:"go1.20.11", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:
k3s
ts=2024-02-13T11:59:38.233Z caller=main.go:585 level=info msg="Starting Prometheus Server" mode=server version="(version=2.46.0, branch=HEAD, revision=cbb69e51423565ec40f46e74f4ff2dbb3b7fb4f0)"
ts=2024-02-13T11:59:38.233Z caller=main.go:590 level=info build_context="(go=go1.20.6, platform=linux/amd64, user=root@42454fc0f41e, date=20230725-12:31:24, tags=netgo,builtinassets,stringlabels)"
ts=2024-02-13T11:59:38.233Z caller=main.go:591 level=info host_details="(Linux 4.18.0-513.11.1.el8_9.x86_64 #1 SMP Thu Dec 7 03:06:13 EST 2023 x86_64 prometheus-k8s-0 (none))"
ts=2024-02-13T11:59:38.233Z caller=main.go:592 level=info fd_limits="(soft=1048576, hard=1048576)"
ts=2024-02-13T11:59:38.233Z caller=main.go:593 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2024-02-13T11:59:38.235Z caller=web.go:563 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2024-02-13T11:59:38.236Z caller=main.go:1026 level=info msg="Starting TSDB ..."
ts=2024-02-13T11:59:38.237Z caller=tls_config.go:274 level=info component=web msg="Listening on" address=[::]:9090
ts=2024-02-13T11:59:38.238Z caller=head.go:595 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2024-02-13T11:59:38.260Z caller=head.go:676 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=4.959µs
ts=2024-02-13T11:59:38.260Z caller=head.go:684 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2024-02-13T11:59:38.260Z caller=tls_config.go:313 level=info component=web msg="TLS is disabled." http2=false address=[::]:9090
ts=2024-02-13T11:59:38.261Z caller=head.go:755 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2024-02-13T11:59:38.261Z caller=head.go:792 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=64.923µs wal_replay_duration=591.788µs wbl_replay_duration=142ns total_replay_duration=702.782µs
ts=2024-02-13T11:59:38.261Z caller=main.go:1047 level=info fs_type=XFS_SUPER_MAGIC
ts=2024-02-13T11:59:38.261Z caller=main.go:1050 level=info msg="TSDB started"
ts=2024-02-13T11:59:38.261Z caller=main.go:1231 level=info msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
ts=2024-02-13T11:59:38.290Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=kubernetes-cadvisor msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.291Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=kubernetes-pods msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.291Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/alertmanager-main/1 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.291Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/coredns/0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.292Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/kafka-service-monitor/0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.292Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/kube-apiserver/0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.292Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=kubernetes-services msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.292Z caller=kubernetes.go:329 level=info component="discovery manager notify" discovery=kubernetes config=config-0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:38.432Z caller=main.go:1268 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=170.302838ms db_storage=1.527µs remote_storage=1.349µs web_handler=561ns query_engine=792ns scrape=316.7µs scrape_sd=2.107591ms notify=25.468µs notify_sd=176.505µs rules=139.444513ms tracing=10.148µs
ts=2024-02-13T11:59:38.432Z caller=main.go:1011 level=info msg="Server is ready to receive web requests."
ts=2024-02-13T11:59:38.432Z caller=manager.go:1009 level=info component="rule manager" msg="Starting rule manager..."
ts=2024-02-13T11:59:42.826Z caller=main.go:1231 level=info msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
ts=2024-02-13T11:59:42.849Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/kube-state-metrics/1 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.851Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/coredns/0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.852Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/kube-apiserver/0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.852Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=kubernetes-pods msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.853Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/kafka-service-monitor/0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.854Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=kubernetes-cadvisor msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.854Z caller=kubernetes.go:329 level=info component="discovery manager scrape" discovery=kubernetes config=kubernetes-services msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:42.854Z caller=kubernetes.go:329 level=info component="discovery manager notify" discovery=kubernetes config=config-0 msg="Using pod service account via in-cluster config"
ts=2024-02-13T11:59:43.006Z caller=main.go:1268 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml totalDuration=179.767429ms db_storage=2.059µs remote_storage=1.432µs web_handler=667ns query_engine=978ns scrape=65.891µs scrape_sd=5.288807ms notify=16.27µs notify_sd=171.972µs rules=150.995181ms tracing=6.049µs
The text was updated successfully, but these errors were encountered: