-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metrics for monitoring the broker and connection status #22
base: master
Are you sure you want to change the base?
Conversation
- Added metric Gauge "up" for the state of teh connection with the broker - Added get dependencies in the Makefile
- Take out the connection loop to a function and called also in lost of connection - Changed the name of the up metric to "broker_connection_up"
- Added metric - Added functions for increase and reset to zero that metric - Added ID to exporter mqtt client - change name of metric broker up/down to broker_connection_up - Launch first connection in independent thread to be able to start gathering metrics before connection (like status of connection)
main.go
Outdated
gaugeMetrics["up"].Set(0) | ||
gaugeMetrics["broker_connection_up"].Set(0) | ||
// try to reconnect | ||
mqttConnect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually required? When connecting after having used NewClientOptions, AutoReconnect is set to true.
I don't know for sure, but I think the code will reconnect by itself even without this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's the exporter's responsibility (or even prometheus's responsibility) to care if the broker is up.
If we're not scraping, there's a problem. If the scrape is old, there's a problem. Don't add dimensions or complexity to it.
It's really an anti-pattern for exporters to use up
. In the event THIS exporter breaks, the last value up
is true
.
Really, it's the orchestrator's problem if it's down (e.g. Docker). Your Orchestrator dashboard should see a rise in failing containers.
You don't even have to publish a last_scrape_time
, metrics would not be coming in, set your alerts there!
As proposed in issue #12 , it is needed a way to check the availability and health of the broker in order to be able to raise alarms in case of broker down or malfunction.
This is due to the metrics shown are the last one received, but as they are served as a push service the exporter acts as a proxy and in case of disconnection the exporter will continue showing results.
Two scenarios are contemplated: