Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics for monitoring the broker and connection status #22

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Add metrics for monitoring the broker and connection status #22

wants to merge 6 commits into from

Conversation

daviddetorres
Copy link

@daviddetorres daviddetorres commented Nov 10, 2019

As proposed in issue #12 , it is needed a way to check the availability and health of the broker in order to be able to raise alarms in case of broker down or malfunction.

This is due to the metrics shown are the last one received, but as they are served as a push service the exporter acts as a proxy and in case of disconnection the exporter will continue showing results.

Two scenarios are contemplated:

  • Lost of connection with the broker: new metric "broker_connection_up" (0 if down, 1 if up)
  • Connection is ok, but the broker does not send updates due to problems in the queue, blockage, etc. Added new metric "seconds_since_last_update". (-1 if never received an update and > 0 since first update, restarting to 0 after every update from the broker).

- Added metric Gauge "up" for the state of teh connection with the
broker
- Added get dependencies in the Makefile
- Take out the connection loop to a function and called also in lost of
connection
- Changed the name of the up metric to "broker_connection_up"
@daviddetorres daviddetorres changed the title WIP: Add metrics for monitoring the broker connection status WIP: Add metrics for monitoring the broker and connection status Nov 11, 2019
- Added metric
- Added functions for increase and reset to zero that metric
- Added ID to exporter mqtt client
- change name of metric broker up/down to broker_connection_up
- Launch first connection in independent thread to be able to start 
gathering metrics before connection (like status of connection)
@daviddetorres daviddetorres changed the title WIP: Add metrics for monitoring the broker and connection status Add metrics for monitoring the broker and connection status Nov 11, 2019
@daviddetorres daviddetorres marked this pull request as ready for review November 11, 2019 22:16
main.go Outdated
gaugeMetrics["up"].Set(0)
gaugeMetrics["broker_connection_up"].Set(0)
// try to reconnect
mqttConnect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually required? When connecting after having used NewClientOptions, AutoReconnect is set to true.

I don't know for sure, but I think the code will reconnect by itself even without this code.

Copy link

@jnovack jnovack Jun 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's the exporter's responsibility (or even prometheus's responsibility) to care if the broker is up.

If we're not scraping, there's a problem. If the scrape is old, there's a problem. Don't add dimensions or complexity to it.

It's really an anti-pattern for exporters to use up. In the event THIS exporter breaks, the last value up is true.

Really, it's the orchestrator's problem if it's down (e.g. Docker). Your Orchestrator dashboard should see a rise in failing containers.

You don't even have to publish a last_scrape_time, metrics would not be coming in, set your alerts there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants