Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAMZA-2412 : Adding SamzaContainer metric that is 0 when container-starting, 1 when started. #1228

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rmatharu-zz
Copy link
Contributor

@rmatharu-zz rmatharu-zz commented Dec 2, 2019

Adding metric that is 0 when container-starting, 1 when started.

Symptom: External monitoring and control (e.g., autosizing) require a way to know when a container has started (e.g., state-restore has completed). Currently, they have to rely on restore-time or container-startup-time metrics, which are a bit tricky to use.

Cause: None

Fix: Added a new metric, a gauge, which is 0 and remains at 0 after the container has started. A Timer on the other hand will go to 0 if its not updated.

Tests: Samza test job with and without state.

@rmatharu-zz rmatharu-zz changed the title Adding metric that is 0 when container-starting, 1 when started. Adding SamzaContainer metric that is 0 when container-starting, 1 when started. Dec 2, 2019
@prateekm
Copy link
Contributor

prateekm commented Dec 2, 2019

You should use a gauge, not counter. In inGraphs, a counter is displayed as a rate (v2-v1/t2-t1).

  1. That's probably not what you want.
  2. Not sure what will happen if v2 < v1.

A gauge should not reset to 0 if not updated, it should retain its last updated value.

@rmatharu-zz
Copy link
Contributor Author

You should use a gauge, not counter. In inGraphs, a counter is displayed as a rate (v2-v1/t2-t1).

  1. That's probably not what you want.
  2. Not sure what will happen if v2 < v1.

A gauge should not reset to 0 if not updated, it should retain its last updated value.

Fair enough, I was confused between Timers and Gauges.

@rmatharu-zz rmatharu-zz changed the title Adding SamzaContainer metric that is 0 when container-starting, 1 when started. SAMZA-2412 : Adding SamzaContainer metric that is 0 when container-starting, 1 when started. Dec 2, 2019
@prateekm
Copy link
Contributor

prateekm commented Dec 3, 2019

@rmatharu Let's also add this to the metrics docs.

@@ -764,6 +764,7 @@ class SamzaContainer(
containerListener.afterStart()
}
metrics.containerStartupTime.set((System.nanoTime() - startTime)/1000000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use (container-startup-time) metric instead to infer if the container has started?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can but its relatively harder to automate since we need to look for changes

@@ -764,6 +764,7 @@ class SamzaContainer(
containerListener.afterStart()
}
metrics.containerStartupTime.set((System.nanoTime() - startTime)/1000000)
metrics.containerStarted.set(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but curious what will be the behavior of the auto sizing controller if this metric is broken. Wondering if it can differentiate between metrics broken vs container not started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this (or another similar metric) the controller has no way of figuring out that a container has finished state-restore, upto which point any input-lag increase does not warrant scale-up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants