Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(anomaly detection): Dynamic Window for Matrix Profiling #1451

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

aayush-se
Copy link
Member

@aayush-se aayush-se commented Nov 19, 2024

  • Use 2 windows (1 determined by the SuSS algorithm, and a "fixed" window at size 10) to perform detection reducing the recovery time for the algorithm ultimately making alerts persist for a shorter period of time after anomalous behavior is over
  • Performing detection twice for each window to maintain the matrix profiles
  • Updated the anomaly_algo_data to store matrix profile related information for both windows for a given alert
  • Created MPTimeSeriesAnomaliesSingleWindow class and updated the MPTimeSeriesAnomalies class to effectively manage and store the proper matrix profile related information and the boolean for determining which window logic to use
  • Recalculates both matrix profiles and update the respective rows if they are not present to ensure backwards compatibility with the older MPTimeSeriesAnomalies class
  • Updated current tests to match the new classes and their fields along with shapes for anomaly_algo_data

@aayush-se aayush-se requested a review from a team as a code owner November 19, 2024 00:22
@aayush-se aayush-se force-pushed the anomaly-detection/dynamic-window branch from cc46826 to bf302c1 Compare November 19, 2024 00:41
@aayush-se
Copy link
Member Author

Waiting for codecov to give report -- will update test coverage upon receiving

if original_flag is None:
original_flag = "none"
original_flags.append(original_flag)
algo_data = MPTimeSeriesAnomalies.extract_algo_data(point.anomaly_algo_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is not backward compatible as both mp_suss and mp_fixed are not there. So detection for existing alerts in production will fail, right?

Copy link
Member Author

@aayush-se aayush-se Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was one thing I wanted to ask about -- if we were to use this class for prod, could we rerun calculation to populate the objects appropriately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because to get the algo data for both suss and fixed, we would need to store the MP as a field in the object

Copy link
Member

@ram-senth ram-senth Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be best to just let the recalculation happen when data pruning happens. So this code should be backward compatible. We will need an unit test to check that. Also, test it locally with an alert pre-populated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will include those tests and confirm through local testing

convert_external_ts_to_internal(timeseries), config, window_size=window_size
)
anomalies_fixed = batch_detector.detect(
convert_external_ts_to_internal(timeseries), config, window_size=10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add this 10 as a constant to the MPConfig class?

@aayush-se aayush-se force-pushed the anomaly-detection/dynamic-window branch from 66e45c6 to 29d9139 Compare November 21, 2024 18:02
and "mp_suss" not in timeseries[-1].anomaly_algo_data
and "mp_fixed" not in timeseries[-1].anomaly_algo_data
):
timeseries = self._recalculate_batch_detection(db_alert)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we recalculate but we do not save to the database here. However, the new time step that will be stored in the db will have these additional information. So next detect call will actually fail, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within _recalculate_batch_detection, another function update_timeseries is called that directly updates the DbDynamicAlertTimeSeries with the new anomaly_algo_data

@ram-senth
Copy link
Member

I'm good with the changes. Just wanted to confirm that you tested locally with a saved alert that does not have the fixed window history to confirm the backward compatibility. We do not want production failures when we deploy. Also, seems like the unit test coverage needs increasing. There are a few low hanging fruits like testing for length mismatch etc that should take it over the goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants