Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the 'strategy' parameter in SymbolicAggregateApproximation() #128

Open
GiovannaR opened this issue Apr 6, 2022 · 4 comments

Comments

@GiovannaR
Copy link

Description

Hi!

I am in doubt about the application of SymbolicAggregateApproximation() in comparison of its describition in the article "Experiencing SAX: a novel symbolic representation of time series". In the article, in section "3.2 Discretization", it is described that the data follows a Gaussian Distribution and the "breakpoints" are created to produce equal-sized areas under the curve of a Gaussian. So, I understand that the parameterer strategy='normal' uses the same strategy as the article, right? So, what a about the uniform and quantile strategies? Are they a change from the article?

Thank you for your help! Have a nice day!

@johannfaouzi
Copy link
Owner

Hi,

Indeed, the parameter strategy='normal' uses the same strategy as the article (quantiles from the standard normal distribution). The justification of using quantiles from the standard normal distribution is given the article:

"[...] the normalized time series have highly Gaussian distribution [...].

strategy='uniform' and strategy='quantile' actually use the values of the time series. strategy='uniform' creates bins of the same length (it uses the minimum and maximum values of the time series and creates K bins of same length). strategy='quantile' is similar to strategy='normal' but instead of using the quantiles of standard normal distribution, it uses the quantiles of the time series (so that all the symbols have (almost) the same number of occurrences).

It should be noted that the dimensionality reduction with Piecewise Aggregate Approximation is not included in this implementation, so you should use pyts.approximation.PiecewiseAggregateApproximation first (if you want to). To standardize time series, you can use pyts.preprocessing.StandardScaler (it is assumed that the time series are standardized to use strategy='normal').

Best,
Johann

@ivan-marroquin
Copy link

Hi @johannfaouzi

Thanks for your comments. I have a follow-up question regarding the normalization of a time series. Do we really need to normalize the time series prior to computing the SymbolicAggregateApproximation?

Ivan

@johannfaouzi
Copy link
Owner

johannfaouzi commented Apr 21, 2023

Hi,

It will depend on the 'strategy' used to discretize the time series with SymbolicAggregateApproximation:

  • 'normal' uses quantiles from the standard normal distribution, so it is assumed that the time series is standardized (zero mean, unit variance): any normalization may have an impact.
  • 'quantile' is invariant to any strictly increasing transformation because the order of the values will remain identical: normalization has no impact
  • 'uniform' is invariant to any strictly increasing linear transformation, so most common normalization techniques have no impact.
>>> import numpy as np
>>> from pyts.approximation import SymbolicAggregateApproximation
>>> from pyts.datasets import load_gunpoint

>>> X, _, _, _ = load_gunpoint(return_X_y=True)

>>> sax_normal = SymbolicAggregateApproximation(strategy='normal')
>>> np.alltrue(sax_normal.transform(X) == sax_normal.transform(2 * X + 6))
False
>>> np.alltrue(sax_normal.transform(X) == sax_normal.transform(np.exp(X))
False

>>> sax_quantile = SymbolicAggregateApproximation(strategy='quantile')
>>> np.alltrue(sax_quantile.transform(X) == sax_quantile.transform(2 * X + 6))
True
>>> np.alltrue(sax_quantile.transform(X) == sax_quantile.transform(np.exp(X)))
True

>>> sax_uniform = SymbolicAggregateApproximation(strategy='uniform')
>>> np.alltrue(sax_uniform.transform(X) == sax_uniform.transform(2 * X + 6))
True
>>> np.alltrue(sax_uniform.transform(X) == sax_uniform.transform(np.exp(X)))
False

Best,
Johann

@ivan-marroquin
Copy link

Hi @johannfaouzi

Thanks for the explanations

Ivan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants