Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling window forecast with rolling demean #633

Open
gavincyi opened this issue Mar 13, 2023 · 3 comments
Open

Rolling window forecast with rolling demean #633

gavincyi opened this issue Mar 13, 2023 · 3 comments

Comments

@gavincyi
Copy link

I saw in the documentation that rolling window forecast can be applied with parameter first_obs and last_obs, while I am looking for an approach with minimal runtime overhead to

  1. demean the series in rolling basis, i.e. returns[first_obs:last_obs] - mean(returns[first_obs:last_obs]), and
  2. fit the GARCH model

I wonder if the constant mean is applied on the rolling basis, or actually on the whole timeseries of argument y.

model = arch_model(tseries, vol="GARCH", mean="Constant", ...)
for i in range(len(tseries) - rolling_window):
  model.fit(first_obs=i, last_ob=i + rolling_window - 1, ...)

I tried to look into the source code but could not conclude it in a glance. Could you help address it?

@bashtage
Copy link
Owner

The mean is jointly estimated with the variance parameters. If you want the exact in-sample mean, you would need to first demean the data using the rolling mean, and then fit a model with ZeroMean. This would involve recreating the model for each sample.

If you use the fit with first and last, then it will jointly estimate everything.

The fastest way is to use the previous fit values for starting values. Here is a demo:

import arch
from arch.data import sp500
import datetime as dt

r = 100 * sp500.load().iloc[:, -2].pct_change().dropna()

last_obs = 1100
now = dt.datetime.now()
for i in range(1000, last_obs):
    res = arch.arch_model(r.iloc[i - 1000 : i]).fit(disp="off")
print(f"{(dt.datetime.now() - now).total_seconds()} (new model, no starting values)")

now = dt.datetime.now()
for i in range(1000, last_obs):
    arch.arch_model(r).fit(disp="off", first_obs=i - 1000, last_obs=i)
print(f"{(dt.datetime.now() - now).total_seconds()} (no starting values)")

last = None
now = dt.datetime.now()
for i in range(1000, last_obs):
    res = arch.arch_model(r.iloc[i - 1000 : i]).fit(disp="off", starting_values=last)
    last = res.params
print(f"{(dt.datetime.now() - now).total_seconds()} (starting values)")

On my machine I see

1.941473 (new model, no starting values)
1.928971 (no starting values)
1.236577 (starting values)

One final option is to only occasionally update the parameters. This updates parameters every 10 observations. Otherwise it uses the last values.

last = None
now = dt.datetime.now()
for i in range(1000, last_obs):
    mod = arch.arch_model(r.iloc[i - 1000 : i])
    if i % 10 == 0 or last is None:
        res = mod.fit(disp="off", starting_values=last)
        last = res.params
    mod.forecast(res.params, horizon=1)
print(
    f"{(dt.datetime.now() - now).total_seconds()} (starting values, occasionally update)"
)
0.224142 (starting values, occasionally update)

@bashtage
Copy link
Owner

One final answer -- when using first_obs and last_obs, the parameters are estimated only using the selected sample.

@gavincyi
Copy link
Author

Thanks for your prompt response. All the above makes perfect sense to me.

I wonder if you think adding an argument to allow demean in the rolling basis is a good idea, i.e. fit(demean=True, ...) passes the demean samples before fitting in the model. If so, I can create a PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants