Releases: awslabs/gluonts
0.10.0
Overview
Arrow based datasets
We have added support for Parquet-files, as well as Arrow's binary format. This is an opt-in feature, requiring pyarrow
to be installed. Use pip install 'gluonts[pro]'
or pip install 'gluonts[arrow]'
to ensure the correct version is installed.
FileDataset
has been reworked to support .parquet
and .arrow
files. Previously, it had assumed all files to use jsonlines
. To continue using jsonlines
ensure that the the files use one of the .json
, .jsonl
, .json.gz
, jsonl.gz
suffixes.
Depending on the dataset size and shape, Arrow can be much faster than the json variant. In more extreme cases we saw speedups of more than 100x when using arrow vs jsonlines (see #2003 for some examples).
To convert a given dataset into arrow, you can use the gluonts.dataset.arrow
utility:
python -m gluonts.dataset.arrow write </path/to/dataset> my-dataset.arrow
PandasDataset
We have added support for pandas.DataFrame
and pandas.Series
as well. You can now directly model data given in a DataFrame
using gluonts.dataset.pandas.PandasDataset
. In this tutorial
we describe in depth how you can use PandasDataset
to speed up modelling using GluonTS.
Changelog
New Features
- #1631 - Add
TimeLimitCallback
tomx/trainer
callbacks. (by @yx1215) - #1780 - adding MQF2 (Multi-horizon) (by @KelvinKan)
- #1903 - Added QuarterlyBegin time feature (by @kashif)
- #1924 - Porting SimpleFeedForwardEstimator to PyTorch (by @lostella)
- #1925 - DeepAR PyTorch: make samplers configurable (by @lostella)
- #1935 - added support for pandas dataframes (by @rsnirwan)
- #1962 - Add support for beta-NLL loss (by @kashif)
- #1982 - Add Uber-TLC dataset to dataset repository. (by @Hongqing-work)
- #1990 - Add info cli. (by @jaheba)
- #1987 - Add HP tuning example with Optuna (by @npnv)
- #2000 - Add
arrow
-based dataset. (by @vafl, @lostella, @jaheba) - #2002 - add ND for item_metrics (by @melopeo)
- #2006 - Added support of "long" RTS, making short RTS be "past_feat_dynamic_real" (by @zoolhasson)
- #2061 - Add
DatasetWriter
. (by @jaheba) - #2074 - Add support for second frequency. (by @kashif)
Breaking Changes
- #1917 - Breaking: Fix return types of features (by @lostella)
- #1941 - Breaking: Update dependency fbprophet -> prophet (by @lostella)
- #1946 - Breaking: Split incremental quantile output into separate class (by @lostella)
- #1965 - Breaking: reorg torch package, shorten import paths (by @lostella)
- #1980 - Use
pd.Period
instead ofpd.Timestamp
. (by @jaheba) - #1997 - Remove
freq
argument fromForecast
. (by @kashif) - #2011 - Remove
dct_reduce
. (by @jaheba) - #2017 - Remove mandatory freq attribute of Predictor. (by @kashif)
- #2018 - Remove multiprocessing dataloader. (by @jaheba)
- #2019 - Rework
FileDataset
. (by @jaheba) - #2053 - Add
dataset_writer
toget_dataset
. (by @Hongqing-work) - #2070 - Add
jsonl.encode_json
, removeserialize_data_entry
. (by @jaheba)
Bug Fixes / Minor Improvements
- #1704 - Settings._let will pop element it added instead of just the last one. (by @jaheba)
- #1905 - Fix typing issues in torch estimators, update base estimators docstrings (by @lostella)
- #1909 - Fix the use of the scaling parameter in Transformer model (by @StanislasGuinel)
- #1916 - Fix AddTimeFeatures transformation for multiples of base frequencies (by @lostella)
- #1920 - Fix: use broadcast_lesser in place of comparisons in ISQF (by @vincentqb)
- #1931 - Fix dummy estimator (by @canerturkmen)
- #1933 - Fix Pytorch Lightning tutorial. (by @jaheba)
- #1938 - Fixed autograd inplace operations error in Transformed Distribution (by @shubhamkapoor)
- #1950 - Fix: Hard threshold positive distribution parameters (by @lostella)
- #1952 - Fix forecast keys (quantiles) output by TemporalFusionTransformer (by @lostella)
- #1968 - Fix: use of num_parallel_samples in deepAR (by @kashif)
- #1969 - Fix: torch DeepAR observed indicator in multivariate case (by @kashif)
- #1975 - use FieldName (by @kashif)
- #1983 - Documentation: add docstrings for torch-based models (by @lostella)
- #1986 - Fix OffsetSplitter for negative offsets (by @lostella)
- #1989 - Pin protobuf version. (by @jaheba)
- #1991 - Remove packaged pytorch-ts from
gluonts.nursery.SCott
(by @lostella) - #1999 - Documentation: fix and speed up tutorials (by @lostella)
- #2004 - Refactor splitter assertion and add error message (by @rsnirwan)
- #2005 - Rework
itertools
, add col-to-row and row-to-col functions. (by @jaheba) - #2008 - Re-add cache for parsing 'pd.Period'. (by @jaheba)
- #2013 - Update website template, clean up homepage and tutorials (by @lostella)
- #2014 - Expose
Estimator
,Predictor
,Forecast
ingluonts.model
. (by @jaheba) - #2015 - Fix mean in
AffineTransformedDistribution
(by @stailx) - #2016 - Fix torch affine transformed distribution (by @lostella)
- #2020 - Remove unnecessary files from
docs
folder, update gitignore (by @lostella) - #2021 - Update references to dev branch. (by @lostella)
- #2024 - Fix README. Use
DataFramesDataset
. (by @jaheba) - #2025 - Make HP tuning tutorial more accurate (by @jaheba)
- #2028 - Re-add support for Python 3.6 (by @jaheba)
- #2029 - Add support for nan values in Rotbaum (by @zoolhasson)
- #2035 - Simplify lag values computation in torch DeepAR (by @lostella)
- #2036 - Minor improvements to the hierarchical model (by @rshyamsundar)
- #2047 - Make
Quantile
derive frompydantic.BaseModel
. (by @jaheba) - #2050 - Add concepts section to docs. (by @jaheba)
- #2051 - Add tutorial on
DataFramesDataset
(by @rsnirwan) - #2057 - Add optional parameter
time_axis
toforecast_start
. (by @melopeo) - #2062 - Fix type annotations for
predict_to_numpy
(by @lostella) - #2066 - Always pass freq explicitly to pd.period_range. (by @kashif)
- #2068 - Docs: simplify call to evaluator (by @lostella)
- #2092 - Fix: DistributionLoss not encodable. (by @jaheba)
- #2098 - Add Airtraffic dataset. (by @jaheba)
- #2108 - Fixup trainer in case of non-finite loss. (by @jaheba)
- #2121 - Change default behavior for TrainDatasets overwrite (by @nklingen)
0.9.6
0.10.0 rc1
Overview
Arrow based datasets
We have added support for Parquet-files, as well as Arrow's binary format. This is an opt-in feature, requiring pyarrow
to be installed. Use pip install 'gluonts[pro]'
or pip install 'gluonts[arrow]'
to ensure the correct version is installed.
FileDataset
has been reworked to support .parquet
and .arrow
files. Previously, it had assumed all files to use jsonlines
. To continue using jsonlines
ensure that the the files use one of the .json
, .jsonl
, .json.gz
, jsonl.gz
suffixes.
Depending on the dataset size and shape, Arrow can be much faster than the json variant. In more extreme cases we saw speedups of more than 100x when using arrow vs jsonlines (see #2003 for some examples).
To convert a given dataset into arrow, you can use the gluonts.dataset.arrow
utility:
python -m gluonts.dataset.arrow write </path/to/dataset> my-dataset.arrow
PandasDataset
We have added support for pandas.DataFrame
and pandas.Series
as well. You can now directly model data given in a DataFrame
using gluonts.dataset.pandas.PandasDataset
. In this tutorial
we describe in depth how you can use PandasDataset
to speed up modelling using GluonTS.
Changelog
New Features
- #1631 - Add
TimeLimitCallback
tomx/trainer
callbacks. (by @yx1215) - #1780 - adding MQF2 (Multi-horizon) (by @KelvinKan)
- #1903 - Added QuarterlyBegin time feature (by @kashif)
- #1924 - Porting SimpleFeedForwardEstimator to PyTorch (by @lostella)
- #1925 - DeepAR PyTorch: make samplers configurable (by @lostella)
- #1935 - added support for pandas dataframes (by @rsnirwan)
- #1962 - Add support for beta-NLL loss (by @kashif)
- #1982 - Add Uber-TLC dataset to dataset repository. (by @Hongqing-work)
- #1990 - Add info cli. (by @jaheba)
- #1987 - Add HP tuning example with Optuna (by @npnv)
- #2000 - Add
arrow
-based dataset. (by @vafl, @lostella, @jaheba) - #2002 - add ND for item_metrics (by @melopeo)
- #2006 - Added support of "long" RTS, making short RTS be "past_feat_dynamic_real" (by @zoolhasson)
- #2061 - Add
DatasetWriter
. (by @jaheba) - #2074 - Add support for second frequency. (by @kashif)
Breaking Changes
- #1917 - Breaking: Fix return types of features (by @lostella)
- #1941 - Breaking: Update dependency fbprophet -> prophet (by @lostella)
- #1946 - Breaking: Split incremental quantile output into separate class (by @lostella)
- #1965 - Breaking: reorg torch package, shorten import paths (by @lostella)
- #1980 - Use
pd.Period
instead ofpd.Timestamp
. (by @jaheba) - #1997 - Remove
freq
argument fromForecast
. (by @kashif) - #2011 - Remove
dct_reduce
. (by @jaheba) - #2018 - Remove multiprocessing dataloader. (by @jaheba)
- #2019 - Rework
FileDataset
. (by @jaheba) - #2053 - Add
dataset_writer
toget_dataset
. (by @Hongqing-work) - #2070 - Add
jsonl.encode_json
, removeserialize_data_entry
. (by @jaheba)
Bug Fixes / Minor Improvements
- #1704 - Settings._let will pop element it added instead of just the last one. (by @jaheba)
- #1905 - Fix typing issues in torch estimators, update base estimators docstrings (by @lostella)
- #1909 - Fix the use of the scaling parameter in Transformer model (by @StanislasGuinel)
- #1916 - Fix AddTimeFeatures transformation for multiples of base frequencies (by @lostella)
- #1920 - Fix: use broadcast_lesser in place of comparisons in ISQF (by @vincentqb)
- #1931 - Fix dummy estimator (by @canerturkmen)
- #1933 - Fix Pytorch Lightning tutorial. (by @jaheba)
- #1938 - Fixed autograd inplace operations error in Transformed Distribution (by @shubhamkapoor)
- #1950 - Fix: Hard threshold positive distribution parameters (by @lostella)
- #1952 - Fix forecast keys (quantiles) output by TemporalFusionTransformer (by @lostella)
- #1968 - Fix: use of num_parallel_samples in deepAR (by @kashif)
- #1969 - Fix: torch DeepAR observed indicator in multivariate case (by @kashif)
- #1975 - use FieldName (by @kashif)
- #1983 - Documentation: add docstrings for torch-based models (by @lostella)
- #1986 - Fix OffsetSplitter for negative offsets (by @lostella)
- #1989 - Pin protobuf version. (by @jaheba)
- #1991 - Remove packaged pytorch-ts from
gluonts.nursery.SCott
(by @lostella) - #1999 - Documentation: fix and speed up tutorials (by @lostella)
- #2004 - Refactor splitter assertion and add error message (by @rsnirwan)
- #2005 - Rework
itertools
, add col-to-row and row-to-col functions. (by @jaheba) - #2008 - Re-add cache for parsing 'pd.Period'. (by @jaheba)
- #2013 - Update website template, clean up homepage and tutorials (by @lostella)
- #2014 - Expose
Estimator
,Predictor
,Forecast
ingluonts.model
. (by @jaheba) - #2015 - Fix mean in
AffineTransformedDistribution
(by @stailx) - #2016 - Fix torch affine transformed distribution (by @lostella)
- #2020 - Remove unnecessary files from
docs
folder, update gitignore (by @lostella) - #2021 - Update references to dev branch. (by @lostella)
- #2024 - Fix README. Use
DataFramesDataset
. (by @jaheba) - #2025 - Make HP tuning tutorial more accurate (by @jaheba)
- #2028 - Re-add support for Python 3.6 (by @jaheba)
- #2029 - Add support for nan values in Rotbaum (by @zoolhasson)
- #2035 - Simplify lag values computation in torch DeepAR (by @lostella)
- #2036 - Minor improvements to the hierarchical model (by @rshyamsundar)
- #2047 - Make
Quantile
derive frompydantic.BaseModel
. (by @jaheba) - #2050 - Add concepts section to docs. (by @jaheba)
- #2051 - Add tutorial on
DataFramesDataset
(by @rsnirwan) - #2057 - Add optional parameter
time_axis
toforecast_start
. (by @melopeo) - #2062 - Fix type annotations for
predict_to_numpy
(by @lostella) - #2068 - Docs: simplify call to evaluator (by @lostella)
0.9.5
Backporting fixes:
0.9.4
0.9.3
Backporting fixes:
- Fix: use broadcast_lesser in place of comparisons in ISQF (#1920 by @vincentqb)
- Fix dummy estimator (#1931 by @canerturkmen)
- Fix Pytorch Lightning tutorial (#1933 by @jaheba)
- Fixed autograd inplace operations error in Transformed Distribution (#1938 by @shubhamkapoor)
0.9.2
0.9.1
0.9.0
Changelog
New Features
- Add
ckpt_path
argument toPyTorchLightningEstimator
. (#1872) - Add TSBench (#1865)
- add SCott code to nursery (#1827)
- Add dynamic code for shell. (#1821)
- Adding
torch.isqf
(#1815) - Add tsbench readme placeholder (#1808)
- Adding ISQF distribution class (#1746)
- Adding IQF to remove quantile crossing and required retraining for ne… (#1693)
- Hierarchical Forecaster: End-to-End model based on DeepVAR (#1665)
- Adding glouonts.torch.piecewise_linear (#1663)
- Add quantitle regression mode to AutoGluon-based TabularEstimator (#1611)
- add dummy estimator to trivial models (#1602)
Bug Fixes
- Add file path argument to m5 dataset generation (#1896)
- Fix negative binomial parameter map (#1893)
- Fix negative binomial sampling (#1884)
- Fixes for Monash Forecasting Repository datasets (#1879)
- Fix serde.flat type handling. (#1851)
- Fix datesplitter. (#1850)
- changed metadata creation function (#1847)
- Check equality of transformations. (#1844)
- Fix samples scaling in PyTorch DeepAR (#1836)
- Fix _version for cases when git is not installed. (#1825)
- Fixed data leakage bug in implementation of dynamic real and categorical features (#1809)
- fix for #1725, reverse breaking changes to data loader and handle all zero batches (#1779)
- Upgrade pytorch and pytorch-lightning requirements and some fixes. (#1765)
- Fix torch NOPScaler shape. (#1752)
- Convert batchify list to np array (#1732)
- Fix gluonts.json; added bdump/bdumps. (#1721)
- Fix scaling for pytorch negative binomial output (#1702)
- Fix frequency string conversion from ts format, add test (#1652)
- Fix NegativeBinomial constructor args in NegativeBinomialOutput (torch) (#1651)
- Add batch_size attribute to MQCNNEstimator and MQRNNEstimator (#1645)
- Add additional datasets from the Monash Time Series Forecasting Repository (#1632)
Breaking Changes
- Extend default quantiles for MQ* Estimators to match MSIS quantiles. (#1866)
- changed metadata creation function (#1847)
- Remove support module. (#1792)
- Set minimum Python version to 3.7. (#1791)
- Exceptions cleanup. (#1615)
Other Changes & Improvements
- Update mypy to 0.910. (#1875)
- Bump ujson from 4.3.0 to 5.1.0 in /src/gluonts/nursery/tsbench (#1869)
- Update black to v22. (#1867)
- Fix docstring typo in feature.py (#1863)
- Fix scott checks. (#1845)
- Remove requirement for
@validated
in from_hyperparameters. (#1826) - Fix test collect ignore. (#1817)
- Split tests into one workflow for each framework. (#1805)
- Mark transformer as flaky. (#1801)
- Mark empirical_distribution test as flaky. (#1798)
- Use of int/float/object over np.int/float/object for dtype. (#1795)
- Rework tests. (#1786)
- Update typing_extension version. (#1785)
- Use of independent random seed. (#1767)
- Upgrade pytorch and pytorch-lightning requirements and some fixes. (#1765)
- Remove sphinx-autobuild sphinx-autorun, update sphinx version. (#1745)
- Exlude bin folders from apidoc. (#1744)
- Don't run doctest on nursery. (#1743)
- Hierarchical: Compute relative reconciliation error and add tests (#1722)
- Fixing doc build from mqcnn-iqf commit (#1699)
- Replace miniver with custom versioning code. (#1662)
- Cap numba<0.54, ipykernel<6.2.0 (#1661)
- Removed assert for cardinality and static feats (#1659)
0.8.1
Backporting fixes:
- loosen RTOL in
test/distribution/test_flows.py
to maketest_flow_invertibility
pass (#1604) - Add batch_size attribute to MQCNNEstimator and MQRNNEstimator (#1645)
- Fix NegativeBinomial constructor args in NegativeBinomialOutput (torch) (#1651)
- Fix frequency string conversion from ts format, add test (adapted from #1652)