Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Umbrella issue for Chronos refactor to enable customized installation #3170

Open
3 tasks done
TheaperDeng opened this issue Oct 18, 2021 · 16 comments
Open
3 tasks done
Labels

Comments

@TheaperDeng
Copy link
Contributor

TheaperDeng commented Oct 18, 2021

This issue is a detailed plan to realize issue intel-analytics/analytics-zoo#107 and decoupling the major function of Chronos with other BigDL components.

@shane-huang @yushan111

Overall design strategy [edited after discussion with Jason]

  • Chronos should stick to nano for single node acceleration when appropriate and be self-contained and able to complete most of its functionalities without any other dependencies (ray/orca).
  • Chronos will rely on orca/ray for functionalities with distribution fashion (will be reflected in the following tabel).
  • Chronos will contain a light-weighted inference installation strategy. (maybe not a new whl)
  • tensorflow is not my first priority since tf2 is intel's AI strategy while Chronos has no tf2 model right now.

Here is a full functionality table (will be updated)

installation status installation cmd Chronos nano orca TSDataset XShardsTSDataset F/D/S*(distributed=False) F/D/S*(distributed=True) ONNX Auto Model AutoTSEstimator TSPipeline
chronos with nano** pip install bigdl-chronos; pip install bigdl-nano ✔(Pseudo distributed)
full chronos pip install bigdl-chronos[all]

* F/D/S means Forecaster/Detector/Simulators

** This two version can be used as light-weighted inference install strategy.

To complete these, we mainly need these steps(can be done simultaneously)

@jason-dai
Copy link
Contributor

  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here

@TheaperDeng
Copy link
Contributor Author

  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here
  1. Nothing will happen when user only install ray but not orca, chronos can do nothing theoretically since AutoTS rely on orca.automl, XshardsTSDataset rely on orca.data and distributed training rely on orca.learn.

@jason-dai
Copy link
Contributor

  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here
  1. Nothing will happen when user only install ray but not orca, chronos can do nothing theoretically since AutoTS rely on orca.automl, XshardsTSDataset rely on orca.data and distributed training rely on orca.learn.

So you have three options:

  1. nano only
  2. nano & orca
  3. nano & orca & ray

@TheaperDeng
Copy link
Contributor Author

  1. chronos should always install nano as its dependency (and use it for single node acceleration when apprioriate)
  2. Installation/deployment of chronos should have 4 options: with or without orca X with or without ray
  3. chronos should use nano API instead of pytorch-lightning; this functionality is however orthogonal to the installation/deployment requirement here
  1. Nothing will happen when user only install ray but not orca, chronos can do nothing theoretically since AutoTS rely on orca.automl, XshardsTSDataset rely on orca.data and distributed training rely on orca.learn.

So you have three options:

  1. nano only
  2. nano & orca
  3. nano & orca & ray

yep, that looks good.

@shanyu-sys
Copy link
Contributor

Note that nano hasn't been fully supported yet.

For the first release of BigDL-2.0, we will only include one extra install option all which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)

We will support other install options after the corresponding code is ready.

@shanyu-sys
Copy link
Contributor

shanyu-sys commented Oct 19, 2021

Shall we also consider dependencies like TensorFlow and PyTorch? Since Chronos contains both TensorFlow models and PyTorch models, and we might add more Tensorflow models in the future.

I extended the table above, detailed all Chronos components and the corresponding dependencies regarding Orca, Ray, TF, Pytorch.

Component Orca Ray Tensorflow Pytorch Notes
TSDataset
TSDataset (distributed)
Forecaster (LSTM, S2S, TCN, TCMF) Issue intel-analytics/analytics-zoo#5006
Forecaster (distributed, backend=ray) Supported pytorch only
Forecaster (distributed, backend!=ray) Supported pytorch only
Forecaster (MTNet) issue intel-analytics/analytics-zoo#5037
Forecaster (Arima, Prophet) issue intel-analytics/analytics-zoo#5037
Detector
Simulator
AutoTSEstimator Supported pytorch only
TSPipeline Issue intel-analytics/analytics-zoo#5006 intel-analytics/analytics-zoo#5007

Therefore, with Tensorflow and Pytorch considered, the dependency options might be

  • nano & pytorch: Single node PyTorch-based Forecaster and Simulator, AutoTS inference
  • nano & tensorflow: Single node Tensorflow-based Forecaster and Detector
  • nano & orca & pytorch: Distributed PyTorch-based Forecaster without Ray backend, distributed TSDataset
  • nano & orca & ray & pytorch: Distributed PyTorch-based Forecaster with Ray backend, AutoTS training
  • None: Statistical Forecaster

In the future, we may support distributed training or tuning with Tensorflow-based model, then we may add:

  • nano & orca & tensorflow: Distributed TF-based Forecaster without Ray backend
  • nano & orca & ray & tensorflow: Distributed TF-based Forecaster with Ray backend, AutoTS training

@TheaperDeng
Copy link
Contributor Author

small update: detector also rely on some pytorch models

@TheaperDeng TheaperDeng transferred this issue from intel-analytics/BigDL-2.x Oct 21, 2021
@shane-huang
Copy link
Contributor

Seperating pytorch and tensorflow installation seems only a valid request for inference with size concerns. And it seems to me the option should be with nano instead of chronos (e.g. seperate libs installation such as pytorch-lightening, IPEX, intel-tensorflow, etc. ). So the install options can be simplified to train/nano[pytorch]/nano[tensorflow]. Further, as our pytorch support is much better and tensorflow layer is thin (correct me if i'm wrong), we can simplify it to nano (both py+tf) & [nano+tensorflow]. So the options could be changed to as below:
- nano
- nano (tensorflow-only)
- nano + orca
- nano + orca + ray

@shane-huang
Copy link
Contributor

Note that nano hasn't been fully supported yet.

For the first release of BigDL-2.0, we will only include one extra install option all which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)

We will support other install options after the corresponding code is ready.

We should expect new customers to use our upcoming release. What's the target usage for w and w/o all options? We may have to explain it to our customers. The w/o all option equals to nano only? If that is correct, the functions in this installation is quite limited:

  • no training or inference for DL-based forecasters (models are inherited from orca.automl.BaseModel)
  • no preprocessing (tsdataset depends on orca.data)
  • only a few ML-based model and detectors can be used.

We might need to consider our target usage when defining the options and which modules to include.

@jason-dai
Copy link
Contributor

How about:

  • chronos default: nano+orca+pytorch
  • chronos[nano] or chronos[lite]: nano+pytorch
  • chronos[automl]: nano+orca+ray+pytorch
  • chronos[all]: nano+orca+ray+pytorch+tf

@TheaperDeng
Copy link
Contributor Author

I prefer to simplify them to 3 options

  • chronos: nano+pytorch
  • chronos[distributed]: nano+orca+pytorch
  • chronos[all]: nano+orca+ray+pytorch

And once our tf support is enhanced, we may change the options. currently, if users want to use tf, we can let them install tf themselves now.

The other reason is that tf1 is conflict with pytorch-lightning on some dependencies which may cause "dependency hell".

nano+orca+pytorch will only support 2 more features than nano+pytorch

  • XShardsTSDataset (experimental)
  • Distributed Training (w/o ray backend, while ray backend is our recommeneded&default backend)

@jason-dai
Copy link
Contributor

I prefer to simplify them to 3 options

  • chronos: nano+pytorch
  • chronos[distributed]: nano+orca+pytorch
  • chronos[all]: nano+orca+ray+pytorch

And once our tf support is enhanced, we may change the options. currently, if users want to use tf, we can let them install tf themselves now.

The other reason is that tf1 is conflict with pytorch-lightning on some dependencies which may cause "dependency hell".

nano+orca+pytorch will only support 2 more features than nano+pytorch

  • XShardsTSDataset (experimental)
  • Distributed Training (w/o ray backend, while ray backend is our recommeneded&default backend)

The default should be distributed; a light version can be single node only.

@TheaperDeng
Copy link
Contributor Author

TheaperDeng commented Oct 21, 2021

bigdl-chronos: nano+pytorch+orca
bigdl-chronos[lite]: nano+pytorch
bigdl-chronos[all]: nano+orca+ray+pytorch

And once our tf support is enhanced, we may change the options. Currently, if users want to use tf, we can let them install tf themselves now.

@TheaperDeng
Copy link
Contributor Author

Note that nano hasn't been fully supported yet.
For the first release of BigDL-2.0, we will only include one extra install option all which will install all Chronos dependencies (PR intel-analytics/analytics-zoo#5033)
We will support other install options after the corresponding code is ready.

We should expect new customers to use our upcoming release. What's the target usage for w and w/o all options? We may have to explain it to our customers. The w/o all option equals to nano only? If that is correct, the functions in this installation is quite limited:

  • no training or inference for DL-based forecasters (models are inherited from orca.automl.BaseModel)
  • no preprocessing (tsdataset depends on orca.data)
  • only a few ML-based model and detectors can be used.

We might need to consider our target usage when defining the options and which modules to include.

preprocessing is available since TSDataset is based on single node pandas while only XShardsTSDataset is based on orca.data.

Still, we should not have this nearly-useless option for this release. So we may simply let the [all] option to be our default option for this release early next week.

I am not sure if some of our customers require a lighter version for this release? And of course we always have the nightly built version later.

@shanyu-sys
Copy link
Contributor

For this release, chronos default dependency (pip install bigdl-chronos) is with bigdl-orca. We haven't done the issues mentioned before to enable the default chronos run without orca, including Forecasters dependencies on orca.automl.BaseModel and orca.automl.metrics, TSPipeline evaluation with orca.automl.metrics.

So with the default chronos, we could support TSDataset, Forecasters, Simulators, Detectors, TSPipeline. Note that users may need to manually install pytorch or tensorflow for the corresponding component they want to use. I could also add pytorch as the default dependency.

With extra [all] option, users could enable the distributed functions, including distributed tuning with AutoTS, distributed forecasting, distributed dataset.

@TheaperDeng
Copy link
Contributor Author

For this release, chronos default dependency (pip install bigdl-chronos) is with bigdl-orca. We haven't done the issues mentioned before to enable the default chronos run without orca, including Forecasters dependencies on orca.automl.BaseModel and orca.automl.metrics, TSPipeline evaluation with orca.automl.metrics.

So with the default chronos, we could support TSDataset, Forecasters, Simulators, Detectors, TSPipeline. Note that users may need to manually install pytorch or tensorflow for the corresponding component they want to use. I could also add pytorch as the default dependency.

With extra [all] option, users could enable the distributed functions, including distributed tuning with AutoTS, distributed forecasting, distributed dataset.

confirmed, thx. This will also be reflected in our user guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants