Skip to content

lukeconibear/distributed_deep_learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed deep learning

Examples of how to distribute deep learning on a High Performance Computer (HPC).

Contents

These examples use Ray Train in a static job on a HPC. Ray handles most of the complexity of distributing the work, with minimal changes to your TensorFlow or PyTorch code.

First, install the Python environments for the required HPC: install_python_environments.md.

It's preferable to use a static job on the HPC. To do this, you could test out different ideas locally in a Jupyter Notebook, then when ready convert this to an executable script (.py) and move it over. However, it is also possible to use Jupyter Notebooks interactively on the HPC following the instructions here: jupyter_notebook_to_hpc.md.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published