MEDomicsTools Good Practices

This document presents the good practices to use for various deep learning applications.

Contributors

Simon Giard-Leroux

Changelog

Revision	Date	Description
A	2022-04-28	Creation

To-Do

Learning rate warmup
Transfer Cool Links
Mixed Precision

Good Practices

Cool Links

Deep Learning Drizzle: https://deep-learning-drizzle.github.io/?s=03
INF8953DE - Reinforcement Learning Course:
- Main Website: https://chandar-lab.github.io/INF8953DE/
- Course Playlist: https://www.youtube.com/playlist?list=PLImtCgowF_ES_JdF_UcM60EXTcGZg67Ua)
NetworkX to draw graphs: https://towardsdatascience.com/visualizing-networks-in-python-d70f4cbeb259
Hiddenlayer to draw block diagrams from PyTorch model: https://github.com/waleedka/hiddenlayer
Ontology Tutorial: https://protegewiki.stanford.edu/wiki/Protege4Pizzas10Minutes
Very cool business card: https://bruno-simon.com/
Made with ML: https://madewithml.com/?utm_campaign=Made+With+ML&utm_medium=email&utm_source=Revue+newsletter
Imaging data:
- MedMNIST: https://medmnist.com/
- NMDID: https://nmdid.unm.edu/
Simpson's Paradox: https://compucademy.net/exploring-simpsons-paradox-with-python/
PyTorch Profiler: https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool/
Python @cache decorator:
- Video: https://www.youtube.com/watch?v=DnKxKFXB4NQ
- Tutorial: https://realpython.com/lru-cache-python/
Python Numba:
- Video: https://www.youtube.com/watch?v=x58W9A2lnQc
- Main Page: https://numba.pydata.org/
Machine Learning Datasets: https://sebastianraschka.com/blog/2021/ml-dl-datasets.html
Machine Learning Basic Code:
- Machine Learning From Scratch: https://github.com/eriklindernoren/ML-From-Scratch
- Machine Learning for Big Code and Naturalness: https://ml4code.github.io/papers.html
Nvidia AI Playground: https://www.nvidia.com/en-us/research/ai-playground/
Comet.ml: https://www.comet.ml/site/
Optuna: https://optuna.org/

Learning Rate Warmup

A simple linear warmup period can be used at the start of training for a few epochs before reaching the peak learning rate. For the Adam optimizer, this can improve early-stage training stability by regulating the size of the parameter updates. The prevalent warmup schedule is a simple linear warmup, in which the global learning rate starts at zero and increases by a constant at each iteration until reaching its intended value.

See the following paper for details: On the adequacy of untuned warmup for adaptive optimization

The following is an example of implementation in PyTorch:

from torch.optim.lr_scheduler import CosineAnnealingLR, LinearLR, SequentialLR

def __build_scheduler(
    self,
    n_warmup: int,
    n_epochs: int,
    warmup_floor: float = 0.1,
    decay_floor: float = 0.1,
) -> Any:
    """Method to build the scheduler for training.
    Args:
        n_warmup (int): number of warmup epochs
        n_epochs (int): number of total epochs
        warmup_floor (float, optional): learning rate ratio when starting the warmup phase. Defaults to 0.1.
        decay_floor (float, optional): learning rate ratio when decaying during the decay phase. Defaults to 0.1.
    Returns:
        any: scheduler
    """
    # Sets a warmup period for the learning rate
    # https://arxiv.org/pdf/1910.04209.pdf
    warmup_scheduler = LinearLR(
        optimizer=self.__optimizer,
        start_factor=warmup_floor,
        end_factor=1.0,
        total_iters=n_warmup,
    )

    # Reduces the learning rate after warmup period
    decay_scheduler = CosineAnnealingLR(
        optimizer=self.__optimizer,
        T_max=n_epochs - n_warmup,
        eta_min=self.__learning_rate * decay_floor,
    )

    # Creating the scheduler
    return SequentialLR(
        optimizer=self.__optimizer,
        schedulers=[warmup_scheduler, decay_scheduler],
        milestones=[n_warmup + 1],
    )

Model Calibration

Model calibration involves adjusting the predicted probabilities generated by a statistical or machine learning model to ensure they accurately reflect the true probabilities of events occurring. This process is crucial for applications requiring reliable probability estimates, such as risk assessment and medical diagnosis. Calibration is typically assessed using calibration curves or metrics like the Brier score, with techniques like Platt Scaling or Isotonic Regression employed to improve calibration if necessary. A well-calibrated model provides more trustworthy predictions, enhancing decision-making in uncertain scenarios.

Quick overview of model calibration: A Comprehensive Guide on Model Calibration: What, When, and How
Sklearn's model calibration: sklearn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

good_practices.md

good_practices.md

MEDomicsTools Good Practices

Table of Contents

Contributors

Changelog

To-Do

Good Practices

Cool Links

Learning Rate Warmup

Model Calibration

Files

good_practices.md

Latest commit

History

good_practices.md

File metadata and controls

MEDomicsTools Good Practices

Table of Contents

Contributors

Changelog

To-Do

Good Practices

Cool Links

Learning Rate Warmup

Model Calibration