Skip to content

Pruning the neural network is an method where we are reducing the number of weights and neurons in network so that it's required less computation power and also optimsing the space.

Notifications You must be signed in to change notification settings

jinsel/Pruning-the-Neural-Networks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Pruning-the-Neural-Networks

Pruning the Neural Network

Purning

Pruning is one of the methods for inference to efficiently produce models smaller in size, more memory-efficient, more power-efficient and faster at inference with minimal loss in accuracy, other such techniques being weight sharing and quantization. Out of several aspects that deep learning takes as an inspiration from the area of Neuroscience. Pruning in deep learning is also a biologically inspired.

Purning helps to solve the problems of:

  1. Model are getting larger
  2. Speed of the machine if you use CPU
  3. Energy Efficiency (Memory Usage)

Here we are comparing two different Purning techniques

  1. Weight Pruning
  2. Unit/Neuron Pruning

Weight pruning

1)Set individual weights in the weight matrix to zero. This corresponds to deleting connections.

2)Here, to achieve sparsity of k% we rank the individual weights in weight matrix W according to their magnitude, and then set to zero the smallest k%.

Tensors with several values set to zero can be considered sparse. This results in important benefits:

  • Compression. Sparse tensors are amenable to compression by only keeping the non-zero values and their corresponding coordinates.
  • Speed. Sparse tensors allow us to skip otherwise unnecessary computations involving the zero values.

Unit/Neuron Pruning

1)Set entire columns to zero in the weight matrix to zero, in effect deleting the corresponding output neuron.

2)Here to achieve sparsity of k% we rank the columns of a weight matrix according to their L2-norm and delete the smallest k%.

Graph between accuracy and sparsity of Weight Pruning and Unit Pruning

Graph

Getting Started

If you want to do some research or work on Pruning the Neural Network then follow the under instructions.

Clone

  1. Fork the Repository
  2. Clone this repo to your local machine using https://github.com/jinsel/Pruning-the-Neural-Networks

Installation

Tensorflow

(Requires the latest pip)
$ pip install --upgrade pip
$ pip install tensorflow

(If you want to use GPU then)
$ pip install tensorflow-gpu

For Numpy

$ pip install numpy

If you are using the Google Colab then:

Enable the GPU with: Runtime > Change runtime type > Hardware accelator and make sure GPU is selected.

Reference:

Research Paper and Blog

  1. To prune, or not to prune: exploring the efficacy of pruning for model compression, Michael H. Zhu, Suyog Gupta, 2017

  2. Learning to Prune Filters in Convolutional Neural Networks, Qiangui Huang et. al, 2018

  3. https://jacobgil.github.io/deeplearning/pruning-deep-learning Pruning deep neural networks to make them fast and small

  4. Optimize machine learning models with Tensorflow Model Optimization Toolkit

  5. https://www.tensorflow.org/model_optimization/guide/pruning/train_sparse_models

  6. https://towardsdatascience.com/pruning-deep-neural-network-56cae1ec5505 Pruning Deep Neural Networks

Videos

  1. https://www.youtube.com/watch?v=CrDRr2fxbsg&t=656s Toward Efficient Deep Neural Network Deployment: Deep Compression and EIE, Song Han

  2. https://www.youtube.com/watch?v=vouEMwDNopQ Deep Compression, DSD Training and EIE

About

Pruning the neural network is an method where we are reducing the number of weights and neurons in network so that it's required less computation power and also optimsing the space.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published