Skip to content

sayarghoshroy/Optimization_and_Learning

Repository files navigation

Optimization and Learning


Open In Colab The first four problems

Open In Colab The fifth problem

Open In Colab The final problem


  • Problem 1 demonstrates the use of second order methods to calculate an optimal learning rate for gradient descent. We see how convergence is affected with change in the learning rate.

  • In Problems 2 and 3, various kinds of algorithms for optimizing functions have been tried out. A contrast has been shown among normal gradient descent, gradient descent with Polyak's learning rate, Nesterov's accelerated gradient descent, and the Adam optimizer. All of these have been implemented from scratch and applied for regression on two bivariate functions with MSE loss.

  • Problem 4 shows how data normalization can lead to faster training. A further analysis on the structure of datasets and 'good' learning rates has been provided.

  • Problem 5 explores the gradient ascent technique to calculate the local maxima of functions.

  • Problem 6 shows the use of Rprop and Quickprop for a regression task. We compare a one hidden-layer neural network architecture for the task, with different number of hidden neurons, different activation functions, with normal batch backpropagation, Rprop and Quickprop. The dataset used for this problem is the Concrete Compressive Strength dataset which can be found here.


The notebooks contain required equations and explanations for all the problems.