Skip to content

Latest commit

 

History

History
118 lines (84 loc) · 4.34 KB

README.md

File metadata and controls

118 lines (84 loc) · 4.34 KB

Hillel_projects

Projects from Hillel Machine Learning class

During the classes several Machine Learning projects were completed.

1. The formula for the flight range of a bullet fired at an angle to the horizon.

  • Generate data arrays for angle and initial velocity from a normal and uniform distribution
  • Calculate the flight range distribution
  • Construct a histogram of the flight range
  • Fill out a report based on research results

2. Experiments with minimization

  • Plot loss function value (should drop over the fitting, loss = f(epoch))
  • Try RMSE, MAE and maybe other losses for linear regression
  • Make animation for fitting: plots of changing fitting curve (line) over the data (see slides)
  • Experiment with non-linear data, for example: y = 2 * x**2 + x + 3.5 + noise
  • Experiment with number of samples, sigma, and optimization algorithms

3. Computation Graph. Derivative, gradient, learning rate, gradient descent

  • Take analytical derivative of sigmoid function (on a paper with pencil, take a photo of your calculation and attach it to pdf report)
  • Experiments with demo code (gradient descent) * Vary learning rate * Vary epochs * Plot MSE over training (over epochs for specific learning rate)
  • Make one forward and backward steps for L = (2a + b)(c - d), where a, b, c, d are arbitrary numbers

4. Experiments with NN

--== Pytorch NN ==--

  • Plot loss curve (both train and test)
  • Print model weights (before and after training) and compare with original ones
  • Plot results (predicted line along with dataset data)

5. Classification Training and Tuning

Make 3 jupyter notebooks for classification of 3 datasets (one notebook for one dataset). Achieve at least 95% accuracy on each dataset, and if possible even 100% (accuracy is measured on the test set) with as few neural network parameters as possible.

  • make_blobs() -- classification into 4 classes
  • make_circles() -- classification into 2 classes
  • make_moon() -- classification into 2 classes

6. Experiment with multiclass classification (MLP) for MNIST data set

  • Include dropout layers
  • Include batch normalization layers
  • Include more layers
  • Experiment with activation functions
  • Experiment presence / absence of dropout and batch normalization
  • Experiment with batch size
  • Plot loss = f(epochs) for each experiment

7. PCA use cases

  • Get dataset from Kaggle (any tabular dataset you want)
  • Make simple classifier / regressor on the dataset
  • Reasonably reduce dataset dimensionality * Plot explained variance * Explain chosen number of components
  • Retrain the same classifier / regressor on the dataset with reduced dimensionality
  • Compare accuracies / MSEs and speed of the two approaches (with and without dimensionality reduction)

8. Classification metrics on MNIST

  • Accuracy (per class and general)
  • Precision (per class and general)
  • Recall (per class and general)
  • F1-score (per class and general)
  • Confusion matrix
  • Classification report

9. Convolutional neural network (CNN)

  • Calculate number of weights on each layer
  • Calculate shape of tensors before and after each layer
  • Make model overfit the data. Show loss curves with overfit
  • Reduce model complexity (number parameters) with keeping accuracy
  • Add batch norm as well (add loss curve plot, accuracy plot, classification report and confusion matrix as well)

10. CV Architectures, fine-tuning

  • Transfer learning for your dataset on a chosen pretrained model
  • Construct your own simple architecture and train it on your dataset

11. Ensemble methods

  • Train classifiers on the dataset
  • There should be 3 classifiers (stacking, boosting, bagging) Mandatory steps:
    • primary data analysis (gap distance, presence of categorical features, ...)
    • feature engineering (build 1-2 new features)
    • scaling feature
    • division of the dataset into training, validation and test parts
    • training the base model with default hyperparameters
    • selection of hyperparameters
    • evaluation of results

12. Text classification

  • Negative/Positive review classification

13. RNN

  • Negative/Positive review classification based on RNN

14. Use Transformer: Translation

  • Based on the T5 (or any other) model from the HuggingFace library, make a translator

15. Text classification

  • Negative/Positive review classification using SpaCy