Skip to content

Using various regression techniques and optimising them to select the best model that predicts correct number of rings from 8 features

Notifications You must be signed in to change notification settings

VIVAN-RANGRA/abalone_analysis-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Predicting Abalone Age Using Various Machine Learning Models

This project aims to predict the age of abalone (a type of edible sea snail) from physical measurements using several machine learning models. The dataset used is the well-known Abalone dataset, which contains 4,177 instances and 8 attributes, including length, diameter, height, whole weight, shucked weight, viscera weight, shell weight, and sex.

Models Employed

The following machine learning models are explored in this project:

  1. Stochastic Gradient Descent Regressor (SGDRegressor)
  2. Multi-Layer Perceptron Regressor (MLPRegressor)
  3. Linear Regression Model
  4. AdaBoost Regressor
  5. Random Forest Regressor
  6. Gradient Boosting Regressor
  7. XGBoost

Libraries Used

The project primarily utilizes the following Python libraries:

  • NumPy: For numerical computations and array operations.
  • Pandas: For data manipulation and analysis.
  • Scikit-learn: For implementing various machine learning models, preprocessing techniques, and model evaluation metrics.

Methodology

  1. Data Preprocessing: The dataset is loaded, and necessary preprocessing steps are performed, such as handling missing values, encoding categorical features, and splitting the data into training and testing sets.

  2. Model Training and Hyperparameter Tuning: For each machine learning model, the following steps are performed:

    • Instantiate the model with default hyperparameters.
    • Perform a grid search using GridSearchCV to find the optimal hyperparameters for the model.
    • Train the model with the optimal hyperparameters on the training data.
  3. Model Evaluation: The trained models are evaluated on the test data using appropriate evaluation metrics from sklearn.metrics, such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R²) score.

  4. Model Comparison: The performance of all models is compared based on the evaluation metrics, and the best-performing model is identified.

Results

The results section will provide a comprehensive analysis of the performance of each model, including the optimal hyperparameters found during the grid search process, evaluation metrics on the test data using sklearn.metrics, and a comparison of all models. The best-performing model will be highlighted and recommended for predicting the age of abalone based on the physical measurements.

Contributing

Contributions to this project are welcome. If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

Acknowledgments

The Abalone dataset is a widely used benchmark dataset in the field of machine learning and data mining. We acknowledge the researchers and contributors who made this dataset available for educational and research purposes.

About

Using various regression techniques and optimising them to select the best model that predicts correct number of rings from 8 features

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published