Skip to content

Latest commit

 

History

History
45 lines (25 loc) · 2.93 KB

README.md

File metadata and controls

45 lines (25 loc) · 2.93 KB

Regression Analysis on Combined Cycle Power Plant Dataset

Combined Cycle Power Plant

This project focuses on performing regression analysis on the "Combined Cycle Power Plant" dataset to estimate five types of regression models: multiple linear regression, polynomial regression, support vector regression, decision tree regression, and random forest regression. The goal is to determine the best model for predicting the electrical energy output of a combined cycle power plant using the determination coefficient.

Data

The "Combined Cycle Power Plant" dataset contains the following features (variables):

  • Temperature (T): The temperature measured in °C.
  • Ambient Pressure (AP): The ambient pressure is measured in millibars.
  • Relative Humidity (RH): The relative humidity measured in percent.
  • Exhaust Vacuum (V): The exhaust vacuum measured in cm Hg.
  • Electrical Energy Output (PE): The electrical energy output of the power plant measured in MW.

You can find more information about the "Combined Cycle Power Plant" dataset at UCI Machine Learning Repository.

Methodology

The regression analysis utilizes five regression models: multiple linear regression, polynomial regression, support vector regression, decision tree regression, and random forest regression. The determination coefficient (R-squared) is used as the evaluation metric to determine the performance of each model.

Results

Based on the determination coefficients obtained for each model, the best-performing model for predicting the electrical energy output is the random forest regression. This model achieved the highest determination coefficient (96.16%) among all the models evaluated.

Limitations and Future Work

One limitation of this analysis is that the determination coefficient was used as the sole metric for model selection. A further examination could consider other evaluation metrics or perform cross-validation to assess the models' generalization performance. Additionally, future work could explore feature engineering techniques or incorporate additional relevant variables to improve predictive accuracy.

Usage

Feel free to use this code for your educational purposes.

Contributions

Contributions to this repository are welcome. If you find a bug or have suggestions for improvement, please open an issue or submit a pull request.

Credits

This project was created by Santiago Moreno Velasquez as part of an Udemy Guided Project.

Acknowledgements

UCI Machine Learning Repository for providing the "Combined Cycle Power Plant" dataset.