California Housing Prices

California Housing Prices EDA and prediction models using sklearn

Introduction:

This project involves detailed EDA done on housing prices from California collected in 1990. I have also created variaous prediction models using linear regression, decision tree regression, and random forest regression algorithms. This project is created to understand machine learning process followed in regression problems. The following things are been studied as part of this project.

  • Data understanding
  • Correlations and histograms
  • Splitting of dataset in training and test set using different approaches
  • Imputing data using different methods
  • Encoding categorical features
  • Creating pipelines using BaseEstimator, TransformerMixin
  • Scaling the data using standard scaler
  • Cross validation
  • Grid Search
  • Hyperparameter tuning
  • Creating and training Regression models

References:

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron - 2019 - O’Reilly Media, Inc.