Boston Housing Kaggle Challenge with Linear Regression:

Boston housing data: It is a dataset taken from StatLib library and maintained by Carnegie Mellon University. The dataset concerns the housing price in the city of Boston. The dataset has 506 instances with 13 features. Now, we will perform the challenge in python for data science.

The description of the dataset has been taken from the following:

Boston Housing Kaggle with Linear Regression in Python for Data Science

Now we will make a linear model that predicts the house prices.

Importing libraries and dataset.

Boston Housing Kaggle with Linear Regression in Python for Data Science

Shaping of input data and getting the feature_names.

Conversion of data from nd-array to the dataframe and adding the feature names to the data.

Boston Housing Kaggle with Linear Regression in Python for Data Science

Addition of ‘Price’ column to the dataset.

Describing the Boston dataset.

Boston Housing Kaggle with Linear Regression in Python for Data Science

Information of Boston Dataset

data.info()

Boston Housing Kaggle with Linear Regression in Python for Data Science

Obtaining the input and the output data and splitting it to training nd testing dataset

Boston Housing Kaggle with Linear Regression in Python for Data Science

Application of the model on the dataset and prediction of prices

Boston Housing Kaggle with Linear Regression in Python for Data Science

Plotting of scatter graph for showing prediction results, ‘ytrue’ vs. ‘y_pred’ value

The result of Linear regression or Mean squared error.

From the result, it is clear that our model is 66.55% accurate. This means that the model we have prepared is not good for the prediction of house prices.

So, to learn more about it in python for data science, you can check this and this as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.