PATTERN RECOGNITION
HOUSE PRICE PREDICTION
MUMBAI
Presented By :
Shashi Ranjan (20BTCSE0061)
Sandeep Yadav (20BTCSE0056)
Submitted to :
Mr. Shubhashish Goswami
Objective
This project is prepared to predict the price of
house in ‘Mumbai’ using the concept of Machine
Learning.
Introduction
● It is very difficult to search house at places like Mumbai. Even if
you find a house it is very difficult to get a perfect price for the
same.
● To overcome such problem Machine Learning technique can be
used.
● Using Machine Learning Technique it will be easy to know the
price of house based on the area available, number of bedrooms,
facilities available
Dataset Description
Dataset Size : 6338 x 17
• Price
• Area
• Location
• Number of Bedrooms
• New/Resale
• Gymnasium
• Lift Available
• Car Parking
• Maintenance Staff
• 24x7 Security
• Club House
• Intercom
• Landscaped Gardens
• Indoor Games
• Gas Connection
• Jogging Track
• Swimming Pool
Following are the parameters in dataset for house price prediction of ‘Mumbai’ :
Dataset Type
Dataset
CONTENTS
CONTENTS
Categorical
Quantitative
Continuous
Binary
Nominal
Ordinal
Descrete
Ordinal
Binary
⮚ Dataset is a collection of related sets of information that is composed of separate elements
but can be manipulated as a unit by a computer.
CATEGORICAL
NOMINAL
BINARY
• Location
Categorical Attributes
ORDINAL
• New/Resale
• Gymnasium
• Car Parking
• Maintenance
• 24x7 Security
• Club House
• Intercom
• Landscaped
• Indoor Games
• Gas Connection
• Jogging Track
• Swimming Pool
QUANTITATIVE
CONTINUOUS
DISCRETE
• Number of Bedrooms
• Lift Available
• Price
• Area
Quantitative Attributes
Snapshot of dataset
The following snapshot shows tail part of data taken from Dataset used in House
price prediction.
Data Visualisation (Scatter Plot)
In the given Scatter plot it shows that as the ‘Price’ of house
increases with increase in ‘Area’ of house.
Data Visualisation (Pie Chart)
Pie chart shows houses available with how many ‘No.of
Bedrooms’ and their percentage out of 100.
Data Visualisation (Histogram)
Histogram verifies the pie chart as is shows the number of
houses having how many ‘No. of Bedrooms’.
Data Visualisation (Box-Whisker Plot)
Box and whisker plot here shows how many outliers are
present in each parameter of the Dataset.
Data Visualisation (Box-Whisker Plot)
Pie chart shows percentage of No.of Bedrooms
Applying Machine Learning Algorithm
1. we set the independent and target variables as X and Y respectively.
2. Split the dataset into training and testing in 70:30 ratio.
3. Fitting the train set to multiple linear regression and getting the score of the
model.
4. Fitting the train set to decision tree and getting the score of the model.
5. Fitting the train set to random forest and getting the score of the model.
6. Calculate the model score to understand how our model performed along
with the explained variance score.
Conclusion
Conclusion
After applying Linear Regression, Decision tree, and
Random forest Machine Learning algorithm we observe
that random forest gives highest accuracy of almost
46%.
Thank you

House Price Prediction.pptx

  • 1.
    PATTERN RECOGNITION HOUSE PRICEPREDICTION MUMBAI Presented By : Shashi Ranjan (20BTCSE0061) Sandeep Yadav (20BTCSE0056) Submitted to : Mr. Shubhashish Goswami
  • 2.
    Objective This project isprepared to predict the price of house in ‘Mumbai’ using the concept of Machine Learning.
  • 3.
    Introduction ● It isvery difficult to search house at places like Mumbai. Even if you find a house it is very difficult to get a perfect price for the same. ● To overcome such problem Machine Learning technique can be used. ● Using Machine Learning Technique it will be easy to know the price of house based on the area available, number of bedrooms, facilities available
  • 4.
    Dataset Description Dataset Size: 6338 x 17 • Price • Area • Location • Number of Bedrooms • New/Resale • Gymnasium • Lift Available • Car Parking • Maintenance Staff • 24x7 Security • Club House • Intercom • Landscaped Gardens • Indoor Games • Gas Connection • Jogging Track • Swimming Pool Following are the parameters in dataset for house price prediction of ‘Mumbai’ :
  • 5.
    Dataset Type Dataset CONTENTS CONTENTS Categorical Quantitative Continuous Binary Nominal Ordinal Descrete Ordinal Binary ⮚ Datasetis a collection of related sets of information that is composed of separate elements but can be manipulated as a unit by a computer.
  • 6.
    CATEGORICAL NOMINAL BINARY • Location Categorical Attributes ORDINAL •New/Resale • Gymnasium • Car Parking • Maintenance • 24x7 Security • Club House • Intercom • Landscaped • Indoor Games • Gas Connection • Jogging Track • Swimming Pool
  • 7.
    QUANTITATIVE CONTINUOUS DISCRETE • Number ofBedrooms • Lift Available • Price • Area Quantitative Attributes
  • 8.
    Snapshot of dataset Thefollowing snapshot shows tail part of data taken from Dataset used in House price prediction.
  • 9.
    Data Visualisation (ScatterPlot) In the given Scatter plot it shows that as the ‘Price’ of house increases with increase in ‘Area’ of house.
  • 10.
    Data Visualisation (PieChart) Pie chart shows houses available with how many ‘No.of Bedrooms’ and their percentage out of 100.
  • 11.
    Data Visualisation (Histogram) Histogramverifies the pie chart as is shows the number of houses having how many ‘No. of Bedrooms’.
  • 12.
    Data Visualisation (Box-WhiskerPlot) Box and whisker plot here shows how many outliers are present in each parameter of the Dataset.
  • 13.
    Data Visualisation (Box-WhiskerPlot) Pie chart shows percentage of No.of Bedrooms
  • 14.
    Applying Machine LearningAlgorithm 1. we set the independent and target variables as X and Y respectively. 2. Split the dataset into training and testing in 70:30 ratio. 3. Fitting the train set to multiple linear regression and getting the score of the model. 4. Fitting the train set to decision tree and getting the score of the model. 5. Fitting the train set to random forest and getting the score of the model. 6. Calculate the model score to understand how our model performed along with the explained variance score.
  • 15.
  • 16.
    Conclusion After applying LinearRegression, Decision tree, and Random forest Machine Learning algorithm we observe that random forest gives highest accuracy of almost 46%.
  • 17.