SlideShare a Scribd company logo
1 of 37
HOUSE PRICE
PREDICTION USING
DEEP LEARNING
By Sabah Begum
1
ABSTRACT
Real estate is the least transparent industry in our ecosystem.
Housing prices keep changing day in and day out and
sometimes are hyped rather than being based on valuation.
Predicting housing prices with real factors is the main crux of
our research project. Here we aim to make our evaluations
based on every basic parameter that is considered while
determining the price. We use various regression techniques in
this pathway, and our results are not sole determination of one
technique rather it is the weighted mean of various techniques
to give most accurate results.
2
EXISTING SYSTEM
Data is at the heart of technical innovations, achieving any
result is now possible using predictive models. Machine
learning is extensively used in this approach. Although
development of correct and accurate model is needed.
Previous algorithms like SVM are not suitable for complex real-
world data.
3
PROPOSED SYSTEM
 Our main focus here is to develop a model which predicts the
property cost based on vast datasets using data mining
approach.
 The data mining process in a real estate industry provides an
advantage to the developers by processing the data,
forecasting future trends and thus assisting them to make
favourable knowledge-driven decisions.
 Our dataset comprises of various essential parameters and
data mining has been at the root of our system.
 We will initially clean up our entire dataset and also truncate
the outlier values.
 Finally a system will be made to predict the Real Estate
Prices using Deep Learning Approach.
4
FEASIBILITY STUDY
Basic areas to be covered are –
 Operational Feasibility: The applying can scale back the time
consumed to take care of manual records and isn’t dull and
cumbersome to take care of the records, therefore
operational practicableness is assured.
 Technical Feasibility: Minimum hardware requirements: 1.66
GHz Pentium Processor or Intel compatible processor. 1 GB
RAM net property, eighty MB disk area.
5
 Economical Feasibility :Once the hardware and software
package needs get consummated, there’s no want for the
user of our system to pay for any further overhead. For the
user, the applying are economically possible within the
following aspects: the applying can scale back tons of labor
work. therefore the efforts are reduced. Our application can
scale back the time that’s wasted in manual processes. The
storage and handling issues of the registers are resolved .
6
SYSTEM SPECIFICATIONS
Since this is an AI based Deep Learning approach, we are
going to use a dataset extracting the features of houses and
algorithmic models to train the dataset for predictions.
 Dataset- We are using an unsupervised dataset,i.e, data
without class labels. The objective of this unsupervised
technique, is to find patterns in data based on the
relationship between data points themselves.
7
8
• The dataset consists of more than 20,000 records.
• Around nine features are provided in the dataset like latitude,
total rooms, total bedrooms etc.
• Data mining technique is used for processing of the dataset
to extract relevant data to be further used in data models.
 Data model-The basic technique used for the model is
Regression Analysis which is a statistical method to model
the relationship between a dependent (target) and
independent (predictor) variables with one or more
independent variables. It predicts continuous/real values.
9
• For our application we are using three types of algorithms to
train the model, which are-
• KN-Neighbors: K nearest neighbors is a simple algorithm
that stores all available cases and predict the numerical
target based on a similarity measure (e.g., distance
functions)
• Support Vector Machine: The goal of the SVM algorithm is
to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can
easily put the new data point in the correct category in the
future.
• Artificial Neural Network(ANN): It’s a computational model
in which artificial neurons are nodes, and directed edges with
weights are connections between neuron outputs and neuron
inputs. For our project we will be using TensorFlow library to
integrate ANN in code.
10
HARDWARE AND SOFTWARE
REQUIREMENTS
SOFTWARE REQUIREMENTS-
 Python version- 3.8
 Libraries-
Pandas: library for manipulating data in numerical tables.
scikit-learn: machine learning library that features various
classification, regression and clustering algorithms.
Matplotlib: Matplotlib is a plotting library for creating static,
animated, and interactive visualizations in python.
11
Seaborn: library based on matplotlib. It provides a high-level
interface for drawing attractive and informative statistical
graphics
Tensorflow: The core open source library to help you develop
and train ML models.
Jupyterlab: JupyterLab is a web-based interactive
development environment for Jupyter notebooks, code, and
data.
12
HARDWARE REQUIREMENTS-
1.66 GHz Pentium Processor or Intel compatible processor.
1 GB RAM
 80 MB disk area.
13
DESIGN AND DEVELOPMENT
Since this project mainly integrates software features rather
than hardware ones, the emphasis is primarily on code
development as follows-
OVERVIEW OF STEPS
Step 1
 Import required libraries
 Import data from dataset
 Get some information about dataset.
14
15
Step 2-
 Visualizing the data in terms of house prices, affordability,
closeness to ocean, number of bedrooms etc.
 First we display a histogram for column
‘median_house_value’ from the dataset
 Next we display a Heatmap of the dataset. A heatmap()
method contains values representing various shades of the
same colour for each value to be plotted. Usually the darker
shades of the chart represent higher values than the lighter
shade.
 Next we display a Distplot() for columns
‘median_house_value’, ‘affordable’, and ‘ocean_proximity’. A
distplot() method lets you show a histogram with a line on it.
It plots a univariate distribution of observations.
16
17
18
 Next we display a Countplot() for the column
‘ocean_proximity’. A countplot() method is used to show the
counts of observations in each categorical bin using bars.
 Next we concatenate the values of dataset with values of
‘ocean_proximity’ column according to the different levels of
distance between the ocean and housing area.
 Next we view the countplot() of ‘affordable’ column, convert it
into string and replace string type “true, false” values by int
values 0,1 to represent if the house is affordable or not.
19
20
21
 Next we use the info() function on the dataset which basically
shows the datatypes used for different columns.
22
Step 3-
 Detecting and removing outliers using ‘IsolationForest’
algorithm.
 Isolation forest is an unsupervised learning algorithm for
anomaly detection that works on the principle of isolating
anomalies. It isolates the outliers by randomly selecting a
feature from the given set of features and then randomly
selecting a split value between the maximum and minimum
values of the selected feature.
 In our dataset after running the algorithm we get around 7170
outliers in the form of negative values(-1), which we remove
from the data, i.e., we delete 7170 rows from around 20,000
rows in the dataset.
23
24
Step 4-
 Splitting into training and testing data.
 We use the Train set to make the algorithm learn the data’s
behavior and then check the accuracy of our model on
the Test set.
 Divide the data into two values-
Features (X): The columns that are inserted into our model
will be used to make predictions.
Prediction (y): Target variable that will be predicted by the
features.
 After defining values of x, y we import train and test functions
from scikit learn library.
25
TRAINING AND TESTING
To test the accuracy of the model we use a metric
mean_absolute_percentage_error which we import from scikit
library. The algorithms used for training and testing the model
are as follows-
 KNN- First we import the KNeighborsRegressor() function
from scikit library and then run the algorithm on training set
and then on test set.
 We obtain the mean absolute percentage error of knn as
38.91504096125152 %
26
27
 SVM- Similarly we import SVR() from scikit library.
 Then we run the svm algorithm on the training data and then
the test data.
 The mean absolute percentage error of svm obtained is
32.246431895552135 %, which is slightly less than that of
knn algorithm.
 ANN- We use tensor flow library to build an ANN here for the
application.
 We also use the Keras library here which is an open source
deep learning framework for python. Keras is one of the most
powerful and easy to use python library, which is built on top
of popular deep learning libraries like TensorFlow, Theano,
etc., for creating deep learning models.
28
 To build the different layers of ANN we use ‘Flatten’ and
‘Dense’ function of keras framework.
 Dense layer is the regular deeply connected neural network
layer. It is most common and frequently used layer.
 Flatten layer is used to flatten the input.
 Then after building the layers of the model we need to
compile the model so that the code is converted into machine
readable code.
 We can view the summary of the model using
model.summary() function.
29
30
31
 Next we train the model on the training set and then test the
model using the test set.
 After testing, the mean absolute percentage error obtained is
18.315523022396036 %.
32
 As we notice , among the three models we find the ANN
model to be the most accurate on the dataset.
 Next we visualize the output of the three models graphically,
where the original output is represented in blue colour, the
svm output is represented in green colour and the ANN
output is represented in red colour.
33
IMPLEMENTATION AND OUTPUT
 To implement and check the accurate output of the model,
we first declare variables for each attribute of the dataset and
provide user defined values to the variables.
 These values are fed as input to the model which predicts the
final output, i.e, the price of the house based on the given
input.
 The final output or prediction is as follows-
34
35
CONCLUSION
From the training and testing of dataset on the model, a strong
deduction can be made that the model ANN works most
effectively on the data producing lower levels of errors, and
hence we can conclude that Deep Learning techniques prove
useful in the implementation and estimation of HOUSE PRICE
PREDICTION.
36
THANK YOU
37

More Related Content

What's hot

Predicting house price
Predicting house pricePredicting house price
Predicting house priceDivya Tiwari
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptxCodingWorld5
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithmijtsrd
 
Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductoryAbhimanyu Dwivedi
 
IRJET- House Rent Price Prediction
IRJET- House Rent Price PredictionIRJET- House Rent Price Prediction
IRJET- House Rent Price PredictionIRJET Journal
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine LearningScaleway
 
housepriceprediction-ml.pptx
housepriceprediction-ml.pptxhousepriceprediction-ml.pptx
housepriceprediction-ml.pptxtommychauhan
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Random forest
Random forestRandom forest
Random forestUjjawal
 
CAR PRICE PREDICTION.pptx
CAR PRICE PREDICTION.pptxCAR PRICE PREDICTION.pptx
CAR PRICE PREDICTION.pptxNAVINCHACKO1
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & UnderfittingSOUMIT KAR
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine LearningAnkit Rai
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its ApplicationsDr Ganesh Iyer
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learningwgyn
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 

What's hot (20)

Predicting house price
Predicting house pricePredicting house price
Predicting house price
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptx
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithm
 
Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductory
 
IRJET- House Rent Price Prediction
IRJET- House Rent Price PredictionIRJET- House Rent Price Prediction
IRJET- House Rent Price Prediction
 
Machine learning
Machine learningMachine learning
Machine learning
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
housepriceprediction-ml.pptx
housepriceprediction-ml.pptxhousepriceprediction-ml.pptx
housepriceprediction-ml.pptx
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Random forest
Random forestRandom forest
Random forest
 
CAR PRICE PREDICTION.pptx
CAR PRICE PREDICTION.pptxCAR PRICE PREDICTION.pptx
CAR PRICE PREDICTION.pptx
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learning
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
supervised learning
supervised learningsupervised learning
supervised learning
 

Similar to House price prediction

AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONIRJET Journal
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowDebasisMohanty37
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System OperationsIRJET Journal
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
AMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTAMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTIRJET Journal
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx0567Padma
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple stepsRenjith M P
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
 
Rachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra
 
Real Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningReal Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningIRJET Journal
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report AlmkdadAli
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewIRJET Journal
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
Bangla Handwritten Digit Recognition Report.pdf
Bangla Handwritten Digit Recognition  Report.pdfBangla Handwritten Digit Recognition  Report.pdf
Bangla Handwritten Digit Recognition Report.pdfKhondokerAbuNaim
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
 
Blood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNNBlood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNNIRJET Journal
 

Similar to House price prediction (20)

AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flow
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System Operations
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
AMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTAMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLT
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural Networks
 
Rachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_report
 
Real Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningReal Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine Learning
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A Review
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
Bangla Handwritten Digit Recognition Report.pdf
Bangla Handwritten Digit Recognition  Report.pdfBangla Handwritten Digit Recognition  Report.pdf
Bangla Handwritten Digit Recognition Report.pdf
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
 
Blood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNNBlood Cell Image Classification for Detecting Malaria using CNN
Blood Cell Image Classification for Detecting Malaria using CNN
 
fINAL ML PPT.pptx
fINAL ML PPT.pptxfINAL ML PPT.pptx
fINAL ML PPT.pptx
 
rscript_paper-1
rscript_paper-1rscript_paper-1
rscript_paper-1
 

Recently uploaded

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

House price prediction

  • 1. HOUSE PRICE PREDICTION USING DEEP LEARNING By Sabah Begum 1
  • 2. ABSTRACT Real estate is the least transparent industry in our ecosystem. Housing prices keep changing day in and day out and sometimes are hyped rather than being based on valuation. Predicting housing prices with real factors is the main crux of our research project. Here we aim to make our evaluations based on every basic parameter that is considered while determining the price. We use various regression techniques in this pathway, and our results are not sole determination of one technique rather it is the weighted mean of various techniques to give most accurate results. 2
  • 3. EXISTING SYSTEM Data is at the heart of technical innovations, achieving any result is now possible using predictive models. Machine learning is extensively used in this approach. Although development of correct and accurate model is needed. Previous algorithms like SVM are not suitable for complex real- world data. 3
  • 4. PROPOSED SYSTEM  Our main focus here is to develop a model which predicts the property cost based on vast datasets using data mining approach.  The data mining process in a real estate industry provides an advantage to the developers by processing the data, forecasting future trends and thus assisting them to make favourable knowledge-driven decisions.  Our dataset comprises of various essential parameters and data mining has been at the root of our system.  We will initially clean up our entire dataset and also truncate the outlier values.  Finally a system will be made to predict the Real Estate Prices using Deep Learning Approach. 4
  • 5. FEASIBILITY STUDY Basic areas to be covered are –  Operational Feasibility: The applying can scale back the time consumed to take care of manual records and isn’t dull and cumbersome to take care of the records, therefore operational practicableness is assured.  Technical Feasibility: Minimum hardware requirements: 1.66 GHz Pentium Processor or Intel compatible processor. 1 GB RAM net property, eighty MB disk area. 5
  • 6.  Economical Feasibility :Once the hardware and software package needs get consummated, there’s no want for the user of our system to pay for any further overhead. For the user, the applying are economically possible within the following aspects: the applying can scale back tons of labor work. therefore the efforts are reduced. Our application can scale back the time that’s wasted in manual processes. The storage and handling issues of the registers are resolved . 6
  • 7. SYSTEM SPECIFICATIONS Since this is an AI based Deep Learning approach, we are going to use a dataset extracting the features of houses and algorithmic models to train the dataset for predictions.  Dataset- We are using an unsupervised dataset,i.e, data without class labels. The objective of this unsupervised technique, is to find patterns in data based on the relationship between data points themselves. 7
  • 8. 8
  • 9. • The dataset consists of more than 20,000 records. • Around nine features are provided in the dataset like latitude, total rooms, total bedrooms etc. • Data mining technique is used for processing of the dataset to extract relevant data to be further used in data models.  Data model-The basic technique used for the model is Regression Analysis which is a statistical method to model the relationship between a dependent (target) and independent (predictor) variables with one or more independent variables. It predicts continuous/real values. 9
  • 10. • For our application we are using three types of algorithms to train the model, which are- • KN-Neighbors: K nearest neighbors is a simple algorithm that stores all available cases and predict the numerical target based on a similarity measure (e.g., distance functions) • Support Vector Machine: The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. • Artificial Neural Network(ANN): It’s a computational model in which artificial neurons are nodes, and directed edges with weights are connections between neuron outputs and neuron inputs. For our project we will be using TensorFlow library to integrate ANN in code. 10
  • 11. HARDWARE AND SOFTWARE REQUIREMENTS SOFTWARE REQUIREMENTS-  Python version- 3.8  Libraries- Pandas: library for manipulating data in numerical tables. scikit-learn: machine learning library that features various classification, regression and clustering algorithms. Matplotlib: Matplotlib is a plotting library for creating static, animated, and interactive visualizations in python. 11
  • 12. Seaborn: library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics Tensorflow: The core open source library to help you develop and train ML models. Jupyterlab: JupyterLab is a web-based interactive development environment for Jupyter notebooks, code, and data. 12
  • 13. HARDWARE REQUIREMENTS- 1.66 GHz Pentium Processor or Intel compatible processor. 1 GB RAM  80 MB disk area. 13
  • 14. DESIGN AND DEVELOPMENT Since this project mainly integrates software features rather than hardware ones, the emphasis is primarily on code development as follows- OVERVIEW OF STEPS Step 1  Import required libraries  Import data from dataset  Get some information about dataset. 14
  • 15. 15
  • 16. Step 2-  Visualizing the data in terms of house prices, affordability, closeness to ocean, number of bedrooms etc.  First we display a histogram for column ‘median_house_value’ from the dataset  Next we display a Heatmap of the dataset. A heatmap() method contains values representing various shades of the same colour for each value to be plotted. Usually the darker shades of the chart represent higher values than the lighter shade.  Next we display a Distplot() for columns ‘median_house_value’, ‘affordable’, and ‘ocean_proximity’. A distplot() method lets you show a histogram with a line on it. It plots a univariate distribution of observations. 16
  • 17. 17
  • 18. 18
  • 19.  Next we display a Countplot() for the column ‘ocean_proximity’. A countplot() method is used to show the counts of observations in each categorical bin using bars.  Next we concatenate the values of dataset with values of ‘ocean_proximity’ column according to the different levels of distance between the ocean and housing area.  Next we view the countplot() of ‘affordable’ column, convert it into string and replace string type “true, false” values by int values 0,1 to represent if the house is affordable or not. 19
  • 20. 20
  • 21. 21
  • 22.  Next we use the info() function on the dataset which basically shows the datatypes used for different columns. 22
  • 23. Step 3-  Detecting and removing outliers using ‘IsolationForest’ algorithm.  Isolation forest is an unsupervised learning algorithm for anomaly detection that works on the principle of isolating anomalies. It isolates the outliers by randomly selecting a feature from the given set of features and then randomly selecting a split value between the maximum and minimum values of the selected feature.  In our dataset after running the algorithm we get around 7170 outliers in the form of negative values(-1), which we remove from the data, i.e., we delete 7170 rows from around 20,000 rows in the dataset. 23
  • 24. 24
  • 25. Step 4-  Splitting into training and testing data.  We use the Train set to make the algorithm learn the data’s behavior and then check the accuracy of our model on the Test set.  Divide the data into two values- Features (X): The columns that are inserted into our model will be used to make predictions. Prediction (y): Target variable that will be predicted by the features.  After defining values of x, y we import train and test functions from scikit learn library. 25
  • 26. TRAINING AND TESTING To test the accuracy of the model we use a metric mean_absolute_percentage_error which we import from scikit library. The algorithms used for training and testing the model are as follows-  KNN- First we import the KNeighborsRegressor() function from scikit library and then run the algorithm on training set and then on test set.  We obtain the mean absolute percentage error of knn as 38.91504096125152 % 26
  • 27. 27
  • 28.  SVM- Similarly we import SVR() from scikit library.  Then we run the svm algorithm on the training data and then the test data.  The mean absolute percentage error of svm obtained is 32.246431895552135 %, which is slightly less than that of knn algorithm.  ANN- We use tensor flow library to build an ANN here for the application.  We also use the Keras library here which is an open source deep learning framework for python. Keras is one of the most powerful and easy to use python library, which is built on top of popular deep learning libraries like TensorFlow, Theano, etc., for creating deep learning models. 28
  • 29.  To build the different layers of ANN we use ‘Flatten’ and ‘Dense’ function of keras framework.  Dense layer is the regular deeply connected neural network layer. It is most common and frequently used layer.  Flatten layer is used to flatten the input.  Then after building the layers of the model we need to compile the model so that the code is converted into machine readable code.  We can view the summary of the model using model.summary() function. 29
  • 30. 30
  • 31. 31
  • 32.  Next we train the model on the training set and then test the model using the test set.  After testing, the mean absolute percentage error obtained is 18.315523022396036 %. 32
  • 33.  As we notice , among the three models we find the ANN model to be the most accurate on the dataset.  Next we visualize the output of the three models graphically, where the original output is represented in blue colour, the svm output is represented in green colour and the ANN output is represented in red colour. 33
  • 34. IMPLEMENTATION AND OUTPUT  To implement and check the accurate output of the model, we first declare variables for each attribute of the dataset and provide user defined values to the variables.  These values are fed as input to the model which predicts the final output, i.e, the price of the house based on the given input.  The final output or prediction is as follows- 34
  • 35. 35
  • 36. CONCLUSION From the training and testing of dataset on the model, a strong deduction can be made that the model ANN works most effectively on the data producing lower levels of errors, and hence we can conclude that Deep Learning techniques prove useful in the implementation and estimation of HOUSE PRICE PREDICTION. 36