This document discusses using machine learning to predict laptop prices based on laptop specifications. It proposes using a random forest algorithm on a dataset containing variables like laptop model, RAM, storage, GPU, CPU, display, and touchscreen to predict laptop price. Explanatory data analysis and preprocessing are performed before implementing the random forest model. The model achieves 89% prediction accuracy. A streamlit web app is created to demonstrate the model's laptop price predictions based on user-selected configurations. The conclusion is that the model can help students select appropriately priced laptops that meet their needs.
3. Introduction
• Laptop price prediction especially when the laptop is coming
direct from the factory to Electronic Market/Stores, is both a
critical and important task. The mad rush that we saw in
2020 for laptops to support remote work and learning is no
longer there. In India, demand of Laptops soared after the
Nationwide lockdown, leading to 4.1-Million-unit shipments
in the June quarter of 2021, the highest in the five
years. Accurate
Laptop price prediction involves expert knowledge, because
price usually depends on many distinctive features and
factors. Typically, most significant ones are brand and
model, RAM, ROM, GPU, CPU, etc. In this paper, we
applied different methods and techniques in order to achieve
higher precision of the used laptop price prediction
4. Abstract
This paper presents a Laptop price prediction system by
using the supervised machine learning technique. The
research uses random forest algorithm as the machine
learning prediction method which offered 89% prediction
precision. Using random forest algorithm, there are
multiple independent variables but one and only one
dependent variable whose actual and predicted values are
compared to find precision of results. This paper proposes
a system where price is dependent variable which is
predicted, and this price is derived from factors like
Laptop’s model, RAM, ROM (HDD/SSD), GPU, CPU, IPS
Display, and Touch Screen.
5. Problem Statement
• The problem statement is that if any user wants to
buy a laptop then our application should be
compatible to provide a tentative price of laptop
according to the user configurations.
• Although it looks like a simple project or just
developing a model, the dataset we have is noisy
and needs lots of feature engineering, and
preprocessing that will drive your interest in
developing this project.
6.
7. Explanatory Data Analysis
(EDA)
• EDA helps to perform hypothesis testing. We will start from the
first column and explore each column and understand what
impact it creates on the target column. At the required step, we
will also perform preprocessing and feature engineering tasks.
our aim in performing in-depth EDA is to prepare and clean data
for better machine learning modeling to achieve high
performance and generalized models.
10. Implementation
• Working of Random Forest Algorithm
• Before understanding the working of the random
forest we must look into the ensemble
technique. Ensemble simply means combining
multiple models. Thus a collection of models is used
to make predictions rather than an individual model.
• Ensemble uses two types of methods:
• 1. Bagging– It creates a different training subset from
sample training data with replacement & the final
output is based on majority voting. For
example, Random Forest.
• 2. Boosting– It combines weak learners into strong
learners by creating sequential models such that the
final model has the highest accuracy. For
11. • Steps involved in random forest algorithm:
• Step 1: In Random forest n number of random
records are taken from the data set having k number
of records.
• Step 2: Individual decision trees are constructed for
each sample.
• Step 3: Each decision tree will generate an output.
• Step 4: Final output is considered based on Majority
Voting or Averaging for Classification and regression
respectively.
12.
13. Important Features of
Random Forest
• 1. Diversity- Not all attributes/variables/features are considered
while making an individual tree, each tree is different.
• 2. Immune to the curse of dimensionality- Since each tree
does not consider all the features, the feature space is reduced.
• 3. Parallelization-Each tree is created independently out of
different data and attributes. This means that we can make full
use of the CPU to build random forests.
• 4. Train-Test split- In a random forest we don’t have to
segregate the data for train and test as there will always be 30%
of the data which is not seen by the decision tree.
• 5. Stability- Stability arises because the result is based on
majority voting/ averaging.
14. Result
• Streamlit is an (open-source Python
library) that makes it easy to create
and share, custom web apps for
machine learning and data science.
Result with backend code is shown in
following figures
16. Conclusions
• Predicting something through the application of machine
learning using the Decision Tree algorithm makes iteasy for
students, especially in determining the choice of laptop
specifications that are most desirable forstudents to meet
student needs and in accordance with the purchasing power of
students. Students no longerneed to look for various sources to
find laptop specifications that are needed by students in meeting
the needs ofstudents, because the laptop specifications from the
results of the machine learning application have providedthe
most desirable specifications with their prices of laptops.
•
17. EFERENCES
• [1].
• Sorower MS. A literature survey on algorithms for multi-label learning. Oregon State
University,Corvallis. 2010 Dec;18.
• [2].
• Pandey M, Sharma VK. A decision tree algorithm pertaining to the student performance
analysis and prediction. International Journal of Computer Applications. 2013 Jan 1;61(13).
• [3].
• Priyama A, Abhijeeta RG, Ratheeb A, Srivastavab S. Comparative analysis of decision
treeclassification algorithms. International Journal of Current Engineering and Technology.
2013Jun;3(2):334-7
• [4].
• Streamlit.io, Kaggle.com, Wikipedia.com